Algorithms List
Algorithms in ReinforceUI-Studio
ReinforceUI-Studio includes a growing collection of algorithms. This list is continuously updated as new algorithms are added. If you have an algorithm or suggestion, feel free to contact us to have it included.
Algorithm | Paper Link | Citation |
---|---|---|
TD3 | https://arxiv.org/abs/1802.09477v3 | Fujimoto, Scott, Herke Hoof, and David Meger. “Addressing function approximation error in actor-critic methods.” In International conference on machine learning, pp. 1587-1596. PMLR, 2018. |
SAC | https://arxiv.org/abs/1801.01290 | Haarnoja, Tuomas, Aurick Zhou, Pieter Abbeel, and Sergey Levine. “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.” In International conference on machine learning, pp. 1861-1870. PMLR, 2018. |
TQC | https://arxiv.org/abs/2005.04269 | Kuznetsov, Arsenii, Pavel Shvechikov, Alexander Grishin, and Dmitry Vetrov. “Controlling overestimation bias with truncated mixture of continuous distributional quantile critics.” In International Conference on Machine Learning, pp. 5556-5566. PMLR, 2020. |
DDPG | https://arxiv.org/abs/1509.02971 | Lillicrap, T. P. “Continuous control with deep reinforcement learning.” arXiv preprint arXiv:1509.02971 (2015). |
CTD4 | https://arxiv.org/abs/2405.02576 | Valencia, David, Henry Williams, Yuning Xing, Trevor Gee, Bruce A. MacDonaland, and Minas Liarokapis. “CTD4-A Deep Continuous Distributional Actor-Critic Agent with a Kalman Fusion of Multiple Critics.” arXiv preprint arXiv:2405.02576 (2024). |
PPO | https://arxiv.org/abs/1707.06347 | Schulman, John, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. “Proximal policy optimization algorithms.” arXiv preprint arXiv:1707.06347 (2017). |
Was this page helpful?