John Schulman, Sergey Levine, Philipp Moritz, Michael Jordan et Pieter Abbeel « Trust Region Policy Optimization » () (arXiv1502.05477, lire en ligne) —International Conference on Machine Learning (ICML)
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford et Oleg Klimov « Proximal Policy Optimization Algorithms » () (arXiv1707.06347, lire en ligne)
Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirzi, Alex Graves, Tim Harley, Timothy Lillicrap, David Silver et Koray Kavukcuoglu « Asynchronous Methods for Deep Reinforcement Learning » () (arXiv1602.01783, lire en ligne) —International Conference on Machine Learning (ICML)
Tuomas Haarnoja, Aurick Zhou, Sergey Levine et Pieter Abbeel « Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor » () (arXiv1801.01290, lire en ligne) —International Conference on Machine Learning (ICML)
(en) Patrik Reizinger et Márton Szemenyei, « Attention-based Curiosity-driven Exploration in Deep Reinforcement Learning », IEEE, (arXiv1910.10840)
Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin et Pieter Abbeel « Hindsight Experience Replay » () (arXiv1707.01495, lire en ligne) —Advances in Neural Information Processing Systems (NeurIPS)
Levine, Finn, Darrell et Abbeel, « End-to-end training of deep visuomotor policies », JMLR, vol. 17, (arXiv1504.00702, lire en ligne)
Tom Schaul, Daniel Horgan, Karol Gregor et David Silver « Universal Value Function Approximators » () (lire en ligne) —International Conference on Machine Learning (ICML)
Katsunari Shibata et Yoichi Okabe « Reinforcement Learning When Visual Sensory Signals are Directly Given as Inputs » () (lire en ligne) —International Conference on Neural Networks (ICNN) 1997
Katsunari Shibata et Masaru Iida « Acquisition of Box Pushing by Direct-Vision-Based Reinforcement Learning » () (lire en ligne) —SICE Annual Conference 2003
Hiroki Utsunomiya et Katsunari Shibata « Contextual Behavior and Internal Representations Acquired by Reinforcement Learning with a Recurrent Neural Network in a Continuous State and Action Space Task » () (lire en ligne) —International Conference on Neural Information Processing (ICONIP) '08
Katsunari Shibata et Tomohiko Kawano « Learning of Action Generation from Raw Camera Images in a Real-World-like Environment by Simple Coupling of Reinforcement Learning and a Neural Network » () (lire en ligne) —International Conference on Neural Information Processing (ICONIP) '08