Advanced Actor Critic and Policy Gradient Methods

12 videos • 13,401 views • by Machine Learning with Phil In this series of deep reinforcement learning tutorials, you will learn how to apply advanced actor critic methods to environments from the Open AI Gym with continuous action spaces. You will read and implement the original deep deterministic policy gradients (DDPG) paper. We'll also cover how to handle multithreaded processing in Python, with the asynchronous advantage actor critic algorithm (A3C). We move on to more advanced topics such as proximal policy optimization (PPO), twin delayed deep deterministic policy gradients (TD3), and soft actor critic (SAC). Tutorials are presented in both the PyTorch and Tensorflow deep learning frameworks.