Deep Reinforcement Learning Explained

Content of this series

Below the reader will find the updated index of the posts published in this series.

Part 1: Essential concepts in Reinforcement Learning and Deep Learning

01: A gentle introduction to Deep Reinforcement Learning, Learning the basics of Reinforcement Learning (15/05/2020)

02: Formalization of a Reinforcement Learning Problem, Agent-Environment interaction in a MDP (22/05/2020)

03: Deep Learning Basics, Basic concepts for Beginners (27/05/2020)

04: Deep Learning with PyTorch, First contact with Pytorch for Beginners (01/06/2020)

05: PyTorch Performance Analysis with TensorBoard,  How to run TensorFlow for PyTorch inside Colab (03/06/2020)

06: Solving an RL Problem Using Cross-Entropy Method, Agent Creation Using Deep Neural Networks (04/06/2020)

07: Cross-Entropy Method Performance Analysis, Implementation of the Cross-Entropy Training Loop (08/06/2020)

08: The Bellman Equation, V-function and Q-function Explained (11/06/2020)

09: The Value Iteration Algorithm, Estimation of Transitions and Rewards from the Agent’s experience (13/06/2020)

10: Value Iteration for V-function, V-function in Practice for Frozen-Lake Environment (14/06/2020)

11: Value Iteration for Q-function, Frozen-Lake code for Q-function (15/06/2020)

Part 2: Implementation of Reinforcement Learning classical methods

12: Reviewing Essential Concepts from Part 1, Mathematical Notation Updated (12/07/2020)

13: Monte Carlo Methods, Exploration-Exploitation Dilemma (22/07/2020)

14: MC Control Methods  and Temporal-Difference Methods, Constant-alpha MC Control, Sarxa, Q-Learning (26/07/2020)

15: Deep Q-Network – I: Open AI Gym and Wrappers (16/08/2020)

16: Deep Q-Network – II:  Experience Replay & Target Network (16/08/2020)

17: Deep Q-Network – III: Performance  & Use (16/08/2020)

18: Policy-based Methods, Hill Climbing algoritm (07/09/2020)

19: Policy-Gradient Methods, REINFORCE algorithm (10/09/2020)

20: Reinforcement Learning Frameworks, Solving CartPole Environment using RLlib on Ray framework  (27/09/2020)




This is an introductory series with a practical approach that tries to cover the basic concepts in Reinforcement Learning and Deep Learning to begin in the area of Deep Reinforcement Learning.

How did this series start?

I started to write this series during the period of lockdown in Barcelona. Honestly, writing these posts in my spare time helped me to #StayAtHome because of the lockdown. Thank you for reading this publication in those days, it justifies the effort I made.

Our research in DRL

Our research group at UPC Barcelona Tech and Barcelona Supercomputing Center is doing research on this topic. Our latest paper in this area is “Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills” presented in the 37th International Conference on Machine Learning (ICML2020) The paper presents a novel paradigm for unsupervised skill discovery in Reinforcement Learning. It is the last contribution of @vcampos7, one of our Ph.D. students co-advised with@DocXavi. This paper is co-authored with @alexrtrott, @CaimingXiong, @RichardSocher from Salesforce Research.

About BSC and UPC

The Barcelona Supercomputing Center (BSC) is a public research center located in Barcelona. It hosts MareNostrum, a 13.7 Petaflops supercomputer, which also includes clusters of emerging technologies. In June 2017, it ranked 13th in the world.

The Polytechnic University of Catalonia (Universitat Politècnica de Catalunya), currently referred to as BarcelonaTech, and commonly known as UPC, is the largest engineering university in Catalonia, Spain. It also offers programs in other disciplines such as mathematics and architecture.

About Towards Data Science

Towards Data Science provides a platform to exchange knowledge through Medium. I thank Towards Data Science very much for accepting the publication of my contributions, which allowed me to be one of the top writers in Artificial Intelligence in Medium.