This repository contains Jupyter notebooks documenting my study of foundational reinforcement learning algorithms.
This notebook includes:
- A vanilla implementation of the DQN algorithm.
- The PyTorch example version of DQN.
This notebook includes:
- REINFORCE
- A2C
- PPO
These algorithms are implemented as vanilla versions, except that the policy update direction is normalised.
This project uses a Conda environment specified in environment.yml.
From the project root directory, run:
conda env create -f environment.yml