robots

Reinforcement-Learning

a computational approach to understanding and automating goal-directed learning and decision making

maze

Learning agents to navigate with Q-learning, comparing Expected SARSA, Dyna-Q and Dyna-Q+
mountain-car

On-policy Control with function approximation. Weak car climbing hill using episodic semi-gradient one-step SARSA
pendulum

Average Reward Softmax Actor-Critic on a continuing task.
lunar-lander

Deep RL using experience replay and expected SARSA
pong

Pong from Pixels using policy gradients

References Reinforcement learning: an introduction (second edition) Richard S Sutton; Andrew G Barto Full pdf