Skip to content

Latest commit

 

History

History
36 lines (19 loc) · 811 Bytes

File metadata and controls

36 lines (19 loc) · 811 Bytes

robots

Reinforcement-Learning

a computational approach to understanding and automating goal-directed learning and decision making error

  • maze

    Learning agents to navigate with Q-learning, comparing Expected SARSA, Dyna-Q and Dyna-Q+

  • mountain-car

    On-policy Control with function approximation. Weak car climbing hill using episodic semi-gradient one-step SARSA

  • pendulum

    Average Reward Softmax Actor-Critic on a continuing task.

  • lunar-lander

    Deep RL using experience replay and expected SARSA

  • pong

    Pong from Pixels using policy gradients

References Reinforcement learning: an introduction (second edition) Richard S Sutton; Andrew G Barto Full pdf