Solving the Bipedal-Walker Gym environment.

As shown in the image below, the algorithm achieves an average of 300+ rewards in 100 episodes, the required criteria for "solving" the environment.

bipedal_walker_demo.mov

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
models		models
plots		plots
README.md		README.md
bipedal_walker_demo.mov		bipedal_walker_demo.mov
main.py		main.py
ppoM.py		ppoM.py
ppo_attn.py		ppo_attn.py
ppo_crossattn.py		ppo_crossattn.py
requirements.txt		requirements.txt
solved1.png		solved1.png
train_ppoM.py		train_ppoM.py
train_ppo_attn.py		train_ppo_attn.py
train_ppo_crossattn.py		train_ppo_crossattn.py