A mxnet implementation of meta-RL

This is an attempt to implement the bandits algorithm in paper Learning to reinforcement learn.

The algorithm should be mostly correct. And this work is based on the repo:https://github.com/awjuliani/Meta-RL (TensorFlow)

The Labryinth experiments will be pushed soon.

Usage

run python a3c-bandit.py --num-threads=32 --episode-len=100

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
frames		frames
resources		resources
README.md		README.md
a3c-bandit-10000.params		a3c-bandit-10000.params
a3c-bandit-80000.params		a3c-bandit-80000.params
a3c-bandit.py		a3c-bandit.py
helper.py		helper.py