Code for "Efficient Multi-Goal Reinforcement Learning via Value Consistency Prioritization".
python3.6+, tensorflow, gym, mujoco
-
Clone the repo and cd into it:
git clone https://github.com/jiawei415/VCP.git cd VCP
-
Install vcp package
pip install -e .
Environments: PointMassEmptyEnv-v1, Reacher-v2, FetchReach-v1, HandReach-v0, HandManipulatePenRotate-v0.
VCP:
python -m vcp.run --env PointMassEmptyEnv-v1 --num_epoch 50 --num_env 1 --alg_config "{'k_heads':16,'priority_temperature':9.0}
HER:
python -m vcp.run --env PointMassEmptyEnv-v1 --num_epoch 50 --num_env 16 --alg_config "{'k_heads':1,'prioritized_replay':False,'use_her_buffer':False}"