Hierarchical-Actor-Critic-Pytorch

Hierarchical Actor Critic (HAC) helps agents learn tasks more quickly by enabling them to break problems down into short sequences of actions. It uses

DDPG (Lilicrap et. al. 2016),
Universal Value Function Approximators (UVFA) (Schaul et al. 2015), and
Hindsight Experience Replay (HER) (Andrychowicz et al. 2017).

Deep Deterministic Policy Gradient (DDPG) is an actor-critic, model-free, off-policy algorithm to learn a policy over continuous action domains. It was proposed by Lillicrap et. al. 2016 after the success of Deep Q Network (DQN) for discrete action domains in Mnih et. al. 2015. DDPG is based on Deep Deterministic Gradient (DPG) actor-critic algorithm proposed by Silver et. al. 2014. Innovations of DQN:

the network is trained off-policy with samples from a "replay buffer" to minimize correlations between samples;
the network is trained with a separate "target Q network" to give consistent targets during temporal difference backups.

Use following commands to train and test

You can modeify the training and testing configuration and the parameters of HAC and DDPG in the main.py file.

python3 main.py ‘MountainCarContinuous-v1’
python3 main.py ‘Pendulum-v1’

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
gym_custom		gym_custom
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
install.sh		install.sh
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hierarchical-Actor-Critic-Pytorch

Use following commands to train and test

References:

About

Uh oh!

Releases

Packages

Languages

License

Rashmeet09/Hierarchical-Actor-Critic-Pytorch

Folders and files

Latest commit

History

Repository files navigation

Hierarchical-Actor-Critic-Pytorch

Use following commands to train and test

References:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages