-
Notifications
You must be signed in to change notification settings - Fork 72
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #243 from cpnota/release/0.7.0
Release/0.7.0
- Loading branch information
Showing
173 changed files
with
4,980 additions
and
2,049 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions | ||
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions | ||
|
||
name: Python package | ||
|
||
on: | ||
push: | ||
branches: [ master, develop ] | ||
pull_request: | ||
branches: [ master, develop ] | ||
|
||
jobs: | ||
build: | ||
|
||
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
python-version: [3.6, 3.7, 3.8] | ||
|
||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
- name: Install dependencies | ||
run: | | ||
sudo apt-get install swig | ||
sudo apt-get install unrar | ||
pip install torch==1.8.0+cpu -f https://download.pytorch.org/whl/torch_stable.html | ||
make install | ||
AutoROM -v | ||
- name: Lint code | ||
run: | | ||
make lint | ||
- name: Run tests | ||
run: | | ||
make test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# This workflow will upload a Python Package using Twine when a release is created | ||
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries | ||
|
||
name: Upload Python Package | ||
|
||
on: | ||
release: | ||
types: [created] | ||
|
||
jobs: | ||
deploy: | ||
|
||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: '3.x' | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install setuptools wheel twine | ||
- name: Build and publish | ||
env: | ||
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }} | ||
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }} | ||
run: | | ||
python setup.py sdist bdist_wheel | ||
twine upload dist/* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,26 @@ | ||
import all.agents | ||
import all.approximation | ||
import all.core | ||
import all.environments | ||
import all.logging | ||
import all.memory | ||
import all.nn | ||
import all.optim | ||
import all.policies | ||
import all.presets | ||
from all.core import State, StateArray | ||
|
||
__all__ = ['nn', 'State', 'StateArray'] | ||
__all__ = [ | ||
'agents', | ||
'approximation', | ||
'core', | ||
'environments', | ||
'logging', | ||
'memory', | ||
'nn', | ||
'optim', | ||
'policies', | ||
'presets', | ||
'State', | ||
'StateArray' | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,29 +1,50 @@ | ||
from ._agent import Agent | ||
from .a2c import A2C | ||
from .c51 import C51 | ||
from .ddpg import DDPG | ||
from .ddqn import DDQN | ||
from .dqn import DQN | ||
from .ppo import PPO | ||
from .rainbow import Rainbow | ||
from .sac import SAC | ||
from .vac import VAC | ||
from .vpg import VPG | ||
from .vqn import VQN | ||
from .vsarsa import VSarsa | ||
from ._multiagent import Multiagent | ||
from ._parallel_agent import ParallelAgent | ||
from .a2c import A2C, A2CTestAgent | ||
from .c51 import C51, C51TestAgent | ||
from .ddpg import DDPG, DDPGTestAgent | ||
from .ddqn import DDQN, DDQNTestAgent | ||
from .dqn import DQN, DQNTestAgent | ||
from .independent import IndependentMultiagent | ||
from .ppo import PPO, PPOTestAgent | ||
from .rainbow import Rainbow, RainbowTestAgent | ||
from .sac import SAC, SACTestAgent | ||
from .vac import VAC, VACTestAgent | ||
from .vpg import VPG, VPGTestAgent | ||
from .vqn import VQN, VQNTestAgent | ||
from .vsarsa import VSarsa, VSarsaTestAgent | ||
|
||
|
||
__all__ = [ | ||
# Agent interfaces | ||
"Agent", | ||
"Multiagent", | ||
"ParallelAgent", | ||
# Agent implementations | ||
"A2C", | ||
"A2CTestAgent", | ||
"C51", | ||
"C51TestAgent", | ||
"DDPG", | ||
"DDPGTestAgent", | ||
"DDQN", | ||
"DDQNTestAgent", | ||
"DQN", | ||
"DQNTestAgent", | ||
"PPO", | ||
"PPOTestAgent", | ||
"Rainbow", | ||
"RainbowTestAgent", | ||
"SAC", | ||
"SACTestAgent", | ||
"VAC", | ||
"VACTestAgent", | ||
"VPG", | ||
"VPGTestAgent", | ||
"VQN", | ||
"VQNTestAgent", | ||
"VSarsa", | ||
"VSarsaTestAgent", | ||
"IndependentMultiagent", | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
from abc import ABC, abstractmethod | ||
from all.optim import Schedulable | ||
|
||
|
||
class Multiagent(ABC, Schedulable): | ||
""" | ||
A multiagent RL agent. Differs from standard agents in that it accepts a multiagent state. | ||
In reinforcement learning, an Agent learns by interacting with an Environment. | ||
Usually, an agent tries to maximize a reward signal. | ||
It does this by observing environment "states", taking "actions", receiving "rewards", | ||
and learning which state-action pairs correlate with high rewards. | ||
An Agent implementation should encapsulate some particular reinforcement learning algorithm. | ||
""" | ||
|
||
@abstractmethod | ||
def act(self, multiagent_state): | ||
""" | ||
Select an action for the current timestep and update internal parameters. | ||
In general, a reinforcement learning agent does several things during a timestep: | ||
1. Choose an action, | ||
2. Compute the TD error from the previous time step | ||
3. Update the value function and/or policy | ||
The order of these steps differs depending on the agent. | ||
This method allows the agent to do whatever is necessary for itself on a given timestep. | ||
However, the agent must ultimately return an action. | ||
Args: | ||
multiagent_state (all.core.MultiagentState): The environment state at the current timestep. | ||
Returns: | ||
torch.Tensor: The action for the current agent to take at the current timestep. | ||
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
from abc import ABC, abstractmethod | ||
from all.optim import Schedulable | ||
|
||
|
||
class ParallelAgent(ABC, Schedulable): | ||
""" | ||
A reinforcement learning agent that chooses actions for multiple states simultaneously. | ||
Differs from SingleAgent in that it accepts a StateArray instead of a State to process | ||
input from multiple environments in parallel. | ||
In reinforcement learning, an Agent learns by interacting with an Environment. | ||
Usually, an Agent tries to maximize a reward signal. | ||
It does this by observing environment "states", taking "actions", receiving "rewards", | ||
and learning which state-action pairs correlate with high rewards. | ||
An Agent implementation should encapsulate some particular reinforcement learning algorithm. | ||
""" | ||
|
||
@abstractmethod | ||
def act(self, state_array): | ||
""" | ||
Select an action for the current timestep and update internal parameters. | ||
In general, a reinforcement learning agent does several things during a timestep: | ||
1. Choose an action, | ||
2. Compute the TD error from the previous time step | ||
3. Update the value function and/or policy | ||
The order of these steps differs depending on the agent. | ||
This method allows the agent to do whatever is necessary for itself on a given timestep. | ||
However, the agent must ultimately return an action. | ||
Args: | ||
state_array (all.environment.StateArray): An array of states for each parallel environment. | ||
Returns: | ||
torch.Tensor: The actions to take for each parallel environmets. | ||
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.