Skip to content

Latest commit

 

History

History
199 lines (147 loc) · 6.82 KB

README.md

File metadata and controls

199 lines (147 loc) · 6.82 KB

BricksRL

CI Python arXiv Website

BricksRL allows the training of custom LEGO robots using deep reinforcement learning. By integrating PyBricks and TorchRL, it facilitates efficient real-world training via Bluetooth communication between LEGO hubs and a local computing device. Check out our paper!

For additional information and building instructions for the robots, view the project page BricksRL.

Prerequisites

Click me

Enable web Bluetooth on chrome

  1. Go to "chrome://flags/"
  2. enable "Experimental Web Platform features"
  3. restart chrome
  4. Use beta.pybricks.com to edit and upload the client scripts for each environment

Environment Setup

  1. Create a Conda environment:
    conda create --name bricksrl python=3.8
  2. Activate the environment:
    conda activate bricksrl
  3. Install PyTorch:
    pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
    
  4. Install additional packages:
    pip install -r requirements.txt

Usage

Client

Update your client script on the PyBricks Hub whenever you want to run a new environment with your robot.

Repo Structure

Click me
project_root/
│
├── configs/                    # Centralized configuration directory
│   ├── config.yaml             # Base config
│   ├── env/                    # Environment and task specific configs
|   |   ├── runaway-v0.yaml
|   |   ├── spinning_v0.yaml
|   |   ├── walker-v0.yaml
|   |   ├── walker_sim-v0.yaml
|   |   ├── roboarm-v0.yaml
|   |   ├── roboarm_sim-v0.yaml
|   |   └── roboarm_mixed-v0.yaml  
│   └── agent/                   # Agent specific configs
|       ├── sac.yaml
|       ├── td3.yaml
|       └── droq.yaml
│   
├── experiments/                # Experiments directory
│   ├── 2wheeler/               # 2wheeler robot specific experiments
|   |   ├── train.py
|   |   └── eval.py  
│   ├── walker/                 # Walker robot specific experiments
|   |   ├── train.py
|   |   └── eval.py  
│   └── roboarm/                # Roboarm specific experiments
|       ├── train.py
|       └── eval.py  
│
├── environments/               # Environments directory
│   ├── __init__.py
│   ├── base/                   # Base environment class
|   |   ├── base_env.py
|   |   └── PybricksHubClass.py # For Async-Communication with the robot
│   ├── runaway_v0.py           # Environment for the 2wheeler robot
|   |   ├── client.py
|   |   └── Env.py  
│   ├── walker_v0.py            # Environment for the walker
|   |   ├── client.py
|   |   └── Env.py
│   └── ...
│
├── src/                     # Source code for common utilities, robot models, etc.
│   ├── __init__.py
│   ├── utils/
│   ├── agents/
|   |   ├── sac.py
|   |   └── td3.py
│   └── networks/
|       └── ...
│
└── tests/                   # Unit tests and integration tests
    ├── ...

Config

Before running experiments, please review and modify the configuration settings according to your needs. Each environment and agent setup has its own specific configuration file under the configs/ directory. For more information checkout the config README.

Robots

Robots utilized for our experiments. Building instructions can be found here.

2wheeler Walker RoboArm
2Wheeler Walker RoboArm

Run Experiments

Train an Agent

python experiments/walker/train.py

Evaluate an Agent

python experiments/walker/eval.py

Results

Click me

Evaluation videos of the trained agents can be found here.

2Wheeler Results:

2Wheeler Results

Walker Results:

Walker Results

RoboArm Results:

RoboArm Results RoboArm Mixed Results

Offline RL

Click me With the use of precollected [offline datasets]() we can pretrain agents with offline RL to perform a task without the need of real world interaction. Such pretrained policies can be evaluated directly or used for later training to fine tuning the pretrained policy on the real robot.

Datasets

The datasets can be downloaded from huggingface and contain expert and random transitions for the 2Wheeler (RunAway-v0 and Spinning-v0), Walker (Walker-v0) and RoboArm (RoboArm-v0) robots.

   git lfs install
   git clone [email protected]:datasets/compsciencelab/BricksRL-Datasets

The datasets consist of TensorDicts containing expert and random transitions, which can be directly loaded into the replay buffer. When initiating (pre-)training, simply provide the path to the desired TensorDict when prompted to load the replay buffer.

Pretrain an Agent

The execution of an experiment for offline training is similar to the online training except that you run the pretrain.py script:

python experiments/walker/pretrain.py

Trained policies can then be evaluated as before with:

python experiments/walker/eval.py

Or run training for fine-tuning the policy on the real robot:

python experiments/walker/train.py