This repository implements various prospective learning and reinforcement learning algorithms to solve a 1D non-stationary foraging task from the L4DC 2026 paper titled Optimal control of the future via prospective learning with control.
This project uses uv for fast, reproducible dependency management.
- Clone and Enter:
git clone https://github.com/ashwindesilva/procontrol.git cd procontrol - Sync Environment:
uv sync uv pip install -e .
The project uses a Makefile to standardize experiment execution. Each command runs the specified agent with optimized defaults.
Run these commands to execute both standard and time-aware versions of the baselines:
- FQI:
make run-fqi - SAC:
make run-sac - PPO:
make run-ppo
PL+C experiments are split by regressor type:
- Neural Network (MLP):
Runs with
make run-plc-nn
--eval_period 100and--terminal_time 3000. - Random Forest:
Runs with
make run-plc-rf
--eval_period 50and--terminal_time 1000.
Results are stored in the data/ directory as .joblib files.
- Visualization: Use
notebooks/figures.ipynbto generate plots. - Animation: Visualize the agent at work.
If you need to override default replicates or parameters, you can pass them via the ARGS variable:
make run-plc-nn ARGS="--num_reps 50 --gamma 0.95"If you find this work useful, please consider citing our work.
@article{bai2025optimal,
title={Optimal control of the future via prospective learning with control},
author={Bai, Yuxin and Acharyya, Aranyak and De Silva, Ashwin and Shen, Zeyu and Hassett, James and Vogelstein, Joshua T},
journal={arXiv preprint arXiv:2511.08717},
year={2025}
}
