Releases: HumanCompatibleAI/imitation
Releases · HumanCompatibleAI/imitation
v1.0.1
Fix bug with tensors being on the wrong device when there is more than one device available (#831).
What's Changed
- Update the README files by @ernestum in #817
- Add more benchmarking documentation by @ernestum in #822
- Clarify in README.md that we switched to gymnasium. by @ernestum in #824
- Switch to lualatex to generate the documentation PDF by @ernestum in #826
- Fix documentation pipeline by @ernestum in #827
- Fix warning in quickstart.py by @ernestum in #823
- Fix coverage issue in BC tests by @ernestum in #830
- Remove FloatReward by @ernestum in #829
- Ensure safe_to_tensor moves tensors to the specified device by @ernestum in #831
Full Changelog: v1.0.0...v1.0.1
v1.0.0 -- first stable release
We're pleased to announce the first stable release of imitation
. Key improvements include:
- Gymnasium compatibility, which has superceded Gym
- Tuned hyperparameters and benchmark results for common algorithm-environment pairs (see release artifact attached).
- New algorithm (beta): SQIL
For more information, see the changelog below.
What's Changed
- Updated Installation Instructions by @ernestum in #760
- Download experts from hf inside tutorials and docs by @jas-ho in #766
- Implementation of the SQIL algorithm by @RedTachyon in #744
- Additional examples of CLI usage by @EdoardoPona in #761
- Dependency fixes by @ernestum in #775
- Tune hyperparameters for kernel density estimation tutorial by @michalzajac-ml in #774
- Tune hyperparameters in tutorials for GAIL and AIRL by @michalzajac-ml in #772
- Introduce interactive policies to gather data from a user by @michalzajac-ml in #776
- Add an option to run SQIL with various off-policy algorithms by @michalzajac-ml in #778
- Complete PR #771 (Tune preference comparison example hyperparameters) by @lukasberglund in #782
- Add CLI for SQIL by @lukasberglund in #784
- Gymnasium Compatibility by @ernestum in #735
- Ensure MyST-NB raises an error when rendering a notebook fails. by @ernestum in #803
- Add a test timeout by @ernestum in #779
- Fix MacOS Pipeline: Include tests not in subdirectories by @AdamGleave in #797
- Remove MuJoCo dependency from SQIL notebook by @AdamGleave in #800
- Add partial support for dictionary observation spaces (bc, density) by @NixGD in #785
- Update gymnasium dependency and render_mode in gym.make by @taufeeque9 in #806
- Upgrade pytype by @ZiyueWang25 in #801
- Reduce training time and improve expert loading code in the tutorials by @ernestum in #810
- Add scripts and configs for hyperparameter tuning by @taufeeque9 in #675
- SQIL and PC performance check fixes by @ernestum in #811
- Running benchmarks by @ernestum in #812
New Contributors
- @jas-ho made their first contribution in #766
- @EdoardoPona made their first contribution in #761
- @michalzajac-ml made their first contribution in #774
- @lukasberglund made their first contribution in #782
- @NixGD made their first contribution in #785
- @ZiyueWang25 made their first contribution in #801
Full Changelog: v0.4.0...v1.0.0
v0.4.0
What's Changed
- Continuous Integration: Add support for Mac OS; remove dependency on MuJoCo
- Preference comparison: improved logging, support for active learning based on variance of ensemble.
- HuggingFace integration for model and dataset loading.
- Benchmarking: add results and example configs.
- Documentation: add notebook tutorials; other general improvements.
- General changes: migrate to pathlib; add more type hints to enable mypy as well as pytype.
Full Changelog: v0.3.1...v0.4.0
v0.3.1
What's Changed
Main changes:
- Added reward ensembles and conservative reward functions by @levmckinney in #460
- Dropping support for python 3.7 by @levmckinney in #505
Minor changes:
- Docstring and other fixes after #472 by @Rocamonde in #497
- Improve Windows CI by @AdamGleave in #495
Full Changelog: v0.3.0...v0.3.1
Major improvements
New features:
- New algorithm: Deep RL from Human Preferences (thanks to @ejnnr @norabelrose et al)
- Notebooks with examples (thanks to @ernestum)
- Serialized trajectories using NumPy arrays rather than pickles, ensuring stability across versions and saving space on disk (thanks to @norabelrose)
- Weights and Biases logging support (thanks to @yawen-d)
Improvements:
- Port MCE IRL from JAX to Torch, eliminating the JAX dependency. (thanks to @qxcv)
- Refactor RewardNet code to be independent from AIRL, and shared across algorithms. (thanks to @ejnnr)
- Add Windows support including continuous integration. (thanks to @taufeeque9)
First PyTorch release
compute_train_stats: Fix logits passed in as proba (#273) Led to an error when I was training.
Final TF1 release
v0.1.1 Final TF1 release
Initial release
Prototype versions of AIRL, GAIL, BC, DAGGER.