Skip to content

Releases: markub3327/rl-toolkit

RL Toolkit v5.0.0

11 Jan 08:06
Compare
Choose a tag to compare

RL Toolkit v4.1.1

02 Sep 14:31
862c317
Compare
Choose a tag to compare

Release v4.1.1

Changelog

  • update default config.yaml

RL Toolkit v4.1.0

09 Feb 03:40
Compare
Choose a tag to compare

Release v4.1.0

Changelog

Features 🔊

  • .fit()
  • AgentCallback

RL Toolkit v4.0.0

05 Feb 17:40
668f128
Compare
Choose a tag to compare

Release v4.0.0

Changelog

Features 🔊

  • Render environments to WanDB
  • Grouping of runs in WanDB
  • SampleToInsertRatio rate limiter
  • Global Gradient Clipping to avoid exploding gradients
  • Softplus for numerical stability
  • YAML configuration file
  • LogCosh instead of Huber loss
  • Critic network with Add layer applied on state & action branches
  • Custom uniform initializer
  • XLA (Accelerated Linear Algebra) compiler
  • Optimized Replay Buffer (google-deepmind/reverb#90)
  • split into Agent, Learner, Tester and Server

Bug fixes 🛠️

  • Fixed creating of saving path for models
  • Fixed model's summary()

RL Toolkit v3.2.5

03 Aug 03:55
a5795cc
Compare
Choose a tag to compare

Release v3.2.5

Changelog

  • Fix out of memory

RL Toolkit v3.2.4

07 Jul 21:06
11f4c7d
Compare
Choose a tag to compare

Release v3.2.4

Changelog

  • Reverb
  • setup.py (package is available on PyPI)
  • Split into agent, learner and tester roles
  • Use custom model and layer for defining Actor-Critic
  • MultiCritic - concatenating multiple critic networks into one network
  • Truncated Quantile Critics

RL Toolkit v2.0.2

23 May 20:49
d1bd3f4
Compare
Choose a tag to compare

Release v2.0.2

Changelog

  • + update Dockerfile
  • + update README.md
  • + formatted code by Black & Flake8

RL-Toolkit v2.0.1

27 Apr 08:58
032d16c
Compare
Choose a tag to compare

Release v2.0.1

Changelog

  • fix Critic model

RL-Toolkit v2.0

22 Apr 19:58
5bbbed6
Compare
Choose a tag to compare
RL-Toolkit v2.0 Pre-release
Pre-release

Release v2.0

Changelog

  • + Huber loss,
  • + Rendering to the video file (test mode),
  • + Normalized observation by Min-max method,
  • + removed TD3 support,
  • ± instead of Concatenate layer is used Add layer (Critic network)