Skip to content

Releases: sail-sg/oat

v0.2.4

23 Dec 05:15
1b52eed

Choose a tag to compare

What's Changed

  • chore: minor updates on logging and resource allocation by @lkevinzc in #73

Full Changelog: v0.2.3...v0.2.4

v0.2.3

31 Oct 01:08
c1a074c

Choose a tag to compare

What's Changed

  • chore: update lora and add metrics by @lkevinzc in #66
  • Fix incorrect state indexing in PPOMultiTurnLearner critic training by @MozerWang in #67
  • fix micro batch training issue in DPO training by @hmhuy0 in #68
  • feat: add fp16 training by @lkevinzc in #70

New Contributors

Full Changelog: v0.2.2...v0.2.3

v0.2.2

02 Oct 02:43
e1164ac

Choose a tag to compare

What's Changed

  • feat: support turn-level ppo for general agentic rl by @lkevinzc in #63
  • feat: support LoRA RL training by @lkevinzc in #64

Full Changelog: v0.2.1...v0.2.2

v0.2.1

24 Aug 06:25
f9adda7

Choose a tag to compare

What's Changed

  • fix: use semantic version comparison for vLLM compatibility with 0.10.0+ by @simonucl in #60
  • chore: updates for online preference learning by @lkevinzc in #61
  • fix: truncated importance sampling to handle precision mismatch by @lkevinzc in #62

New Contributors

Full Changelog: v0.2.0...v0.2.1

v0.2.0

24 Jul 15:00

Choose a tag to compare

What's Changed

  • Fix tensor slicing in SFTLearner when batch_size=1 by @longxudou in #57
  • feat: refactor SFT to support multi turn chat data by @lkevinzc in #59

New Contributors

Full Changelog: v0.1.4...v0.2.0

v0.1.4

09 Jul 02:44
e6fa2ec

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.1.2...v0.1.4

v0.1.3.post2

28 Jun 12:18

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.1.2...v0.1.3.post2

v0.1.2

06 May 08:17
52ceaa7

Choose a tag to compare

What's Changed

Full Changelog: v0.1.0...v0.1.2

v0.1.0

18 Apr 03:34
43532b3

Choose a tag to compare

What's Changed

Full Changelog: v0.0.9...v0.1.0

v0.0.9

21 Mar 09:42
59eb01b

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.0.6...v0.0.9