v0.13.0 - Pretraining Support & Optimizer Configuration

Latest

Latest

RobotSail released this 08 Jan 19:48

· 4 commits to main since this release

574f946

What's New

Features

Pretraining Data Processing API (#672)
- Added new API for processing pretraining-style datasets
- Documents are now chunked by configurable block_size
- Chunks are treated as independent, fully-unmasked samples
- Updated training loop to ingest pretraining-style datasets
- Includes comprehensive test coverage (test_pretraining_data_process.py, test_pretraining_mode.py, test_pretraining_sampler.py)
AdamW Optimizer Configuration (#674)
- Exposed weight_decay, betas, and eps parameters in TrainingArgs
- Users can now tune AdamW hyperparameters through run_training() API
- Provides more control over optimizer behavior
Granite 4 Model Support (#669)
- Added support for Granite 4 models as Mixture of Experts (MoE) models in training

Bug Fixes

Process Timing Fix (#675)
- Fixed race condition where process wasn't completed by the time it was read
Variable Access Fix (#668)
- Fixed invalid variable access bug

Dependencies

Build Dependency Update (#670)
- Updated hynek build dependency

Files Changed

17 files changed with 1,642 insertions and 52 deletions:

Core training modules: data_process.py, main_ds.py, sampler.py, model.py, config.py
New test suites for pretraining functionality
Updated README with new capabilities

Full Changelog

All Changes:

574f946 Exposes API for processing pretraining data (#672)
638a753 fixes bug where process isn't completed by the time the process gets read (#675)
c495035 Expose AdamW optimizer parameters in training API (#674)
3d05302 Handle granite 4 as MoE models in training (#669)
781c36f fixes stray invalid variable access bug (#668)
529c2f7 bumps hynek build dep (#670)

Full Diff: v0.12.1...v0.13.0

Assets 6