Breaking Changes
- Transform System: Complete refactor to PyTorch named tensors for GPU acceleration. Transforms now use
('channel', 'time')or('batch', 'channel', 'time')tensor conventions - Dataset Architecture: Refactored datasets module with layered architecture (
WindowedDataset,DataModule,DatasetCreator) - Models Restructure: Flattened model hierarchy from 4 levels to 2 (
models/raul_net/v16.py). Removed deprecated models (V1, V4, V9 variants, classification), removed online/offline distinction - Datatypes Package: Split
datatypes.pyinto organized package structure - Storage Format: Enforced
.zipformat for datasets, removed GPU Direct Storage support
Added
SlidingWindowTransformbase class for consistent transform implementations- MLflow experiment tracking and visualization (
myoverse/tracking.py) - RAM caching and multiprocessing support in
ContinuousDataset - Lazy dataloader initialization for faster startup
- Cache pre-loading in main process before spawning workers
- Nested dataset storage structure (split/modality/task)
ZipStorefor faster dataset I/O on Windowspy.typedmarker for PEP 561 compliance
Changed
- Lazy import of heavy dependencies in datasets module for faster imports
- Modernized codebase to Python 3.12+ type hints and modern
super()calls - Ran ruff format across entire codebase
- Removed graph visualization from
_Dataclass - Removed Rich auto-coloring of numbers and text for cleaner output
- Moved
emg_xarrayandemg_tensorto datatypes module
Removed
- Unused
utilsmodule tests/workflow.py- icecream dependency
- Deprecated model variants (V1, V4, V9, classification models)
- GPU Direct Storage (GDS) support
Fixed
- MedianFilter padding
- Tests updated to match simplified
_DataAPI
Full Changelog: v1.1.6...v2.0.0