Skip to content

v1.90.0

Compare
Choose a tag to compare
@pelikhan pelikhan released this 10 Jan 17:40
· 145 commits to main since this release

New Features and Enhancements

  • Added CLI commands for video processing:
    • Extract audio from video files.
    • Extract video frames with options for count, size, and output folder.
  • Integrated HuggingFace's pipeline API for advanced transformer capabilities.
  • Introduced transcription support via OpenAI's Whisper API, enabling audio-to-text conversion with caching and SRT/VTT format generation.
  • Enhanced data slicing in defData to support object field sampling and filtering.

Performance Improvements

  • Optimized hashing with streaming file support and salt integration.
  • Improved concurrency handling for video frame extraction and audio transcoding using FFmpeg.

🛠️ Bug Fixes and Stability

  • Fixed edge cases in file handling for workspace paths.
  • Resolved hashing inconsistencies for various data types like buffers and blobs.
  • Enhanced error handling and logging for video processing and transcription workflows.

🎥 Video and Audio Processing

  • Seamless integration of FFmpeg for video/audio tasks.
  • Added caching for video probes, audio extraction, and frame generation.

💡 Developer Experience

  • Simplified runtime configuration for transcription and video utilities.
  • Improved CLI usability with detailed descriptions and argument validations.