Skip to content

Conversation

@tbarbugli
Copy link
Member

@tbarbugli tbarbugli commented Dec 9, 2025

  1. Added PcmData.from_g711() for decoding G.711 (μ-law/A-law) to PCM
  2. Added PcmData.g711_bytes() for encoding PCM to G.711 using PyAV codecs
  3. Modified PcmData.resample() to use PyAV resampler when duration > 0.5s
  4. Added PcmData.to_av_frame() helper method
from getstream.video.rtc import PcmData, G711Mapping, G711Encoding

# Encode PCM to G.711
g711_bytes = pcm.g711_bytes(sample_rate=8000, channels=1, mapping=G711Mapping.MULAW)

# Decode G.711 to PCM (bytes)
decoded = PcmData.from_g711(g711_bytes, sample_rate=8000, channels=1, mapping=G711Mapping.MULAW)

# Decode G.711 to PCM (base64)
decoded = PcmData.from_g711("base64_encoded_audio", sample_rate=8000, encoding=G711Encoding.BASE64)

Summary by CodeRabbit

  • New Features

    • Added G.711 audio codec support (μ-law and A-law) for decoding from bytes/base64 and encoding to G.711.
    • Added enums/options for G.711 mapping and input encoding; exposed constants/tables for decoding.
    • Added AV interoperability to convert PCM to audio frames and a higher-quality resampling path for longer audio; improved multi-channel and flexible sample-rate handling.
  • Tests

    • Added comprehensive tests covering G.711 encode/decode, base64 handling, resampling, stereo, format preservation, and roundtrips.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 9, 2025

Walkthrough

Adds G.711 (μ‑law / A‑law) support plus PyAV audio interop: new g711 module with decode tables/enums, PcmData G.711 encode/decode/AV-frame methods, PyAV resampling/encoding helpers, public exports, and expanded tests for roundtrips and resampling.

Changes

Cohort / File(s) Summary
Public API exports
getstream/video/rtc/__init__.py
Import and expose G711Encoding and G711Mapping on the package public API.
G.711 codec data
getstream/video/rtc/g711.py
New module adding G711Mapping and G711Encoding enums and precomputed MULAW_DECODE_TABLE / ALAW_DECODE_TABLE numpy arrays and related constants.
PcmData, resampling & AV interop, G.711 logic
getstream/video/rtc/track_util.py
Add G.711 enums/constants; implement PcmData.from_g711(...), PcmData.g711_bytes(...), PcmData.to_av_frame(); add PyAV helpers _resample_with_pyav(), _encode_g711_with_pyav(), _encode_frame_with_codec(); integrate G.711 decode/encode and resampling paths; add imports (base64, fractions, g711 tables).
Tests
tests/rtc/test_pcm_data.py
New and extended tests covering G.711 decoding/encoding (μ‑law / A‑law, raw/base64), multi‑channel/sample‑rate variants, roundtrips/integration, and PyAV resampling format preservation tests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

  • Pay special attention to the correctness of MULAW_DECODE_TABLE / ALAW_DECODE_TABLE against the G.711 spec.
  • Verify PyAV usage: AudioFrame formats, resampler parameters, codec names, packet/frame handling, and buffer flushing.
  • Check sample‑rate and channel conversions, data layout (planar vs packed), and dtype preservation across resample paths.
  • Review base64/raw handling and input validation in PcmData.from_g711() and corresponding tests.

Poem

🐰
I nibble μ and A by night,
hop frames until the samples light.
PyAV hums a velvet tune,
bytes uncoil into PCM rune—
carrots cheer the codec flight! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding G.711 audio support with from_g711() decoding, g711_bytes() encoding, and related enums/constants.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch g711

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
tests/rtc/test_pcm_data.py (2)

1-6: Consider importing G711Mapping and G711Encoding for type safety.

The test file uses string literals like "alaw" and "base64" instead of the enum types. While this works due to the string-based API compatibility, importing and using the enums would provide better IDE support and catch typos at static analysis time.

 import numpy as np
 import pytest
 import av
 from fractions import Fraction
 
-from getstream.video.rtc.track_util import PcmData, AudioFormat, Resampler
+from getstream.video.rtc.track_util import PcmData, AudioFormat, Resampler, G711Mapping, G711Encoding

1847-1900: Integration test writes files to assets directory on every run.

The test creates g711_original.wav, g711_decoded_mulaw.wav, and g711_decoded_alaw.wav in tests/assets/ on every test execution. While the file sizes (~16KB each) comply with the 256KB limit per coding guidelines, consider:

  1. These files will accumulate if tests run repeatedly without cleanup
  2. If these files are meant to be committed as test assets for manual review, consider generating them once and committing them
  3. If they're temporary, use tmp_path fixture for cleanup
-def test_g711_integration():
+def test_g711_integration(tmp_path):
     """Integration test that generates test files for manual review."""
     import os
 
     # ... audio generation code ...
 
-    # Save files for manual review
-    assets_dir = os.path.join(os.path.dirname(__file__), "..", "assets")
-    os.makedirs(assets_dir, exist_ok=True)
+    # Save files for manual review (use tmp_path for automatic cleanup)
+    assets_dir = tmp_path

Alternatively, if these files are intentionally persisted for manual audio review, add a comment clarifying this is intentional.

getstream/video/rtc/track_util.py (3)

646-651: Unused encoding constants - consider removal.

MULAW_ENCODE_BIAS, MULAW_MAX, ALAW_ENCODE_BIAS, and ALAW_MAX are defined but not used. The encoding path uses PyAV codecs (pcm_mulaw/pcm_alaw) instead of table-based encoding. Consider removing these unused constants to avoid confusion.

-# G.711 encoding constants
-MULAW_ENCODE_BIAS = 33
-MULAW_MAX = 32635
-ALAW_ENCODE_BIAS = 33
-ALAW_MAX = 32635
-

1311-1345: to_av_frame always converts to int16, which is appropriate for G.711.

The method always converts to int16 via to_int16() before creating the PyAV frame. This is correct for G.711 encoding (which operates on 16-bit PCM), but the docstring could clarify this limitation since the method is public.

Consider adding to the docstring:

     def to_av_frame(self) -> "av.AudioFrame":
         """Convert PcmData to a PyAV AudioFrame.
+        
+        Note: This method converts audio to int16 format (s16p) for compatibility
+        with PyAV codecs. Float32 audio will be converted to int16.
 
         Returns:
             av.AudioFrame: A PyAV AudioFrame with the audio data

2-3: New standard library imports are appropriate.

The base64 and fractions imports are correctly added for G.711 base64 decoding and PyAV time_base handling.

Note: There's also an inline from fractions import Fraction at line 1089 in from_av_frame. Since fractions is now imported at module level, consider using the module-level import consistently:

         # Convert time_base from Fraction to float if present
         time_base = None
         if hasattr(frame, "time_base") and frame.time_base is not None:
-            from fractions import Fraction
-
             if isinstance(frame.time_base, Fraction):
+            if isinstance(frame.time_base, fractions.Fraction):
                 time_base = float(frame.time_base)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between eaee60d and f9fd991.

📒 Files selected for processing (3)
  • getstream/video/rtc/__init__.py (2 hunks)
  • getstream/video/rtc/track_util.py (5 hunks)
  • tests/rtc/test_pcm_data.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Plugins that work with audio, video, or WebRTC functionality should depend on getstream[webrtc] instead of just getstream to access WebRTC-related dependencies like aiortc, numpy, torch, torchaudio, soundfile, scipy, deepgram-sdk, and elevenlabs

Files:

  • getstream/video/rtc/__init__.py
  • tests/rtc/test_pcm_data.py
  • getstream/video/rtc/track_util.py
**/*test*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*test*.py: Use fixtures to inject objects in tests
Keep tests well organized and use test classes for similar tests
Tests that rely on file assets should use files from tests/assets/ folder; new files should be added there, existing ones reused if possible
Do not use mocks or mock things in tests unless explicitly asked to

Files:

  • tests/rtc/test_pcm_data.py
**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/test_*.py: Use fixtures to inject objects in tests; test client fixtures can use the Stream API client
Load credentials from .env file; the client fixture will load credentials automatically
Keep tests well organized and use test classes to group similar tests
Tests that rely on file assets should use files from tests/assets/ folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes
Do not use mocks or mock objects in tests unless directly requested

Files:

  • tests/rtc/test_pcm_data.py
🧠 Learnings (1)
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/*.py : Plugins that work with audio, video, or WebRTC functionality should depend on `getstream[webrtc]` instead of just `getstream` to access WebRTC-related dependencies like `aiortc`, `numpy`, `torch`, `torchaudio`, `soundfile`, `scipy`, `deepgram-sdk`, and `elevenlabs`

Applied to files:

  • getstream/video/rtc/__init__.py
🧬 Code graph analysis (3)
getstream/video/rtc/__init__.py (1)
getstream/video/rtc/track_util.py (4)
  • PcmData (653-2199)
  • AudioFormat (31-81)
  • G711Encoding (104-116)
  • G711Mapping (89-101)
tests/rtc/test_pcm_data.py (1)
getstream/video/rtc/track_util.py (5)
  • PcmData (653-2199)
  • from_g711 (1107-1187)
  • g711_bytes (1347-1379)
  • duration (800-830)
  • to_wav_bytes (1290-1309)
getstream/video/rtc/track_util.py (2)
getstream/video/rtc/recording.py (1)
  • AudioFrame (64-70)
getstream/video/rtc/__init__.py (1)
  • join (55-76)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Tests (3.12)
  • GitHub Check: Tests (3.13)
  • GitHub Check: Tests (3.11)
  • GitHub Check: Tests (3.10)
🔇 Additional comments (6)
getstream/video/rtc/__init__.py (1)

20-26: LGTM! Clean public API expansion for G.711 support.

The new G711Encoding and G711Mapping enums are properly imported and re-exported, following the existing patterns in the module. The webrtc optional dependency check at lines 30-37 ensures required dependencies are available.

Also applies to: 95-96

tests/rtc/test_pcm_data.py (2)

1688-1747: LGTM! Good test coverage for G.711 decoding.

The tests cover:

  • Basic μ-law and A-law decoding with known byte values
  • Base64 input handling (both bytes and string)
  • Proper error handling for invalid encoding combinations
  • Verification of output metadata (sample_rate, channels, format, dtype)

1772-1845: LGTM! Comprehensive encoding and roundtrip tests.

The tests properly account for G.711's lossy compression by:

  • Checking output length matches input length
  • Verifying decoded samples are in reasonable ranges
  • Not asserting exact equality for non-zero values
  • Confirming near-zero samples remain close to zero
getstream/video/rtc/track_util.py (3)

89-117: LGTM! Well-documented enum definitions.

The G711Mapping and G711Encoding enums follow the same pattern as AudioFormat, inheriting from str for backward compatibility with string-based APIs. Documentation is clear and appropriate.


119-380: Large decode tables are appropriate for G.711 implementation.

The 256-entry lookup tables for μ-law and A-law decoding follow the standard ITU-T G.711 implementation approach, providing O(1) decoding performance. The table values appear consistent with the G.711 specification.


1347-1416: LGTM! G.711 encoding implementation is well-structured.

The encoding flow correctly:

  1. Resamples to target rate/channels
  2. Converts to int16 (required for G.711)
  3. Creates PyAV AudioFrame
  4. Encodes using PyAV codec (pcm_mulaw/pcm_alaw)
  5. Flushes the encoder to capture any buffered data

The empty sample handling and proper codec flushing ensure robust behavior.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
getstream/video/rtc/track_util.py (1)

671-704: PyAV resampler hardcodes s16p format, losing f32 audio quality.

This issue was identified in a previous review. The _resample_with_pyav method always uses format="s16p" (line 681) and outputs AudioFormat.S16 (line 697), which means float32 audio will be silently downgraded to int16 during resampling for audio longer than 0.5 seconds.

This contradicts the existing test test_resample_float32_preserves_float32_dtype which expects f32 format preservation.

🧹 Nitpick comments (2)
tests/rtc/test_pcm_data.py (1)

1714-1755: Comprehensive base64 input handling tests.

The test properly verifies:

  1. Base64-encoded bytes input
  2. Base64-encoded string input
  3. Error handling when string input is used with encoding="raw" (both string literal and enum)

The comment on line 1749 ("This is the bug") is outdated since the test now expects and verifies the correct behavior (TypeError is raised).

Consider removing the misleading comment on line 1749:

     # Test that string with encoding=G711Encoding.RAW (enum) also raises TypeError
-    # This is the bug: currently it doesn't raise an error, it just decodes as base64
     from getstream.video.rtc import G711Encoding
getstream/video/rtc/track_util.py (1)

95-100: Unused G.711 encoding constants.

These constants (MULAW_ENCODE_BIAS, MULAW_MAX, ALAW_ENCODE_BIAS, ALAW_MAX) are defined but never used. The actual G.711 encoding is performed via PyAV's pcm_mulaw/pcm_alaw codecs in _encode_g711_with_pyav().

Consider removing these dead constants:

-# G.711 encoding constants
-MULAW_ENCODE_BIAS = 33
-MULAW_MAX = 32635
-ALAW_ENCODE_BIAS = 33
-ALAW_MAX = 32635
-

Or if they're intended for future use, add a comment explaining the intent.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f9fd991 and 74a08d5.

📒 Files selected for processing (4)
  • getstream/video/rtc/__init__.py (2 hunks)
  • getstream/video/rtc/g711.py (1 hunks)
  • getstream/video/rtc/track_util.py (6 hunks)
  • tests/rtc/test_pcm_data.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • getstream/video/rtc/init.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Plugins that work with audio, video, or WebRTC functionality should depend on getstream[webrtc] instead of just getstream to access WebRTC-related dependencies like aiortc, numpy, torch, torchaudio, soundfile, scipy, deepgram-sdk, and elevenlabs

Files:

  • getstream/video/rtc/g711.py
  • tests/rtc/test_pcm_data.py
  • getstream/video/rtc/track_util.py
**/*test*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*test*.py: Use fixtures to inject objects in tests
Keep tests well organized and use test classes for similar tests
Tests that rely on file assets should use files from tests/assets/ folder; new files should be added there, existing ones reused if possible
Do not use mocks or mock things in tests unless explicitly asked to

Files:

  • tests/rtc/test_pcm_data.py
**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/test_*.py: Use fixtures to inject objects in tests; test client fixtures can use the Stream API client
Load credentials from .env file; the client fixture will load credentials automatically
Keep tests well organized and use test classes to group similar tests
Tests that rely on file assets should use files from tests/assets/ folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes
Do not use mocks or mock objects in tests unless directly requested

Files:

  • tests/rtc/test_pcm_data.py
🧠 Learnings (1)
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/*.py : Plugins that work with audio, video, or WebRTC functionality should depend on `getstream[webrtc]` instead of just `getstream` to access WebRTC-related dependencies like `aiortc`, `numpy`, `torch`, `torchaudio`, `soundfile`, `scipy`, `deepgram-sdk`, and `elevenlabs`

Applied to files:

  • getstream/video/rtc/track_util.py
🧬 Code graph analysis (2)
tests/rtc/test_pcm_data.py (2)
getstream/video/rtc/track_util.py (5)
  • PcmData (102-1652)
  • from_g711 (556-640)
  • g711_bytes (800-832)
  • duration (249-279)
  • to_wav_bytes (743-762)
getstream/video/rtc/g711.py (1)
  • G711Encoding (23-35)
getstream/video/rtc/track_util.py (1)
getstream/video/rtc/g711.py (2)
  • G711Encoding (23-35)
  • G711Mapping (8-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Tests (3.12)
  • GitHub Check: Tests (3.13)
  • GitHub Check: Tests (3.10)
  • GitHub Check: Tests (3.11)
🔇 Additional comments (11)
getstream/video/rtc/g711.py (3)

1-36: Well-structured enums for G.711 codec support.

The G711Mapping and G711Encoding enums are correctly implemented as str subclasses for backward compatibility with string-based APIs. This allows both enum and string comparisons (e.g., mapping == G711Mapping.MULAW or mapping == "mulaw").


38-299: MULAW_DECODE_TABLE is correctly sized.

The μ-law decode table has exactly 256 entries (indices 0-255), matching the uint8 input range for G.711 decoding. The values follow the ITU-T G.711 standard.


301-563: ALAW_DECODE_TABLE is correctly sized.

The A-law decode table has exactly 256 entries (indices 0-255), matching the uint8 input range for G.711 decoding. The values follow the ITU-T G.711 standard.

tests/rtc/test_pcm_data.py (4)

1688-1712: Basic G.711 decoding tests are well-structured.

These tests verify the essential properties (sample_rate, channels, format, length, dtype) for both μ-law and A-law decoding paths.


1757-1778: Good coverage for custom sample rates and stereo channels.


1780-1822: Encoding tests with roundtrip verification.

The tests properly verify:

  1. Output is bytes with expected length
  2. Encoded data can be decoded back
  3. Auto-resampling works for non-8kHz input

1824-1853: Roundtrip test correctly handles lossy G.711 encoding.

The test appropriately:

  1. Tests both μ-law and A-law paths
  2. Uses tolerance for decoded values (G.711 is lossy)
  3. Verifies sample count and dtype preservation
getstream/video/rtc/track_util.py (4)

2-3: Imports are correctly added for G.711 support.

Also applies to: 28-34


555-640: G.711 decoding implementation is correct.

The implementation properly:

  1. Normalizes encoding enum to string early (lines 592-594), addressing the previous review concern about enum vs string distinction
  2. Raises TypeError for string input with encoding="raw" (both string and enum)
  3. Uses efficient lookup table decoding
  4. Handles multi-channel data by reshaping to (channels, frames) format

764-798: to_av_frame() implementation is correct for G.711 encoding.

The method correctly:

  1. Converts to int16 (required for G.711 codecs)
  2. Handles both 1D mono and 2D multi-channel arrays
  3. Creates appropriate PyAV AudioFrame with correct layout

800-833: g711_bytes() method is correctly implemented.

The method properly:

  1. Resamples to target sample rate and channels
  2. Converts to int16 (required for G.711)
  3. Delegates to PyAV for actual encoding
  4. Supports both μ-law and A-law mappings

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
tests/rtc/test_pcm_data.py (1)

1918-1971: Integration test should avoid writing permanent files under tests/assets/.

This test generates WAV files in tests/assets/ at runtime. That can pollute git state, cause issues on read-only CI filesystems, and introduce race conditions between parallel test runs. Prefer writing to a temporary directory (e.g., pytest’s tmp_path fixture or tempfile.TemporaryDirectory) rather than committing artifacts on each run; if you want permanent fixtures, generate them once and commit them instead of re-creating them in the test.

Based on learnings, tests should rely on static assets in tests/assets or on per-run temp dirs, not generate new committed files.

🧹 Nitpick comments (2)
tests/rtc/test_pcm_data.py (1)

667-735: PyAV resample tests look correct; consider aligning threshold commentary.

The tests accurately exercise the new PyAV-based resampling path and validate dtype/format preservation for both f32 and s16. One minor nit: comments mention a “>100ms” threshold, while PcmData.resample() currently switches to PyAV for duration > 0.5 seconds. Updating the comments to “>500ms” (or matching whatever threshold you settle on) would avoid future confusion.

getstream/video/rtc/track_util.py (1)

642-713: PyAV resample path correctly preserves formats; just ensure threshold and tests/docs are coherent.

Delegating to _resample_with_pyav for duration > 0.5 seconds and preserving f32 vs s16 via av_format / target_format lines up with the new tests (test_resample_*_pyav_preserves_format). The frame-by-frame conversion back through from_av_frame and append also keeps formats and dtypes consistent.

Two minor follow-ups:

  • The comment mentions “audio longer than 500ms” here, while one of the tests still refers to a 100ms threshold; make sure comments and tests agree on the actual cutoff.
  • For long clips this path may be more allocation-heavy due to concatenating frames via append; if you anticipate multi-second segments, consider a future optimization that pre-allocates or concatenates NumPy arrays directly.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 74a08d5 and bed137c.

📒 Files selected for processing (2)
  • getstream/video/rtc/track_util.py (6 hunks)
  • tests/rtc/test_pcm_data.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*test*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*test*.py: Use fixtures to inject objects in tests
Keep tests well organized and use test classes for similar tests
Tests that rely on file assets should use files from tests/assets/ folder; new files should be added there, existing ones reused if possible
Do not use mocks or mock things in tests unless explicitly asked to

Files:

  • tests/rtc/test_pcm_data.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Plugins that work with audio, video, or WebRTC functionality should depend on getstream[webrtc] instead of just getstream to access WebRTC-related dependencies like aiortc, numpy, torch, torchaudio, soundfile, scipy, deepgram-sdk, and elevenlabs

Files:

  • tests/rtc/test_pcm_data.py
  • getstream/video/rtc/track_util.py
**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/test_*.py: Use fixtures to inject objects in tests; test client fixtures can use the Stream API client
Load credentials from .env file; the client fixture will load credentials automatically
Keep tests well organized and use test classes to group similar tests
Tests that rely on file assets should use files from tests/assets/ folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes
Do not use mocks or mock objects in tests unless directly requested

Files:

  • tests/rtc/test_pcm_data.py
🧠 Learnings (3)
📚 Learning: 2025-12-02T12:16:45.739Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-02T12:16:45.739Z
Learning: Applies to **/*test*.py : Tests that rely on file assets should use files from `tests/assets/` folder; new files should be added there, existing ones reused if possible

Applied to files:

  • tests/rtc/test_pcm_data.py
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/test_*.py : Tests that rely on file assets should use files from `tests/assets/` folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes

Applied to files:

  • tests/rtc/test_pcm_data.py
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/*.py : Plugins that work with audio, video, or WebRTC functionality should depend on `getstream[webrtc]` instead of just `getstream` to access WebRTC-related dependencies like `aiortc`, `numpy`, `torch`, `torchaudio`, `soundfile`, `scipy`, `deepgram-sdk`, and `elevenlabs`

Applied to files:

  • getstream/video/rtc/track_util.py
🪛 GitHub Actions: CI
getstream/video/rtc/track_util.py

[error] 586-586: TypeError in PcmData.from_g711: Cannot use string input with encoding='raw'. Strings are only supported for base64-encoded data. Either pass bytes with encoding='raw', or use encoding='base64' for string input.

🔇 Additional comments (2)
tests/rtc/test_pcm_data.py (1)

1751-1917: G.711 unit tests provide good coverage of bytes/base64 and enum/string paths.

These tests cover μ-law vs A-law, bytes vs base64, custom sample rates, stereo reshaping, and the enum vs string encoding="raw"/"base64" behavior of from_g711 and g711_bytes. They should catch the earlier inconsistency around G711Encoding.RAW for string input and validate the basic roundtrip behavior. No issues from a test-design perspective.

getstream/video/rtc/track_util.py (1)

772-815: to_av_frame looks correct and consistent with from_av_frame.

The method normalizes to planar formats (fltp for f32, s16p for s16), ensures a (channels, samples) ndarray (or (1, N) for mono), and sets layout/sample_rate on the resulting AudioFrame. This matches the assumptions in _resample_with_pyav and _encode_g711_with_pyav and is symmetric with the logic in from_av_frame. No changes needed here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
getstream/video/rtc/track_util.py (1)

555-640: Fix incorrect docstring and example in from_g711.

The docstring at line 572 claims "If g711_data is a string, encoding is automatically set to BASE64," but this is false. The implementation (lines 599-604) raises a TypeError when a string is passed with encoding="raw" (the default), rather than automatically converting to BASE64.

The example at lines 585-586 will fail:

>>> g711_base64 = "//8A"
>>> pcm = PcmData.from_g711(g711_base64, sample_rate=8000)  # Missing encoding parameter!

This call will raise TypeError because the default encoding=G711Encoding.RAW conflicts with string input.

Update the docstring and example to match the implementation:

         Args:
             g711_data: G.711 encoded audio data (bytes or base64 string)
             sample_rate: Sample rate in Hz (default: 8000)
             channels: Number of channels (default: 1 for mono)
             mapping: G.711 mapping type (default: MULAW)
-            encoding: Input encoding format (default: RAW, can be BASE64).
-                     If g711_data is a string, encoding is automatically set to BASE64.
+            encoding: Input encoding format (default: RAW, can be BASE64).
+                     When g711_data is a string, you must explicitly pass encoding="base64".

         Returns:
             PcmData object with decoded audio

         Example:
             >>> import numpy as np
             >>> # Decode μ-law bytes
             >>> g711_data = bytes([0xFF, 0x7F, 0x00, 0x80])
             >>> pcm = PcmData.from_g711(g711_data, sample_rate=8000, channels=1)
             >>> pcm.sample_rate
             8000
             >>> # Decode from base64 string
             >>> g711_base64 = "//8A"
-            >>> pcm = PcmData.from_g711(g711_base64, sample_rate=8000)
+            >>> pcm = PcmData.from_g711(g711_base64, sample_rate=8000, encoding="base64")
             >>> pcm.sample_rate
             8000

Based on learnings, as per past review comments this was supposed to be addressed but the fix was not applied correctly.

🧹 Nitpick comments (3)
tests/rtc/test_pcm_data.py (2)

667-701: Update misleading comment about PyAV resampler threshold.

The comment at lines 668 and 672 references a "100ms threshold," but the actual threshold in track_util.py line 662 is 0.5 seconds (500ms). While the test correctly uses 1 second of audio (which exceeds 500ms and triggers PyAV), the comment misleads readers about the actual threshold.

Update the comments to reflect the correct threshold:

-def test_resample_float32_pyav_preserves_format():
-    """Test that float32 stays float32 when using PyAV resampler (audio > 100ms)."""
-    # Create float32 audio longer than 100ms to trigger PyAV resampler
+def test_resample_float32_pyav_preserves_format():
+    """Test that float32 stays float32 when using PyAV resampler (audio > 500ms)."""
+    # Create float32 audio longer than 500ms to trigger PyAV resampler
     sample_rate_in = 16000
     sample_rate_out = 48000
-    duration_sec = 1.0  # 1 second > 100ms threshold
+    duration_sec = 1.0  # 1 second > 500ms threshold

703-734: Update misleading comment about PyAV resampler threshold.

Same issue as the previous test—the comment references "100ms threshold" but the actual threshold is 500ms (line 662 in track_util.py).

Update the comments:

-def test_resample_int16_pyav_preserves_format():
-    """Test that int16 stays int16 when using PyAV resampler (audio > 100ms)."""
-    # Create int16 audio longer than 100ms to trigger PyAV resampler
+def test_resample_int16_pyav_preserves_format():
+    """Test that int16 stays int16 when using PyAV resampler (audio > 500ms)."""
+    # Create int16 audio longer than 500ms to trigger PyAV resampler
     sample_rate_in = 16000
     sample_rate_out = 48000
-    duration_sec = 1.0  # 1 second > 100ms threshold
+    duration_sec = 1.0  # 1 second > 500ms threshold
getstream/video/rtc/track_util.py (1)

95-99: Dead code: G.711 encoding constants are unused.

These constants (MULAW_ENCODE_BIAS, MULAW_MAX, ALAW_ENCODE_BIAS, ALAW_MAX) are defined but never referenced in the codebase. G.711 encoding is handled by PyAV codecs (lines 843-848) rather than manual encoding, making these constants unnecessary.

Consider removing the unused constants:

-# G.711 encoding constants
-MULAW_ENCODE_BIAS = 33
-MULAW_MAX = 32635
-ALAW_ENCODE_BIAS = 33
-ALAW_MAX = 32635
-
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bed137c and 1420491.

📒 Files selected for processing (2)
  • getstream/video/rtc/track_util.py (6 hunks)
  • tests/rtc/test_pcm_data.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*test*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*test*.py: Use fixtures to inject objects in tests
Keep tests well organized and use test classes for similar tests
Tests that rely on file assets should use files from tests/assets/ folder; new files should be added there, existing ones reused if possible
Do not use mocks or mock things in tests unless explicitly asked to

Files:

  • tests/rtc/test_pcm_data.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Plugins that work with audio, video, or WebRTC functionality should depend on getstream[webrtc] instead of just getstream to access WebRTC-related dependencies like aiortc, numpy, torch, torchaudio, soundfile, scipy, deepgram-sdk, and elevenlabs

Files:

  • tests/rtc/test_pcm_data.py
  • getstream/video/rtc/track_util.py
**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/test_*.py: Use fixtures to inject objects in tests; test client fixtures can use the Stream API client
Load credentials from .env file; the client fixture will load credentials automatically
Keep tests well organized and use test classes to group similar tests
Tests that rely on file assets should use files from tests/assets/ folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes
Do not use mocks or mock objects in tests unless directly requested

Files:

  • tests/rtc/test_pcm_data.py
🧠 Learnings (3)
📚 Learning: 2025-12-02T12:16:45.739Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-02T12:16:45.739Z
Learning: Applies to **/*test*.py : Tests that rely on file assets should use files from `tests/assets/` folder; new files should be added there, existing ones reused if possible

Applied to files:

  • tests/rtc/test_pcm_data.py
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/test_*.py : Tests that rely on file assets should use files from `tests/assets/` folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes

Applied to files:

  • tests/rtc/test_pcm_data.py
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/*.py : Plugins that work with audio, video, or WebRTC functionality should depend on `getstream[webrtc]` instead of just `getstream` to access WebRTC-related dependencies like `aiortc`, `numpy`, `torch`, `torchaudio`, `soundfile`, `scipy`, `deepgram-sdk`, and `elevenlabs`

Applied to files:

  • getstream/video/rtc/track_util.py
🧬 Code graph analysis (2)
tests/rtc/test_pcm_data.py (1)
getstream/video/rtc/g711.py (1)
  • G711Encoding (23-35)
getstream/video/rtc/track_util.py (2)
getstream/video/rtc/g711.py (2)
  • G711Encoding (23-35)
  • G711Mapping (8-20)
getstream/video/rtc/recording.py (1)
  • AudioFrame (64-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Tests (3.13)
  • GitHub Check: Tests (3.12)
  • GitHub Check: Tests (3.10)
  • GitHub Check: Tests (3.11)
🔇 Additional comments (6)
tests/rtc/test_pcm_data.py (2)

1751-1916: LGTM! Comprehensive G.711 test coverage.

The G.711 tests thoroughly cover:

  • μ-law and A-law decoding (basic cases)
  • Base64 encoding/decoding with proper error handling for invalid encoding combinations
  • Custom sample rates and stereo channels
  • Encoding to G.711 bytes
  • Roundtrip encode/decode verification

The error handling tests (lines 1807-1817) correctly verify that string input with encoding="raw" raises TypeError for both the string literal and enum variants, ensuring consistent behavior.


1918-1970: LGTM! Integration test correctly uses temporary directory.

The integration test now properly uses pytest's tmp_path fixture instead of writing to tests/assets/. This addresses the previous review concern about:

  • Git pollution from uncommitted test artifacts
  • CI environment compatibility with read-only filesystems
  • Avoiding race conditions in parallel test execution

The temporary directory is automatically cleaned up by pytest after the test completes.

getstream/video/rtc/track_util.py (4)

660-669: LGTM! Smart hybrid resampling strategy.

The modification to use PyAV resampling for audio longer than 500ms is a good optimization. The comment clearly explains the trade-off: PyAV provides higher quality but is stateful and unsuitable for small real-time chunks (e.g., WebRTC's 20ms frames), while the in-house resampler handles short chunks with lower latency.


671-712: LGTM! PyAV resampler correctly preserves audio format.

The _resample_with_pyav method properly preserves the original audio format:

  • Float32 input → fltp (float32 planar) PyAV format → Float32 output
  • Int16 input → s16p (int16 planar) PyAV format → Int16 output

This addresses the previous concern about hardcoding s16p and losing float32 quality. The format detection (lines 680-685) and result format assignment (lines 703-706) ensure audio quality is maintained through the resampling process.


772-814: LGTM! Correct conversion of PcmData to PyAV AudioFrame.

The to_av_frame method properly handles:

  • Format conversion preserving quality (f32 → fltp, s16 → s16p)
  • Shape normalization to PyAV's expected (channels, samples) format
  • Channel layout configuration (mono/stereo)
  • Sample rate preservation

This provides the foundation for PyAV-based operations like resampling and G.711 encoding.


816-886: LGTM! G.711 encoding implementation is correct.

The G.711 encoding implementation properly:

  • Resamples and converts to int16 before encoding (line 840)
  • Selects the appropriate codec (pcm_mulaw/pcm_alaw) based on mapping (lines 843-846)
  • Configures the codec with correct parameters (lines 868-872)
  • Opens the codec before encoding (line 873) - this addresses the previous review concern
  • Flushes the encoder to get remaining buffered data (lines 881-884)

The addition of codec.open() at line 873 ensures proper codec initialization across all PyAV versions.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
getstream/video/rtc/track_util.py (1)

571-573: Docstring and example still incorrect per previous review.

The docstring claim at lines 571-573 that "encoding is automatically set to BASE64" remains false—the implementation requires callers to explicitly pass encoding="base64" when using string input. The example at line 586 will raise TypeError because it omits this required parameter.

This issue was flagged in a previous review and shows a GitHub Actions error, but appears unresolved.

Also applies to: 586-586

🧹 Nitpick comments (1)
getstream/video/rtc/track_util.py (1)

95-99: Unused G.711 constants—consider removing or documenting.

These module-level constants are defined but never referenced in the implementation. The G.711 encoding/decoding uses PyAV codecs and lookup tables rather than these values. If they're intended as part of the public API for external use, add docstrings to explain their purpose. Otherwise, remove them to reduce maintenance burden.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1420491 and 39b27c2.

📒 Files selected for processing (1)
  • getstream/video/rtc/track_util.py (6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Plugins that work with audio, video, or WebRTC functionality should depend on getstream[webrtc] instead of just getstream to access WebRTC-related dependencies like aiortc, numpy, torch, torchaudio, soundfile, scipy, deepgram-sdk, and elevenlabs

Files:

  • getstream/video/rtc/track_util.py
🧠 Learnings (1)
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/*.py : Plugins that work with audio, video, or WebRTC functionality should depend on `getstream[webrtc]` instead of just `getstream` to access WebRTC-related dependencies like `aiortc`, `numpy`, `torch`, `torchaudio`, `soundfile`, `scipy`, `deepgram-sdk`, and `elevenlabs`

Applied to files:

  • getstream/video/rtc/track_util.py
🧬 Code graph analysis (1)
getstream/video/rtc/track_util.py (3)
getstream/video/rtc/g711.py (2)
  • G711Encoding (23-35)
  • G711Mapping (8-20)
getstream/video/rtc/recording.py (1)
  • AudioFrame (64-70)
getstream/video/rtc/__init__.py (1)
  • join (57-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Tests (3.12)
  • GitHub Check: Tests (3.11)
  • GitHub Check: Tests (3.13)
  • GitHub Check: Tests (3.10)
🔇 Additional comments (3)
getstream/video/rtc/track_util.py (3)

660-712: LGTM!

The PyAV resampler integration is well-implemented: properly preserves audio format (f32/s16), uses appropriate format mappings (fltp/s16p), flushes the resampler to capture buffered samples, and correctly assembles results across multiple frames. The 0.5-second threshold for choosing the high-quality path is reasonable.


772-814: LGTM!

The to_av_frame() method correctly converts PcmData to PyAV AudioFrame: preserves the original format (f32→fltp, s16→s16p), handles both 1D and 2D sample arrays, and properly sets up the frame with the correct layout and sample rate.


816-886: LGTM!

The G.711 encoding implementation is correct: properly resamples to target rate and converts to int16, sets up the PyAV codec context with correct parameters (format="s16", proper layout and sample_rate, appropriate time_base), calls codec.open() before encoding, and properly flushes the encoder to capture buffered packets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants