-
Notifications
You must be signed in to change notification settings - Fork 14
Add support for G711 audio #202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds G.711 (μ‑law / A‑law) support plus PyAV audio interop: new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (5)
tests/rtc/test_pcm_data.py (2)
1-6: Consider importing G711Mapping and G711Encoding for type safety.The test file uses string literals like
"alaw"and"base64"instead of the enum types. While this works due to the string-based API compatibility, importing and using the enums would provide better IDE support and catch typos at static analysis time.import numpy as np import pytest import av from fractions import Fraction -from getstream.video.rtc.track_util import PcmData, AudioFormat, Resampler +from getstream.video.rtc.track_util import PcmData, AudioFormat, Resampler, G711Mapping, G711Encoding
1847-1900: Integration test writes files to assets directory on every run.The test creates
g711_original.wav,g711_decoded_mulaw.wav, andg711_decoded_alaw.wavintests/assets/on every test execution. While the file sizes (~16KB each) comply with the 256KB limit per coding guidelines, consider:
- These files will accumulate if tests run repeatedly without cleanup
- If these files are meant to be committed as test assets for manual review, consider generating them once and committing them
- If they're temporary, use
tmp_pathfixture for cleanup-def test_g711_integration(): +def test_g711_integration(tmp_path): """Integration test that generates test files for manual review.""" import os # ... audio generation code ... - # Save files for manual review - assets_dir = os.path.join(os.path.dirname(__file__), "..", "assets") - os.makedirs(assets_dir, exist_ok=True) + # Save files for manual review (use tmp_path for automatic cleanup) + assets_dir = tmp_pathAlternatively, if these files are intentionally persisted for manual audio review, add a comment clarifying this is intentional.
getstream/video/rtc/track_util.py (3)
646-651: Unused encoding constants - consider removal.
MULAW_ENCODE_BIAS,MULAW_MAX,ALAW_ENCODE_BIAS, andALAW_MAXare defined but not used. The encoding path uses PyAV codecs (pcm_mulaw/pcm_alaw) instead of table-based encoding. Consider removing these unused constants to avoid confusion.-# G.711 encoding constants -MULAW_ENCODE_BIAS = 33 -MULAW_MAX = 32635 -ALAW_ENCODE_BIAS = 33 -ALAW_MAX = 32635 -
1311-1345: to_av_frame always converts to int16, which is appropriate for G.711.The method always converts to int16 via
to_int16()before creating the PyAV frame. This is correct for G.711 encoding (which operates on 16-bit PCM), but the docstring could clarify this limitation since the method is public.Consider adding to the docstring:
def to_av_frame(self) -> "av.AudioFrame": """Convert PcmData to a PyAV AudioFrame. + + Note: This method converts audio to int16 format (s16p) for compatibility + with PyAV codecs. Float32 audio will be converted to int16. Returns: av.AudioFrame: A PyAV AudioFrame with the audio data
2-3: New standard library imports are appropriate.The
base64andfractionsimports are correctly added for G.711 base64 decoding and PyAV time_base handling.Note: There's also an inline
from fractions import Fractionat line 1089 infrom_av_frame. Sincefractionsis now imported at module level, consider using the module-level import consistently:# Convert time_base from Fraction to float if present time_base = None if hasattr(frame, "time_base") and frame.time_base is not None: - from fractions import Fraction - if isinstance(frame.time_base, Fraction): + if isinstance(frame.time_base, fractions.Fraction): time_base = float(frame.time_base)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
getstream/video/rtc/__init__.py(2 hunks)getstream/video/rtc/track_util.py(5 hunks)tests/rtc/test_pcm_data.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Plugins that work with audio, video, or WebRTC functionality should depend on
getstream[webrtc]instead of justgetstreamto access WebRTC-related dependencies likeaiortc,numpy,torch,torchaudio,soundfile,scipy,deepgram-sdk, andelevenlabs
Files:
getstream/video/rtc/__init__.pytests/rtc/test_pcm_data.pygetstream/video/rtc/track_util.py
**/*test*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*test*.py: Use fixtures to inject objects in tests
Keep tests well organized and use test classes for similar tests
Tests that rely on file assets should use files fromtests/assets/folder; new files should be added there, existing ones reused if possible
Do not use mocks or mock things in tests unless explicitly asked to
Files:
tests/rtc/test_pcm_data.py
**/test_*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/test_*.py: Use fixtures to inject objects in tests; test client fixtures can use the Stream API client
Load credentials from .env file; the client fixture will load credentials automatically
Keep tests well organized and use test classes to group similar tests
Tests that rely on file assets should use files fromtests/assets/folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes
Do not use mocks or mock objects in tests unless directly requested
Files:
tests/rtc/test_pcm_data.py
🧠 Learnings (1)
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/*.py : Plugins that work with audio, video, or WebRTC functionality should depend on `getstream[webrtc]` instead of just `getstream` to access WebRTC-related dependencies like `aiortc`, `numpy`, `torch`, `torchaudio`, `soundfile`, `scipy`, `deepgram-sdk`, and `elevenlabs`
Applied to files:
getstream/video/rtc/__init__.py
🧬 Code graph analysis (3)
getstream/video/rtc/__init__.py (1)
getstream/video/rtc/track_util.py (4)
PcmData(653-2199)AudioFormat(31-81)G711Encoding(104-116)G711Mapping(89-101)
tests/rtc/test_pcm_data.py (1)
getstream/video/rtc/track_util.py (5)
PcmData(653-2199)from_g711(1107-1187)g711_bytes(1347-1379)duration(800-830)to_wav_bytes(1290-1309)
getstream/video/rtc/track_util.py (2)
getstream/video/rtc/recording.py (1)
AudioFrame(64-70)getstream/video/rtc/__init__.py (1)
join(55-76)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Tests (3.12)
- GitHub Check: Tests (3.13)
- GitHub Check: Tests (3.11)
- GitHub Check: Tests (3.10)
🔇 Additional comments (6)
getstream/video/rtc/__init__.py (1)
20-26: LGTM! Clean public API expansion for G.711 support.The new
G711EncodingandG711Mappingenums are properly imported and re-exported, following the existing patterns in the module. Thewebrtcoptional dependency check at lines 30-37 ensures required dependencies are available.Also applies to: 95-96
tests/rtc/test_pcm_data.py (2)
1688-1747: LGTM! Good test coverage for G.711 decoding.The tests cover:
- Basic μ-law and A-law decoding with known byte values
- Base64 input handling (both bytes and string)
- Proper error handling for invalid encoding combinations
- Verification of output metadata (sample_rate, channels, format, dtype)
1772-1845: LGTM! Comprehensive encoding and roundtrip tests.The tests properly account for G.711's lossy compression by:
- Checking output length matches input length
- Verifying decoded samples are in reasonable ranges
- Not asserting exact equality for non-zero values
- Confirming near-zero samples remain close to zero
getstream/video/rtc/track_util.py (3)
89-117: LGTM! Well-documented enum definitions.The
G711MappingandG711Encodingenums follow the same pattern asAudioFormat, inheriting fromstrfor backward compatibility with string-based APIs. Documentation is clear and appropriate.
119-380: Large decode tables are appropriate for G.711 implementation.The 256-entry lookup tables for μ-law and A-law decoding follow the standard ITU-T G.711 implementation approach, providing O(1) decoding performance. The table values appear consistent with the G.711 specification.
1347-1416: LGTM! G.711 encoding implementation is well-structured.The encoding flow correctly:
- Resamples to target rate/channels
- Converts to int16 (required for G.711)
- Creates PyAV AudioFrame
- Encodes using PyAV codec (pcm_mulaw/pcm_alaw)
- Flushes the encoder to capture any buffered data
The empty sample handling and proper codec flushing ensure robust behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
getstream/video/rtc/track_util.py (1)
671-704: PyAV resampler hardcodes s16p format, losing f32 audio quality.This issue was identified in a previous review. The
_resample_with_pyavmethod always usesformat="s16p"(line 681) and outputsAudioFormat.S16(line 697), which means float32 audio will be silently downgraded to int16 during resampling for audio longer than 0.5 seconds.This contradicts the existing test
test_resample_float32_preserves_float32_dtypewhich expects f32 format preservation.
🧹 Nitpick comments (2)
tests/rtc/test_pcm_data.py (1)
1714-1755: Comprehensive base64 input handling tests.The test properly verifies:
- Base64-encoded bytes input
- Base64-encoded string input
- Error handling when string input is used with
encoding="raw"(both string literal and enum)The comment on line 1749 ("This is the bug") is outdated since the test now expects and verifies the correct behavior (TypeError is raised).
Consider removing the misleading comment on line 1749:
# Test that string with encoding=G711Encoding.RAW (enum) also raises TypeError - # This is the bug: currently it doesn't raise an error, it just decodes as base64 from getstream.video.rtc import G711Encodinggetstream/video/rtc/track_util.py (1)
95-100: Unused G.711 encoding constants.These constants (
MULAW_ENCODE_BIAS,MULAW_MAX,ALAW_ENCODE_BIAS,ALAW_MAX) are defined but never used. The actual G.711 encoding is performed via PyAV'spcm_mulaw/pcm_alawcodecs in_encode_g711_with_pyav().Consider removing these dead constants:
-# G.711 encoding constants -MULAW_ENCODE_BIAS = 33 -MULAW_MAX = 32635 -ALAW_ENCODE_BIAS = 33 -ALAW_MAX = 32635 -Or if they're intended for future use, add a comment explaining the intent.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
getstream/video/rtc/__init__.py(2 hunks)getstream/video/rtc/g711.py(1 hunks)getstream/video/rtc/track_util.py(6 hunks)tests/rtc/test_pcm_data.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- getstream/video/rtc/init.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Plugins that work with audio, video, or WebRTC functionality should depend on
getstream[webrtc]instead of justgetstreamto access WebRTC-related dependencies likeaiortc,numpy,torch,torchaudio,soundfile,scipy,deepgram-sdk, andelevenlabs
Files:
getstream/video/rtc/g711.pytests/rtc/test_pcm_data.pygetstream/video/rtc/track_util.py
**/*test*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*test*.py: Use fixtures to inject objects in tests
Keep tests well organized and use test classes for similar tests
Tests that rely on file assets should use files fromtests/assets/folder; new files should be added there, existing ones reused if possible
Do not use mocks or mock things in tests unless explicitly asked to
Files:
tests/rtc/test_pcm_data.py
**/test_*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/test_*.py: Use fixtures to inject objects in tests; test client fixtures can use the Stream API client
Load credentials from .env file; the client fixture will load credentials automatically
Keep tests well organized and use test classes to group similar tests
Tests that rely on file assets should use files fromtests/assets/folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes
Do not use mocks or mock objects in tests unless directly requested
Files:
tests/rtc/test_pcm_data.py
🧠 Learnings (1)
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/*.py : Plugins that work with audio, video, or WebRTC functionality should depend on `getstream[webrtc]` instead of just `getstream` to access WebRTC-related dependencies like `aiortc`, `numpy`, `torch`, `torchaudio`, `soundfile`, `scipy`, `deepgram-sdk`, and `elevenlabs`
Applied to files:
getstream/video/rtc/track_util.py
🧬 Code graph analysis (2)
tests/rtc/test_pcm_data.py (2)
getstream/video/rtc/track_util.py (5)
PcmData(102-1652)from_g711(556-640)g711_bytes(800-832)duration(249-279)to_wav_bytes(743-762)getstream/video/rtc/g711.py (1)
G711Encoding(23-35)
getstream/video/rtc/track_util.py (1)
getstream/video/rtc/g711.py (2)
G711Encoding(23-35)G711Mapping(8-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Tests (3.12)
- GitHub Check: Tests (3.13)
- GitHub Check: Tests (3.10)
- GitHub Check: Tests (3.11)
🔇 Additional comments (11)
getstream/video/rtc/g711.py (3)
1-36: Well-structured enums for G.711 codec support.The
G711MappingandG711Encodingenums are correctly implemented asstrsubclasses for backward compatibility with string-based APIs. This allows both enum and string comparisons (e.g.,mapping == G711Mapping.MULAWormapping == "mulaw").
38-299: MULAW_DECODE_TABLE is correctly sized.The μ-law decode table has exactly 256 entries (indices 0-255), matching the uint8 input range for G.711 decoding. The values follow the ITU-T G.711 standard.
301-563: ALAW_DECODE_TABLE is correctly sized.The A-law decode table has exactly 256 entries (indices 0-255), matching the uint8 input range for G.711 decoding. The values follow the ITU-T G.711 standard.
tests/rtc/test_pcm_data.py (4)
1688-1712: Basic G.711 decoding tests are well-structured.These tests verify the essential properties (sample_rate, channels, format, length, dtype) for both μ-law and A-law decoding paths.
1757-1778: Good coverage for custom sample rates and stereo channels.
1780-1822: Encoding tests with roundtrip verification.The tests properly verify:
- Output is bytes with expected length
- Encoded data can be decoded back
- Auto-resampling works for non-8kHz input
1824-1853: Roundtrip test correctly handles lossy G.711 encoding.The test appropriately:
- Tests both μ-law and A-law paths
- Uses tolerance for decoded values (G.711 is lossy)
- Verifies sample count and dtype preservation
getstream/video/rtc/track_util.py (4)
2-3: Imports are correctly added for G.711 support.Also applies to: 28-34
555-640: G.711 decoding implementation is correct.The implementation properly:
- Normalizes encoding enum to string early (lines 592-594), addressing the previous review concern about enum vs string distinction
- Raises
TypeErrorfor string input withencoding="raw"(both string and enum)- Uses efficient lookup table decoding
- Handles multi-channel data by reshaping to
(channels, frames)format
764-798: to_av_frame() implementation is correct for G.711 encoding.The method correctly:
- Converts to int16 (required for G.711 codecs)
- Handles both 1D mono and 2D multi-channel arrays
- Creates appropriate PyAV AudioFrame with correct layout
800-833: g711_bytes() method is correctly implemented.The method properly:
- Resamples to target sample rate and channels
- Converts to int16 (required for G.711)
- Delegates to PyAV for actual encoding
- Supports both μ-law and A-law mappings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
tests/rtc/test_pcm_data.py (1)
1918-1971: Integration test should avoid writing permanent files undertests/assets/.This test generates WAV files in
tests/assets/at runtime. That can pollute git state, cause issues on read-only CI filesystems, and introduce race conditions between parallel test runs. Prefer writing to a temporary directory (e.g., pytest’stmp_pathfixture ortempfile.TemporaryDirectory) rather than committing artifacts on each run; if you want permanent fixtures, generate them once and commit them instead of re-creating them in the test.Based on learnings, tests should rely on static assets in
tests/assetsor on per-run temp dirs, not generate new committed files.
🧹 Nitpick comments (2)
tests/rtc/test_pcm_data.py (1)
667-735: PyAV resample tests look correct; consider aligning threshold commentary.The tests accurately exercise the new PyAV-based resampling path and validate dtype/format preservation for both f32 and s16. One minor nit: comments mention a “>100ms” threshold, while
PcmData.resample()currently switches to PyAV forduration > 0.5seconds. Updating the comments to “>500ms” (or matching whatever threshold you settle on) would avoid future confusion.getstream/video/rtc/track_util.py (1)
642-713: PyAV resample path correctly preserves formats; just ensure threshold and tests/docs are coherent.Delegating to
_resample_with_pyavforduration > 0.5seconds and preservingf32vss16viaav_format/target_formatlines up with the new tests (test_resample_*_pyav_preserves_format). The frame-by-frame conversion back throughfrom_av_frameandappendalso keeps formats and dtypes consistent.Two minor follow-ups:
- The comment mentions “audio longer than 500ms” here, while one of the tests still refers to a 100ms threshold; make sure comments and tests agree on the actual cutoff.
- For long clips this path may be more allocation-heavy due to concatenating frames via
append; if you anticipate multi-second segments, consider a future optimization that pre-allocates or concatenates NumPy arrays directly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
getstream/video/rtc/track_util.py(6 hunks)tests/rtc/test_pcm_data.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*test*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*test*.py: Use fixtures to inject objects in tests
Keep tests well organized and use test classes for similar tests
Tests that rely on file assets should use files fromtests/assets/folder; new files should be added there, existing ones reused if possible
Do not use mocks or mock things in tests unless explicitly asked to
Files:
tests/rtc/test_pcm_data.py
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Plugins that work with audio, video, or WebRTC functionality should depend on
getstream[webrtc]instead of justgetstreamto access WebRTC-related dependencies likeaiortc,numpy,torch,torchaudio,soundfile,scipy,deepgram-sdk, andelevenlabs
Files:
tests/rtc/test_pcm_data.pygetstream/video/rtc/track_util.py
**/test_*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/test_*.py: Use fixtures to inject objects in tests; test client fixtures can use the Stream API client
Load credentials from .env file; the client fixture will load credentials automatically
Keep tests well organized and use test classes to group similar tests
Tests that rely on file assets should use files fromtests/assets/folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes
Do not use mocks or mock objects in tests unless directly requested
Files:
tests/rtc/test_pcm_data.py
🧠 Learnings (3)
📚 Learning: 2025-12-02T12:16:45.739Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-02T12:16:45.739Z
Learning: Applies to **/*test*.py : Tests that rely on file assets should use files from `tests/assets/` folder; new files should be added there, existing ones reused if possible
Applied to files:
tests/rtc/test_pcm_data.py
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/test_*.py : Tests that rely on file assets should use files from `tests/assets/` folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes
Applied to files:
tests/rtc/test_pcm_data.py
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/*.py : Plugins that work with audio, video, or WebRTC functionality should depend on `getstream[webrtc]` instead of just `getstream` to access WebRTC-related dependencies like `aiortc`, `numpy`, `torch`, `torchaudio`, `soundfile`, `scipy`, `deepgram-sdk`, and `elevenlabs`
Applied to files:
getstream/video/rtc/track_util.py
🪛 GitHub Actions: CI
getstream/video/rtc/track_util.py
[error] 586-586: TypeError in PcmData.from_g711: Cannot use string input with encoding='raw'. Strings are only supported for base64-encoded data. Either pass bytes with encoding='raw', or use encoding='base64' for string input.
🔇 Additional comments (2)
tests/rtc/test_pcm_data.py (1)
1751-1917: G.711 unit tests provide good coverage of bytes/base64 and enum/string paths.These tests cover μ-law vs A-law, bytes vs base64, custom sample rates, stereo reshaping, and the enum vs string
encoding="raw"/"base64"behavior offrom_g711andg711_bytes. They should catch the earlier inconsistency aroundG711Encoding.RAWfor string input and validate the basic roundtrip behavior. No issues from a test-design perspective.getstream/video/rtc/track_util.py (1)
772-815:to_av_framelooks correct and consistent withfrom_av_frame.The method normalizes to planar formats (
fltpfor f32,s16pfor s16), ensures a(channels, samples)ndarray (or(1, N)for mono), and sets layout/sample_rate on the resultingAudioFrame. This matches the assumptions in_resample_with_pyavand_encode_g711_with_pyavand is symmetric with the logic infrom_av_frame. No changes needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
getstream/video/rtc/track_util.py (1)
555-640: Fix incorrect docstring and example infrom_g711.The docstring at line 572 claims "If g711_data is a string, encoding is automatically set to BASE64," but this is false. The implementation (lines 599-604) raises a
TypeErrorwhen a string is passed withencoding="raw"(the default), rather than automatically converting to BASE64.The example at lines 585-586 will fail:
>>> g711_base64 = "//8A" >>> pcm = PcmData.from_g711(g711_base64, sample_rate=8000) # Missing encoding parameter!This call will raise
TypeErrorbecause the defaultencoding=G711Encoding.RAWconflicts with string input.Update the docstring and example to match the implementation:
Args: g711_data: G.711 encoded audio data (bytes or base64 string) sample_rate: Sample rate in Hz (default: 8000) channels: Number of channels (default: 1 for mono) mapping: G.711 mapping type (default: MULAW) - encoding: Input encoding format (default: RAW, can be BASE64). - If g711_data is a string, encoding is automatically set to BASE64. + encoding: Input encoding format (default: RAW, can be BASE64). + When g711_data is a string, you must explicitly pass encoding="base64". Returns: PcmData object with decoded audio Example: >>> import numpy as np >>> # Decode μ-law bytes >>> g711_data = bytes([0xFF, 0x7F, 0x00, 0x80]) >>> pcm = PcmData.from_g711(g711_data, sample_rate=8000, channels=1) >>> pcm.sample_rate 8000 >>> # Decode from base64 string >>> g711_base64 = "//8A" - >>> pcm = PcmData.from_g711(g711_base64, sample_rate=8000) + >>> pcm = PcmData.from_g711(g711_base64, sample_rate=8000, encoding="base64") >>> pcm.sample_rate 8000Based on learnings, as per past review comments this was supposed to be addressed but the fix was not applied correctly.
🧹 Nitpick comments (3)
tests/rtc/test_pcm_data.py (2)
667-701: Update misleading comment about PyAV resampler threshold.The comment at lines 668 and 672 references a "100ms threshold," but the actual threshold in
track_util.pyline 662 is0.5seconds (500ms). While the test correctly uses 1 second of audio (which exceeds 500ms and triggers PyAV), the comment misleads readers about the actual threshold.Update the comments to reflect the correct threshold:
-def test_resample_float32_pyav_preserves_format(): - """Test that float32 stays float32 when using PyAV resampler (audio > 100ms).""" - # Create float32 audio longer than 100ms to trigger PyAV resampler +def test_resample_float32_pyav_preserves_format(): + """Test that float32 stays float32 when using PyAV resampler (audio > 500ms).""" + # Create float32 audio longer than 500ms to trigger PyAV resampler sample_rate_in = 16000 sample_rate_out = 48000 - duration_sec = 1.0 # 1 second > 100ms threshold + duration_sec = 1.0 # 1 second > 500ms threshold
703-734: Update misleading comment about PyAV resampler threshold.Same issue as the previous test—the comment references "100ms threshold" but the actual threshold is 500ms (line 662 in
track_util.py).Update the comments:
-def test_resample_int16_pyav_preserves_format(): - """Test that int16 stays int16 when using PyAV resampler (audio > 100ms).""" - # Create int16 audio longer than 100ms to trigger PyAV resampler +def test_resample_int16_pyav_preserves_format(): + """Test that int16 stays int16 when using PyAV resampler (audio > 500ms).""" + # Create int16 audio longer than 500ms to trigger PyAV resampler sample_rate_in = 16000 sample_rate_out = 48000 - duration_sec = 1.0 # 1 second > 100ms threshold + duration_sec = 1.0 # 1 second > 500ms thresholdgetstream/video/rtc/track_util.py (1)
95-99: Dead code: G.711 encoding constants are unused.These constants (
MULAW_ENCODE_BIAS,MULAW_MAX,ALAW_ENCODE_BIAS,ALAW_MAX) are defined but never referenced in the codebase. G.711 encoding is handled by PyAV codecs (lines 843-848) rather than manual encoding, making these constants unnecessary.Consider removing the unused constants:
-# G.711 encoding constants -MULAW_ENCODE_BIAS = 33 -MULAW_MAX = 32635 -ALAW_ENCODE_BIAS = 33 -ALAW_MAX = 32635 -
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
getstream/video/rtc/track_util.py(6 hunks)tests/rtc/test_pcm_data.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*test*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*test*.py: Use fixtures to inject objects in tests
Keep tests well organized and use test classes for similar tests
Tests that rely on file assets should use files fromtests/assets/folder; new files should be added there, existing ones reused if possible
Do not use mocks or mock things in tests unless explicitly asked to
Files:
tests/rtc/test_pcm_data.py
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Plugins that work with audio, video, or WebRTC functionality should depend on
getstream[webrtc]instead of justgetstreamto access WebRTC-related dependencies likeaiortc,numpy,torch,torchaudio,soundfile,scipy,deepgram-sdk, andelevenlabs
Files:
tests/rtc/test_pcm_data.pygetstream/video/rtc/track_util.py
**/test_*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/test_*.py: Use fixtures to inject objects in tests; test client fixtures can use the Stream API client
Load credentials from .env file; the client fixture will load credentials automatically
Keep tests well organized and use test classes to group similar tests
Tests that rely on file assets should use files fromtests/assets/folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes
Do not use mocks or mock objects in tests unless directly requested
Files:
tests/rtc/test_pcm_data.py
🧠 Learnings (3)
📚 Learning: 2025-12-02T12:16:45.739Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-02T12:16:45.739Z
Learning: Applies to **/*test*.py : Tests that rely on file assets should use files from `tests/assets/` folder; new files should be added there, existing ones reused if possible
Applied to files:
tests/rtc/test_pcm_data.py
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/test_*.py : Tests that rely on file assets should use files from `tests/assets/` folder; add new files there and reuse existing ones when possible. Do not use files larger than 256 kilobytes
Applied to files:
tests/rtc/test_pcm_data.py
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/*.py : Plugins that work with audio, video, or WebRTC functionality should depend on `getstream[webrtc]` instead of just `getstream` to access WebRTC-related dependencies like `aiortc`, `numpy`, `torch`, `torchaudio`, `soundfile`, `scipy`, `deepgram-sdk`, and `elevenlabs`
Applied to files:
getstream/video/rtc/track_util.py
🧬 Code graph analysis (2)
tests/rtc/test_pcm_data.py (1)
getstream/video/rtc/g711.py (1)
G711Encoding(23-35)
getstream/video/rtc/track_util.py (2)
getstream/video/rtc/g711.py (2)
G711Encoding(23-35)G711Mapping(8-20)getstream/video/rtc/recording.py (1)
AudioFrame(64-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Tests (3.13)
- GitHub Check: Tests (3.12)
- GitHub Check: Tests (3.10)
- GitHub Check: Tests (3.11)
🔇 Additional comments (6)
tests/rtc/test_pcm_data.py (2)
1751-1916: LGTM! Comprehensive G.711 test coverage.The G.711 tests thoroughly cover:
- μ-law and A-law decoding (basic cases)
- Base64 encoding/decoding with proper error handling for invalid encoding combinations
- Custom sample rates and stereo channels
- Encoding to G.711 bytes
- Roundtrip encode/decode verification
The error handling tests (lines 1807-1817) correctly verify that string input with
encoding="raw"raisesTypeErrorfor both the string literal and enum variants, ensuring consistent behavior.
1918-1970: LGTM! Integration test correctly uses temporary directory.The integration test now properly uses pytest's
tmp_pathfixture instead of writing totests/assets/. This addresses the previous review concern about:
- Git pollution from uncommitted test artifacts
- CI environment compatibility with read-only filesystems
- Avoiding race conditions in parallel test execution
The temporary directory is automatically cleaned up by pytest after the test completes.
getstream/video/rtc/track_util.py (4)
660-669: LGTM! Smart hybrid resampling strategy.The modification to use PyAV resampling for audio longer than 500ms is a good optimization. The comment clearly explains the trade-off: PyAV provides higher quality but is stateful and unsuitable for small real-time chunks (e.g., WebRTC's 20ms frames), while the in-house resampler handles short chunks with lower latency.
671-712: LGTM! PyAV resampler correctly preserves audio format.The
_resample_with_pyavmethod properly preserves the original audio format:
- Float32 input →
fltp(float32 planar) PyAV format → Float32 output- Int16 input →
s16p(int16 planar) PyAV format → Int16 outputThis addresses the previous concern about hardcoding
s16pand losing float32 quality. The format detection (lines 680-685) and result format assignment (lines 703-706) ensure audio quality is maintained through the resampling process.
772-814: LGTM! Correct conversion of PcmData to PyAV AudioFrame.The
to_av_framemethod properly handles:
- Format conversion preserving quality (f32 → fltp, s16 → s16p)
- Shape normalization to PyAV's expected (channels, samples) format
- Channel layout configuration (mono/stereo)
- Sample rate preservation
This provides the foundation for PyAV-based operations like resampling and G.711 encoding.
816-886: LGTM! G.711 encoding implementation is correct.The G.711 encoding implementation properly:
- Resamples and converts to int16 before encoding (line 840)
- Selects the appropriate codec (pcm_mulaw/pcm_alaw) based on mapping (lines 843-846)
- Configures the codec with correct parameters (lines 868-872)
- Opens the codec before encoding (line 873) - this addresses the previous review concern
- Flushes the encoder to get remaining buffered data (lines 881-884)
The addition of
codec.open()at line 873 ensures proper codec initialization across all PyAV versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
getstream/video/rtc/track_util.py (1)
571-573: Docstring and example still incorrect per previous review.The docstring claim at lines 571-573 that "encoding is automatically set to BASE64" remains false—the implementation requires callers to explicitly pass
encoding="base64"when using string input. The example at line 586 will raiseTypeErrorbecause it omits this required parameter.This issue was flagged in a previous review and shows a GitHub Actions error, but appears unresolved.
Also applies to: 586-586
🧹 Nitpick comments (1)
getstream/video/rtc/track_util.py (1)
95-99: Unused G.711 constants—consider removing or documenting.These module-level constants are defined but never referenced in the implementation. The G.711 encoding/decoding uses PyAV codecs and lookup tables rather than these values. If they're intended as part of the public API for external use, add docstrings to explain their purpose. Otherwise, remove them to reduce maintenance burden.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
getstream/video/rtc/track_util.py(6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Plugins that work with audio, video, or WebRTC functionality should depend on
getstream[webrtc]instead of justgetstreamto access WebRTC-related dependencies likeaiortc,numpy,torch,torchaudio,soundfile,scipy,deepgram-sdk, andelevenlabs
Files:
getstream/video/rtc/track_util.py
🧠 Learnings (1)
📚 Learning: 2025-12-02T12:16:57.243Z
Learnt from: CR
Repo: GetStream/stream-py PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T12:16:57.243Z
Learning: Applies to **/*.py : Plugins that work with audio, video, or WebRTC functionality should depend on `getstream[webrtc]` instead of just `getstream` to access WebRTC-related dependencies like `aiortc`, `numpy`, `torch`, `torchaudio`, `soundfile`, `scipy`, `deepgram-sdk`, and `elevenlabs`
Applied to files:
getstream/video/rtc/track_util.py
🧬 Code graph analysis (1)
getstream/video/rtc/track_util.py (3)
getstream/video/rtc/g711.py (2)
G711Encoding(23-35)G711Mapping(8-20)getstream/video/rtc/recording.py (1)
AudioFrame(64-70)getstream/video/rtc/__init__.py (1)
join(57-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Tests (3.12)
- GitHub Check: Tests (3.11)
- GitHub Check: Tests (3.13)
- GitHub Check: Tests (3.10)
🔇 Additional comments (3)
getstream/video/rtc/track_util.py (3)
660-712: LGTM!The PyAV resampler integration is well-implemented: properly preserves audio format (f32/s16), uses appropriate format mappings (fltp/s16p), flushes the resampler to capture buffered samples, and correctly assembles results across multiple frames. The 0.5-second threshold for choosing the high-quality path is reasonable.
772-814: LGTM!The
to_av_frame()method correctly converts PcmData to PyAV AudioFrame: preserves the original format (f32→fltp, s16→s16p), handles both 1D and 2D sample arrays, and properly sets up the frame with the correct layout and sample rate.
816-886: LGTM!The G.711 encoding implementation is correct: properly resamples to target rate and converts to int16, sets up the PyAV codec context with correct parameters (format="s16", proper layout and sample_rate, appropriate time_base), calls
codec.open()before encoding, and properly flushes the encoder to capture buffered packets.
PcmData.from_g711()for decoding G.711 (μ-law/A-law) to PCMPcmData.g711_bytes()for encoding PCM to G.711 using PyAV codecsPcmData.resample()to use PyAV resampler whenduration > 0.5sPcmData.to_av_frame()helper methodSummary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.