Skip to content

Commit 740b9cf

Browse files
committed
Add captureTimestamp and senderCaptureTimeOffset to frame metadata
Fixes #225
1 parent 4b61373 commit 740b9cf

File tree

1 file changed

+79
-0
lines changed

1 file changed

+79
-0
lines changed

index.bs

+79
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@ spec:webidl; type:dfn; text:resolve
4848
"CloneArrayBuffer": {
4949
"href": "https://tc39.es/ecma262/#sec-clonearraybuffer",
5050
"title": "CloneArrayBuffer"
51+
},
52+
"RTP-EXT-CAPTURE-TIME": {
53+
"href": "https://webrtc.googlesource.com/src/+/refs/heads/main/docs/native-code/rtp-hdrext/abs-capture-time",
54+
"title": "RTP Header Extension for Absolute Capture Time",
55+
"publisher": "WebRTC Project"
5156
}
5257
}
5358
</pre>
@@ -134,6 +139,20 @@ The <dfn abstract-op>readEncodedData</dfn> algorithm is given a |rtcObject| as p
134139
1. Let |frame| be the newly produced frame.
135140
1. Set |frame|.`[[owner]]` to |rtcObject|.
136141
1. Set |frame|.`[[counter]]` to |rtcObject|.`[[lastEnqueuedFrameCounter]]`.
142+
1. If the frame has been produced by a {{RTCRtpReceiver}}:
143+
1. If the relevant RTP packet contains the
144+
[[RTP-EXT-CAPTURE-TIME|RTP Header Extension for Absolute Capture Time]], set |frame|.`[[captureTimestamp]]` to the
145+
[[RTP-EXT-CAPTURE-TIME#absolute-capture-timestamp|absolute capture timestamp]] field and set |frame|.`[[senderCaptureTimeOffset]]`
146+
to the [[RTP-EXT-CAPTURE-TIME#estimated-capture-clock-offset|capture clock offset field]] if it is present.
147+
1. Otherwise, if the relevant RTP packet does not contain the
148+
[[RTP-EXT-CAPTURE-TIME|RTP Header Extension for Absolute Capture Time]] but a previous RTP packet did,
149+
set |frame|.`[[captureTimestamp]]` to the result of calculating the absolute capture timestamp according to
150+
[[RTP-EXT-CAPTURE-TIME#timestamp-interpolation|timestamp interpolation]] and set |frame|.`[[senderCaptureTimeOffset]]`
151+
to the most recent value that was present.
152+
1. Otherwise, set set |frame|.`[[captureTimestamp]]` to undefined and set |frame|.`[[senderCaptureTimeOffset]]` to undefined.
153+
1. If the frame has been produced by a {{RTCRtpSender}}, set |frame|.`[[captureTimestamp]]` to the capture timestamp
154+
using the methodology described in [[RTP-EXT-CAPTURE-TIME#absolute-capture-timestamp]] and set frame.`[[senderCaptureTimeOffset]]`
155+
to undefined.
137156
1. [=ReadableStream/Enqueue=] |frame| in |rtcObject|.`[[readable]]`.
138157

139158
The <dfn abstract-op>writeEncodedData</dfn> algorithm is given a |rtcObject| as parameter and a |frame| as input. It is defined by running the following steps:
@@ -293,6 +312,10 @@ The <dfn method for="SFrameTransform">setEncryptionKey(|key|, |keyID|)</dfn> met
293312

294313
# RTCRtpScriptTransform # {#scriptTransform}
295314

315+
In this section, the capture system refers to the system where media is sourced from and the sender system
316+
refers to the system that is sending RTP and RTCP packets to the receiver system where {{RTCEncodedVideoFrameMetadata}} data
317+
or {{RTCEncodedAudioFrameMetadata}} data is populated.
318+
296319
## <dfn enum>RTCEncodedVideoFrameType</dfn> dictionary ## {#RTCEncodedVideoFrameType}
297320
<pre class="idl">
298321
// New enum for video frame types. Will eventually re-use the equivalent defined
@@ -358,6 +381,8 @@ dictionary RTCEncodedVideoFrameMetadata {
358381
sequence&lt;unsigned long&gt; contributingSources;
359382
long long timestamp; // microseconds
360383
unsigned long rtpTimestamp;
384+
DOMHighResTimeStamp captureTimestamp;
385+
DOMHighResTimeStamp senderCaptureTimeOffset;
361386
DOMString mimeType;
362387
};
363388
</pre>
@@ -431,6 +456,32 @@ dictionary RTCEncodedVideoFrameMetadata {
431456
that reflects the sampling instant of the first octet in the RTP data packet.
432457
</p>
433458
</dd>
459+
<dt>
460+
<dfn dict-member>captureTimestamp</dfn> <span class="idlMemberType">DOMHighResTimeStamp</span>
461+
</dt>
462+
<dd>
463+
<p>
464+
The {{RTCEncodedVideoFrameMetadata/captureTimestamp}} is set by the frame source, and for frames that come
465+
from the {{RTCRtpReceiver}}, it is extracted by the [[#stream-processing]] algorithm. Its reference clock
466+
is the capture system's NTP clock (same clock used to generate NTP timestamps for RTCP sender reports on
467+
that system).
468+
469+
On populating this member, the user agent MUST return the value of the frame's `[[captureTimestamp]]` slot.
470+
</p>
471+
</dd>
472+
<dt>
473+
<dfn dict-member>senderCaptureTimeOffset</dfn> <span class="idlMemberType">DOMHighResTimeStamp</span>
474+
</dt>
475+
<dd>
476+
<p>
477+
The {{RTCEncodedVideoFrameMetadata/senderCaptureTimeOffset}} is the sender system's estimate of the offset
478+
between its own NTP clock and the capture system's NTP clock, for the same frame that the
479+
{{RTCEncodedVideoFrameMetadata/captureTimestamp}} was originated from. It is extracted by the
480+
[[#stream-processing]] algorithm.
481+
482+
On populating this member, the user agent MUST return the value of the frame's `[[senderCaptureTimeOffset]]` slot.
483+
</p>
484+
</dd>
434485
<dt>
435486
<dfn dict-member>mimeType</dfn> <span class="idlMemberType">DOMString</span>
436487
</dt>
@@ -611,6 +662,8 @@ dictionary RTCEncodedAudioFrameMetadata {
611662
sequence&lt;unsigned long&gt; contributingSources;
612663
short sequenceNumber;
613664
unsigned long rtpTimestamp;
665+
DOMHighResTimeStamp captureTimestamp;
666+
DOMHighResTimeStamp senderCaptureTimeOffset;
614667
DOMString mimeType;
615668
};
616669
</pre>
@@ -664,6 +717,32 @@ dictionary RTCEncodedAudioFrameMetadata {
664717
that reflects the sampling instant of the first octet in the RTP data packet.
665718
</p>
666719
</dd>
720+
<dt>
721+
<dfn dict-member>captureTimestamp</dfn> <span class="idlMemberType">DOMHighResTimeStamp</span>
722+
</dt>
723+
<dd>
724+
<p>
725+
The {{RTCEncodedAudioFrameMetadata/captureTimestamp}} is set by the frame source, and for frames that come
726+
from the {{RTCRtpReceiver}}, it is extracted by the [[#stream-processing]] algorithm. Its reference clock
727+
is the capture system's NTP clock (same clock used to generate NTP timestamps for RTCP sender reports on
728+
that system).
729+
730+
On populating this member, the user agent MUST return the value of the frame's `[[captureTimestamp]]` slot.
731+
</p>
732+
</dd>
733+
<dt>
734+
<dfn dict-member>senderCaptureTimeOffset</dfn> <span class="idlMemberType">DOMHighResTimeStamp</span>
735+
</dt>
736+
<dd>
737+
<p>
738+
The {{RTCEncodedAudioFrameMetadata/senderCaptureTimeOffset}} is the sender system's estimate of the offset
739+
between its own NTP clock and the capture system's NTP clock, for the same frame that the
740+
{{RTCEncodedAudioFrameMetadata/captureTimestamp}} was originated from. It is extracted by the
741+
[[#stream-processing]] algorithm.
742+
743+
On populating this member, the user agent MUST return the value of the frame's `[[senderCaptureTimeOffset]]` slot.
744+
</p>
745+
</dd>
667746
<dt>
668747
<dfn dict-member>mimeType</dfn> <span class="idlMemberType">DOMString</span>
669748
</dt>

0 commit comments

Comments
 (0)