Add PSNR (Y/U/V) for outbound-rtp #794

fippo · 2025-02-05T16:41:17Z

This is similar to qpSum but codec-independent.

Since PSNR requires additional computation it is defined with an
accompanying psnrMeasurements counter to allow the computation of
an average PSNR.

Defined as a record with components for the Y, U and V planes respectively.

See also
https://datatracker.ietf.org/doc/html/rfc8761#section-5

Preview | Diff

This is similar to qpSum but codec-independent. Since PSNR requires additional computation it is defined with an accompanying psnrMeasurements counter to allow the computation of an average PSNR. Defined as three components for the Y, U and V planes respeectively. See also https://datatracker.ietf.org/doc/html/rfc8761#section-5

fippo · 2025-02-05T16:52:43Z

@henbos ^

henbos · 2025-02-06T15:24:05Z

This needs to be presented at the next virtual interim. Youenn mentions he'd like to hear about use case of the metric.

fippo · 2025-02-06T16:09:15Z

https://www.researchgate.net/publication/383545049_Low-Complexity_Video_PSNR_Measurement_in_Real-Time_Communication_Products has a whole paper about this.

tl;dr is "qp is codec dependent", PSNR is not (but comes at a cost hence this can not be a simple sum)

@youennf the folks who implemented https://developer.apple.com/documentation/videotoolbox/kvtcompressionpropertykey_calculatemeansquarederror?changes=l_4_8 might be able to tell you more too.

cc @taste1981

vr000m · 2025-02-18T17:38:23Z

I have not read the paper, is a preprint available?

But based on my own interactions with PSNR, there is a source and decoded image. Is the measurement on the outbound-rtp related to source and the encoded image, i.e., the PSNR due to the encoder.

fippo · 2025-02-18T18:40:20Z

This one is encoder PSNR, not scaling PSNR. Scaling PSNR would end up living on media-source in stats.
I'll have a copy of the paper sent to you

webrtc-stats.html

vr000m · 2025-02-18T22:38:03Z

The other thing that I am curious about is if the PSNR requires decoding the encoded video or is this calculated as part of the encoder operation. Mainly the impact on CPU if it requires some kind of decode step, I wonder if this is only calculated applied to I-frames or huge-frames.

alvestrand · 2025-02-20T16:00:23Z

@sprangerik have you looked at this?

jan-ivar · 2025-02-20T16:00:39Z

@jesup thoughts?

sprangerik · 2025-02-21T09:30:34Z

I am supportive of this.

For context, see also https://webrtc-review.googlesource.com/c/src/+/368960

sprangerik · 2025-02-21T09:36:04Z

The other thing that I am curious about is if the PSNR requires decoding the encoded video or is this calculated as part of the encoder operation. Mainly the impact on CPU if it requires some kind of decode step, I wonder if this is only calculated applied to I-frames or huge-frames.

The idea is to have it as part of the encoder process. The encoder is by definition also a decoder, so it can directly use both the raw input and reconstructed state without penalty. The actual PSNR calculation will of course often incur an extra CPU hit, unless it is already a part of e.g. a rate-distortion aware rate controller - but that's not often the case for real-time encoders. That's why it's proposed to limit the frequency of PSNR calculations.

This of course means the user cannot count on PSNR metrics being populated. Even for a given stream, the PSNR values might suddenly disappear if e.g. there is a software/hardware switching event and only one implementation supports PSNR output.

vr000m · 2025-02-22T00:17:19Z

since webrtc-stats anyway does aggregate values, we could do a sumPsnr and countFrames, i.e., each time a psnr is calculated it is added and corresponding frame count counter goes up. If it is done for all frames, we would not need a frame counter

dontcallmedom-bot · 2025-02-26T09:32:27Z

This issue was discussed in WebRTC February 2025 meeting – (#794 Add PSNR)

Drekabi · 2025-02-27T01:22:40Z

This seems like a nice feature that could have a few uses. I do wonder if it could be a separate API instead of part of the outbound RTP stats. My initial concerns are calculating this data regardless if the application is even interested in the data and no specification or recommendations on frequency of measurements.

Some pros and cons that come to mind if this were implemented as a separate API instead.
Pros:

This would allow for those interested in the information to request a PSNR directly.
Eliminates the need to constantly be collecting PSNR measurements that may not be used.
It would allow for applications to have some control over the frequency of PSNR measurements.

Cons:

Applications would not be getting PSNR sum for free and need to calculate and store this on their own.

If this should remain in the stats could we consider adding some sort of getStats object to enable logging for this kind of data?

alvestrand · 2025-02-27T06:46:56Z

WebRTC users routinely log getStats data, so adding this would not be any big overhead. If the stats are collected on a timescale of seconds, the overhead is usually negligible. (polling stats for every frame is not a good idea).
I presume that the stats would not be populated if not implemented or enabled; I don't know if it's worth it to expose a control to turn computation on and off on a track.

jan-ivar · 2025-02-27T14:08:52Z

If I understand correctly, the concern of overhead is in the browser doing an expensive calculation most websites would never request (though per-frame is not an issue due to caching, per-second might be; is never an acceptable frequency?)

How expensive is this computation? Our webstats model is like a boat we keep loading with new stuff. Eventually, it becomes problematic.

At some point (maybe now?) might we wish we had something like this?

await sender.getStats({verbosity: "high"}) // low | medium (default) | high

henbos · 2025-02-27T14:55:21Z

I don't think WebRTC has to do these measurements very often for the PSNR measurements to be valuable and if they aren't done very often (say every second or every several seconds) then I don't think we need to make API changes.

A similar example is that if you negotiate corruption-detection we do corruptionMeasurements, but since we only make these once per second they don't have any significant performance implications compared to the rest of the decoding pipeline.

henbos · 2025-02-27T14:56:52Z

Btw this is unrelated to the polling frequency since the metrics only update when a measurement is made and a measurement happens in the background whether or not the app is polling getStats. (Polling getStats several times per second is bad because of the overhead of that call, not because of counters incrementing in the background)

jan-ivar · 2025-02-27T15:13:21Z

One concern is this stat seems to require making two getStats call over some interval.

E.g. is the use case here to try one encoder setting, get stats, then wait 1 second and call getStats again expecting two different measurements? If so this might cause divide by zero error in one browser but not another.

henbos · 2025-02-27T15:34:11Z

All metrics in the getStats API are used like so: "delta foo / delta bar", that is true whether it is a rate (delta bytesSent / delta timestamp), or a measurement thingy (delta totalCorruptionProbability / delta corruptionMeasurements) or something more exotic like (delta qpSum / delta framesEncoded) or even (jitterBufferDelay / jitterBufferEmittedCount).

I could go on with more examples but "divide by zero" is something that the user of this API should be aware of

henbos · 2025-02-27T15:35:54Z

qpSum / framesDecoded might be a better example of a foot gun since that could fail when network glitches but not in stable environment.

In practice web developers will make helper functions that does lookup of deltas and rates taking care of foot guns.

Also you have to be prepared for a metric not being present all of a sudden

jan-ivar · 2025-02-27T15:53:52Z

Yes I didn't mean to suggest the divide by zero hazard was limited to this API. The difference is something like framesDecoded is tied to real-world reception of RTP which won't vary by user agent.

I think the concern in this case is:

lack of cap on frequency might lead two browsers to implement vastly different intervals, e.g. 1 vs 15 seconds.
whether an anticipated use case is A/B codec testing where a website might tune its interval to the fastest browser

fippo · 2025-02-27T17:41:53Z

PSNR is similar to qp so having it in getStats makes sense.

As the paper says we have done this at a frequency higher than one per second on devices where battery consumption is a concern and it works there. Hardware encoder support makes this "cheaper" even. Note that the calculation is done by the encoder so can not be triggered by calling getStats with some magic option. I considered whether it was possible to gate it on the corruption detection RTP header extension but that would have been quite awkward since it is not closely related (not without precedence, quite a few statistics depend on header extensions)

When I say "A/B testing" consider a project like Jitsi moving to AV1, in particular the "Metrics Captured" which, unsurprisingly, relies on getStats. See here for how one uses PSNR to evaluate when it is available.

Such experiments are designed not to compare 🍎 to 🍌 (different browsers, different operating systems) so letting a UA decide on sampling frequency is not a concern as long as it does so consistently.

henbos · 2025-02-28T08:54:23Z

I think the concern in this case is:

lack of cap on frequency might lead two browsers to implement vastly different intervals, e.g. 1 vs 15 seconds.

whether an anticipated use case is A/B codec testing where a website might tune its interval to the fastest browser

Polling is just asking "do you have any new measurements for me?"
Whether I poll once per 15 seconds and get that measurement on the first try or if I poll getStats 15 times and get 14 "no's" and 1 "yes", I think I end up in the same situation, so what exactly was the problem there (and why is this different from the other hundred+ metrics we have)?

It doesn't matter if app polling interval and browser polling interval aligns or not and it's clear from the guidelines that there is no control of sampling period. So I would argue that the only thing that matters is if the measurements are arriving at a granular enough level to be useful.

If the concern is that a browser implementer doesn't know what a useful measurement interval is, maybe we can provide some guidance there, but I fail to see the interop issue with different polling intervals that are all within a "useful" range. FTR I think 15 second is too large of an interval since a lot can happen in that period of time.

fippo · 2025-02-28T18:45:59Z

I would not even poll getStats for A/B testing purposes. One would typically poll periodically and use the last result or call getStats explicitly before closing the peerconnection and then calculate the average PSNR as psnrSum_{y,u,v}/psnrMeasurements. Only calls with enough psnrMeasurements should be taken into account which one needs to irrespective of sampling frequency to exclude "short calls".

(while we are rambling: it seems Firefox throws when calling getStats on a closed peerconnection which is not my understanding of #3 arguably with all the transceivers gone all the interesting stats disappear nowadays)

jan-ivar · 2025-02-28T21:46:42Z

I think the concern in this case is:
...
2. whether an anticipated use case is A/B codec testing where a website might tune its interval to the fastest browser

Polling is just asking "do you have any new measurements for me?" Whether I poll once per 15 seconds and get that measurement on the first try or if I poll getStats 15 times and get 14 "no's" and 1 "yes", I think I end up in the same situation, so what exactly was the problem there

That's fine for telemetry. I think our concern was more someone making runtime decisions off stats, e.g.:

// probe and switch to best codec for media being sent right now:
let bestCodec, bestY = 0;
for (const codec of sender.getParameters().codecs) {
  const params = sender.getParameters();
  params.encodings[0].codec = codec;
  await sender.setParameters(params);
  await wait(1000);
  const ortp1 = [...(await sender.getStats()).values()].find(({type}) => type == "outbound-rtp");
  await wait(1000);
  const ortp2 = [...(await sender.getStats()).values()].find(({type}) => type == "outbound-rtp");
  const y = (ortp2.psnrSum.y - ortp1.psnrSum.y) / (ortp2.psnrMeasurements - ortp1.psnrMeasurements);
  if (bestY < y) { bestY = y; bestCodec = codec;
}
const params = sender.getParameters();
params.encodings[0].codec = bestCodec; }
await sender.setParameters(params);

(and why is this different from the other hundred+ metrics we have)?

No other stat has this note:

FTR I think 15 second is too large of an interval since a lot can happen in that period of time.

This might be good for an implementer to know.

fippo · 2025-02-28T23:08:27Z

I can replace that with "is implementation-defined" linking to https://infra.spec.whatwg.org/#implementation-defined

And copy @sprangerik's great "these metrics should primarily be used as a basis for statistical analysis rather than be used as an absolute truth on a per-frame basis"

henbos · 2025-03-03T07:54:05Z

We could say that PSNR measurement frequency is implementation-specific but that "the user agent SHOULD make make PSNR measurements no less frequently than every X seconds, if PSNR measurements are supported for the current encoder implementation". I still think the app needs to handle the case that a PSNR measurement is not available for a given encoder implementation or browser though, but this would give the app an upper bound on how long to maximally wait for.

fippo · 2025-03-03T18:24:47Z

From what I can see browser implementations would boil down to the same line of code in libWebRTC for software encoders, no? None of the parameters for video encoding are "specified", why do we need to be prescribe here?

updated along the lines of #794 (comment)

jan-ivar · 2025-03-03T21:03:14Z

webrtc-stats.html

+                <p>
+                  The PSNR is defined in [[ISO-29170-1:2017]].
+                </p>


This seems redundant with the same text under psnrSum (line 2235) which is already link to here.

Suggested change



The PSNR is defined in [[ISO-29170-1:2017]].

jan-ivar · 2025-03-03T21:14:20Z

webrtc-stats.html

+                <p class="note">
+		  PSNR metrics should primarily be used as a basis for statistical analysis rather
+		  than be used as an absolute truth on a per-frame basis.
+                  The frequency of PSNR measurements is [=implementation-defined=].
+                </p>


We want to avoid "should" and normative statements in non-normative notes. If we want to be normative we should lift it out.

Based on #794 (comment) are we comfortable picking a min frequency of every 5 seconds?

(Edit: added "or the encoding frame rate whichever is lower")

Suggested change



PSNR metrics should primarily be used as a basis for statistical analysis rather

than be used as an absolute truth on a per-frame basis.

The frequency of PSNR measurements is [=implementation-defined=].





If the current encoder supports taking PSNR measurements, their

frequency SHOULD be no less than every 5 seconds or the

encoding frame rate, whichever is lower.





This allows for testing. PSNR measurements are intended for

statistical analysis, and aren't expected to be accurate down

to a frame.

Can you explain why 5 seconds? keyFramesEncoded and keyFramesDecoded are examples very-low-frequency events happening already where you can not make an assumption about the minimum interval between two getStats calls that will give you an increase. Same for packetsLost.

What value would you like? The frequency of keyFramesEncoded and keyFramesDecoded are determined by external factors unlikely to vary by user agent.

5 is arbitrary. In this thread I heard 15 seconds was too high, and that 1 second was not too expensive, but also that we'd rather not overconstrain implementations too much. 10 seconds?

This value, whatever we pick would go into WPT and also give web developers that can't wait for some reason a minimum time to wait to be interoperable.

Hmm, maybe we need to also qualify that the encoder is actually encoding something? E.g. if the track is muted or a canvas track, then the frame rate may be less than what value we pick. Maybe we add "... or the encoding frame rate whichever is lower"?

tl;dr: this is a parameter of the video encoder configuration. It becomes observable in stats just like others.
What the spec should say is "implementation defined" - the implied warning is "do not compare apples to oranges"

The frequency of keyFramesEncoded is a good example, it is determined by the encoder setting such as gop size. One wants this as large as possible for good performance but this is something where the encoder can be tuned without "interoperability" constraints since the decoder behavior is required to be flexible.

If one wants to get fancy about PSNR one may need to take into account things like "is this a screen sharing track" (higher resolution, lower frame rate and wholly different content) or frame rate (which may depend on BWE). Worth the effort for statistical analysis? Unlikely.

15 seconds was arbitrary too I think with the purpose of "i dont think this makes sense anymore". One second picked in the code is 3x the value from the paper (which has a parameter study) and hence 1/3rd as expensive in terms of power impact.

A SHOULD implies implementation-defined. Is there a value that would be satisfactory?

Moved the normative statement out of the note (and moved the note to the actual values)

alvestrand · 2025-03-13T15:37:59Z

From editors meeting: Main reason for a number seems to be feature detection - people want to know that if they have waited this long between calls to GetStats, and the frame counter has increased by X, and the stat doesn't show up, the feature is off.

However, feature detection can be done by insisting that the stat is visible and with value zero if supported.

youennf · 2025-03-13T15:44:43Z

If we have the psnr measurement frequency be implementation-defined, the spec should help the UA developer select a good frequency value. Spec should add some guidelines, for instance doing measurements every second or every 30 frames or so.

fippo · 2025-03-13T22:37:01Z

From editors meeting: Main reason for a number seems to be feature detection - people want to know that if they have waited this long between calls to GetStats, and the frame counter has increased by X, and the stat doesn't show up, the feature is off.

Huh? This is a counter, it can not disappear, it can only stop increasing.

sprangerik · 2025-03-14T09:08:35Z

If we have the psnr measurement frequency be implementation-defined, the spec should help the UA developer select a good frequency value. Spec should add some guidelines, for instance doing measurements every second or every 30 frames or so.

I think the gist of it is that we want the frequency to be a high as possible as long as the performance impact can be kept negligible. Perhaps we can add some text to that effect. Trying to give guidelines in terms of the expected frequency doesn't seem helpful to UA developers - but on the other hand those developers are probably in the best position to determine at which frequency the performance hit becomes non-negligible for their particular encoder implementations.

jan-ivar · 2025-03-14T16:23:21Z

I think the gist of it is that we want the frequency to be a high as possible as long as the performance impact can be kept negligible.

That's great, because I don't think anyone is proposing a ceiling on frequency. A floor would be nice though.

I disagree giving guidance is bad, since what you wrote sounds like good guidance already. Let's write it down.

fippo · 2025-03-15T01:32:16Z

As we say in German, "Guter Rat ist teuer". The guidance should "talk to your video encoding guys". That is what "implementation-defined" is for, no?

youennf · 2025-03-17T09:55:34Z

talk to your video encoding guys seems to induce that the rate selection may be encoder specific.

This naturally leads to privacy questions if the selection of the rate depends on the encoder, the system load, the CPU...
I only see benefits in the spec defining the implementor playground.

fippo · 2025-03-17T15:46:44Z

if the selection of the rate depends on the encoder, the system load, the CPU...

Who is proposing this?

This naturally leads to privacy questions

How is this a new concern given video codecs in WebRTC already are adapting to CPU, thermal state, etc? Which is not described by any specification either. Note that PSNR should be possible to compute with JS and existing APIs already.

And you are most welcome to invite privacy guys into your meetings obviously.

jan-ivar · 2025-04-10T13:40:57Z

webrtc-stats.html

+                  PSNR is defined in [[ISO-29170-1:2017]].
+                </p>
+                <p class="note">
+		  PSNR metrics should primarily be used as a basis for statistical analysis rather


Avoid "should" in non-normative notes. How about

Suggested change

PSNR metrics should primarily be used as a basis for statistical analysis rather

Authors are expected to use PSNR metrics primarily as a basis for statistical analysis rather

https://w3c.github.io/webrtc-stats/#dom-rtcinboundrtpstreamstats-totalcorruptionprobability -- literally copied from this note, can you explain why it is ok there but not here?

I missed it in review of #788. It's not a hard rule, but most WG editors I've spoken with agree avoiding lowercase requirement-laden words avoids confusion and improves readability.

Both notes appear to be speaking to authors rather than implementers, which is fine. But the primary audience of specs are implementers, so it might help to clarify when speaking to someone else. Specs have no authority saying authors should or shouldn't do anything. So it seems more accurate to describe the usage the design anticipates, which is how I interpret these notes.

No, this note documents that implementers (well, the one) do not think authors should be abusing this for per-frame analysis. 810d67d avoids lower-case should.

jan-ivar · 2025-04-10T13:44:05Z

webrtc-stats.html

+                </p>
+                <p>
+                  The PSNR is defined in [[ISO-29170-1:2017]].
+                  The frequency of PSNR measurements is [=implementation-defined=].


OK with me if for any reasonable value of X (I'm not married to any particular value, as long as we have one that lets us WPT test)

Suggested change

The frequency of PSNR measurements is [=implementation-defined=].

The frequency of PSNR measurements is [=implementation-defined=],

but SHOULD be no less than every X seconds.

A single value is sufficient for existence and can be used to write a WPT.

https://chromium-review.googlesource.com/c/chromium/src/+/6149759/4/third_party/blink/web_tests/external/wpt/webrtc-stats/video-psnr.html passes on my machine...

The argument here is a WPT testing an interoperable floor on frequency, not just existence.

I only see benefits in the spec defining the implementor playground.

I believe @youennf also sought some implementer guidance here so we don't have to reverse engineer interoperable behavior.

If we can't resolve this in the PR let's aim to discuss this again next meeting.

I still have not seen an argument why the minimum frequency needs to be interoperable.
Also I assume you mean "intraoperable"?

Adding more unfounded numbers (unless you have run large scale experiments?) is not going to help.
1 second is quite likely to be an issue for high-resolution screen sharing.

Going once

Edit: ^ s/every second/every few seconds/

If situations arise where implementations fall outside these bounds we can always revisit.

Since a vague soft limit doesn't avoid the "numerator is zero" what problem does this solve?
What is the behavior when the track was removed via replaceTrack? Follow-up question in #619...

Since a vague soft limit doesn't avoid the "numerator is zero" what problem does this solve?

@youennf does it address your concern in #794 (comment)?

What is the behavior when the track was removed via replaceTrack?

The encoding frame rate drops to zero, which is lower than 1/15, and no PSNR measurements happen.

youennf · 2025-04-18T16:33:24Z

My understanding is that, if the encoder is providing PSNR, there is probably no perf issue in doing so for every frame.
For encoders that do not support PSNR, it is ok to not expose that value (modulo maybe a fingerprinting note).

Based on this, I would tend to remove psnrMeasurements and just keep psnrSum.

sprangerik · 2025-04-18T18:43:17Z

My understanding is that, if the encoder is providing PSNR, there is probably no perf issue in doing so for every frame. For encoders that do not support PSNR, it is ok to not expose that value (modulo maybe a fingerprinting note).

Based on this, I would tend to remove psnrMeasurements and just keep psnrSum.

That's not quite the case. Some hardware encoders do expose PSNR at a minimum of performance penalty, so exposing it for every frame is fine there. However, there are a number of software encoders (e.g. libvpx) that have the ability to calculate PSNR but do not typically do so. It's commonly use in RD based control methods which use a lot of CPU are not suitable for realtime encoding. However, the PSNR feature can be enabled/disabled on a per-frame basis even in the fast encoding modes, which is what is being discussed here. Getting the values are still valuable, but we don't want to enable it for every frame as that will come at a noticeable CPU penalty.

Then of course there are encoders that don't have the ability to output PSNR at all. Say we have a software encoders that does expose and hardware encoder of the same type the does not - and at runtime encoding is switched back and forth between hardware and software (e.g. due to resolution constraints or just random failures), what do we do then?

I still think the easiest way is to allow the update frequency be zero for extended periods of time. The user just has to be aware that the metric may or may not be available at any given time, so if no PSNR measurements have been added between calls to getStats() you have to interpret that as "undefined" not "not available".

vr000m · 2025-04-22T05:05:16Z

I think psnrSum/psnrMeasurements gives us the information even in periods it's not updated. Ie if an app is calling getstats regularly, ie every second or every 2 seconds no updates in psnrSum and measurements already gives us info, that no updates were made.

It would be great if psnr is calculated per frame but I'm okay with it being updated as often as the codec feels it can without a performance penalty.

youennf · 2025-04-22T09:17:59Z

Getting the values are still valuable, but we don't want to enable it for every frame as that will come at a noticeable CPU penalty.

The fact that some encoders can expose or not PSNR would be new information exposed to the web and could be used for fingerprinting.

Also, this PR gives no implementor's guideline. My assumption is that a single measurement frequency would be used for all encoders, this frequency value would be fixed for a given UA instance, and probably for a given UA across all devices it runs on (say a specific version of Chrome). It would be good to clarify this, otherwise I could see potential additional threats. Maybe the PING WG should weigh in there.

sprangerik · 2025-04-24T12:20:25Z

The fact that some encoders can expose or not PSNR would be new information exposed to the web and could be used for fingerprinting.

This would map essentially 1:1 with the implementation used, and that can already pretty easily be inferred (e.g. via https://www.w3.org/TR/webrtc-stats/#dom-rtcoutboundrtpstreamstats-encoderimplementation or platform+https://www.w3.org/TR/webrtc-stats/#dom-rtcoutboundrtpstreamstats-powerefficientencoder, not to mention info from WebCodecs, WebGPU, parsing data from encoded transform, etc etc). So while it might be a new "bit", it doesn't actually provide any new information imo.

Also, this PR gives no implementor's guideline. My assumption is that a single measurement frequency would be used for all encoders, this frequency value would be fixed for a given UA instance, and probably for a given UA across all devices it runs on (say a specific version of Chrome). It would be good to clarify this, otherwise I could see potential additional threats.

Can we let the implementor's guideline just be along what has been said above, e.g. "the frequency should be as high as possible as long as the performance impact can be kept negligible". I don't see a reason to change the frequency based on codec type, only by implementation performance overhead. For a given implementation though I don't see a reason to change the frequency - detailing that this should be fixed for a given UA seems fine to me.

youennf · 2025-04-24T12:41:09Z

This would map essentially 1:1 with the implementation used, and that can already pretty easily be inferred (e.g. via https://www.w3.org/TR/webrtc-stats/#dom-rtcoutboundrtpstreamstats-encoderimplementation or platform+https://www.w3.org/TR/webrtc-stats/#dom-rtcoutboundrtpstreamstats-powerefficientencoder, n

Both encoderImplementation and powerEfficientEncoder have an exposure note: must not exist unless exposing hardware is allowed.

A possibility is to restrict psnr in the same manner.
In that case, it seems ok to have per frame PSNR for hardware encoders and less frequent PSNR for software encoders.

This should be fixed for a given UA seems fine to me.

I find it useful information.

jan-ivar · 2025-04-24T13:11:46Z

This would map essentially 1:1 with the implementation used, and that can already pretty easily be inferred (e.g. via https://www.w3.org/TR/webrtc-stats/#dom-rtcoutboundrtpstreamstats-encoderimplementation or platform+https://www.w3.org/TR/webrtc-stats/#dom-rtcoutboundrtpstreamstats-powerefficientencoder,

Those don't seem like great examples as they're blocked on exposing hardware is allowed, unless we're suggesting adding that requirement here? What are the other examples?

My assumption is that a single measurement frequency would be used for all encoders, this frequency value would be fixed for a given UA instance, and probably for a given UA across all devices it runs on (say a specific version of Chrome). It would be good to clarify this, otherwise I could see potential additional threats.

Agreed. Clarifying these assumptions in the guidance can only help.

Can we let the implementor's guideline just be along what has been said above, e.g. "the frequency should be as high as possible as long as the performance impact can be kept negligible". I don't see a reason to change the frequency based on codec type, only by implementation performance overhead. For a given implementation though I don't see a reason to change the frequency - detailing that this should be fixed for a given UA seems fine to me.

Doesn't tying it too tightly to performance make it another performance metric? I like the part that it should not vary by codec.

fippo · 2025-04-24T15:58:05Z

In that case, it seems ok to have per frame PSNR for hardware encoders and less frequent PSNR for software encoders.

Even on hardware encoders it has an impact on power consumption and the return on doing it on every frame is not there, see the parts in the paper that talk about subsampling.

I'm fine with gating on HW.

youennf · 2025-04-25T09:03:58Z

Even on hardware encoders it has an impact on power consumption

If so, PSNR gathering could be opt-in, something like:

dictionary StatsGatheringOptions {
    float psnrPreferredFrequency = 0;
};
partial interface RTCPeerConnection {
    attribute StatsGatheringOptions statsOptions;
};

Is it overkill?

fippo · 2025-04-25T14:04:47Z

Is it overkill?

Quoting https://w3c.github.io/webrtc-stats/#guidelines-for-design-of-stats-objects:
There is, by design, no control surface for the application to influence how stats are generated.

youennf · 2025-04-28T14:27:58Z

Until now, there was no concern about stats being potentially computer intensive.

fippo added 3 commits January 23, 2025 10:09

record of doubles

cece6f4

make reference informative

33b4159

vr000m reviewed Feb 18, 2025

View reviewed changes

webrtc-stats.html Outdated Show resolved Hide resolved

fippo mentioned this pull request Feb 27, 2025

Stats example uses confusing variable names #768

Open

statistical analysis, implementation-defined

85afc36

jan-ivar requested changes Mar 3, 2025

View reviewed changes

fippo added 2 commits April 8, 2025 10:31

move frequency out of the note

7140c3f

move note to metrics from number of measurements

42d593a

jan-ivar reviewed Apr 10, 2025

View reviewed changes

neutral

810d67d

gate on hw

2e31240

-                <p class="note">
-		  PSNR metrics should primarily be used as a basis for statistical analysis rather
-		  than be used as an absolute truth on a per-frame basis.
-                  The frequency of PSNR measurements is [=implementation-defined=].
-                </p>
+                <p>
+                  If the current encoder supports taking PSNR measurements, their
+                  frequency SHOULD be no less than every 5 seconds or the
+                  encoding frame rate, whichever is lower.
+                </p>
+                <p class="note">
+                  This allows for testing. PSNR measurements are intended for
+                  statistical analysis, and aren't expected to be accurate down
+                  to a frame.
+                </p>

	PSNR metrics should primarily be used as a basis for statistical analysis rather
	Authors are expected to use PSNR metrics primarily as a basis for statistical analysis rather

	The frequency of PSNR measurements is [=implementation-defined=].
	The frequency of PSNR measurements is [=implementation-defined=],
	but SHOULD be no less than every X seconds.

Add PSNR (Y/U/V) for outbound-rtp #794

Are you sure you want to change the base?

Add PSNR (Y/U/V) for outbound-rtp #794

Conversation

fippo commented Feb 5, 2025 • edited by pr-preview bot Loading

fippo commented Feb 5, 2025

henbos commented Feb 6, 2025

fippo commented Feb 6, 2025 • edited Loading

vr000m commented Feb 18, 2025

fippo commented Feb 18, 2025 • edited Loading

vr000m commented Feb 18, 2025

alvestrand commented Feb 20, 2025

jan-ivar commented Feb 20, 2025

sprangerik commented Feb 21, 2025

sprangerik commented Feb 21, 2025

vr000m commented Feb 22, 2025

dontcallmedom-bot commented Feb 26, 2025

Drekabi commented Feb 27, 2025

alvestrand commented Feb 27, 2025

jan-ivar commented Feb 27, 2025

henbos commented Feb 27, 2025

henbos commented Feb 27, 2025

jan-ivar commented Feb 27, 2025

henbos commented Feb 27, 2025

henbos commented Feb 27, 2025

jan-ivar commented Feb 27, 2025

fippo commented Feb 27, 2025 • edited Loading

henbos commented Feb 28, 2025

fippo commented Feb 28, 2025

jan-ivar commented Feb 28, 2025

fippo commented Feb 28, 2025 • edited Loading

henbos commented Mar 3, 2025 • edited Loading

fippo commented Mar 3, 2025 • edited Loading

Choose a reason for hiding this comment

jan-ivar Mar 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alvestrand commented Mar 13, 2025 • edited Loading

youennf commented Mar 13, 2025

fippo commented Mar 13, 2025 • edited Loading

sprangerik commented Mar 14, 2025

jan-ivar commented Mar 14, 2025

fippo commented Mar 15, 2025 • edited Loading

youennf commented Mar 17, 2025

fippo commented Mar 17, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youennf commented Apr 18, 2025

sprangerik commented Apr 18, 2025

vr000m commented Apr 22, 2025 • edited Loading

youennf commented Apr 22, 2025

sprangerik commented Apr 24, 2025

youennf commented Apr 24, 2025

jan-ivar commented Apr 24, 2025

fippo commented Apr 24, 2025

youennf commented Apr 25, 2025

fippo commented Apr 25, 2025

youennf commented Apr 28, 2025

fippo commented Feb 5, 2025 •

edited by pr-preview bot

Loading

fippo commented Feb 6, 2025 •

edited

Loading

fippo commented Feb 18, 2025 •

edited

Loading

fippo commented Feb 27, 2025 •

edited

Loading

fippo commented Feb 28, 2025 •

edited

Loading

henbos commented Mar 3, 2025 •

edited

Loading

fippo commented Mar 3, 2025 •

edited

Loading

jan-ivar Mar 3, 2025 •

edited

Loading

alvestrand commented Mar 13, 2025 •

edited

Loading

fippo commented Mar 13, 2025 •

edited

Loading

fippo commented Mar 15, 2025 •

edited

Loading

vr000m commented Apr 22, 2025 •

edited

Loading