Skip to content

Commit 4d36513

Browse files
authored
Merge pull request #4 from qiyan98/master
Fix asset links on fvmd blog posts
2 parents 9c6e580 + 28ab904 commit 4d36513

File tree

2 files changed

+28
-28
lines changed

2 files changed

+28
-28
lines changed

_posts/2024-06-20-fvmd-1.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
layout: distill
3-
title: Video Evaluation Metrics 1/2 - A Review of the State of the Art
3+
title: A Review of Video Evaluation Metrics
44
description: Video generative models have been rapidly improving recently, but how do we evaluate them efficiently and effectively? In this blog post, we review the existing evaluation metrics and highlight their pros and cons.
55
tags: metrics video generative-models
66
giscus_comments: true
@@ -62,7 +62,7 @@ toc:
6262

6363
<div class="row mt-3">
6464
<div class="col-sm mt-3 mt-md-0">
65-
{% include figure.liquid loading="eager" path="assets/img/fvmd/video-metrics.png" class="img-fluid rounded z-depth-1" %}
65+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/video-metrics.png" class="img-fluid rounded z-depth-1" %}
6666
</div>
6767
</div>
6868
<div class="caption">
@@ -113,7 +113,7 @@ The models (a) to (e) are sorted based on human ratings collected through a user
113113

114114
<div class="row mt-3">
115115
<div class="col-sm mt-3 mt-md-0">
116-
{% include video.liquid path="assets/video/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true loop=true %}
116+
{% include video.liquid path="blog/2024/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true loop=true %}
117117
</div>
118118
</div>
119119
<div class="caption">
@@ -142,20 +142,20 @@ We also present visualizations of video frames for one randomly selected scene t
142142

143143
<div class="row mt-3">
144144
<div class="col-sm mt-3 mt-md-0">
145-
{% include figure.liquid loading="eager" path="assets/img/fvmd/fig-eval-metric-comparison-v0.png" class="img-fluid rounded z-depth-1" %}
145+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/fig-eval-metric-comparison-v0.jpg" class="img-fluid rounded z-depth-1" %}
146146
</div>
147147
</div>
148148

149149
<details>
150150
<summary>click here for more frames comparison</summary>
151151
<div class="row mt-3">
152152
<div class="col-sm mt-3 mt-md-0">
153-
{% include figure.liquid loading="eager" path="assets/img/fvmd/fig-eval-metric-comparison-v1.png" class="img-fluid rounded z-depth-1" %}
153+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/fig-eval-metric-comparison-v1.jpg" class="img-fluid rounded z-depth-1" %}
154154
</div>
155155
</div>
156156
</details>
157157

158158
## Summary
159159
We review the video evaluation metrics used to assess video generative models. These metrics can be categorized into two types: set-to-set comparison metrics (FID, FVD, KVD, FVMD, PSNR, and SSIM) and unary metrics (VBench, CLIP score, and IS). We discuss the pros and cons of each type and provide a detailed comparison using the TikTok dataset. The results show that the **FVMD metric aligns better with human judgments than other metrics, especially for assessing motion consistency**. This suggests that FVMD is a promising metric for evaluating video generative models.
160160

161-
Wonder why FVMD performs so much better than other metrics? Check out [the second part of our blog post](https://qiyan98.github.io/blog/2024/fvmd-2/) to find out more! We will delve into the details of the FVMD metric and explain why it is more effective in assessing video quality and motion consistency.
161+
Wonder why FVMD performs so much better than other metrics? Check out [the second part of our blog post](https://dsl-lab.github.io/blog/2024/fvmd-2/) to find out more! We will delve into the details of the FVMD metric and explain why it is more effective in assessing video quality and motion consistency.

_posts/2024-06-20-fvmd-2.md

+22-22
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
layout: distill
3-
title: Video Evaluation Metrics 2/2 - Evaluating Motion Consistency by Fréchet Video Motion Distance (FVMD)
3+
title: Evaluating Motion Consistency by Fréchet Video Motion Distance (FVMD)
44
description: In this blog post, we introduce a promising new metric for video generative models, Fréchet Video Motion Distance (FVMD), which focuses on the motion consistency of generated videos.
55
tags: metrics video generative-models
66
giscus_comments: true
@@ -67,7 +67,7 @@ toc:
6767

6868
Recently, diffusion models have demonstrated remarkable capabilities in high-quality image generation. This advancement has been extended to the video domain, giving rise to text-to-video diffusion models, such as [Pika](https://pika.art/home), [Runway Gen-2](https://research.runwayml.com/gen2), and [Sora](https://openai.com/index/sora/) <d-cite key="videoworldsimulators2024"></d-cite>.
6969

70-
Despite the rapid development of video generation models, research on evaluation metrics for video generation remains insufficient (see more discussion on our [blog](https://qiyan98.github.io/blog/2024/fvmd-1/)).
70+
Despite the rapid development of video generation models, research on evaluation metrics for video generation remains insufficient (see more discussion on our [blog](https://dsl-lab.github.io/blog/2024/fvmd-1/)).
7171
For example, FID-VID <d-cite key="balaji2019conditional"></d-cite> and FVD <d-cite key="unterthiner2018towards"></d-cite> are commonly used video metrics. FID-VID focuses on visual quality by comparing synthesized *frames* to real ones, ignoring motion quality. FVD adds temporal coherence by using features from a *pre-trained action recognition model*, Inflated 3D Convnet (I3D) <d-cite key="carreira2017quo"></d-cite>.
7272
Recently, VBench <d-cite key="huang2023vbench"></d-cite> introduces a 16-dimensional evaluation suite for text-to-video generative models. However, VBench's protocols for temporal consistency, like temporal flickering and motion smoothness, favor videos with smooth or static movement, *neglecting high-quality videos with intense motion*, such as dancing and sports videos.
7373

@@ -80,7 +80,7 @@ The code is available at [GitHub](https://github.com/DSL-Lab/FVMD-frechet-video-
8080
## Fréchet Video Motion Distance (FVMD)
8181
<div class="row mt-3">
8282
<div class="col-sm mt-3 mt-md-0">
83-
{% include figure.liquid loading="eager" path="assets/img/fvmd/pipeline.png" class="img-fluid rounded z-depth-1" %}
83+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/pipeline.png" class="img-fluid rounded z-depth-1" %}
8484
</div>
8585
</div>
8686
<div class="caption">
@@ -92,10 +92,10 @@ The core idea of FVMD is to measure temporal motion consistency based on **the p
9292

9393
<div class="row mt-3">
9494
<div class="col-sm mt-3 mt-md-0">
95-
{% include figure.liquid loading="eager" path="assets/img/fvmd/tracking_demo_1.gif" class="img-fluid rounded z-depth-1" %}
95+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/tracking_demo_1.gif" class="img-fluid rounded z-depth-1" %}
9696
</div>
9797
<div class="col-sm mt-3 mt-md-0">
98-
{% include figure.liquid loading="eager" path="assets/img/fvmd/tracking_demo_2.gif" class="img-fluid rounded z-depth-1" %}
98+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/tracking_demo_2.gif" class="img-fluid rounded z-depth-1" %}
9999
</div>
100100
</div>
101101
<div class="caption">
@@ -155,24 +155,24 @@ If two videos are of very different quality, their histograms should look very *
155155

156156
<div class="row mt-3">
157157
<div class="col-sm mt-3 mt-md-0">
158-
{% include figure.liquid loading="eager" path="assets/img/fvmd/gt.gif" class="img-fluid rounded z-depth-1" %}
158+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/gt.gif" class="img-fluid rounded z-depth-1" %}
159159
</div>
160160
<div class="col-sm mt-3 mt-md-0">
161-
{% include figure.liquid loading="eager" path="assets/img/fvmd/disco.gif" class="img-fluid rounded z-depth-1" %}
161+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/disco.gif" class="img-fluid rounded z-depth-1" %}
162162
</div>
163163
<div class="col-sm mt-3 mt-md-0">
164-
{% include figure.liquid loading="eager" path="assets/img/fvmd/anyone.gif" class="img-fluid rounded z-depth-1" %}
164+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone.gif" class="img-fluid rounded z-depth-1" %}
165165
</div>
166166
</div>
167167
<div class="row mt-3">
168168
<div class="col-sm mt-3 mt-md-0">
169-
{% include figure.liquid loading="eager" path="assets/img/fvmd/gt_tracking.gif" class="img-fluid rounded z-depth-1" %}
169+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/gt_tracking.gif" class="img-fluid rounded z-depth-1" %}
170170
</div>
171171
<div class="col-sm mt-3 mt-md-0">
172-
{% include figure.liquid loading="eager" path="assets/img/fvmd/disco_tracking.gif" class="img-fluid rounded z-depth-1" %}
172+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/disco_tracking.gif" class="img-fluid rounded z-depth-1" %}
173173
</div>
174174
<div class="col-sm mt-3 mt-md-0">
175-
{% include figure.liquid loading="eager" path="assets/img/fvmd/anyone_tracking.gif" class="img-fluid rounded z-depth-1" %}
175+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone_tracking.gif" class="img-fluid rounded z-depth-1" %}
176176
</div>
177177
</div>
178178
<div class="caption">
@@ -185,13 +185,13 @@ Above, we show three pieces of video from the TikTok dataset <d-cite key="jafari
185185

186186
<div class="row mt-3">
187187
<div class="col-sm mt-3 mt-md-0">
188-
{% include figure.liquid loading="eager" path="assets/img/fvmd/gt_v_1d.png" class="img-fluid rounded z-depth-1" %}
188+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/gt_v_1d.png" class="img-fluid rounded z-depth-1" %}
189189
</div>
190190
<div class="col-sm mt-3 mt-md-0">
191-
{% include figure.liquid loading="eager" path="assets/img/fvmd/disco_v_1d.png" class="img-fluid rounded z-depth-1" %}
191+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/disco_v_1d.png" class="img-fluid rounded z-depth-1" %}
192192
</div>
193193
<div class="col-sm mt-3 mt-md-0">
194-
{% include figure.liquid loading="eager" path="assets/img/fvmd/anyone_v_1d.png" class="img-fluid rounded z-depth-1" %}
194+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone_v_1d.png" class="img-fluid rounded z-depth-1" %}
195195
</div>
196196
</div>
197197
<div class="caption">
@@ -208,13 +208,13 @@ This is exactly what we want to observe in the motion features! These features c
208208
<summary>click here for 2D histogram result</summary>
209209
<div class="row mt-3">
210210
<div class="col-sm mt-3 mt-md-0">
211-
{% include figure.liquid loading="eager" path="assets/img/fvmd/gt_v_2d.png" class="img-fluid rounded z-depth-1" %}
211+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/gt_v_2d.png" class="img-fluid rounded z-depth-1" %}
212212
</div>
213213
<div class="col-sm mt-3 mt-md-0">
214-
{% include figure.liquid loading="eager" path="assets/img/fvmd/disco_v_2d.png" class="img-fluid rounded z-depth-1" %}
214+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/disco_v_2d.png" class="img-fluid rounded z-depth-1" %}
215215
</div>
216216
<div class="col-sm mt-3 mt-md-0">
217-
{% include figure.liquid loading="eager" path="assets/img/fvmd/anyone_v_2d.png" class="img-fluid rounded z-depth-1" %}
217+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone_v_2d.png" class="img-fluid rounded z-depth-1" %}
218218
</div>
219219
</div>
220220
<div class="caption">
@@ -243,7 +243,7 @@ To verify the efficacy of the extracted motion features in representing motion p
243243

244244
<div class="row mt-3">
245245
<div class="col-sm mt-3 mt-md-0">
246-
{% include figure.liquid loading="eager" path="assets/img/fvmd/sanity_check.png" class="img-fluid rounded z-depth-1" %}
246+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/sanity_check.png" class="img-fluid rounded z-depth-1" %}
247247
</div>
248248
</div>
249249
<div class="caption">
@@ -256,7 +256,7 @@ When measuring the FVMD of **two subsets from the same dataset**, it **converges
256256
Moreover, a sensitivity analysis is conducted to evaluate if the proposed metric can effectively detect temporal inconsistencies in generated videos, *i.e.*, being **numerically sensitive to temporal noises**. To this end, artificially-made temporal noises are injected to the TikTok dancing dataset <d-cite key="jafarian2022self"></d-cite> and FVMD scores are computed to assess its sensitivity to data corruption.
257257
<div class="row mt-3">
258258
<div class="col-sm mt-3 mt-md-0">
259-
{% include figure.liquid loading="eager" path="assets/img/fvmd/sensitivity_ana.png" class="img-fluid rounded z-depth-1" %}
259+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/sensitivity_ana.png" class="img-fluid rounded z-depth-1" %}
260260
</div>
261261
</div>
262262
<div class="caption">
@@ -275,7 +275,7 @@ Note that the models (a) to (e) are sorted based on human ratings collected thro
275275

276276
<div class="row mt-3">
277277
<div class="col-sm mt-3 mt-md-0">
278-
{% include video.liquid path="assets/video/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true %}
278+
{% include video.liquid path="blog/2024/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true %}
279279
</div>
280280
</div>
281281
<div class="caption">
@@ -305,7 +305,7 @@ The second setting, **One Metric Diverse**, evaluates the agreement among differ
305305

306306
<div class="row mt-3">
307307
<div class="col-sm mt-3 mt-md-0">
308-
{% include figure.liquid loading="eager" path="assets/img/fvmd/human_study_eql.png" class="img-fluid rounded z-depth-1" %}
308+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/human_study_eql.png" class="img-fluid rounded z-depth-1" %}
309309
</div>
310310
</div>
311311
<div class="caption">
@@ -314,7 +314,7 @@ The second setting, **One Metric Diverse**, evaluates the agreement among differ
314314

315315
<div class="row mt-3">
316316
<div class="col-sm mt-3 mt-md-0">
317-
{% include figure.liquid loading="eager" path="assets/img/fvmd/human_study_div.png" class="img-fluid rounded z-depth-1" %}
317+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/human_study_div.png" class="img-fluid rounded z-depth-1" %}
318318
</div>
319319
</div>
320320
<div class="caption">

0 commit comments

Comments
 (0)