Skip to content

Commit 55d66b0

Browse files
committed
fix asset links on fvmd blog posts
1 parent 9c6e580 commit 55d66b0

File tree

2 files changed

+26
-26
lines changed

2 files changed

+26
-26
lines changed

_posts/2024-06-20-fvmd-1.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ toc:
6262

6363
<div class="row mt-3">
6464
<div class="col-sm mt-3 mt-md-0">
65-
{% include figure.liquid loading="eager" path="assets/img/fvmd/video-metrics.png" class="img-fluid rounded z-depth-1" %}
65+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/video-metrics.png" class="img-fluid rounded z-depth-1" %}
6666
</div>
6767
</div>
6868
<div class="caption">
@@ -113,7 +113,7 @@ The models (a) to (e) are sorted based on human ratings collected through a user
113113

114114
<div class="row mt-3">
115115
<div class="col-sm mt-3 mt-md-0">
116-
{% include video.liquid path="assets/video/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true loop=true %}
116+
{% include video.liquid path="blog/2024/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true loop=true %}
117117
</div>
118118
</div>
119119
<div class="caption">
@@ -142,20 +142,20 @@ We also present visualizations of video frames for one randomly selected scene t
142142

143143
<div class="row mt-3">
144144
<div class="col-sm mt-3 mt-md-0">
145-
{% include figure.liquid loading="eager" path="assets/img/fvmd/fig-eval-metric-comparison-v0.png" class="img-fluid rounded z-depth-1" %}
145+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/fig-eval-metric-comparison-v0.jpg" class="img-fluid rounded z-depth-1" %}
146146
</div>
147147
</div>
148148

149149
<details>
150150
<summary>click here for more frames comparison</summary>
151151
<div class="row mt-3">
152152
<div class="col-sm mt-3 mt-md-0">
153-
{% include figure.liquid loading="eager" path="assets/img/fvmd/fig-eval-metric-comparison-v1.png" class="img-fluid rounded z-depth-1" %}
153+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/fig-eval-metric-comparison-v1.jpg" class="img-fluid rounded z-depth-1" %}
154154
</div>
155155
</div>
156156
</details>
157157

158158
## Summary
159159
We review the video evaluation metrics used to assess video generative models. These metrics can be categorized into two types: set-to-set comparison metrics (FID, FVD, KVD, FVMD, PSNR, and SSIM) and unary metrics (VBench, CLIP score, and IS). We discuss the pros and cons of each type and provide a detailed comparison using the TikTok dataset. The results show that the **FVMD metric aligns better with human judgments than other metrics, especially for assessing motion consistency**. This suggests that FVMD is a promising metric for evaluating video generative models.
160160

161-
Wonder why FVMD performs so much better than other metrics? Check out [the second part of our blog post](https://qiyan98.github.io/blog/2024/fvmd-2/) to find out more! We will delve into the details of the FVMD metric and explain why it is more effective in assessing video quality and motion consistency.
161+
Wonder why FVMD performs so much better than other metrics? Check out [the second part of our blog post](https://dsl-lab.github.io/blog/2024/fvmd-2/) to find out more! We will delve into the details of the FVMD metric and explain why it is more effective in assessing video quality and motion consistency.

_posts/2024-06-20-fvmd-2.md

+21-21
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ toc:
6767

6868
Recently, diffusion models have demonstrated remarkable capabilities in high-quality image generation. This advancement has been extended to the video domain, giving rise to text-to-video diffusion models, such as [Pika](https://pika.art/home), [Runway Gen-2](https://research.runwayml.com/gen2), and [Sora](https://openai.com/index/sora/) <d-cite key="videoworldsimulators2024"></d-cite>.
6969

70-
Despite the rapid development of video generation models, research on evaluation metrics for video generation remains insufficient (see more discussion on our [blog](https://qiyan98.github.io/blog/2024/fvmd-1/)).
70+
Despite the rapid development of video generation models, research on evaluation metrics for video generation remains insufficient (see more discussion on our [blog](https://dsl-lab.github.io/blog/2024/fvmd-1/)).
7171
For example, FID-VID <d-cite key="balaji2019conditional"></d-cite> and FVD <d-cite key="unterthiner2018towards"></d-cite> are commonly used video metrics. FID-VID focuses on visual quality by comparing synthesized *frames* to real ones, ignoring motion quality. FVD adds temporal coherence by using features from a *pre-trained action recognition model*, Inflated 3D Convnet (I3D) <d-cite key="carreira2017quo"></d-cite>.
7272
Recently, VBench <d-cite key="huang2023vbench"></d-cite> introduces a 16-dimensional evaluation suite for text-to-video generative models. However, VBench's protocols for temporal consistency, like temporal flickering and motion smoothness, favor videos with smooth or static movement, *neglecting high-quality videos with intense motion*, such as dancing and sports videos.
7373

@@ -80,7 +80,7 @@ The code is available at [GitHub](https://github.com/DSL-Lab/FVMD-frechet-video-
8080
## Fréchet Video Motion Distance (FVMD)
8181
<div class="row mt-3">
8282
<div class="col-sm mt-3 mt-md-0">
83-
{% include figure.liquid loading="eager" path="assets/img/fvmd/pipeline.png" class="img-fluid rounded z-depth-1" %}
83+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/pipeline.png" class="img-fluid rounded z-depth-1" %}
8484
</div>
8585
</div>
8686
<div class="caption">
@@ -92,10 +92,10 @@ The core idea of FVMD is to measure temporal motion consistency based on **the p
9292

9393
<div class="row mt-3">
9494
<div class="col-sm mt-3 mt-md-0">
95-
{% include figure.liquid loading="eager" path="assets/img/fvmd/tracking_demo_1.gif" class="img-fluid rounded z-depth-1" %}
95+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/tracking_demo_1.gif" class="img-fluid rounded z-depth-1" %}
9696
</div>
9797
<div class="col-sm mt-3 mt-md-0">
98-
{% include figure.liquid loading="eager" path="assets/img/fvmd/tracking_demo_2.gif" class="img-fluid rounded z-depth-1" %}
98+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/tracking_demo_2.gif" class="img-fluid rounded z-depth-1" %}
9999
</div>
100100
</div>
101101
<div class="caption">
@@ -155,24 +155,24 @@ If two videos are of very different quality, their histograms should look very *
155155

156156
<div class="row mt-3">
157157
<div class="col-sm mt-3 mt-md-0">
158-
{% include figure.liquid loading="eager" path="assets/img/fvmd/gt.gif" class="img-fluid rounded z-depth-1" %}
158+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/gt.gif" class="img-fluid rounded z-depth-1" %}
159159
</div>
160160
<div class="col-sm mt-3 mt-md-0">
161-
{% include figure.liquid loading="eager" path="assets/img/fvmd/disco.gif" class="img-fluid rounded z-depth-1" %}
161+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/disco.gif" class="img-fluid rounded z-depth-1" %}
162162
</div>
163163
<div class="col-sm mt-3 mt-md-0">
164-
{% include figure.liquid loading="eager" path="assets/img/fvmd/anyone.gif" class="img-fluid rounded z-depth-1" %}
164+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone.gif" class="img-fluid rounded z-depth-1" %}
165165
</div>
166166
</div>
167167
<div class="row mt-3">
168168
<div class="col-sm mt-3 mt-md-0">
169-
{% include figure.liquid loading="eager" path="assets/img/fvmd/gt_tracking.gif" class="img-fluid rounded z-depth-1" %}
169+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/gt_tracking.gif" class="img-fluid rounded z-depth-1" %}
170170
</div>
171171
<div class="col-sm mt-3 mt-md-0">
172-
{% include figure.liquid loading="eager" path="assets/img/fvmd/disco_tracking.gif" class="img-fluid rounded z-depth-1" %}
172+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/disco_tracking.gif" class="img-fluid rounded z-depth-1" %}
173173
</div>
174174
<div class="col-sm mt-3 mt-md-0">
175-
{% include figure.liquid loading="eager" path="assets/img/fvmd/anyone_tracking.gif" class="img-fluid rounded z-depth-1" %}
175+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone_tracking.gif" class="img-fluid rounded z-depth-1" %}
176176
</div>
177177
</div>
178178
<div class="caption">
@@ -185,13 +185,13 @@ Above, we show three pieces of video from the TikTok dataset <d-cite key="jafari
185185

186186
<div class="row mt-3">
187187
<div class="col-sm mt-3 mt-md-0">
188-
{% include figure.liquid loading="eager" path="assets/img/fvmd/gt_v_1d.png" class="img-fluid rounded z-depth-1" %}
188+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/gt_v_1d.png" class="img-fluid rounded z-depth-1" %}
189189
</div>
190190
<div class="col-sm mt-3 mt-md-0">
191-
{% include figure.liquid loading="eager" path="assets/img/fvmd/disco_v_1d.png" class="img-fluid rounded z-depth-1" %}
191+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/disco_v_1d.png" class="img-fluid rounded z-depth-1" %}
192192
</div>
193193
<div class="col-sm mt-3 mt-md-0">
194-
{% include figure.liquid loading="eager" path="assets/img/fvmd/anyone_v_1d.png" class="img-fluid rounded z-depth-1" %}
194+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone_v_1d.png" class="img-fluid rounded z-depth-1" %}
195195
</div>
196196
</div>
197197
<div class="caption">
@@ -208,13 +208,13 @@ This is exactly what we want to observe in the motion features! These features c
208208
<summary>click here for 2D histogram result</summary>
209209
<div class="row mt-3">
210210
<div class="col-sm mt-3 mt-md-0">
211-
{% include figure.liquid loading="eager" path="assets/img/fvmd/gt_v_2d.png" class="img-fluid rounded z-depth-1" %}
211+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/gt_v_2d.png" class="img-fluid rounded z-depth-1" %}
212212
</div>
213213
<div class="col-sm mt-3 mt-md-0">
214-
{% include figure.liquid loading="eager" path="assets/img/fvmd/disco_v_2d.png" class="img-fluid rounded z-depth-1" %}
214+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/disco_v_2d.png" class="img-fluid rounded z-depth-1" %}
215215
</div>
216216
<div class="col-sm mt-3 mt-md-0">
217-
{% include figure.liquid loading="eager" path="assets/img/fvmd/anyone_v_2d.png" class="img-fluid rounded z-depth-1" %}
217+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone_v_2d.png" class="img-fluid rounded z-depth-1" %}
218218
</div>
219219
</div>
220220
<div class="caption">
@@ -243,7 +243,7 @@ To verify the efficacy of the extracted motion features in representing motion p
243243

244244
<div class="row mt-3">
245245
<div class="col-sm mt-3 mt-md-0">
246-
{% include figure.liquid loading="eager" path="assets/img/fvmd/sanity_check.png" class="img-fluid rounded z-depth-1" %}
246+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/sanity_check.png" class="img-fluid rounded z-depth-1" %}
247247
</div>
248248
</div>
249249
<div class="caption">
@@ -256,7 +256,7 @@ When measuring the FVMD of **two subsets from the same dataset**, it **converges
256256
Moreover, a sensitivity analysis is conducted to evaluate if the proposed metric can effectively detect temporal inconsistencies in generated videos, *i.e.*, being **numerically sensitive to temporal noises**. To this end, artificially-made temporal noises are injected to the TikTok dancing dataset <d-cite key="jafarian2022self"></d-cite> and FVMD scores are computed to assess its sensitivity to data corruption.
257257
<div class="row mt-3">
258258
<div class="col-sm mt-3 mt-md-0">
259-
{% include figure.liquid loading="eager" path="assets/img/fvmd/sensitivity_ana.png" class="img-fluid rounded z-depth-1" %}
259+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/sensitivity_ana.png" class="img-fluid rounded z-depth-1" %}
260260
</div>
261261
</div>
262262
<div class="caption">
@@ -275,7 +275,7 @@ Note that the models (a) to (e) are sorted based on human ratings collected thro
275275

276276
<div class="row mt-3">
277277
<div class="col-sm mt-3 mt-md-0">
278-
{% include video.liquid path="assets/video/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true %}
278+
{% include video.liquid path="blog/2024/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true %}
279279
</div>
280280
</div>
281281
<div class="caption">
@@ -305,7 +305,7 @@ The second setting, **One Metric Diverse**, evaluates the agreement among differ
305305

306306
<div class="row mt-3">
307307
<div class="col-sm mt-3 mt-md-0">
308-
{% include figure.liquid loading="eager" path="assets/img/fvmd/human_study_eql.png" class="img-fluid rounded z-depth-1" %}
308+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/human_study_eql.png" class="img-fluid rounded z-depth-1" %}
309309
</div>
310310
</div>
311311
<div class="caption">
@@ -314,7 +314,7 @@ The second setting, **One Metric Diverse**, evaluates the agreement among differ
314314

315315
<div class="row mt-3">
316316
<div class="col-sm mt-3 mt-md-0">
317-
{% include figure.liquid loading="eager" path="assets/img/fvmd/human_study_div.png" class="img-fluid rounded z-depth-1" %}
317+
{% include figure.liquid loading="eager" path="blog/2024/fvmd/human_study_div.png" class="img-fluid rounded z-depth-1" %}
318318
</div>
319319
</div>
320320
<div class="caption">

0 commit comments

Comments
 (0)