Merge pull request #4 from qiyan98/master

lrjconan · web-flow · commit 4d3651338b5a · 2024-07-07T11:06:12.000-07:00
Fix asset links on fvmd blog posts
diff --git a/_posts/2024-06-20-fvmd-1.md b/_posts/2024-06-20-fvmd-1.md
@@ -1,6 +1,6 @@
 ---
 layout: distill
-title: Video Evaluation Metrics 1/2 - A Review of the State of the Art
+title:  A Review of Video Evaluation Metrics
 description: Video generative models have been rapidly improving recently, but how do we evaluate them efficiently and effectively? In this blog post, we review the existing evaluation metrics and highlight their pros and cons. 
 tags: metrics video generative-models
 giscus_comments: true
@@ -62,7 +62,7 @@ toc:
 
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/video-metrics.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/video-metrics.png" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 <div class="caption">
@@ -113,7 +113,7 @@ The models (a) to (e) are sorted based on human ratings collected through a user
 
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include video.liquid path="assets/video/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true loop=true %}
+        {% include video.liquid path="blog/2024/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true loop=true %}
     </div>
 </div>
 <div class="caption">
@@ -142,20 +142,20 @@ We also present visualizations of video frames for one randomly selected scene t
 
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/fig-eval-metric-comparison-v0.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/fig-eval-metric-comparison-v0.jpg" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 
 <details>
   <summary>click here for more frames comparison</summary>
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/fig-eval-metric-comparison-v1.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/fig-eval-metric-comparison-v1.jpg" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 </details>
 
 ## Summary
 We review the video evaluation metrics used to assess video generative models. These metrics can be categorized into two types: set-to-set comparison metrics (FID, FVD, KVD, FVMD, PSNR, and SSIM) and unary metrics (VBench, CLIP score, and IS). We discuss the pros and cons of each type and provide a detailed comparison using the TikTok dataset. The results show that the **FVMD metric aligns better with human judgments than other metrics, especially for assessing motion consistency**. This suggests that FVMD is a promising metric for evaluating video generative models.
 
-Wonder why FVMD performs so much better than other metrics? Check out [the second part of our blog post](https://qiyan98.github.io/blog/2024/fvmd-2/) to find out more! We will delve into the details of the FVMD metric and explain why it is more effective in assessing video quality and motion consistency.
+Wonder why FVMD performs so much better than other metrics? Check out [the second part of our blog post](https://dsl-lab.github.io/blog/2024/fvmd-2/) to find out more! We will delve into the details of the FVMD metric and explain why it is more effective in assessing video quality and motion consistency.
diff --git a/_posts/2024-06-20-fvmd-2.md b/_posts/2024-06-20-fvmd-2.md
@@ -1,6 +1,6 @@
 ---
 layout: distill
-title: Video Evaluation Metrics 2/2 - Evaluating Motion Consistency by Fréchet Video Motion Distance (FVMD)
+title: Evaluating Motion Consistency by Fréchet Video Motion Distance (FVMD)
 description: In this blog post, we introduce a promising new metric for video generative models, Fréchet Video Motion Distance (FVMD), which focuses on the motion consistency of generated videos.
 tags: metrics video generative-models
 giscus_comments: true
@@ -67,7 +67,7 @@ toc:
 
 Recently, diffusion models have demonstrated remarkable capabilities in high-quality image generation. This advancement has been extended to the video domain, giving rise to text-to-video diffusion models, such as [Pika](https://pika.art/home), [Runway Gen-2](https://research.runwayml.com/gen2), and [Sora](https://openai.com/index/sora/) <d-cite key="videoworldsimulators2024"></d-cite>. 
 
-Despite the rapid development of video generation models, research on evaluation metrics for video generation remains insufficient (see more discussion on our [blog](https://qiyan98.github.io/blog/2024/fvmd-1/)). 
+Despite the rapid development of video generation models, research on evaluation metrics for video generation remains insufficient (see more discussion on our [blog](https://dsl-lab.github.io/blog/2024/fvmd-1/)). 
 For example, FID-VID <d-cite key="balaji2019conditional"></d-cite> and FVD <d-cite key="unterthiner2018towards"></d-cite> are commonly used video metrics. FID-VID focuses on visual quality by comparing synthesized *frames* to real ones, ignoring motion quality. FVD adds temporal coherence by using features from a *pre-trained action recognition model*, Inflated 3D Convnet (I3D) <d-cite key="carreira2017quo"></d-cite>.
 Recently, VBench <d-cite key="huang2023vbench"></d-cite> introduces a 16-dimensional evaluation suite for text-to-video generative models. However, VBench's protocols for temporal consistency, like temporal flickering and motion smoothness, favor videos with smooth or static movement, *neglecting high-quality videos with intense motion*, such as dancing and sports videos.
 
@@ -80,7 +80,7 @@ The code is available at [GitHub](https://github.com/DSL-Lab/FVMD-frechet-video-
 ## Fréchet Video Motion Distance (FVMD)
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/pipeline.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/pipeline.png" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 <div class="caption">
@@ -92,10 +92,10 @@ The core idea of FVMD is to measure temporal motion consistency based on **the p
 
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/tracking_demo_1.gif" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/tracking_demo_1.gif" class="img-fluid rounded z-depth-1" %}
     </div>
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/tracking_demo_2.gif" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/tracking_demo_2.gif" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 <div class="caption">
@@ -155,24 +155,24 @@ If two videos are of very different quality, their histograms should look very *
 
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/gt.gif" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/gt.gif" class="img-fluid rounded z-depth-1" %}
     </div>
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/disco.gif" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/disco.gif" class="img-fluid rounded z-depth-1" %}
     </div>
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/anyone.gif" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone.gif" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/gt_tracking.gif" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/gt_tracking.gif" class="img-fluid rounded z-depth-1" %}
     </div>
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/disco_tracking.gif" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/disco_tracking.gif" class="img-fluid rounded z-depth-1" %}
     </div>
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/anyone_tracking.gif" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone_tracking.gif" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 <div class="caption">
@@ -185,13 +185,13 @@ Above, we show three pieces of video from the TikTok dataset <d-cite key="jafari
 
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/gt_v_1d.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/gt_v_1d.png" class="img-fluid rounded z-depth-1" %}
     </div>
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/disco_v_1d.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/disco_v_1d.png" class="img-fluid rounded z-depth-1" %}
     </div>
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/anyone_v_1d.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone_v_1d.png" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 <div class="caption">
@@ -208,13 +208,13 @@ This is exactly what we want to observe in the motion features! These features c
   <summary>click here for 2D histogram result</summary>
   <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/gt_v_2d.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/gt_v_2d.png" class="img-fluid rounded z-depth-1" %}
     </div>
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/disco_v_2d.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/disco_v_2d.png" class="img-fluid rounded z-depth-1" %}
     </div>
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/anyone_v_2d.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/anyone_v_2d.png" class="img-fluid rounded z-depth-1" %}
     </div>
   </div>
   <div class="caption">
@@ -243,7 +243,7 @@ To verify the efficacy of the extracted motion features in representing motion p
 
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/sanity_check.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/sanity_check.png" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 <div class="caption">
@@ -256,7 +256,7 @@ When measuring the FVMD of **two subsets from the same dataset**, it **converges
 Moreover, a sensitivity analysis is conducted to evaluate if the proposed metric can effectively detect temporal inconsistencies in generated videos, *i.e.*, being **numerically sensitive to temporal noises**. To this end, artificially-made temporal noises are injected to the TikTok dancing dataset <d-cite key="jafarian2022self"></d-cite> and FVMD scores are computed to assess its sensitivity to data corruption.
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/sensitivity_ana.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/sensitivity_ana.png" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 <div class="caption">
@@ -275,7 +275,7 @@ Note that the models (a) to (e) are sorted based on human ratings collected thro
 
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include video.liquid path="assets/video/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true %}
+        {% include video.liquid path="blog/2024/fvmd/FVMD.mp4" class="img-fluid rounded z-depth-1" controls=true autoplay=true %}
     </div>
 </div>
 <div class="caption">
@@ -305,7 +305,7 @@ The second setting, **One Metric Diverse**, evaluates the agreement among differ
 
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/human_study_eql.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/human_study_eql.png" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 <div class="caption">
@@ -314,7 +314,7 @@ The second setting, **One Metric Diverse**, evaluates the agreement among differ
 
 <div class="row mt-3">
     <div class="col-sm mt-3 mt-md-0">
-        {% include figure.liquid loading="eager" path="assets/img/fvmd/human_study_div.png" class="img-fluid rounded z-depth-1" %}
+        {% include figure.liquid loading="eager" path="blog/2024/fvmd/human_study_div.png" class="img-fluid rounded z-depth-1" %}
     </div>
 </div>
 <div class="caption">