A few bad samples in the dataset: still-frame vids, muted audio, short vids (< 10 sec) #6

v-iashin · 2020-08-22T19:32:18Z

I played with the dataset a little and found some flawed examples (please correct me if it is expectable).

Still frames segments (usually from a music/sound_effects youtube channel) or duplicates
Example: hpDY7u8B8hE_5000_15000 "Horse Neigh Many / Horse, Horses..." (YouTube)
- Procedure 1: several zero-dist groups using nearest neighbor on some visual features
  How many: 408 examples
  List: filtered_in_0_dists_group.txt
- Procedure 2: distance between consecutive video features (not frame-level features) is less than 0.01 ((feat[1:, :] - feat[:-1, :]).sum(-1) feat are of size (20, 512))
  How many: 5965 examples
  List: filtered_video_has_still_frames.txt
  1st col: vid_id,
  2nd col: portion_of_same_features -- because sometimes 0.05 is <1 sec of still frames but the rest is ok. So it is up to you if you would like to use such examples.
- Procedure 3: Mostly manual work. I found examples close to the ones found in procedures 1 and 2 👆 in TSNE (2d), clustered the TSNE, and took close clusters if a majority of manually checked examples are flawed. Please note, it is a bit fuzzy and has some mistakes (falsely thinks that some examples are bad). I sorted the lists according to how confident I am that a list has most examples flawed.
  List (how many) [sorted by confidence: high to low]:
  region_app.txt (2)
  region_babies.txt (12)
  region_bird.txt (13)
  region_blue_circle.txt (8)
  region_flipped.txt (19): content is ok but the vid is rotated for 90º
  region_footsteps.txt (24)
  region_tube.txt (2)
  filtered_manually_found copy.txt (18)
  145.txt (427)
  190.txt (358)
  299.txt (566)
Segments with no audio
Example: YamCgQFbo7c_60000_70000 "Mécanismes pour lits muraux, à ouverture verticale" (YouTube)
Procedure: zero std on the middle 5 seconds of the audio track
How many: 1010 examples
List: filtered_audio_has_0_std.txt
Short videos (<5 seconds) -- if you like I can provide for <X seconds (for any X)
Example: 6dhXrzs8pJc_0_10000 (a bit loud) "Funny Goat saying hey" (YouTube)
Procedure: measured length of a video
How many: 37 examples
List: filtered_vid_has_short_vfeats.txt

I am not after criticizing the paper but rather sharing my findings with others who might want to use the dataset for their applications 🤗. It is not that significant considering the size of the dataset and the number of flawed examples (<5 %) and the sets do intersect! However, it might prevent one from facing strange errors when dealing with the dataset.

The text was updated successfully, but these errors were encountered:

WeidiXie · 2020-08-22T20:48:02Z

Hi,

Thank you very much for pointing out these noises or errors,

as the dataset was collected mainly with automatic pipeline, so the noises are inevitable, but these lists are super helpful, we will update the meta information accordingly.

Best,
Weidi

v-iashin · 2020-08-22T22:04:26Z

Thanks for the prompt reply.

Can I ask you to wait with the update for a week or so? I am still in progress of finding more such examples. I will update the post if I will find anything else.

WeidiXie · 2020-08-22T22:05:27Z

That would be amazing, thanks a lot for your help.

Best,
Weidi

v-iashin · 2020-08-28T09:02:36Z

@WeidiXie
Hey, I updated the post. Check it out.

WeidiXie · 2020-08-28T10:15:08Z

@v-iashin

Thank you so much for this, we are looking into it.

Best,
Weidi

v-iashin · 2020-08-28T10:56:40Z

Plus, of course, some of the videos are missing because they are no longer available on YouTube (~10k). I can provide a list of a month-old state.

WeidiXie · 2020-08-28T21:29:29Z

@v-iashin

oh, I see, that's OK, we expect that will happen, like Kinetics also has this problem, so, unless we release the downloaded data, otherwise the dataset will be dynamic.

Best,
Weidi

daisukelab · 2021-08-04T09:24:08Z

Plus, of course, some of the videos are missing because they are no longer available on YouTube (~10k). I can provide a list of a month-old state.

Hi @v-iashin, is It possible to share your list of the missing ones, please?
I'm trying to download, but I could get 178k samples so far.
It seems to have lost 20k+ samples already...

v-iashin · 2021-08-04T09:54:03Z

Hi @daisukelab

Here is the list of available videos at the moment when I downloaded it:
available_clips.txt

daisukelab · 2021-08-13T13:56:07Z

Hi @v-iashin and all,
I'd share my list of missing videos.
https://drive.google.com/file/d/13g_3d-7btA48qu1DsBfyLqZbFo6iYZXp/view?usp=sharing

From Japan.
199,176 items are listed on vggsound.csv, and 181,683 items could be downloaded = 17,493 items are missing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A few bad samples in the dataset: still-frame vids, muted audio, short vids (< 10 sec) #6

A few bad samples in the dataset: still-frame vids, muted audio, short vids (< 10 sec) #6

v-iashin commented Aug 22, 2020 •

edited

Loading

WeidiXie commented Aug 22, 2020

v-iashin commented Aug 22, 2020

WeidiXie commented Aug 22, 2020

v-iashin commented Aug 28, 2020

WeidiXie commented Aug 28, 2020

v-iashin commented Aug 28, 2020 •

edited

Loading

WeidiXie commented Aug 28, 2020

daisukelab commented Aug 4, 2021

v-iashin commented Aug 4, 2021

daisukelab commented Aug 13, 2021

A few bad samples in the dataset: still-frame vids, muted audio, short vids (< 10 sec) #6

A few bad samples in the dataset: still-frame vids, muted audio, short vids (< 10 sec) #6

Comments

v-iashin commented Aug 22, 2020 • edited Loading

WeidiXie commented Aug 22, 2020

v-iashin commented Aug 22, 2020

WeidiXie commented Aug 22, 2020

v-iashin commented Aug 28, 2020

WeidiXie commented Aug 28, 2020

v-iashin commented Aug 28, 2020 • edited Loading

WeidiXie commented Aug 28, 2020

daisukelab commented Aug 4, 2021

v-iashin commented Aug 4, 2021

daisukelab commented Aug 13, 2021

v-iashin commented Aug 22, 2020 •

edited

Loading

v-iashin commented Aug 28, 2020 •

edited

Loading