dcase2021_umaps

Paper	Data	Webpage

Repository holding code for our paper:

USING UMAP TO INSPECT AUDIO DATA FOR UNSUPERVISED ANOMALY DETECTION UNDER DOMAIN-SHIFT CONDITIONS Andres Fernandez and Mark D. Plumbley 2021

You can cite our work as follows:

@inproceedings{aferro2021umap,
  author = {Fernandez, Andres and Plumbley, Mark D.},
  title = {Using {UMAP} to Inspect Audio Data for Unsupervised Anomaly Detection under Domain-Shift Conditions},
  booktitle = "Proceedings of the Detection and Classification of Acoustic Scenes and Events 2021 Workshop ({DCASE2021})",
  address = "Barcelona, Spain",
  month = "November",
  year = "2021",
}

Our work is released under liberal licenses (code: MIT, data: CC-BY). We're happy for others to build on it; refactoring the scatterplot scripts is particularly welcome.

Comprehensive UMAPs and plots generated for the paper can be downloaded at the Zenodo link above. Our results can be fully reproduced following the steps detailed below. The data pipeline can be summarized as follows:

Collect WAV audio datasets. In this case we have 3 (DCASE, AudioSet and Fraunhofer).
Compute the log-STFT, log-mel spectrograms and L3 embeddings and save as HDF5 datasets (performed by the 00... Python scripts)
Compute the UMAPs and save as HDF5 datasets (performed by the 01... Python script)
Render scatter plots for section, device and global scopes (performed by the 02... Python scripts)

We also included the 03... scripts used to render the plots in the paper. Note that step 2 requires a fair amount of disk memory and time. The L3 embeddings can also take a while to compute. Step 3 is very RAM-hungry and potentially slow.

Reproduction

If not existing, create the following directories inside this repository:

datasets
precomputed_features
umaps
umap_plots
logs

Datasets:

AudioSet: Download our custom AudioSet subset and extract its 39437 WAV files into datasets/AudioSet_fragments
Fraunhofer: Download from here and extract into datasets/IDMT-ISA-ELECTRIC-ENGINE. It should end up with the following structure:

IDMT-ISA-ELECTRIC-ENGINE/
├──   test
│   ├──   engine1_good
│   ├──   engine2_broken
│   └──   engine3_heavyload
├──   test_cut
│   ├──   engine1_good
│   ├──   engine2_broken
│   └──   engine3_heavyload
├──   train
│   ├──   engine1_good
│   ├──   engine2_broken
│   └──   engine3_heavyload
└──   train_cut
    ├──   engine1_good
    ├──   engine2_broken
    └──   engine3_heavyload

DCASE: Download and merge the Development and Additional Training datasets, and extract into datasets/DCASE2021/t2. It should end up with the following structure:

DCASE2021/
└──   t2
    ├──   dev
    │   ├──   fan
    │   │   ├──   source_test
    │   │   ├──   target_test
    │   │   └──   train
    │   ├──   gearbox
    │   │   ├──   source_test
    │   │   ├──   target_test
    │   │   └──   train
    │   ├──   pump
    │   │   ├──   source_test
    │   │   ├──   target_test
    │   │   └──   train
    │   ├──   slider
    │   │   ├──   source_test
    │   │   ├──   target_test
    │   │   └──   train
    │   ├──   ToyCar
    │   │   ├──   source_test
    │   │   ├──   target_test
    │   │   └──   train
    │   ├──   ToyTrain
    │   │   ├──   source_test
    │   │   ├──   target_test
    │   │   └──   train
    │   └──   valve
    │       ├──   source_test
    │       ├──   target_test
    │       └──   train
    └──   eval
        ├──   fan
        │   ├──   source_test
        │   ├──   target_test
        │   └──   train
        ├──   gearbox
        │   ├──   source_test
        │   ├──   target_test
        │   └──   train
        ├──   pump
        │   ├──   source_test
        │   ├──   target_test
        │   └──   train
        ├──   slider
        │   ├──   source_test
        │   ├──   target_test
        │   └──   train
        ├──   ToyCar
        │   ├──   source_test
        │   ├──   target_test
        │   └──   train
        ├──   ToyTrain
        │   ├──   source_test
        │   ├──   target_test
        │   └──   train
        └──   valve
            ├──   source_test
            ├──   target_test
            └──   train

Python dependencies:

Tested on CUDA-enabled Ubuntu 20.04 with Conda and Python 3.8.1.

conda create --name dcase2021umaps python=3.8
conda activate dcase2021umaps
#
conda install -y -c conda-forge omegaconf
conda install -y -c conda-forge librosa
conda install -y -c anaconda h5py
conda install -y -c anaconda pytz
pip install coloredlogs
conda install -y -c anaconda pandas
#
conda install -y -c anaconda cython
pip install openl3==0.4.0  # TF backend should automatically recognize GPU
#
conda install -y -c conda-forge umap-learn
#
pip install randomcolor

The resulting environment has been frozen into requirements.txt. Check the file for full details on versions and dependencies.

Precompute features:

Results are HDF5 files with 3 keys:

data: A matrix of shape (num_features, length) with the representation of all audio files concatenated across the length.
data_idxs: A matrix of shape (2, num_files), where each (beg, end) pair designs the beginning and end index of an audio file in the data matrix.
metadata: An array of length num_files in the same order as data_idxs. Each entry contains a string with the file metadata. For AudioSet and Fraunhofer, this is the relative filepath. For DCASE, it is a rich JSON object.

The reason for this design is that we want to have as much data as possible in a single contiguous memory chunk, for performance reasons. Encoding metadata as strings allows enough flexibility for all used datasets.

Run the following commands to precompute all features:

# Fixed features:
python 00a_precompute_dcase_fixed.py WAV_NORM=absmax ROOT_PATH=datasets/DCASE2021/t2
python 00b_precompute_audioset_fixed.py WAV_NORM=absmax ROOT_PATH=datasets/AudioSet_fragments/
python 00c_precompute_fraunhofer_fixed.py WAV_NORM=absmax ROOT_PATH=datasets/IDMT-ISA-ELECTRIC-ENGINE/

# L3 embeddings: each call to L3 is slow: higher NUM_FILES_PER_L3_RUN is faster but consumes RAM. Also GPU computation helps speed up processing, but GPU memory is limited, this can be controlled with L3_BATCHSIZE. Parameters below should be good for an 8GB GPU and 32GB of RAM
python 00d_precompute_dcase_l3.py WAV_NORM=absmax ROOT_PATH=datasets/DCASE2021/t2 NUM_FILES_PER_L3_RUN=200 L3_BATCHSIZE=16
python 00e_precompute_audioset_l3.py WAV_NORM=absmax ROOT_PATH=datasets/AudioSet_fragments/ NUM_FILES_PER_L3_RUN=200 L3_BATCHSIZE=16
python 00f_precompute_fraunhofer_l3.py WAV_NORM=absmax ROOT_PATH=datasets/IDMT-ISA-ELECTRIC-ENGINE/ NUM_FILES_PER_L3_RUN=200 L3_BATCHSIZE=16

Commands to compute per-device UMAPs

Results are pickled dictionaries with the following keys: config, audioset, fraunhofer, (train, valve, 00, source), .... The config key contains a string with the parameters used. Each of the other entries corresponds to a dataset split and contains a dictionary with 4 keys: umaps, metadata, global_idxs, relative_idxs. The umaps are arrays of shape (N, 2) containing N samples from the computed UMAP. The others are N-element lists containing per-sample info: metadata about file path and labels, global index to find the frame in the original HDF5 matrix, and relative index to find the frame in the original file. This allows to trace back each UMAP dot to its corresponding audio wave or frame, which can be useful to e.g. compute the energies.

# Define these variables for all computations
STACK=1  # same with STACK=5
TRAIN_SZ=10000
TEST_SZ=20000
AUDIOSET_SZ=10000
FRAUNHOFER_SZ=10000

STACK=10
TRAIN_SZ=10000
TEST_SZ=20000
AUDIOSET_SZ=1
FRAUNHOFER_SZ=1


# Define these variables for the L3 computations
MOD=l3
AUDIOSET=precomputed_features/audioset_wavnorm=absmax_l3env_hop0.1_linear512.h5
FRAUNHOFER=precomputed_features/fraunhofer_wavnorm=absmax_l3env_hop0.1_linear512.h5
TRAIN=precomputed_features/dcase2021_t2_train_wavnorm=absmax_l3env_hop0.1_linear512.h5
TEST=precomputed_features/dcase2021_t2_cv_wavnorm=absmax_l3env_hop0.1_linear512.h5

# Define these variables for the mel computations
MOD=mel
AUDIOSET=precomputed_features/audioset_wavnorm=absmax_mel_win1024_hop512_m128.h5
FRAUNHOFER=precomputed_features/fraunhofer_wavnorm=absmax_mel_win1024_hop512_m128.h5
TRAIN=precomputed_features/dcase2021_t2_train_wavnorm=absmax_mel_win1024_hop512_m128.h5
TEST=precomputed_features/dcase2021_t2_cv_wavnorm=absmax_mel_win1024_hop512_m128.h5

# Define these variables for the stft computations
MOD=stft
AUDIOSET=precomputed_features/audioset_wavnorm=absmax_stft_win1024_hop512.h5
FRAUNHOFER=precomputed_features/fraunhofer_wavnorm=absmax_stft_win1024_hop512.h5
TRAIN=precomputed_features/dcase2021_t2_train_wavnorm=absmax_stft_win1024_hop512.h5
TEST=precomputed_features/dcase2021_t2_cv_wavnorm=absmax_stft_win1024_hop512.h5

# Once the variables of choice are defined, run one UMAP computation per device
for d in fan gearbox pump slider valve ToyCar ToyTrain; do python 01a_precompute_umaps.py STACK=$STACK MODALITY=$MOD MAX_AUDIOSET=$AUDIOSET_SZ MAX_FRAUNHOFER=$FRAUNHOFER_SZ MAX_DCASE_TRAIN=$TRAIN_SZ MAX_DCASE_TEST=$TEST_SZ SPLITS_NAME=$d DCASE_TRAIN_PATH=$TRAIN DCASE_TEST_PATH=$TEST AUDIOSET_PATH=$AUDIOSET FRAUNHOFER_PATH=$FRAUNHOFER "DCASE_SPLITS=[[$d, '00', source], [$d, '00', target], [$d, '01', source], [$d, '01', target], [$d, '02', source], [$d, '02', target], [$d, '03', source], [$d, '03', target], [$d, '04', source], [$d, '04', target], [$d, '05', source], [$d, '05', target]]"; done

Commands to compute global UMAPs

# Define these variables for all computations
STACK=1  # STACK=5
TRAIN_SZ=1000
TEST_SZ=2000
AUDIOSET_SZ=50000
FRAUNHOFER_SZ=50000

STACK=10
TRAIN_SZ=2000
TEST_SZ=2000
AUDIOSET_SZ=1
FRAUNHOFER_SZ=1

# Define these variables for the L3 computations
MOD=L3
AUDIOSET=precomputed_features/audioset_wavnorm=absmax_l3env_hop0.1_linear512.h5
FRAUNHOFER=precomputed_features/fraunhofer_wavnorm=absmax_l3env_hop0.1_linear512.h5
TRAIN=precomputed_features/dcase2021_t2_train_wavnorm=absmax_l3env_hop0.1_linear512.h5
TEST=precomputed_features/dcase2021_t2_cv_wavnorm=absmax_l3env_hop0.1_linear512.h5

# Define these variables for the mel computations
MOD=mel
AUDIOSET=precomputed_features/audioset_wavnorm=absmax_mel_win1024_hop512_m128.h5
FRAUNHOFER=precomputed_features/fraunhofer_wavnorm=absmax_mel_win1024_hop512_m128.h5
TRAIN=precomputed_features/dcase2021_t2_train_wavnorm=absmax_mel_win1024_hop512_m128.h5
TEST=precomputed_features/dcase2021_t2_cv_wavnorm=absmax_mel_win1024_hop512_m128.h5

# Define these variables for the stft computations
MOD=stft
AUDIOSET=precomputed_features/audioset_wavnorm=absmax_stft_win1024_hop512.h5
FRAUNHOFER=precomputed_features/fraunhofer_wavnorm=absmax_stft_win1024_hop512.h5
TRAIN=precomputed_features/dcase2021_t2_train_wavnorm=absmax_stft_win1024_hop512.h5
TEST=precomputed_features/dcase2021_t2_cv_wavnorm=absmax_stft_win1024_hop512.h5

# Once the variables of choice are defined, run the global UMAP computation
python 01a_precompute_umaps.py STACK=$STACK MODALITY=$MOD MAX_AUDIOSET=$AUDIOSET_SZ MAX_FRAUNHOFER=$FRAUNHOFER_SZ MAX_DCASE_TRAIN=$TRAIN_SZ MAX_DCASE_TEST=$TEST_SZ SPLITS_NAME=GLOBAL DCASE_TRAIN_PATH=$TRAIN DCASE_TEST_PATH=$TEST AUDIOSET_PATH=$AUDIOSET FRAUNHOFER_PATH=$FRAUNHOFER "DCASE_SPLITS=[[fan, '00', source], [fan, '00', target], [fan, '01', source], [fan, '01', target], [fan, '02', source], [fan, '02', target], [fan, '03', source], [fan, '03', target], [fan, '04', source], [fan, '04', target], [fan, '05', source], [fan, '05', target], [gearbox, '00', source], [gearbox, '00', target], [gearbox, '01', source], [gearbox, '01', target], [gearbox, '02', source], [gearbox, '02', target], [gearbox, '03', source], [gearbox, '03', target], [gearbox, '04', source], [gearbox, '04', target], [gearbox, '05', source], [gearbox, '05', target], [pump, '00', source], [pump, '00', target], [pump, '01', source], [pump, '01', target], [pump, '02', source], [pump, '02', target], [pump, '03', source], [pump, '03', target], [pump, '04', source], [pump, '04', target], [pump, '05', source], [pump, '05', target], [slider, '00', source], [slider, '00', target], [slider, '01', source], [slider, '01', target], [slider, '02', source], [slider, '02', target], [slider, '03', source], [slider, '03', target], [slider, '04', source], [slider, '04', target], [slider, '05', source], [slider, '05', target], [valve, '00', source], [valve, '00', target], [valve, '01', source], [valve, '01', target], [valve, '02', source], [valve, '02', target], [valve, '03', source], [valve, '03', target], [valve, '04', source], [valve, '04', target], [valve, '05', source], [valve, '05', target], [ToyCar, '00', source], [ToyCar, '00', target], [ToyCar, '01', source], [ToyCar, '01', target], [ToyCar, '02', source], [ToyCar, '02', target], [ToyCar, '03', source], [ToyCar, '03', target], [ToyCar, '04', source], [ToyCar, '04', target], [ToyCar, '05', source], [ToyCar, '05', target], [ToyTrain, '00', source], [ToyTrain, '00', target], [ToyTrain, '01', source], [ToyTrain, '01', target], [ToyTrain, '02', source], [ToyTrain, '02', target], [ToyTrain, '03', source], [ToyTrain, '03', target], [ToyTrain, '04', source], [ToyTrain, '04', target], [ToyTrain, '05', source], [ToyTrain, '05', target]]"

Per-Section Plots:

# LogMels and STFTs: 500/100 is a reasonable, general approximation to locate the spike of the convex cone.

STACK=1
MODALITY=mel
excl=500
avg=100

STACK=5
MODALITY=mel
excl=500
avg=100

STACK=1
MODALITY=stft
excl=500
avg=100

STACK=5
MODALITY=stft
excl=500
avg=100

for d in fan gearbox pump slider valve ToyCar ToyTrain; do for s in 0 1 2; do pth=UMAP_modality=${MODALITY}_splits=${d}_stack=${STACK}_maxDcaseTrain=10000_maxDcaseTest=20000_maxAudioset=10000_maxFraunhofer=10000.pickle; python 02a_single_section_plot.py WITH_CROSS=true DEVICE=${d} DEVICE_UMAP_PATH=umaps/$pth SECTION=${s} SAVEFIG_PATH="umap_plots/${pth}_section${s}.png" CROSS_EXCLUDE_LOWEST=${excl} CROSS_AVERAGE_N=${avg}; done; done


# L3 embeddings don't have a defined energy so they don't have a cross. Computes faster

STACK=1
MODALITY=l3

STACK=5
MODALITY=l3

for d in fan gearbox pump slider valve ToyCar ToyTrain; do for s in 0 1 2; do pth=UMAP_modality=${MODALITY}_splits=${d}_stack=${STACK}_maxDcaseTrain=10000_maxDcaseTest=20000_maxAudioset=10000_maxFraunhofer=10000.pickle; python 02a_single_section_plot.py WITH_CROSS=false DEVICE=${d} DEVICE_UMAP_PATH=umaps/$pth SECTION=${s} SAVEFIG_PATH="umap_plots/${pth}_section${s}.png"; done; done

Per-Device Plots with external datasets:

# LogMels and STFTs:
STACK=1
MODALITY=mel
excl=500
avg=100

STACK=5
MODALITY=mel
excl=500
avg=100

STACK=1
MODALITY=stft
excl=500
avg=100

STACK=5
MODALITY=stft
excl=500
avg=100

for d in fan gearbox pump slider valve ToyCar ToyTrain; do pth=UMAP_modality=${MODALITY}_splits=${d}_stack=${STACK}_maxDcaseTrain=10000_maxDcaseTest=20000_maxAudioset=10000_maxFraunhofer=10000.pickle; python 02b_single_device_plot.py WITH_CROSS=true DEVICE=${d} DEVICE_UMAP_PATH=umaps/$pth  SAVEFIG_PATH="umap_plots/${pth}_device.png" CROSS_EXCLUDE_LOWEST=${excl} CROSS_AVERAGE_N=${avg}; done


# L3 embeddings
STACK=1
MODALITY=l3

STACK=5
MODALITY=l3

for d in fan gearbox pump slider valve ToyCar ToyTrain; do pth=UMAP_modality=${MODALITY}_splits=${d}_stack=${STACK}_maxDcaseTrain=10000_maxDcaseTest=20000_maxAudioset=10000_maxFraunhofer=10000.pickle; python 02b_single_device_plot.py WITH_CROSS=false DEVICE=${d} DEVICE_UMAP_PATH=umaps/$pth  SAVEFIG_PATH="umap_plots/${pth}_device.png"; done

Global plots with external datasets:

pth="UMAP_modality=stft_splits=GLOBAL_stack=5_maxDcaseTrain=1000_maxDcaseTest=2000_maxAudioset=50000_maxFraunhofer=50000.pickle"; python 02c_global_plot.py GLOBAL_UMAP_PATH=umaps/${pth} SAVEFIG_PATH=umap_plots/${pth}_global.png

pth="UMAP_modality=mel_splits=GLOBAL_stack=5_maxDcaseTrain=1000_maxDcaseTest=2000_maxAudioset=50000_maxFraunhofer=50000.pickle"; python 02c_global_plot.py GLOBAL_UMAP_PATH=umaps/${pth} SAVEFIG_PATH=umap_plots/${pth}_global.png

pth="UMAP_modality=L3_splits=GLOBAL_stack=5_maxDcaseTrain=1000_maxDcaseTest=2000_maxAudioset=50000_maxFraunhofer=50000.pickle"; python 02c_global_plot.py GLOBAL_UMAP_PATH=umaps/${pth} SAVEFIG_PATH=umap_plots/${pth}_global.png

Per-Device Plots without external datasets:

# LogMels and STFTs:
STACK=10
MODALITY=stft
MODALITY=mel
MODALITY=l3

for d in fan gearbox pump slider valve ToyCar ToyTrain; do pth=UMAP_modality=${MODALITY}_splits=${d}_stack=${STACK}_maxDcaseTrain=10000_maxDcaseTest=20000_maxAudioset=1_maxFraunhofer=1.pickle; python 02b_single_device_plot.py PLOT_AUDIOSET=false PLOT_FRAUNHOFER=false WITH_CROSS=false DEVICE=${d} DEVICE_UMAP_PATH=umaps/$pth  SAVEFIG_PATH="umap_plots/${pth}_device.png"; done

Global plots without external datasets:

pth="UMAP_modality=stft_splits=GLOBAL_stack=10_maxDcaseTrain=2000_maxDcaseTest=2000_maxAudioset=1_maxFraunhofer=1.pickle"; python 02c_global_plot.py GLOBAL_UMAP_PATH=umaps/${pth} SAVEFIG_PATH=umap_plots/${pth}_global.png

pth="UMAP_modality=mel_splits=GLOBAL_stack=10_maxDcaseTrain=2000_maxDcaseTest=2000_maxAudioset=1_maxFraunhofer=1.pickle"; python 02c_global_plot.py GLOBAL_UMAP_PATH=umaps/${pth} SAVEFIG_PATH=umap_plots/${pth}_global.png

pth="UMAP_modality=L3_splits=GLOBAL_stack=10_maxDcaseTrain=2000_maxDcaseTest=2000_maxAudioset=1_maxFraunhofer=1.pickle"; python 02c_global_plot.py GLOBAL_UMAP_PATH=umaps/${pth} SAVEFIG_PATH=umap_plots/${pth}_global.png

Paper plots:

# Mel pump stack 5 excerpt
pth=UMAP_modality=mel_splits=pump_stack=5_maxDcaseTrain=10000_maxDcaseTest=20000_maxAudioset=10000_maxFraunhofer=10000.pickle; python 03f_single_device_plot_paper_explanatory.py PLOT_AUDIOSET=false PLOT_FRAUNHOFER=false WITH_CROSS=false DEVICE=pump DEVICE_UMAP_PATH=umaps/${pth} CUT_TOP=0.597 CUT_LEFT=0.807 CUT_BOTTOM=0.33 CUT_RIGHT=0.095 FIG_MARGIN_RIGHT=0.7 FIG_LEGEND_POS=0.89 SAVEFIG_PATH=umap_plots/${pth}_device_plot_paper.png DCASE_SHADOW_SIZE=60 DOT_SIZE=20 LEGEND_WIDTH_FACTOR=1.5 LEGEND_FONT_SIZE=31

# Global STFT plot
pth="UMAP_modality=stft_splits=GLOBAL_stack=10_maxDcaseTrain=2000_maxDcaseTest=2000_maxAudioset=1_maxFraunhofer=1.pickle"; python 03c_global_plot_paper.py GLOBAL_UMAP_PATH=umaps/${pth} PLOT_LEGEND=true PLOT_AUDIOSET=false PLOT_FRAUNHOFER=false SAVEFIG_PATH=umap_plots/${pth}_global_paper.png

# L3 ToyCar stack 1 device
pth=UMAP_modality=l3_splits=ToyCar_stack=1_maxDcaseTrain=10000_maxDcaseTest=20000_maxAudioset=10000_maxFraunhofer=10000.pickle; python 03b_single_device_plot_paper.py PLOT_AUDIOSET=false PLOT_FRAUNHOFER=false WITH_CROSS=false DEVICE=ToyCar DEVICE_UMAP_PATH=umaps/${pth} CUT_TOP=0.05 CUT_LEFT=0.06 CUT_BOTTOM=0.04 CUT_RIGHT=0 FIG_MARGIN_RIGHT=0.7 FIG_LEGEND_POS=0.82 SAVEFIG_PATH=umap_plots/${pth}_device_plot_paper.png

# Mel Valve stack 1 with cross
pth=UMAP_modality=mel_splits=valve_stack=1_maxDcaseTrain=10000_maxDcaseTest=20000_maxAudioset=10000_maxFraunhofer=10000.pickle; python 03a_single_section_plot_paper.py WITH_CROSS=true CROSS_EXCLUDE_LOWEST=500 CROSS_AVERAGE_N=200 DEVICE=valve DEVICE_UMAP_PATH=umaps/$pth SECTION=0 CUT_TOP=0.42 CUT_LEFT=0.37 CUT_BOTTOM=0.14 CUT_RIGHT=0.25 SAVEFIG_PATH=umap_plots/${pth}_section0_plot_paper.png

# L3 fan stack 1
pth=UMAP_modality=l3_splits=fan_stack=1_maxDcaseTrain=10000_maxDcaseTest=20000_maxAudioset=10000_maxFraunhofer=10000.pickle; python 03b_single_device_plot_paper.py PLOT_AUDIOSET=true PLOT_FRAUNHOFER=true WITH_CROSS=false DEVICE=fan DEVICE_UMAP_PATH=umaps/${pth} CUT_TOP=0.05 CUT_LEFT=0.06 CUT_BOTTOM=0.04 CUT_RIGHT=-0.2 FIG_MARGIN_RIGHT=0.8 FIG_LEGEND_POS=0.8 SAVEFIG_PATH=umap_plots/${pth}_device_plot_paper.png

Trimming and compressing plots:

# trim
for i in umap_plots/*; do convert $i -trim $i; done
# compress: 2 is best quality, 31 worst
# https://stackoverflow.com/questions/10225403/how-can-i-extract-a-good-quality-jpeg-image-from-a-video-file-with-ffmpeg/10234065#10234065
for i in umap_plots/*; do ffmpeg -i $i -qscale:v 15 ${i/.png/.jpg}; done

*Work supported by EPSRC grants EP/T019751/1 (AI for Sound) and EP/T022205/1 (JADE2 Tier 2 HPC facility)*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dcase2021_umaps

Reproduction

Datasets:

Python dependencies:

Precompute features:

Commands to compute per-device UMAPs

Commands to compute global UMAPs

Per-Section Plots:

Per-Device Plots with external datasets:

Global plots with external datasets:

Per-Device Plots without external datasets:

Global plots without external datasets:

Paper plots:

Trimming and compressing plots:

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
d2021umaps		d2021umaps
.gitignore		.gitignore
00a_precompute_dcase_fixed.py		00a_precompute_dcase_fixed.py
00b_precompute_audioset_fixed.py		00b_precompute_audioset_fixed.py
00c_precompute_fraunhofer_fixed.py		00c_precompute_fraunhofer_fixed.py
00d_precompute_dcase_l3.py		00d_precompute_dcase_l3.py
00e_precompute_audioset_l3.py		00e_precompute_audioset_l3.py
00f_precompute_fraunhofer_l3.py		00f_precompute_fraunhofer_l3.py
01a_precompute_umaps.py		01a_precompute_umaps.py
02a_single_section_plot.py		02a_single_section_plot.py
02b_single_device_plot.py		02b_single_device_plot.py
02c_global_plot.py		02c_global_plot.py
03a_single_section_plot_paper.py		03a_single_section_plot_paper.py
03b_single_device_plot_paper.py		03b_single_device_plot_paper.py
03c_global_plot_paper.py		03c_global_plot_paper.py
03d_sep_dsup_example_plot.py		03d_sep_dsup_example_plot.py
03e_mnist_example_plot.py		03e_mnist_example_plot.py
03f_single_device_plot_paper_explanatory.py		03f_single_device_plot_paper_explanatory.py
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

dcase2021_umaps

Reproduction

Datasets:

Python dependencies:

Precompute features:

Commands to compute per-device UMAPs

Commands to compute global UMAPs

Per-Section Plots:

Per-Device Plots with external datasets:

Global plots with external datasets:

Per-Device Plots without external datasets:

Global plots without external datasets:

Paper plots:

Trimming and compressing plots:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages