This repo will contains the code for the MMM24 paper Find the Cliffhanger - Multi-Modal Trailerness in Soap Operas, also available on arXiv.
The dataset can be downloaded from here and the GTST folder should be placed in the same folder as this file. The dataset contains extracted features for the four possible streams: visual feats at a clip level, visual feats at a shot level, textual feats at a clip level, and textual feats at a shot level.
TL;DR If you'd like to reproduce our main results:
Create the conda environment with mamba/conda with conda env create -f environment.yml
Then, simply run
python -m datamodule.as_semantic=False,True datamodule.as_shots=False,True general.seed=10,20,30,40,50
Then compute late fusion predictions with python
. This requires a WandB account that needs to be specified in
. Finally, to get the results table, run python