jena-shreyas / Awesome-Video-Language-Resources Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

A repository of Video Language papers, code and datasets.

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md

Repository files navigation

Awesome Video Language Model Resources

This repo contains a collection of video language models-based works over the past few years that I found interesting. Hope this helps !

Models

Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval [CVPR 2023]
Code : https://github.com/XudongLinthu/upgradable-multimodal-intelligence
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners (VidIL) [NeurIPS 2022]
Code : https://github.com/MikeWangWZHL/VidIL
Invariant Grounding for Video Question Answering (IGV) [CVPR 2022]
Code : https://github.com/yl3800/IGV
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (HQGA) [AAAI 2022]
Code : https://github.com/doc-doc/HQGA

Datasets

ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos [EMNLP 2023]
Code : https://github.com/PlusLabNLP/acquired (NOT RELEASED YET)
From representation to reasoning: Towards both evidence and commonsense reasoning for video question-answering (CausalVidQA) [CVPR 2022]
Code : https://github.com/bcmi/Causal-VidQA
ComPhy: Compositional Physical Reasoning of Objects and Events from Videos (ComPhy) [ICLR 2022]
Code : https://github.com/zfchenUnique/compositional_physics_learner
NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions (NeXT-QA) [CVPR 2021]
Code : https://github.com/doc-doc/NExT-QA
CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions (CRAFT) [ACL 2022]
Code : https://github.com/hucvl/craft
CLEVRER: CoLlision Events for Video REpresentation and Reasoning (CLEVRER) [ICLR 2020]
Code : https://github.com/chuangg/CLEVRER

Summary papers

Video Question Answering: Datasets, Algorithms and Challenges

About

A repository of Video Language papers, code and datasets.

multimodal multimodal-deep-learning video-language video-language-understanding

Report repository

Releases

No releases published

Packages

No packages published