Understanding Language Prior of LVLMs by Contrasting Chain-of-Embedding

by Lin Long, Changdae Oh, Seongheon Park, and Sharon Li.

Overview

This repository provides tools and scripts to analyze the language prior in Vision-Language Models by examining representation distances across different layers.

Quick Start

1. Data Preparation

First, set up the environment variable for your data path:

export DATA_PATH=/path/to/your/data

Create dataset files in JSONL format under the DATA_PATH directory. Each dataset should be named as {dataset}.jsonl, where each line contains:

image: the image path or base64 string starting with "data:image/"
instruction: the instruction text
target_tokens: the target tokens (e.g., ["Yes", "No"] or ["A", "B", "C", "D"])
other keys: additional keys you want to include

Example JSONL entry:

{"image": "/path/to/image.jpg", "instruction": "What color is the sky?", "target_tokens": ["A", "B", "C", "D"], "answer": "A"}

We provide reference data processing scripts for several datasets in the data_preparation/ folder.

2. Generate Hidden States

Generate hidden states for your model. Using Qwen2.5-VL as an example:

CUDA_VISIBLE_DEVICES=0 python generation/gen_qwenvl.py --dataset mme

Multi-GPU Support: This step supports multi-GPU parallel generation. After generation is complete, you need to merge the results:

python utils/merge.py

3. Plot Representation Distance

Use the plotting script to visualize representation distance curves:

python plot_divergences.py --model qwenvl --dataset mme

Available options:

--model: Model name (e.g., qwenvl, llava, gemma)
--dataset: Dataset name (e.g., mme, mmbench, vlind)
--data_path: Path to the data directory (default: "data")

Requirements

Install the required dependencies:

pip install -r requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Understanding Language Prior of LVLMs by Contrasting Chain-of-Embedding

Overview

Quick Start

1. Data Preparation

2. Generate Hidden States

3. Plot Representation Distance

Requirements

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data_preparation		data_preparation
generation		generation
models		models
utils		utils
README.md		README.md
plot_divergences.py		plot_divergences.py
requirements.txt		requirements.txt

deeplearning-wisc/understanding_lp

Folders and files

Latest commit

History

Repository files navigation

Understanding Language Prior of LVLMs by Contrasting Chain-of-Embedding

Overview

Quick Start

1. Data Preparation

2. Generate Hidden States

3. Plot Representation Distance

Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages