feat(openai): add pass_video_url and enable_thinking_kwarg for vLLM-served video tasks by min1321 · Pull Request #1366 · EvolvingLMMs-Lab/lmms-eval

min1321 · 2026-06-16T13:14:50Z

Summary

Add two opt-in init params to OpenAICompatible: pass_video_url and enable_thinking_kwarg, so a vLLM-served Qwen3-VL / Qwen3.5-VL backend can do server-side video decoding instead of forcing client-side frame extraction.
pass_video_url=True sends each video as {"type": "video_url", "video_url": {"url": "file://..."}} so vLLM can apply media_io_kwargs.num_frames and attach absolute-time signals; enable_thinking_kwarg forwards chat_template_kwargs.enable_thinking via extra_body.
Defaults are unchanged — existing tasks/configs are unaffected.

lmms_eval/models/chat/openai.py:
- OpenAICompatible.__init__: accept pass_video_url: bool = False and enable_thinking_kwarg: object = None.
- build_payload_for_index: when pass_video_url=True, build the OpenAI messages list manually (skip to_openai_messages) and emit each video as a video_url part. When either flag is set, populate payload["extra_body"] with media_io_kwargs and/or chat_template_kwargs.

Other adapters (vllm, sglang, huggingface, litellm, …). Same idea would apply but each has its own video path; happy to follow up if maintainers want.
Changes to qwen_vl_utils or protocol.py.
Audio / image handling — unchanged.

python -m lmms_eval --model openai --tasks extremewhenbench --model_args "...,pass_video_url=True,max_frames_num=768,enable_thinking_kwarg=False,..." against vLLM-served Qwen3.5-9B | sample size: N=2,273 | key metrics: mIoU | result: 0.048 with new flags vs. 0.003 default (existing path) — pass. Matches a hand-rolled openai-client reference (0.047) within run-to-run noise.
Default-path regression: existing tasks via --model openai without the new flags produce identical output | result: pass.

Defaults preserve existing behavior; new flags are opt-in. No new dependencies.
extra_body is OpenAI-API spec-compliant; non-vLLM backends that don't understand media_io_kwargs simply ignore it (server-specific).

…erved video tasks

feat(openai): add pass_video_url and enable_thinking_kwarg for vLLM-s…

ca256c8

…erved video tasks