Skip to content

Video input fails with Gemini - Pydantic validation error #35

@ebowwa

Description

@ebowwa

Bug Report: Video input to Gemini fails with validation error

Description

When trying to send video content to Gemini using the multimodal format, the request fails with a Pydantic validation error.

Current Behavior

Sending a video file to Gemini results in:

Extra inputs are not permitted [type=extra_forbidden, input_value=b'\\x00\\x00\\x00 ftypisom...']
Input should be a valid list [type=list_type, input_value={'mime_type': 'video/mp4'...}]

Expected Behavior

Video should be accepted and processed by Gemini's multimodal API.

Code to Reproduce

from ai_proxy_core import CompletionClient

async def analyze_video():
    client = CompletionClient()
    messages = [{
        "role": "user",
        "content": [
            {"type": "text", "text": "Analyze this video"},
            {
                "type": "video",
                "video": {
                    "file_path": "test_video.mp4"
                }
            }
        ]
    }]
    
    response = await client.create_completion(
        model="gemini-1.5-flash",
        messages=messages
    )
    return response

Potential Fix

The issue appears to be in how video data is being formatted for the Gemini API. The _parse_content method in google.py correctly handles video input, but the resulting format might not match what Gemini's API expects.

Environment

  • ai-proxy-core version: latest
  • Python: 3.12
  • Gemini model: gemini-1.5-flash

Related

This is blocking video editing functionality in Nano-Banana-Shorts-Editor where Gemini needs to analyze videos to identify frames for editing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions