Skip to content

[Feature] mm and thinking model support structred output #2749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

kevincheng2
Copy link
Contributor

  1. mm and thinking model support structred output
  2. offline Inference support structred output

Copy link

paddle-bot bot commented Jul 8, 2025

Thanks for your contribution!

@kevincheng2 kevincheng2 changed the title [vl] mm and thinking model support structred output [Feature] mm and thinking model support structred output Jul 8, 2025
Copilot

This comment was marked as outdated.

@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch from d07f737 to 72de4a3 Compare July 11, 2025 06:41
@Jiang-Jia-Jun Jiang-Jia-Jun requested a review from Copilot July 12, 2025 16:08
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds structured output support via guided decoding (reasoning parsers) for multi-modal and thinking models, including offline inference capabilities.

  • Introduce a new --reasoning_parser CLI argument and propagate it through configuration to model runners.
  • Extend the sampling and guided decoding pipeline: updated Sampler, guided backend interfaces, and skip-index logic.
  • Enhance SamplingParams with GuidedDecodingParams and document offline inference usage for structured outputs.

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
fastdeploy/worker/worker_process.py Add --reasoning_parser CLI arg and integrate it into FDConfig.
fastdeploy/worker/vl_gpu_model_runner.py Initialize guided backend and reasoning parser; update guided decoding flow in the GPU model runner.
fastdeploy/model_executor/layers/sample/sampler.py Enhance Sampler to support reasoning parsing and skip indices when masking tokens.
fastdeploy/engine/sampling_params.py Introduce GuidedDecodingParams in SamplingParams for offline structured inference.
docs/features/structured_outputs.md Add offline inference examples for structured output using GuidedDecodingParams.
Comments suppressed due to low confidence (3)

fastdeploy/worker/vl_gpu_model_runner.py:145

  • The code checks for guided_json, guided_regex, guided_grammar, and structural_tag but does not handle guided_choice from GuidedDecodingParams. Add support for guided_choice to ensure all constraint types are honored.
        elif request.guided_grammar is not None:

fastdeploy/engine/engine.py:1049

  • The code references self.cfg.reasoning_parser, but reasoning_parser is not defined on the engine config object. It should likely reference self.cfg.model_config.reasoning_parser.
            f" --reasoning_parser {self.cfg.reasoning_parser}")

fastdeploy/worker/vl_gpu_model_runner.py:152

  • Using request.get(...) may not work if request is not a dict-like object. Consider using getattr(request, 'enable_thinking', True) to access the attribute safely.
            enable_thinking=request.get("enable_thinking", True),

@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch from e99d5a7 to 8ae5d81 Compare July 17, 2025 08:39
@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch from aac8503 to 04c2f3c Compare July 17, 2025 12:44
Jiang-Jia-Jun
Jiang-Jia-Jun previously approved these changes Jul 18, 2025
@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch from 3f12790 to 2ef373a Compare July 18, 2025 11:21
@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch from 2ef373a to 69fc3a2 Compare July 18, 2025 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants