[Feature] mm and thinking model support structred output #2749

kevincheng2 · 2025-07-08T07:50:42Z

mm and thinking model support structred output
offline Inference support structred output

paddle-bot · 2025-07-08T07:50:48Z

Thanks for your contribution!

Copilot

Pull Request Overview

This PR adds structured output support via guided decoding (reasoning parsers) for multi-modal and thinking models, including offline inference capabilities.

Introduce a new --reasoning_parser CLI argument and propagate it through configuration to model runners.
Extend the sampling and guided decoding pipeline: updated Sampler, guided backend interfaces, and skip-index logic.
Enhance SamplingParams with GuidedDecodingParams and document offline inference usage for structured outputs.

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
fastdeploy/worker/worker_process.py	Add `--reasoning_parser` CLI arg and integrate it into `FDConfig`.
fastdeploy/worker/vl_gpu_model_runner.py	Initialize guided backend and reasoning parser; update guided decoding flow in the GPU model runner.
fastdeploy/model_executor/layers/sample/sampler.py	Enhance `Sampler` to support reasoning parsing and skip indices when masking tokens.
fastdeploy/engine/sampling_params.py	Introduce `GuidedDecodingParams` in `SamplingParams` for offline structured inference.
docs/features/structured_outputs.md	Add offline inference examples for structured output using `GuidedDecodingParams`.

Comments suppressed due to low confidence (3)

fastdeploy/worker/vl_gpu_model_runner.py:145

The code checks for guided_json, guided_regex, guided_grammar, and structural_tag but does not handle guided_choice from GuidedDecodingParams. Add support for guided_choice to ensure all constraint types are honored.

        elif request.guided_grammar is not None:

fastdeploy/engine/engine.py:1049

The code references self.cfg.reasoning_parser, but reasoning_parser is not defined on the engine config object. It should likely reference self.cfg.model_config.reasoning_parser.

            f" --reasoning_parser {self.cfg.reasoning_parser}")

fastdeploy/worker/vl_gpu_model_runner.py:152

Using request.get(...) may not work if request is not a dict-like object. Consider using getattr(request, 'enable_thinking', True) to access the attribute safely.

            enable_thinking=request.get("enable_thinking", True),

fastdeploy/model_executor/layers/sample/sampler.py

fastdeploy/input/ernie_processor.py

kevincheng2 changed the title ~~[vl] mm and thinking model support structred output~~ [Feature] mm and thinking model support structred output Jul 8, 2025

Jiang-Jia-Jun requested review from Copilot and Jiang-Jia-Jun July 9, 2025 04:22

This comment was marked as outdated.

Sign in to view

kevincheng2 force-pushed the mm_structred_output branch from d07f737 to 72de4a3 Compare July 11, 2025 06:41

Jiang-Jia-Jun requested a review from Copilot July 12, 2025 16:08

Copilot AI reviewed Jul 12, 2025

View reviewed changes

fastdeploy/model_executor/layers/sample/sampler.py Show resolved Hide resolved

fastdeploy/input/ernie_processor.py Outdated Show resolved Hide resolved

kevincheng2 force-pushed the mm_structred_output branch from e99d5a7 to 8ae5d81 Compare July 17, 2025 08:39

mm support structured output

04c2f3c

kevincheng2 force-pushed the mm_structred_output branch from aac8503 to 04c2f3c Compare July 17, 2025 12:44

Jiang-Jia-Jun previously approved these changes Jul 18, 2025

View reviewed changes

fastdeploy/input/ernie_processor.py Outdated Show resolved Hide resolved

kevincheng2 dismissed Jiang-Jia-Jun’s stale review via f5ff1a1 July 18, 2025 08:43

kevincheng2 force-pushed the mm_structred_output branch from 36f9386 to f5ff1a1 Compare July 18, 2025 08:43

update code

d653a49

kevincheng2 force-pushed the mm_structred_output branch from 3f12790 to 2ef373a Compare July 18, 2025 11:21

Merge remote-tracking branch 'origin/develop' into mm_structred_output

69fc3a2

kevincheng2 force-pushed the mm_structred_output branch from 2ef373a to 69fc3a2 Compare July 18, 2025 11:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] mm and thinking model support structred output #2749

[Feature] mm and thinking model support structred output #2749

Uh oh!

kevincheng2 commented Jul 8, 2025

Uh oh!

paddle-bot bot commented Jul 8, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Feature] mm and thinking model support structred output #2749

Are you sure you want to change the base?

[Feature] mm and thinking model support structred output #2749

Uh oh!

Conversation

kevincheng2 commented Jul 8, 2025

Uh oh!

paddle-bot bot commented Jul 8, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!