[rollout, tool] feat: add experimental agent framework and gateway runtime#5931
[rollout, tool] feat: add experimental agent framework and gateway runtime#5931zackcxb wants to merge 4 commits intoverl-project:mainfrom
Conversation
|
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new experimental agent framework for collecting and assembling agent trajectories. It includes a GatewayActor for managing OpenAI-compatible chat sessions, a GatewayManager for routing sessions across multiple gateways, and a TrajectoryAssembler to convert collected trajectories into a training-ready DataProto. Additionally, the AsyncLLMServerManager has been updated to support this new gateway-backed session runtime. I have no feedback to provide as there are no review comments.
| if "rm_scores" in batch_tensors: | ||
| meta_info["reward_extra_keys"] = reward_extra_keys | ||
|
|
||
| return DataProto( |
There was a problem hiding this comment.
Do not use DataProto, use TensorDict instead.
| self.config = config | ||
| self._load_balancer = load_balancer_handle | ||
| self._server_id_to_handle: dict[str, ray.actor.ActorHandle] = dict(servers) | ||
| self._server_id_to_handle: dict[str, ray.actor.ActorHandle] = dict(servers or []) |
There was a problem hiding this comment.
Do not modify agent_loop, we will adapt to agent gateway once it's mature.
There was a problem hiding this comment.
so we rewrite the AsyncLLMServerManager class, put it together with anything we inherit from the current agent_loop.py under a new path (e.g. verl/agent)?
| await asyncio.gather(*[_await_ray_ref(gateway.shutdown.remote()) for gateway in self.owned_gateway_actors]) | ||
| self.owned_gateway_actors = [] | ||
| self.gateway_manager = None | ||
| self._server_id_to_handle: dict[str, ray.actor.ActorHandle] = dict(servers) |
There was a problem hiding this comment.
self._server_id_to_handle: dict[str, ray.actor.ActorHandle] = dict(servers or [])
What does this PR do?
This PR adds an experimental agent framework and gateway runtime for multi-turn agent-style rollout
in VERL, according to #5790.
Specifically, it:
verl.experimental.agent_frameworkfor a new abstraction for agent systems, with an example implementation,verl.experimental.agent_gatewayfor OpenAI-compatible session serving and sticky sessionrouting,
AsyncLLMServerManager,ownership.
WIP:
Core implementation is ready for review.
I still need to finish the remaining TODO items before marking this ready for review:
e2e recipe test with SWE agents or other tool agents
multi-modal support
hygiene items for CI
Checklist Before Starting
https://github.com/verl-project/verl/pulls?q=is%3Apr+agent+framework+gateway
[{modules}] {type}: {description}(This will be checked by the CI){modules}includefsdp,megatron,veomni,sglang,vllm,rollout,trainer,ci,training_utils,recipe,hardware,deployment,ray,worker,single_controller,misc,perf,model,algo,env,tool,ckpt,doc,data,cfg,reward,fully_async,one_step_off,like[megatron, fsdp, doc]{type}is infeat,fix,refactor,chore,test[BREAKING]to the beginning of the title.
[BREAKING][fsdp, megatron] feat: dynamic batchingTest
pytest -q tests/experimental/agent_framework tests/experimental/agent_gatewayResult:
22 passed, 1 warningAPI and Usage Example
This PR adds experimental APIs under:
verl.experimental.agent_frameworkverl.experimental.agent_gatewayExample:
Design & Code Changes
High-level changes:
Checklist Before Submitting
(https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): pre-
commit install && pre-commit run --all-files --show-diff-on-failure --color=always
(https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not f
easible, explain why: focused experimental framework / gateway tests are included in this PR.
(https://verl-project.slack.com/archives/C091TCESWB1) in the verl Slack workspace
(https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If n
ot accessible, please try the Feishu group (飞书群)
(https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)
submodule commit via git submodule update --remote or cd recipe && git pull origin main.