Skip to content

[feat] update swe-agent runtime params for long-context DP attention#959

Open
guapisolo wants to merge 1 commit intomainfrom
feat/update-swe-runtime-params
Open

[feat] update swe-agent runtime params for long-context DP attention#959
guapisolo wants to merge 1 commit intomainfrom
feat/update-swe-runtime-params

Conversation

@guapisolo
Copy link
Copy Markdown
Collaborator

  • Increase default session server timeout from 600s to 1800s
  • Increase max_seq_len to 64000 and rollout-max-response-len to 16384
  • Configure 8-GPU DP attention (data-parallel-size 8, enable-dp-attention)
  • Explicitly set --miles-router-timeout 3600 for long agent tasks
  • Add commented-out speculative decoding and MoE params for future use

Made-with: Cursor

- Increase default session server timeout from 600s to 1800s
- Increase max_seq_len to 64000 and rollout-max-response-len to 16384
- Configure 8-GPU DP attention (data-parallel-size 8, enable-dp-attention)
- Explicitly set --miles-router-timeout 3600 for long agent tasks
- Add commented-out speculative decoding and MoE params for future use

Made-with: Cursor
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the configuration for the swe-agent-v2 experiment, including increasing the maximum sequence length, adjusting rollout response lengths, and updating the SGLang engine arguments to support data-parallel attention and speculative decoding. Feedback was provided regarding incorrect GPU and data-parallel configuration flags, the need to comment out incomplete speculative decoding parameters to prevent initialization failures, and a recommendation to remove unused commented-out code.

# Agent tasks can run long (complex CoT + multi-step tool calls);
# default 1800s may not be enough for the hardest instances.
"--miles-router-timeout 3600 "
"--rollout-num-gpus-per-engine 8 "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Setting --rollout-num-gpus-per-engine to 8 will set the Tensor Parallel (TP) size to 8 in the SGLang engine (see miles/backends/sglang_utils/sglang_engine.py:622). Combined with --sglang-data-parallel-size 8, this would require 64 GPUs (TP=8 * DP=8). For an 8-GPU setup intended to use Data Parallel (DP) attention, this should be set to 1 so that each of the 8 replicas uses a single GPU.

Suggested change
"--rollout-num-gpus-per-engine 8 "
"--rollout-num-gpus-per-engine 1 "

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here rollout engine = 8 is fine, but we will need EP 8 for MOE right? o.w. the MOE part would be TP 8 instead of EP 8?

# default 1800s may not be enough for the hardest instances.
"--miles-router-timeout 3600 "
"--rollout-num-gpus-per-engine 8 "
"--sglang-data-parallel-size 8 "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The SGLang engine implementation specifically looks for the sglang_dp_size attribute (see miles/backends/sglang_utils/sglang_engine.py:623). Using --sglang-data-parallel-size will likely result in the setting being ignored. Use --sglang-dp-size instead.

Suggested change
"--sglang-data-parallel-size 8 "
"--sglang-dp-size 8 "

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sglang-dp-size is correct

Comment on lines +164 to +167
"--sglang-speculative-algorithm EAGLE "
"--sglang-speculative-num-steps 2 "
"--sglang-speculative-eagle-topk 1 "
"--sglang-speculative-num-draft-tokens 3 "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These speculative decoding parameters are currently active, but the pull request description states they should be commented out for future use. Furthermore, the EAGLE algorithm requires a draft model to be specified (e.g., via --sglang-speculative-draft-model), otherwise the engine will fail to initialize. Note that parameters like eagle-topk should ideally be retrieved from configuration rather than being hardcoded.

Suggested change
"--sglang-speculative-algorithm EAGLE "
"--sglang-speculative-num-steps 2 "
"--sglang-speculative-eagle-topk 1 "
"--sglang-speculative-num-draft-tokens 3 "
# "--sglang-speculative-algorithm EAGLE "
# "--sglang-speculative-num-steps 2 "
# "--sglang-speculative-eagle-topk 1 "
# "--sglang-speculative-num-draft-tokens 3 "
References
  1. Model parameters, such as index_topk, should be retrieved from the model configuration rather than being hardcoded.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec decoding by default?

Comment on lines +168 to +170
# "--sglang-expert-parallel-size 8 "
# "--sglang-moe-dense-tp-size 1 "
# "--sglang-enable-dp-lm-head "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid including commented-out code in the repository. These lines should be removed if they are not currently needed.

# Agent tasks can run long (complex CoT + multi-step tool calls);
# default 1800s may not be enough for the hardest instances.
"--miles-router-timeout 3600 "
"--rollout-num-gpus-per-engine 8 "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here rollout engine = 8 is fine, but we will need EP 8 for MOE right? o.w. the MOE part would be TP 8 instead of EP 8?

# default 1800s may not be enough for the hardest instances.
"--miles-router-timeout 3600 "
"--rollout-num-gpus-per-engine 8 "
"--sglang-data-parallel-size 8 "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sglang-dp-size is correct

Comment on lines +164 to +167
"--sglang-speculative-algorithm EAGLE "
"--sglang-speculative-num-steps 2 "
"--sglang-speculative-eagle-topk 1 "
"--sglang-speculative-num-draft-tokens 3 "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec decoding by default?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants