Skip to content

[Question] How to compare rollout accuracy using debug-rollout-only mode #1837

@cjy0x

Description

@cjy0x

Your Question

I found that the log stopped to print rollout raw_reward value since this commit. The log_rollout_data won't be executed.

I wonder if there are any other ways to compare rollout accuracy? Thanks

What I've Tried

Only rollout performance arguments have been printed:

�[36m(RolloutManager pid=1805485)�[0m [2026-04-15 07:57:53] sglang_rollout.py:338 - Abort request for ['http://90.90.97.74:15006', 'http://90.90.97.74:15000', 'http://90.90.97.74:15004', 'http://90.90.97.74:15002']
�[36m(RolloutManager pid=1805485)�[0m [2026-04-15 07:57:53] rollout.py:1193 - perf 0: {'rollout/response_len/mean': 2773.71484375, 'rollout/response_len/median': 2837.5, 'rollout/response_len/max': 4096, 'rollout/response_len/min': 593, 'rollout/zero_std/count_0': 12, 'rollout/zero_std/count_1': 14, 'rollout/repetition_frac': 0.0, 'rollout/truncated_ratio': 0.46484375, 'perf/rollout_time': 371.1796889305115, 'perf/tokens_per_gpu_per_sec': 239.12643295688667, 'perf/longest_sample_tokens_per_sec': 11.035086568992766, 'perf/effective_tokens_per_gpu_per_sec': 239.12643295688667, 'perf/longest_effective_sample_tokens_per_sec': 11.035086568992766}
�[36m(SGLangEngine pid=1807332)�[0m [2026-04-15 07:57:53] INFO:     90.90.97.74:49018 - "POST /abort_request HTTP/1.1" 200 OK
�[36m(RolloutManager pid=1805485)�[0m 
Rollout generation:   0%|          | 0/256 [00:00<?, ?it/s]
�[36m(SGLangEngine pid=1807330)�[0m [2026-04-15 07:57:58 TP0] Prefill batch, #new-seq: 1, #new-token: 256, #cached-token: 0, full token usage: 0.01, mamba usage: 0.02, #running-req: 0, #queue-req: 0, npu graph: False, input throughput (token/s): 4.60
�[36m(SGLangEngine pid=1807331)�[0m [2026-04-15 07:58:03 TP0] Decode batch, #running-req: 60, #full token: 18560, full token usage: 0.06, mamba num: 60, mamba usage: 0.09, npu graph: False, gen throughput (token/s): 16.37, #queue-req: 0
�[36m(SGLangEngine pid=1807330)�[0m [2026-04-15 07:58:02 TP0] Prefill batch, #new-seq: 3, #new-token: 1536, #cached-token: 0, full token usage: 0.06, mamba usage: 0.09, #running-req: 57, #queue-req: 0, npu graph: False, input throughput (token/s): 41072.45�[32m [repeated 24x across cluster]�[0m

Environment (if relevant)

  • slime version: v0.2.4
  • Python version:
  • PyTorch version:
  • CUDA/ROCm version:
  • GPU type and count:
  • OS:

Additional Context

No response

Pre-submission Checklist

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions