[Bug] `_get_capped_partitions` crashes when a single sample exceeds `max_tokens_per_gpu`

### Bug Description

PR: #1823 (Add fallback for getseqlenbalanced_partitions)

Issue: When rollout-max-response-len > max-tokens-per-gpu, a single sample's total length (prompt + response) can exceed max_tokens_per_gpu. get_minimum_num_micro_batch_size handles this correctly by isolating the oversized sample in its own micro-batch. 

However, the fallback _get_capped_partitions enforces sums[i] + length <= max_tokens strictly, so it can't place the sample in any partition and hits raise AssertionError("This should never happen.").


### Steps to Reproduce

   Repro config:

   --rollout-max-response-len 8192
   --max-tokens-per-gpu 4096


   Any sample with prompt (~400 tokens) + response (>3696 tokens) triggers the crash.

### Expected Behavior

Expected: _get_capped_partitions should match get_minimum_num_micro_batch_size's behavior — when a sample can't fit in any existing partition, place it alone in an empty partition (even if it exceeds max_tokens).

If this is not desired, like we want to enforce max tokens per gpu, we should add a more meaningful error message and update the documentation about the limitation, or enforce this when parsing the config.



### Actual Behavior

 raise AssertionError("This should never happen.").

### Environment

 - slime version: v0.2.4 (commit 286750aa)
  - Python version: 3.12.3
  - PyTorch version: 2.9.1+cu129
  - CUDA version: 12.9
  - GPU type and count: NVIDIA H200, 8 per node (4 nodes, 32 total)
  - OS: Linux (Amazon Linux 2023, kernel 6.1.141)
  - SGLang version: 0.5.9
  - Megatron-LM version: 0.16.0

### Logs

```shell

```

### Additional Context

_No response_

### Pre-submission Checklist

- [x] I have read the [CONTRIBUTING.md](https://github.com/THUDM/slime/blob/main/CONTRIBUTING.md) and understand the collaboration scope.
- [x] I have read the [documentation](https://thudm.github.io/slime/) and my issue is not addressed there.
- [x] I have searched for [existing issues](https://github.com/THUDM/slime/issues) and this is not a duplicate.
- [x] I have provided a minimal, reproducible example.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] `_get_capped_partitions` crashes when a single sample exceeds `max_tokens_per_gpu` #1839

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Logs

Additional Context

Pre-submission Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] _get_capped_partitions crashes when a single sample exceeds max_tokens_per_gpu #1839

Description

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Logs

Additional Context

Pre-submission Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Bug] `_get_capped_partitions` crashes when a single sample exceeds `max_tokens_per_gpu` #1839