Skip to content

Support stopping engines externally and map cell_id to concrete engines#944

Open
fzyzcjy wants to merge 14 commits intorollout_ft/26from
rollout_ft/27
Open

Support stopping engines externally and map cell_id to concrete engines#944
fzyzcjy wants to merge 14 commits intorollout_ft/26from
rollout_ft/27

Conversation

@fzyzcjy
Copy link
Copy Markdown
Collaborator

@fzyzcjy fzyzcjy commented Apr 7, 2026

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces functionality to stop specific cells within the rollout manager by mapping cell IDs to their corresponding server and engine indices. A new utility module, server_cell.py, was added to handle this mapping logic, and a stop_cell method was implemented in RolloutManager. The review feedback suggests renaming the mapping function for better clarity, optimizing the stop_cell method by caching the cell ID map to avoid redundant computations, and implementing input validation to prevent out-of-bounds access.

engine_indices: list[int]


def get_cell_indexer_of_id_map(servers: dict[str, RolloutServer]) -> list[CellIndexer]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function name get_cell_indexer_of_id_map is confusing and grammatically awkward. Since it returns a list where the index represents the cell_id, a more descriptive name like get_cell_id_to_indexer_map would improve readability and maintainability.

Suggested change
def get_cell_indexer_of_id_map(servers: dict[str, RolloutServer]) -> list[CellIndexer]:
def get_cell_id_to_indexer_map(servers: dict[str, RolloutServer]) -> list[CellIndexer]:

from miles.ray.rollout.rollout_data_conversion import postprocess_rollout_data
from miles.ray.rollout.rollout_server import RolloutServer, start_rollout_servers
from miles.ray.rollout.router_manager import start_session_server
from miles.ray.rollout.server_cell import get_cell_indexer_of_id_map
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Updating the import to reflect the suggested function rename in server_cell.py.

Suggested change
from miles.ray.rollout.server_cell import get_cell_indexer_of_id_map
from miles.ray.rollout.server_cell import get_cell_id_to_indexer_map

Comment on lines +234 to +237
async def stop_cell(self, cell_id: int):
idx = get_cell_indexer_of_id_map(self.servers)[cell_id]
group = self.servers[idx.srv_key].server_groups[idx.group_index]
group.stop_engines(engine_indices=idx.engine_indices)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Recomputing the cell mapping on every call to stop_cell is inefficient, especially if the number of engines is large. Additionally, the cell_id should be validated to ensure it is within the valid range and to prevent unexpected behavior with negative indices (which Python lists allow). Caching the mapping lazily on the instance is a good way to optimize this since self.servers is static after initialization.

Suggested change
async def stop_cell(self, cell_id: int):
idx = get_cell_indexer_of_id_map(self.servers)[cell_id]
group = self.servers[idx.srv_key].server_groups[idx.group_index]
group.stop_engines(engine_indices=idx.engine_indices)
async def stop_cell(self, cell_id: int):
if not hasattr(self, "_cell_id_to_indexer"):
self._cell_id_to_indexer = get_cell_id_to_indexer_map(self.servers)
if not (0 <= cell_id < len(self._cell_id_to_indexer)):
raise IndexError(f"cell_id {cell_id} is out of range (0-{len(self._cell_id_to_indexer) - 1})")
idx = self._cell_id_to_indexer[cell_id]
group = self.servers[idx.srv_key].server_groups[idx.group_index]
group.stop_engines(engine_indices=idx.engine_indices)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant