-
Notifications
You must be signed in to change notification settings - Fork 150
Support stopping engines externally and map cell_id to concrete engines #944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: rollout_ft/26
Are you sure you want to change the base?
Changes from 11 commits
63c83c3
11c167a
504844a
a116d7b
e1219fb
4a665bf
9f89139
5c7f686
b8f8b47
3840993
b1b02cb
8131ae1
e6804eb
f6c9b80
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -10,6 +10,7 @@ | |||||||||||||||||||||||||||||
| from miles.ray.rollout.rollout_data_conversion import postprocess_rollout_data | ||||||||||||||||||||||||||||||
| from miles.ray.rollout.rollout_server import RolloutServer, start_rollout_servers | ||||||||||||||||||||||||||||||
| from miles.ray.rollout.router_manager import start_session_server | ||||||||||||||||||||||||||||||
| from miles.ray.rollout.server_cell import get_cell_indexer_of_id_map | ||||||||||||||||||||||||||||||
| from miles.ray.rollout.train_data_conversion import convert_samples_to_train_data, split_train_data_by_dp | ||||||||||||||||||||||||||||||
| from miles.ray.utils import Lock | ||||||||||||||||||||||||||||||
| from miles.rollout.base_types import ( | ||||||||||||||||||||||||||||||
|
|
@@ -224,6 +225,17 @@ def _get_updatable_server(self) -> RolloutServer | None: | |||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||
| return updatable[0] if updatable else None | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| # -------------------------- external start/stop ----------------------------- | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| # TODO | ||||||||||||||||||||||||||||||
| # async def start_cell(self): | ||||||||||||||||||||||||||||||
| # pass | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| async def stop_cell(self, cell_id: int): | ||||||||||||||||||||||||||||||
| idx = get_cell_indexer_of_id_map(self.servers)[cell_id] | ||||||||||||||||||||||||||||||
| group = self.servers[idx.srv_key].server_groups[idx.group_index] | ||||||||||||||||||||||||||||||
| group.stop_engines(engine_indices=idx.engine_indices) | ||||||||||||||||||||||||||||||
|
Comment on lines
+234
to
+237
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Recomputing the cell mapping on every call to
Suggested change
|
||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| # -------------------------- misc APIs ----------------------------- | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| def get_num_rollout_per_epoch(self): | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,26 @@ | ||||||
| from typing import NamedTuple | ||||||
|
|
||||||
| from miles.ray.rollout.rollout_server import RolloutServer | ||||||
|
|
||||||
|
|
||||||
| class CellIndexer(NamedTuple): | ||||||
| srv_key: str | ||||||
| group_index: int | ||||||
| engine_indices: list[int] | ||||||
|
|
||||||
|
|
||||||
| def get_cell_indexer_of_id_map(servers: dict[str, RolloutServer]) -> list[CellIndexer]: | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The function name
Suggested change
|
||||||
| result: list[CellIndexer] = [] | ||||||
| for srv_key, srv in servers.items(): | ||||||
| for group_index, group in enumerate(srv.server_groups): | ||||||
| for local_cell in range(len(group.engines)): | ||||||
| result.append( | ||||||
| CellIndexer( | ||||||
| srv_key=srv_key, | ||||||
| group_index=group_index, | ||||||
| engine_indices=list( | ||||||
| range(local_cell * group.nodes_per_engine, (local_cell + 1) * group.nodes_per_engine) | ||||||
| ), | ||||||
| ) | ||||||
| ) | ||||||
| return result | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating the import to reflect the suggested function rename in
server_cell.py.