Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New validation mode: via sending model #205

Open
wants to merge 40 commits into
base: main
Choose a base branch
from

Conversation

Parzival-05
Copy link
Collaborator

No description provided.

@Parzival-05 Parzival-05 force-pushed the david/model-validation branch 5 times, most recently from 04b5e54 to 1dfc4de Compare December 27, 2024 22:10
@Parzival-05 Parzival-05 force-pushed the david/model-validation branch 2 times, most recently from 7913201 to 9dc7cf6 Compare December 28, 2024 18:45
@Parzival-05 Parzival-05 force-pushed the david/new-validation-design branch from ae4f954 to 3080a6d Compare December 28, 2024 19:46
@Parzival-05 Parzival-05 force-pushed the david/model-validation branch from 9dc7cf6 to 0da98e9 Compare December 28, 2024 19:50
@Parzival-05 Parzival-05 force-pushed the david/new-validation-design branch from 3080a6d to 8d465dc Compare December 28, 2024 20:34
@Parzival-05 Parzival-05 force-pushed the david/model-validation branch from 0da98e9 to 2d61059 Compare December 28, 2024 21:00
@Parzival-05 Parzival-05 marked this pull request as draft December 28, 2024 22:07
@Parzival-05 Parzival-05 force-pushed the david/model-validation branch 2 times, most recently from 1087af2 to 2bdb53b Compare December 30, 2024 04:59
@Parzival-05 Parzival-05 force-pushed the david/new-validation-design branch from becaff2 to 843076a Compare December 30, 2024 05:01
@Parzival-05 Parzival-05 marked this pull request as ready for review December 30, 2024 05:34
@Parzival-05 Parzival-05 force-pushed the david/new-validation-design branch 2 times, most recently from b24850b to 9ef0dca Compare December 30, 2024 06:19
@Parzival-05 Parzival-05 force-pushed the david/model-validation branch from bb2a8ea to fc3ecf9 Compare December 30, 2024 06:21
@Parzival-05 Parzival-05 force-pushed the david/new-validation-design branch from 9ef0dca to 7657a7b Compare December 30, 2024 17:48
@Parzival-05 Parzival-05 force-pushed the david/model-validation branch from fc3ecf9 to cd56d10 Compare December 30, 2024 17:52
@Parzival-05 Parzival-05 marked this pull request as draft January 8, 2025 22:07
@Parzival-05 Parzival-05 force-pushed the david/model-validation branch from cd56d10 to 0a0a262 Compare January 31, 2025 21:24
@Parzival-05 Parzival-05 force-pushed the david/model-validation branch from 3b8cedc to d345340 Compare February 11, 2025 07:20
@Parzival-05 Parzival-05 force-pushed the david/model-validation branch from 576996c to 3fe73d1 Compare March 7, 2025 13:39
@Parzival-05 Parzival-05 marked this pull request as ready for review March 9, 2025 10:09
@Parzival-05 Parzival-05 requested review from emnigma and Anya497 March 9, 2025 11:29
@@ -492,6 +492,10 @@ def graphs_are_equal(new_g_v, old_g_v, new_s_v, old_s_v):
return np.array_equal(new_g_v, old_g_v) and np.array_equal(new_s_v, old_s_v)

def merge_distributions(old_distribution, new_distribution):
if old_distribution.device != new_distribution.device:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data for manipulation must be on the same device

@dataclass(slots=True)
class ModelGameStep:
GameState: GameState
Output: list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List of what?

game_map_info = ModelGameMapInfo(
total_game_state=None,
total_steps=[],
proc=None, # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too much problems with types -> design is wrong.

) -> Tuple[int, socket.socket]:
if attempts <= 0:
raise RuntimeError("Failed to occupy port")
logging.debug(f"Looking for port... Attempls left: {attempts}.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this logic is not in connection manager (utils?)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

game_map_info.proc.wait()
with open(steps, "br") as f:
steps_json = json.load(f)
steps = list(map(lambda v: ModelGameStep.from_dict(v), steps_json)) # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such tricks with types hint at bad design.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pyright issue with dataclass_json methods lidatong/dataclasses-json#309
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

step = steps.pop(0)
game_state, nn_output = step.GameState, step.Output

def get_hetero_data(game_state: GameState, nn_output: list):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be moved to dataset?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -256,6 +257,13 @@ def objective(
torch.save(model.state_dict(), CURRENT_MODEL_PATH)
mlflow.log_artifact(CURRENT_MODEL_PATH, str(epoch))

model_kwargs = trial.params.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model args are required by the onnx converter

help="Path to .yaml-file with model kwargs",

@Parzival-05 Parzival-05 requested a review from gsvgit March 15, 2025 20:09
Copy link
Collaborator

@emnigma emnigma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of TODOs still. Will they be fixed? Or you want to merge them?

SVMConnectionInfo,
)
from ml.validation.coverage.game_managers.utils import set_timeout_if_needed
from onyx import entrypoint, load_gamestate, resolve_import_model
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from onyx import entrypoint, load_gamestate, resolve_import_model
from onyx import entrypoint as save_torch_model_to_onnx_file, load_gamestate, resolve_import_model

Comment on lines +94 to +95
self._create_onnx_model()
self._clean_output_folder()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swap please, if SVMS_OUTPUT_PATH == CURRENT_ONNX_MODEL_PATH.parent, then onnx model will be removed

game_map_info.proc.wait()
with open(steps, "br") as f:
steps_json = json.load(f)
steps = list(map(lambda v: ModelGameStep.from_dict(v), steps_json)) # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

steps = None
return steps

def are_steps_required(self, game_map: GameMap, required: bool):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is "pollute the class namespace"? If the issue is in duck typing hints, you can create static class and put these functions inside

I am suggesting this change because rn this function is unreadable to the reviewer. Also imagine you need to test convert_steps_to_hetero function. How would you do it?

@@ -57,12 +64,15 @@ def _evaluate_game_map(
)
map_name = game_map.MapName
if self.dataset.is_update_map_required(map_name, map_result):
self._game_manager.are_steps_required(game_map=game_map, required=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why something is run after alert_svm_about_step_saving? According to this func documentation, "symbolic execution environment" is "notified", why something else is happening?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants