Skip to content

Implement TUNA in mlos_bench #926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bpkroth opened this issue Jan 13, 2025 · 1 comment
Open

Implement TUNA in mlos_bench #926

bpkroth opened this issue Jan 13, 2025 · 1 comment
Assignees

Comments

@bpkroth
Copy link
Contributor

bpkroth commented Jan 13, 2025

Placeholder issue to track to implementing the TUNA noisy optimization algorithm in an mlos_bench scheduler now that the TrialRunner abstraction is available (arxiv citation to be added).

May also want to finish implementing a basic ParallelTrialScheduler first (#380).

Probably also need to finish making some singleton VM ARM templates to make use of that (e.g., parameterized by the $trial_runner_id).

@jsfreischuetz for awareness and tracking

See Also: http://aka.ms/mlos/tuna-eurosys-paper

@bpkroth
Copy link
Contributor Author

bpkroth commented Apr 9, 2025

Adding a few of my notes from our discussion. @jsfreischuetz please adjust/correct as you see them (just wanted to get them down while they were fresh in my mind).

PRs needed for different features

  1. Basic ParallelTrialScheduler
    parallel trial execution #380
    WIP: Parallel trial scheduler stub using multiprocessing.Pool #939
    Contrary to our discussion, since MLOS already provides --trial-config-repeat-count to repeat a config some number of times, this would effectively already be the same as the null aggregation policy we discussed (i.e., simply re-register the score each time we finish a config trial).

  2. Extend ParallelTrialScheduler to support new aggregation policies.

    This is where we can include config syntax to select the config score aggregation policy (e.g., min, avg, avg - stddev, etc.)

    May need some facilities to delay registering until after some number of results have returned.

    Later as we increase the repeat budget we may need to consider rebuilding the optimizer in order "re-register" the new aggregated score at the higher budget.

    Come to think of it, we may need to have a separate PR on incorporating budget from the Optimizer (and in turn mlos_core) or managing that only in the Scheduler.

    This may also impact how --trial-config-repeat-count gets interpretted (e.g., "max budget repeats with a fixed increase step function?", or something more flexibly defined for that Scheduler type)

    This may also interact with the register_pending APIs in mlos_core in order to mark a config as "in progress".

  3. Add outlier detection capabilities to the aggregation policy.

    Make sure this has a configurable variance limit with a default of 30% (per the paper).

  4. Configs for system stats collection.

    These should be relatively generic and able to be done for any Linux OS.

    We could implement these only as a RemoteEnv config, but I think it will be better to implement as a new OsStatsEnv (possibly as a subclass of RemoteEnv with stock configs that tell it how to do it) and SupportsOsStatsService type that can manage this in a handful of ways.

    Doing it that way would allow us in later PRs to check that an OsStatsEnv was included in the target Environment and also allow us to enforce some details about the telemetry it outputs (e.g., OpenTelemetry conforming), which could be useful for mlos_viz analysis and visualization later on as well.

    (This is a nice thing to have independently of anything else btw)

  5. Extend ParallelTrialExecuter to add optional "noise adjuster" component.

    Requires an OsStatsEnv to be present in the target Env somewhere.

    Uses the (OpenTelemetry) data from that to adjust the value produced in the aggregation policy prior to calling register for the Optimizer.

Let's review here and flush out some more details, but then make sub-issues for each of these before we start knocking them out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants