Implement TUNA in mlos_bench #926

bpkroth · 2025-01-13T16:32:48Z

Placeholder issue to track to implementing the TUNA noisy optimization algorithm in an mlos_bench scheduler now that the TrialRunner abstraction is available (arxiv citation to be added).

May also want to finish implementing a basic ParallelTrialScheduler first (#380).

Probably also need to finish making some singleton VM ARM templates to make use of that (e.g., parameterized by the $trial_runner_id).

@jsfreischuetz for awareness and tracking

See Also: http://aka.ms/mlos/tuna-eurosys-paper

The text was updated successfully, but these errors were encountered:

bpkroth · 2025-04-09T22:18:31Z

Adding a few of my notes from our discussion. @jsfreischuetz please adjust/correct as you see them (just wanted to get them down while they were fresh in my mind).

PRs needed for different features

Basic ParallelTrialScheduler
parallel trial execution #380
WIP: Parallel trial scheduler stub using multiprocessing.Pool #939
Contrary to our discussion, since MLOS already provides --trial-config-repeat-count to repeat a config some number of times, this would effectively already be the same as the null aggregation policy we discussed (i.e., simply re-register the score each time we finish a config trial).
Extend ParallelTrialScheduler to support new aggregation policies.

This is where we can include config syntax to select the config score aggregation policy (e.g., min, avg, avg - stddev, etc.)

May need some facilities to delay registering until after some number of results have returned.

Later as we increase the repeat budget we may need to consider rebuilding the optimizer in order "re-register" the new aggregated score at the higher budget.

Come to think of it, we may need to have a separate PR on incorporating budget from the Optimizer (and in turn mlos_core) or managing that only in the Scheduler.

This may also impact how --trial-config-repeat-count gets interpretted (e.g., "max budget repeats with a fixed increase step function?", or something more flexibly defined for that Scheduler type)

This may also interact with the register_pending APIs in mlos_core in order to mark a config as "in progress".
Add outlier detection capabilities to the aggregation policy.

Make sure this has a configurable variance limit with a default of 30% (per the paper).
Configs for system stats collection.

These should be relatively generic and able to be done for any Linux OS.

We could implement these only as a RemoteEnv config, but I think it will be better to implement as a new OsStatsEnv (possibly as a subclass of RemoteEnv with stock configs that tell it how to do it) and SupportsOsStatsService type that can manage this in a handful of ways.

Doing it that way would allow us in later PRs to check that an OsStatsEnv was included in the target Environment and also allow us to enforce some details about the telemetry it outputs (e.g., OpenTelemetry conforming), which could be useful for mlos_viz analysis and visualization later on as well.

(This is a nice thing to have independently of anything else btw)
Extend ParallelTrialExecuter to add optional "noise adjuster" component.

Requires an OsStatsEnv to be present in the target Env somewhere.

Uses the (OpenTelemetry) data from that to adjust the value produced in the aggregation policy prior to calling register for the Optimizer.

Let's review here and flush out some more details, but then make sub-issues for each of these before we start knocking them out.

bpkroth assigned bpkroth and jsfreischuetz Jan 15, 2025

bpkroth mentioned this issue Jan 22, 2025

WIP: Parallel trial scheduler stub using multiprocessing.Pool #939

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement TUNA in mlos_bench #926

Implement TUNA in mlos_bench #926

bpkroth commented Jan 13, 2025 •

edited

Loading

bpkroth commented Apr 9, 2025 •

edited

Loading

Implement TUNA in mlos_bench #926

Implement TUNA in mlos_bench #926

Comments

bpkroth commented Jan 13, 2025 • edited Loading

bpkroth commented Apr 9, 2025 • edited Loading

PRs needed for different features

bpkroth commented Jan 13, 2025 •

edited

Loading

bpkroth commented Apr 9, 2025 •

edited

Loading