Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More explicit logic for CONDA_LIBMAMBA_SOLVER_MAX_ATTEMPTS #394

Merged
merged 6 commits into from
Dec 5, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 31 additions & 5 deletions conda_libmamba_solver/solver.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
from conda.core.prefix_data import PrefixData
from conda.core.solve import Solver
from conda.exceptions import (
CondaValueError,
InvalidMatchSpec,
InvalidSpec,
PackagesNotFoundError,
Expand Down Expand Up @@ -256,18 +257,43 @@ def _spinner_msg_solving(self):
return "Getting pinned dependencies"
return "Solving environment"

def _max_attempts(self, in_state: SolverInputState):
from_env_var = os.environ.get("CONDA_LIBMAMBA_SOLVER_MAX_ATTEMPTS")
n_installed = len(in_state.installed)
if from_env_var:
try:
max_attempts_from_env = int(from_env_var)
except ValueError:
raise CondaValueError(
f"CONDA_LIBMAMBA_SOLVER_MAX_ATTEMPTS='{from_env_var}'. Must be int."
)
if max_attempts_from_env < 1:
raise CondaValueError(
f"CONDA_LIBMAMBA_SOLVER_MAX_ATTEMPTS='{max_attempts_from_env}'. Must be >=1."
)
if max_attempts_from_env > n_installed:
log.warning(
"CONDA_LIBMAMBA_SOLVER_MAX_ATTEMPTS='%s' is higher than the number of "
"installed packages (%s) and will be ignored.",
max_attempts_from_env,
n_installed,
)
return n_installed
return max_attempts_from_env
if in_state.update_modifier.FREEZE_INSTALLED:
# this the default, but can be overriden with --update-specs
# TODO: should we cap this at a reasonable number? some base envs have 100s of pkgs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main point of this PR :) Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't messed with libmamba solver at all, so absolutely no idea/feeling about what ranges would be sensible.

If you don't want a hard limit, you could use some function that just increases slower for higher inputs.
If you want a hard limit you could either do min(n_installed, limit) or something that only approaches the limit gradually like math.ceil(n_installed / (1 + (n_installed / limit)**2)**.5) or something more/less sophisticated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each solve is usually sub-second, but some complex one that need some extra backtracking (like the one reported in your issue) take a 2-4s. If you end up with 100 installed packages then you need to wait a few minutes for the solver to give up.

By "giving up" I mean that we stop trying to unlock installed packages and we let the solver modify them as needed, just as mamba does. So maybe we can get by with 10 attempts. In other words, let the solver unfreeze up to 5-10 conflicting installed packages one by one until we just let everything float.

Alternatively, we could make the retry loop be based on time spent and not iterations, but that might be more complex than necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to try with 10 attempts and see if that is enough to pass all tests. Then we adjust as necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.
Could you add that as a constant variable (DEFAULT_LIBMAMBA_SOLVER_MAX_ATTEMPTS or something ) to better track that?


Alternatively, we could make the retry loop be based on time spent and not iterations, but that might be more complex than necessary.

Yes, complexity, but also determinism ;).

return n_installed
return 1

def _solving_loop(
self,
in_state: SolverInputState,
out_state: SolverOutputState,
index: LibMambaIndexHelper,
):
solved = False
max_attempts = max(
2,
int(os.environ.get("CONDA_LIBMAMBA_SOLVER_MAX_ATTEMPTS", len(in_state.installed))) + 1,
)
for attempt in range(1, max_attempts):
for attempt in range(1, self._max_attempts(in_state) + 1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to prepopulate the max attempts at the start of the solve as a instance attribute, instead of inline, so it's easier to debug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this might break some tests. It shouldn't, but by saving it as instance attribute we are assuming we are going to have the same number of attempts for every call to solve_for_*() (for the lifetime of the same instance). In some instances, that might be tied to len(in_state.installed).

So I'd rather keep it like it is because it's technically more correct and does not assume that "one instantiation, one solve".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thank you!

log.debug("Starting solver attempt %s", attempt)
try:
solved = self._solve_attempt(in_state, out_state, index, attempt=attempt)
Expand Down
Loading