diff --git a/docs/source/algorithms.md b/docs/source/algorithms.md index ad355a0af..d96de704f 100644 --- a/docs/source/algorithms.md +++ b/docs/source/algorithms.md @@ -4124,6 +4124,80 @@ package. To use it, you need to have - **n_restarts** (int): Number of times to restart the optimizer. Default is 1. ``` +```{eval-rst} +.. dropdown:: nevergrad_oneplusone + + .. code-block:: + + "nevergrad_oneplusone" + + Minimize a scalar function using the One Plus One Evolutionary algorithm from Nevergrad. + + THe One Plus One evolutionary algorithm iterates to find a set of parameters that minimizes the loss + function. It does this by perturbing, or mutating, the parameters from the last iteration (the + parent). If the new (child) parameters yield a better result, then the child becomes the new parent + whose parameters are perturbed, perhaps more aggressively. If the parent yields a better result, it + remains the parent and the next perturbation is less aggressive. Originally proposed by + :cite:`Rechenberg1973`. The implementation in Nevergrad is based on the one-fifth adaptation rule, + going back to :cite:`Schumer1968. + + - **noise\_handling**: Method for handling the noise, can be + - "random": A random point is reevaluated regularly using the one-fifth adaptation rule. + - "optimistic": The best optimistic point is reevaluated regularly, embracing optimism in the face of uncertainty. + - A float coefficient can be provided to tune the regularity of these reevaluations (default is 0.05). Eg: with 0.05, each evaluation has a 5% chance (i.e., 1 in 20) of being repeated (i.e., the same candidate solution is reevaluated to better estimate its performance). (Default: `None`). + - **n\_cores**: Number of cores to use. + + - **stopping.maxfun**: Maximum number of function evaluations. + - **mutation**: Type of mutation to apply. Available options are (Default: `"gaussian"`). + - "gaussian": Standard mutation by adding a Gaussian random variable (with progressive widening) to the best pessimistic point. + - "cauchy": Same as Gaussian but using a Cauchy distribution. + - "discrete": Mutates a randomly drawn variable (mutation occurs with probability 1/d in d dimensions, hence ~1 variable per mutation). + - "discreteBSO": Follows brainstorm optimization by gradually decreasing mutation rate from 1 to 1/d. + - "fastga": Fast Genetic Algorithm mutations from the current best. + - "doublefastga": Double-FastGA mutations from the current best :cite:`doerr2017`. + - "rls": Randomized Local Search — mutates one and only one variable. + - "portfolio": Random number of mutated bits, known as uniform mixing :cite:`dang2016`. + - "lengler": Mutation rate is a function of dimension and iteration index. + - "lengler{2|3|half|fourth}": Variants of the Lengler mutation rate adaptation. + - **sparse**: Whether to apply random mutations that set variables to zero. Default is `False`. + - **smoother**: Whether to suggest smooth mutations. Default is `False`. + - **annealing**: + Annealing schedule to apply to mutation amplitude or temperature-based control. Options are: + - "none": No annealing is applied. + - "Exp0.9": Exponential decay with rate 0.9. + - "Exp0.99": Exponential decay with rate 0.99. + - "Exp0.9Auto": Exponential decay with rate 0.9, auto-scaled based on problem horizon. + - "Lin100.0": Linear decay from 1 to 0 over 100 iterations. + - "Lin1.0": Linear decay from 1 to 0 over 1 iteration. + - "LinAuto": Linearly decaying annealing automatically scaled to the problem horizon. Default is `"none"`. + - **super\_radii**: + Whether to apply extended radii beyond standard bounds for candidate generation, enabling broader + exploration. Default is `False`. + - **roulette\_size**: + Size of the roulette wheel used for selection in the evolutionary process. Affects the sampling + diversity from past candidates. (Default: `64`) + - **antismooth**: + Degree of anti-smoothing applied to prevent premature convergence in smooth landscapes. This alters + the landscape by penalizing overly smooth improvements. (Default: `4`) + - **crossover**: Whether to include a genetic crossover step every other iteration. Default is `False`. + - **crossover\_type**: + Method used for genetic crossover between individuals in the population. Available options (Default: `"none"`): + - "none": No crossover is applied. + - "rand": Randomized selection of crossover point. + - "max": Crossover at the point with maximum fitness gain. + - "min": Crossover at the point with minimum fitness gain. + - "onepoint": One-point crossover, splitting the genome at a single random point. + - "twopoint": Two-point crossover, splitting the genome at two points and exchanging the middle section. + - **tabu\_length**: + Length of the tabu list used to prevent revisiting recently evaluated candidates in local search + strategies. Helps in escaping local minima. (Default: `1000`) + - **rotation**: + Whether to apply rotational transformations to the search space, promoting invariance to axis- + aligned structures and enhancing search performance in rotated coordinate systems. (Default: + `False`) + - **seed**: Seed for the random number generator for reproducibility. +``` + ## References ```{eval-rst} diff --git a/docs/source/refs.bib b/docs/source/refs.bib index 4a1f60848..1edf9fb31 100644 --- a/docs/source/refs.bib +++ b/docs/source/refs.bib @@ -927,6 +927,48 @@ @InProceedings{Zambrano2013 doi = {10.1109/CEC.2013.6557848}, } +@book{Rechenberg1973, + author = {Rechenberg, Ingo}, + title = {Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution}, + publisher = {Frommann-Holzboog Verlag}, + year = {1973}, + url = {https://gwern.net/doc/reinforcement-learning/exploration/1973-rechenberg.pdf}, + address = {Stuttgart}, + note = {[Evolution Strategy: Optimization of Technical Systems According to the Principles of Biological Evolution]} +} + +@article{Schumer1968, + author={Schumer, M. and Steiglitz, K.}, + journal={IEEE Transactions on Automatic Control}, + title={Adaptive step size random search}, + year={1968}, + volume={13}, + number={3}, + pages={270-276}, + keywords={Minimization methods;Gradient methods;Search methods;Adaptive control;Communication systems;Q measurement;Cost function;Newton method;Military computing}, + doi={10.1109/TAC.1968.1098903} +} + +@misc{doerr2017, + title={Fast Genetic Algorithms}, + author={Benjamin Doerr and Huu Phuoc Le and Régis Makhmara and Ta Duy Nguyen}, + year={2017}, + eprint={1703.03334}, + archivePrefix={arXiv}, + primaryClass={cs.NE}, + url={https://arxiv.org/abs/1703.03334}, +} + +@misc{dang2016, + title={Self-adaptation of Mutation Rates in Non-elitist Populations}, + author={Duc-Cuong Dang and Per Kristian Lehre}, + year={2016}, + eprint={1606.05551}, + archivePrefix={arXiv}, + primaryClass={cs.NE}, + url={https://arxiv.org/abs/1606.05551}, +} + @Misc{Nogueira2014, author={Fernando Nogueira}, title={{Bayesian Optimization}: Open source constrained global optimization tool for {Python}}, diff --git a/src/optimagic/algorithms.py b/src/optimagic/algorithms.py index cbd65474b..588514e95 100644 --- a/src/optimagic/algorithms.py +++ b/src/optimagic/algorithms.py @@ -12,7 +12,6 @@ from typing import Type, cast from optimagic.optimization.algorithm import Algorithm -from optimagic.optimizers.bayesian_optimizer import BayesOpt from optimagic.optimizers.bhhh import BHHH from optimagic.optimizers.fides import Fides from optimagic.optimizers.iminuit_migrad import IminuitMigrad @@ -367,7 +366,6 @@ def Scalar(self) -> BoundedGlobalGradientFreeNonlinearConstrainedScalarAlgorithm @dataclass(frozen=True) class BoundedGlobalGradientFreeScalarAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nevergrad_pso: Type[NevergradPSO] = NevergradPSO nlopt_crs2_lm: Type[NloptCRS2LM] = NloptCRS2LM nlopt_direct: Type[NloptDirect] = NloptDirect @@ -1034,7 +1032,6 @@ def Local(self) -> GradientBasedLocalNonlinearConstrainedScalarAlgorithms: @dataclass(frozen=True) class BoundedGlobalGradientFreeAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nevergrad_pso: Type[NevergradPSO] = NevergradPSO nlopt_crs2_lm: Type[NloptCRS2LM] = NloptCRS2LM nlopt_direct: Type[NloptDirect] = NloptDirect @@ -1099,7 +1096,6 @@ def Scalar(self) -> GlobalGradientFreeNonlinearConstrainedScalarAlgorithms: @dataclass(frozen=True) class GlobalGradientFreeScalarAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nevergrad_pso: Type[NevergradPSO] = NevergradPSO nlopt_crs2_lm: Type[NloptCRS2LM] = NloptCRS2LM nlopt_direct: Type[NloptDirect] = NloptDirect @@ -1309,7 +1305,6 @@ def Scalar(self) -> BoundedGradientFreeNonlinearConstrainedScalarAlgorithms: @dataclass(frozen=True) class BoundedGradientFreeScalarAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nag_pybobyqa: Type[NagPyBOBYQA] = NagPyBOBYQA nevergrad_pso: Type[NevergradPSO] = NevergradPSO nlopt_bobyqa: Type[NloptBOBYQA] = NloptBOBYQA @@ -1534,7 +1529,6 @@ def Scalar(self) -> BoundedGlobalNonlinearConstrainedScalarAlgorithms: @dataclass(frozen=True) class BoundedGlobalScalarAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nevergrad_pso: Type[NevergradPSO] = NevergradPSO nlopt_crs2_lm: Type[NloptCRS2LM] = NloptCRS2LM nlopt_direct: Type[NloptDirect] = NloptDirect @@ -2147,7 +2141,6 @@ def Local(self) -> GradientBasedLikelihoodLocalAlgorithms: @dataclass(frozen=True) class GlobalGradientFreeAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nevergrad_pso: Type[NevergradPSO] = NevergradPSO nlopt_crs2_lm: Type[NloptCRS2LM] = NloptCRS2LM nlopt_direct: Type[NloptDirect] = NloptDirect @@ -2234,7 +2227,6 @@ def Scalar(self) -> GradientFreeLocalScalarAlgorithms: @dataclass(frozen=True) class BoundedGradientFreeAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nag_dfols: Type[NagDFOLS] = NagDFOLS nag_pybobyqa: Type[NagPyBOBYQA] = NagPyBOBYQA nevergrad_pso: Type[NevergradPSO] = NevergradPSO @@ -2332,7 +2324,6 @@ def Scalar(self) -> GradientFreeNonlinearConstrainedScalarAlgorithms: @dataclass(frozen=True) class GradientFreeScalarAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nag_pybobyqa: Type[NagPyBOBYQA] = NagPyBOBYQA neldermead_parallel: Type[NelderMeadParallel] = NelderMeadParallel nevergrad_pso: Type[NevergradPSO] = NevergradPSO @@ -2456,7 +2447,6 @@ def Scalar(self) -> GradientFreeParallelScalarAlgorithms: @dataclass(frozen=True) class BoundedGlobalAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nevergrad_pso: Type[NevergradPSO] = NevergradPSO nlopt_crs2_lm: Type[NloptCRS2LM] = NloptCRS2LM nlopt_direct: Type[NloptDirect] = NloptDirect @@ -2539,7 +2529,6 @@ def Scalar(self) -> GlobalNonlinearConstrainedScalarAlgorithms: @dataclass(frozen=True) class GlobalScalarAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nevergrad_pso: Type[NevergradPSO] = NevergradPSO nlopt_crs2_lm: Type[NloptCRS2LM] = NloptCRS2LM nlopt_direct: Type[NloptDirect] = NloptDirect @@ -2854,7 +2843,6 @@ def Scalar(self) -> BoundedNonlinearConstrainedScalarAlgorithms: @dataclass(frozen=True) class BoundedScalarAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt fides: Type[Fides] = Fides iminuit_migrad: Type[IminuitMigrad] = IminuitMigrad ipopt: Type[Ipopt] = Ipopt @@ -3167,7 +3155,6 @@ def Scalar(self) -> GradientBasedScalarAlgorithms: @dataclass(frozen=True) class GradientFreeAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nag_dfols: Type[NagDFOLS] = NagDFOLS nag_pybobyqa: Type[NagPyBOBYQA] = NagPyBOBYQA neldermead_parallel: Type[NelderMeadParallel] = NelderMeadParallel @@ -3242,7 +3229,6 @@ def Scalar(self) -> GradientFreeScalarAlgorithms: @dataclass(frozen=True) class GlobalAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt nevergrad_pso: Type[NevergradPSO] = NevergradPSO nlopt_crs2_lm: Type[NloptCRS2LM] = NloptCRS2LM nlopt_direct: Type[NloptDirect] = NloptDirect @@ -3372,7 +3358,6 @@ def Scalar(self) -> LocalScalarAlgorithms: @dataclass(frozen=True) class BoundedAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt fides: Type[Fides] = Fides iminuit_migrad: Type[IminuitMigrad] = IminuitMigrad ipopt: Type[Ipopt] = Ipopt @@ -3510,7 +3495,6 @@ def Scalar(self) -> NonlinearConstrainedScalarAlgorithms: @dataclass(frozen=True) class ScalarAlgorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt fides: Type[Fides] = Fides iminuit_migrad: Type[IminuitMigrad] = IminuitMigrad ipopt: Type[Ipopt] = Ipopt @@ -3687,7 +3671,6 @@ def Scalar(self) -> ParallelScalarAlgorithms: @dataclass(frozen=True) class Algorithms(AlgoSelection): - bayes_opt: Type[BayesOpt] = BayesOpt bhhh: Type[BHHH] = BHHH fides: Type[Fides] = Fides iminuit_migrad: Type[IminuitMigrad] = IminuitMigrad diff --git a/src/optimagic/optimizers/nevergrad_optimizers.py b/src/optimagic/optimizers/nevergrad_optimizers.py index e8f6c239c..720115038 100644 --- a/src/optimagic/optimizers/nevergrad_optimizers.py +++ b/src/optimagic/optimizers/nevergrad_optimizers.py @@ -15,7 +15,7 @@ from optimagic.optimization.internal_optimization_problem import ( InternalOptimizationProblem, ) -from optimagic.typing import AggregationLevel, PositiveInt +from optimagic.typing import AggregationLevel, NonNegativeInt, PositiveInt if IS_NEVERGRAD_INSTALLED: import nevergrad as ng @@ -110,3 +110,137 @@ def _solve_internal_problem( ) return result + + +@mark.minimizer( + name="nevergrad_oneplusone", + solver_type=AggregationLevel.SCALAR, + is_available=IS_NEVERGRAD_INSTALLED, + is_global=True, + needs_jac=False, + needs_hess=False, + needs_bounds=False, + supports_parallelism=True, + supports_bounds=True, + supports_infinite_bounds=False, + supports_linear_constraints=False, + supports_nonlinear_constraints=False, + disable_history=False, +) +@dataclass(frozen=True) +class NevergradOnePlusOne(Algorithm): + noise_handling: ( + Literal["random", "optimistic"] + | tuple[Literal["random", "optimistic"], float] + | None + ) = None + mutation: Literal[ + "gaussian", + "cauchy", + "discrete", + "fastga", + "rls", + "doublefastga", + "adaptive", + "coordinatewise_adaptive", + "portfolio", + "discreteBSO", + "lengler", + "lengler2", + "lengler3", + "lenglerhalf", + "lenglerfourth", + "doerr", + "lognormal", + "xlognormal", + "xsmalllognormal", + "tinylognormal", + "smalllognormal", + "biglognormal", + "hugelognormal", + ] = "gaussian" + annealing: ( + Literal[ + "none", "Exp0.9", "Exp0.99", "Exp0.9Auto", "Lin100.0", "Lin1.0", "LinAuto" + ] + | None + ) = None + sparse: bool = False + super_radii: bool = False + smoother: bool = False + roulette_size: PositiveInt = 64 + antismooth: NonNegativeInt = 4 + crossover: bool = False + crossover_type: ( + Literal["none", "rand", "max", "min", "onepoint", "twopoint"] | None + ) = None + tabu_length: NonNegativeInt = 1000 + rotation: bool = False + seed: int | None = None + stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL + n_cores: PositiveInt = 1 + + def _solve_internal_problem( + self, problem: InternalOptimizationProblem, x0: NDArray[np.float64] + ) -> InternalOptimizeResult: + if not IS_NEVERGRAD_INSTALLED: + raise NotInstalledError( + "The nevergrad_oneplusone optimizer requires the 'nevergrad' package " + "to be installed. You can install it with `pip install nevergrad`. " + "Visit https://facebookresearch.github.io/nevergrad/getting_started.html" + " for more detailed installation instructions." + ) + + instrum = ng.p.Array( + init=x0, lower=problem.bounds.lower, upper=problem.bounds.upper + ) + + instrum.specify_tabu_length(tabu_length=self.tabu_length) + instrum = ng.p.Instrumentation(instrum) + + if self.seed is not None: + instrum.random_state.seed(self.seed) + + optimizer = ng.optimizers.ParametrizedOnePlusOne( + noise_handling=self.noise_handling, + mutation=self.mutation, + crossover=self.crossover, + rotation=self.rotation, + annealing=self.annealing or "none", + sparse=self.sparse, + smoother=self.smoother, + super_radii=self.super_radii, + roulette_size=self.roulette_size, + antismooth=self.antismooth, + crossover_type=self.crossover_type or "none", + )( + parametrization=instrum, + budget=self.stopping_maxfun, + num_workers=self.n_cores, + ) + + while optimizer.num_ask < self.stopping_maxfun: + x_list = [ + optimizer.ask() + for _ in range( + min(self.n_cores, self.stopping_maxfun - optimizer.num_ask) + ) + ] + losses = problem.batch_fun( + [x.value[0][0] for x in x_list], n_cores=self.n_cores + ) + for x, loss in zip(x_list, losses, strict=True): + optimizer.tell(x, loss) + + recommendation = optimizer.provide_recommendation() + + result = InternalOptimizeResult( + x=recommendation.value[0][0], + fun=recommendation.loss, + success=True, + n_fun_evals=optimizer.num_ask, + n_jac_evals=0, + n_hess_evals=0, + ) + + return result