-
Notifications
You must be signed in to change notification settings - Fork 5
Refactor/benchmarks #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 3 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
0543ed0
iohblade/problems -> iohblade/benchmarks: 1.
anantashahane 1ee9534
iohblade/problems -> iohblade/benchmarks
anantashahane 04715b5
Adding Documentation 1/3.
anantashahane cbd5e4f
Benchmark Documentation 2/3
anantashahane c52107f
Benchmark Documentation 3/3
anantashahane 8c2d7a5
Photonics + BBO
anantashahane 47eb91c
Readme update beta 1.
anantashahane 7d5ab34
BBOB docs done.
anantashahane 3ba59d5
AutoML benchmark details in documentation (#100)
sofokliski8 3ebe4f6
Documentation complete.
anantashahane 7952281
Merge conflict.
anantashahane File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,23 +1,354 @@ | ||
| AlphaEvolve Benchmarks | ||
| ====================== | ||
| Benchmarks | ||
| ========== | ||
|
|
||
| BLADE includes benchmark instances inspired by the Google DeepMind | ||
| AlphaEvolve paper. These instances are available in two complementary forms: | ||
| BLADE includes a collection of benchmarks ranging from BBOB to | ||
| Google DeepMind–inspired tasks derived from the AlphaEvolve paper. | ||
| These instances are available in two complementary forms: | ||
|
|
||
| - ``run_benchmarks/`` provides standalone reference scripts for running each | ||
| task directly. | ||
| - ``iohblade/benchmarks`` packages the same tasks for programmatic use in | ||
| experiments and pipelines. | ||
| - ``run_benchmarks/`` provides standalone reference scripts for running | ||
| each task directly. | ||
| - ``iohblade/benchmarks`` packages the same tasks for programmatic use | ||
| in experiments and pipelines. | ||
|
|
||
| The packaged benchmarks are grouped by domain: | ||
| The packaged benchmarks are grouped by domain. | ||
|
|
||
| - Analysis (auto-correlation inequalities) | ||
| - Combinatorics (Erdos min-overlap) | ||
| - Geometry (Heilbronn problems, kissing number, and distance ratios) | ||
| - Matrix multiplication | ||
| - Number theory (sums vs differences) | ||
| - Packing (rectangle, hexagon, and unit square packing) | ||
| - Fourier (uncertainty inequalities) | ||
| .. _list_of_benchmarks: | ||
|
|
||
| Each domain folder contains a README with task-specific details and citations | ||
| to the original sources. | ||
| List of Benchmarks | ||
| ------------------ | ||
|
|
||
| Analysis | ||
| ^^^^^^^^ | ||
|
|
||
| Perform auto-correlation on a time series using the following configuration: | ||
|
|
||
| - The domain size ``N`` defines the discretisation of | ||
| :math:`[-\tfrac{1}{4}, \tfrac{1}{4}]`. | ||
| - Step size: | ||
| :math:`dx = \frac{0.5}{N}` | ||
| - Auto-convolution: | ||
| :math:`g = dx \cdot \mathrm{conv}(f, f, \text{mode='full'})` | ||
| - Riemann sum analogues: | ||
|
|
||
| - :math:`I = dx \sum_i f[i]` | ||
| - :math:`L_1 = dx \sum_j |g[j]|` | ||
| - :math:`L_{2}^{2} = dx \sum_j g[j]^2` | ||
| - :math:`L_{\infty} = \max_j |g[j]|` | ||
| - :math:`\max_g = \max_j g[j]` | ||
| - :math:`\max_{\text{abs}_g} = \max_j |g[j]|` | ||
|
|
||
| Fitness | ||
| ~~~~~~~ | ||
|
|
||
| Auto-Correlation 1 | ||
| """""""""""""""""" | ||
|
|
||
| - Score: | ||
| :math:`\frac{\max_g}{I^2}` | ||
| - Constraints: | ||
| :math:`f \ge 0`, :math:`I > 0` | ||
| - Optimisation direction: minimisation | ||
| - Default: | ||
| :math:`N = 600` | ||
|
|
||
| Auto-Correlation 2 | ||
| """""""""""""""""" | ||
|
|
||
| - Score: | ||
| :math:`\frac{L_{2}^{2}}{L_1 \cdot L_{\infty}}` | ||
| - Constraints: | ||
| :math:`f \ge 0` | ||
| - Optimisation direction: maximisation | ||
| - Default: | ||
| :math:`N = 50` | ||
|
|
||
| Auto-Correlation 3 | ||
| """""""""""""""""" | ||
|
|
||
| - Score: | ||
| :math:`\frac{\max_{\text{abs}_g}}{I^2}` | ||
| - Constraints: | ||
| :math:`f` real-valued, :math:`I \ne 0` | ||
| - Optimisation direction: minimisation | ||
| - Default: | ||
| :math:`N = 400` | ||
|
|
||
| AutoML | ||
| ^^^^^^ | ||
|
|
||
| - Used to generate a set of machine learning pipelines using libraries `scikit-learn`. | ||
|
|
||
| Black Box Optimisation Benchmarking (BBOB) | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| - Contains a set of Black Box Optimisation problems and Multi-Affine Black Box Optimisation Problems from the ``ioh`` library. | ||
| - Includes a special benchmark ``SBOX-COST``, which is similar to BBOB but defined on the domain :math:`[-5, 5]^n`. | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would not call this whole section BBOB as this is the name of one specific benchmark. I would say: "Black Box Optimization (BBO)" Then stress the link with
|
||
| Combinatorics | ||
| ^^^^^^^^^^^^^ | ||
|
|
||
| Erdős Minimum-Overlap Problem | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - The continuous Erdős minimum-overlap problem seeks to find measurable | ||
| functions :math:`f, g: [-1,1] \rightarrow [0,1]` that satisfy: | ||
|
|
||
| 1. **Complementarity**: :math:`f(x) + g(x) = 1` for all :math:`x \in [-1,1]` | ||
| 2. **Unit mass**: :math:`\int_{-1}^{1} f(x) \, dx = \int_{-1}^{1} g(x) \, dx = 1` | ||
| 3. **Bounds**: :math:`f(x), g(x) \in [0,1]` for all :math:`x \in [-1,1]` | ||
|
|
||
| - And minimize the **maximum overlap integral**: | ||
|
|
||
| .. math:: | ||
|
|
||
| C := \sup_{x \in [-2,2]} \int_{-1}^{1} f(t) g(x+t) \, dt | ||
|
|
||
| - where g is extended by zero outside [-1,1]. | ||
|
|
||
| Euclidean Steiner Tree Problem | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - Given an undirected graph, the Euclidean Steiner Tree algorithm optimises the minimum spanning tree by adding extra nodes (Steiner points) to the graph. | ||
| - Adding these nodes allows for a shorter MST. | ||
| - This benchmark takes a set of points, runs MST on the original points and on the points combined with Steiner points, and returns their ratio. | ||
| - Optimisation goal: | ||
|
|
||
| :math:`\text{minimize} \ \frac{\text{mst(points + steiner\_points)}}{\text{mst(points)}}` | ||
|
|
||
| Graph Colouring Problem | ||
| ~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - The Graph Colouring benchmark focuses on assigning colours to the vertices of a graph such that no two adjacent vertices share the same colour. | ||
| - Input: a graph :math:`G = (V, E)` with vertices :math:`V` and edges :math:`E`. | ||
| - Objective: minimise the number of colours used while ensuring a **valid colouring**. | ||
| - Constraints: | ||
|
|
||
| 1. Each vertex receives exactly one colour. | ||
| 2. Adjacent vertices must not share the same colour. | ||
|
|
||
| - Fitness: | ||
|
|
||
| .. math:: | ||
|
|
||
| \text{fitness} = |C| \quad \text{where } C \text{ is the set of colours used} | ||
|
|
||
| Fourier | ||
| ^^^^^^^ | ||
|
|
||
| Fourier Uncertainty Inequality | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - This benchmark minimises the Fourier Uncertainty Inequality given by :math:`f(x) = P(x) e^{-\pi x^2}` | ||
| where :math:`r_{\max}` is the largest positive root after which :math:`P(x)` remains non-negative. | ||
|
|
||
| - Function Class: | ||
|
|
||
| .. math:: | ||
|
|
||
| f(x) = P(x) \, e^{-\pi x^2}, \quad | ||
| P(x) = \sum_{k=0}^{K-1} c[k] \, H_{4k}(x) | ||
|
|
||
| - :math:`H_n` are physicists' Hermite polynomials. | ||
| - Evenness holds by construction (degrees 0,4,8,…). | ||
|
|
||
| - Constraints: | ||
|
|
||
| 1. :math:`P(0) < 0` | ||
| 2. Leading coefficient :math:`c[K-1] > 0` (scale-invariant: any positive scaling leaves score unchanged) | ||
| 3. Tail nonnegativity: :math:`P(x) \ge 0 \quad \forall x \ge r_{\max}` | ||
| 4. Optional numeric sanity check: :math:`P(x_{\max}) \ge 0` for large :math:`x_{\max}` | ||
|
|
||
| Geometry | ||
| ^^^^^^^^ | ||
|
|
||
| Heilbronn Problem on a Unit-Area Triangle | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - This benchmark finds :math:`n = 11` points inside a triangle of area 1 that **maximize the minimum area** of any triangle formed by these points. | ||
|
|
||
| - Score: | ||
|
|
||
| .. math:: | ||
|
|
||
| \text{Score} = \min_{a,b,c} \Big( \max \sqrt{s (s-a)(s-b)(s-c)} \Big) | ||
|
|
||
| where: | ||
|
|
||
| - :math:`a,b,c \in \text{points}` | ||
| - :math:`a \ne b \ne c \ne a` | ||
| - :math:`s = \frac{a+b+c}{2}` | ||
| - :math:`\max \sqrt{s (s-a)(s-b)(s-c)}` is the largest area of the triangle formed by the three points. | ||
|
|
||
| Heilbronn Problem on a Unit-Area Convex Region | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - This benchmark finds :math:`n \in \{13, 14\}` points inside a convex region of area 1 that **maximize the minimum area** of any triangle formed by these points. | ||
| - The convex hull of the points defines the region, which is then rescaled to have area 1. | ||
| - Score: | ||
|
|
||
| .. math:: | ||
|
|
||
| \text{Score} = \min_{a,b,c} \Big( \max \sqrt{s (s-a)(s-b)(s-c)} \Big) | ||
|
|
||
| where: | ||
|
|
||
| - :math:`a,b,c \in \text{points}` | ||
| - :math:`a \ne b \ne c \ne a` | ||
|
|
||
| Kissing Number in 11 Dimensions | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - This benchmark maximises the count of integer vectors | ||
|
|
||
| :math:`C \subset \mathbb{Z}^{11} \setminus \{0\}` | ||
|
|
||
| that satisfies the **kissing configuration constraint**: | ||
|
|
||
| .. math:: | ||
|
|
||
| \min_{x \ne y} \|x - y\| \ge \max_{x \in C} \|x\| | ||
|
|
||
| - Score: | ||
|
|
||
| .. math:: | ||
|
|
||
| \text{Score} = |C| | ||
|
|
||
| Min/Max Distance Ratio | ||
| ~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - This benchmark finds a configuration of points that **minimizes the squared ratio of the maximum to minimum pairwise distances**. | ||
|
|
||
| - Fitness function: | ||
|
|
||
| .. math:: | ||
|
|
||
| \text{Fitness} = \left( \frac{\max_{i<j} d(i,j)}{\min_{i<j} d(i,j)} \right)^2 | ||
|
|
||
| - Variants: | ||
|
|
||
| - **2D space** with :math:`n = 16` points | ||
| - **3D space** with :math:`n = 14` points | ||
|
|
||
| Spherical Code | ||
| ~~~~~~~~~~~~~~ | ||
|
|
||
| - This benchmark is a **maximisation problem** where :math:`n=30` distinct points are arranged on the surface of a **unit sphere**. | ||
| - Objective: **maximize the minimum pairwise angle** between any two points. | ||
| - Score: | ||
|
|
||
| .. math:: | ||
|
|
||
| \text{f} = \min_{i \ne j} \theta(i,j) | ||
|
|
||
| where :math:`\theta(i,j)` is the angle between points :math:`i` and :math:`j`. | ||
|
|
||
| Kernel Tuner | ||
| ^^^^^^^^^^^^ | ||
|
|
||
| - This benchmark evaluates **metaheuristic algorithms for hardware kernel tuning** across integer and variable-dimensional search spaces with constraints. | ||
| - The algorithm is scored based on performance of the kernel using metrices like: | ||
|
|
||
| - runtime | ||
| - throughput | ||
| - custom matrices. | ||
|
|
||
| Logistics | ||
| ^^^^^^^^^ | ||
|
|
||
| Travelling Salesman Problem | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - Given a set of 2-D points :math:`(x, y)`, find the sortest path connecting those points. | ||
| - Optimisation Direction: Minimisation. | ||
| - Fitness: | ||
|
|
||
| .. math:: | ||
| \text{Minimise} \quad | ||
| \sum_{i \in V} \sum_{j \in V, j \ne i} d(i,j) \, x_{ij} | ||
|
|
||
| Vehicle Routing Problem | ||
| ~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - Given a set of 2-D points with their weights as, :math:`(x, y, w)` and a count :math:`n` of similar vehicles, with capacity :math:`c`, and a depot point :math:`(x,y, 0)` find the minimum travel distance. | ||
| - Optimisation Direction: Minimisation | ||
| - Fitness: | ||
|
|
||
| .. math:: | ||
| \text{Minimise} \quad | ||
| \sum_{i \in V} \sum_{j \in V, j \ne i} d(i,j) \, x_{ij} | ||
|
|
||
| - Constraints: | ||
|
|
||
| - Each customer is only serviced once. | ||
| - For each vehicle: :math:`sum_{i \in v_j} w_i \le c` where :math:`v_j` is a vehicle, and :math:`j \in \{0,\ldots,n\}`. | ||
|
|
||
| Matrix Multiplication via Tensor Decomposition | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
|
|
||
| - Given matrix dimensions: | ||
|
|
||
| .. math:: | ||
|
|
||
| n \times m \ \cdot \ m \times p \ \longrightarrow \ n \times p | ||
|
|
||
| define a tensor | ||
|
|
||
| .. math:: | ||
|
|
||
| T \in \mathbb{R}^{(n \cdot m) \times (m \cdot p) \times (p \cdot n)} | ||
|
|
||
| that encodes ordinary matrix multiplication. | ||
|
|
||
| - Factorisation: | ||
|
|
||
| A rank-:math:`r` CP decomposition of :math:`T`: | ||
|
|
||
| .. math:: | ||
|
|
||
| T[i,j,k] = \sum_{\ell=1}^{r} F_1[i,\ell] \cdot F_2[j,\ell] \cdot F_3[k,\ell] | ||
|
|
||
| yields a matrix multiplication algorithm requiring only :math:`r` scalar multiplications. | ||
|
|
||
| - Objective: | ||
|
|
||
| - Find the **smallest rank :math:`r`** that allows exact reconstruction (zero error). | ||
|
|
||
| - Constraints: | ||
|
|
||
| - All entries of the factor matrices :math:`F_1, F_2, F_3` must lie on a **quantisation grid**: | ||
|
|
||
| .. code-block:: | ||
|
|
||
| grid = 0.5, 1.0, 1.5, ... | ||
|
|
||
| Number Theory | ||
| ^^^^^^^^^^^^^ | ||
| Sums vs Differences (Single-Set Formulation) | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| - This benchmark searches for a finite set | ||
|
|
||
| :math:`U \subset \mathbb{Z}^+ \cup \{0\}` | ||
|
|
||
| that **maximizes** the following quantity: | ||
|
|
||
| .. math:: | ||
|
|
||
| c(U) = 1 + \frac{\log|U-U| - \log|U+U|}{\log(2 \max(U) + 1)} | ||
|
|
||
| - Purpose: | ||
|
|
||
| - :math:`c(U)` **lower-bounds the exponent** :math:`C_6` in the inequality | ||
|
|
||
| .. math:: | ||
|
|
||
| |A-B| \gtrsim |A+B|^{C_6} | ||
|
|
||
| - A larger :math:`c(U)` corresponds to a **better lower bound** on :math:`C_6`. | ||
|
|
||
| - Evaluation: | ||
|
|
||
| - The evaluator computes :math:`|U+U|` and :math:`|U-U|` exactly. | ||
| - Implementation uses **FFT convolution/correlation** on the indicator function of :math:`U` over the domain :math:`[0, \max(U)]`. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add more specifics here, emphasis the link to the OpenML platform etc. For this PR it is ok, we can ask Sofoklis to write this.