diff --git a/CHANGELOG.md b/CHANGELOG.md index 672e80b4..0d1282f3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added - (sample/features) add_field: check field size consistency with geometrical support. +- (sample) add `set_trees` to `Sample` delegated methods: `sample.set_trees(...)` now works as a direct proxy to `SampleFeatures.set_trees`, consistent with other delegated tree methods. ### Changed diff --git a/Report.md b/Report.md new file mode 100644 index 00000000..b4ab0bff --- /dev/null +++ b/Report.md @@ -0,0 +1,186 @@ +# BasicCleaning → main Change Report (Full Update) + +## Scope + +- **Compared branches:** `main...BasicCleaning` +- **Current branch head:** `7e4d2c4` (`remove aliases.`) +- **Global diff size:** **265 files changed**, **5132 insertions**, **26744 deletions** +- **Change status split:** **40 added**, **80 modified**, **145 deleted** +- **General outcome:** this branch is a broad **cleanup + scope reduction** pass. It simplifies the public API, restructures storage around explicit backend contracts, removes multiple legacy subpackages, and significantly trims benchmark/tutorial legacy content. + +--- + +## 1) Architecture changes (high priority) + +### 1.1 Public package surface reduction + +The branch removes several legacy or peripheral modules from `src/plaid`: + +- Removed package: `plaid.bridges` +- Removed package: `plaid.pipelines` +- Removed package: `plaid.post` +- Removed legacy utility/type modules: + - `src/plaid/utils/deprecation.py` + - `src/plaid/utils/init_with_tabular.py` + - `src/plaid/utils/interpolation.py` + - `src/plaid/utils/split.py` + - `src/plaid/utils/stats.py` + - `src/plaid/types/feature_types.py` + - `src/plaid/types/sklearn_types.py` + - `src/plaid/containers/feature_identifier.py` + +**Impact:** API scope is now centered on containers, storage, types, and problem definition. This is cleaner for maintenance but **breaking** for users importing removed modules. + +--- + +### 1.2 Storage redesign around backend abstraction + +Storage is now organized around a backend contract + backend registry pattern: + +- **Added:** `src/plaid/storage/backend_api.py` + - introduces backend protocol contract (`BackendModule`). +- **Added:** `src/plaid/storage/in_memory/__init__.py` + - in-memory backend implementation. +- **Updated:** `src/plaid/storage/registry.py` + - centralized backend registration and lookup. +- **Updated:** `src/plaid/storage/reader.py`, `src/plaid/storage/writer.py` + - high-level I/O now uses backend dispatch. +- Backend modules updated coherently: + - `src/plaid/storage/hf_datasets/*` + - `src/plaid/storage/zarr/*` + - `src/plaid/storage/cgns/*` + - `src/plaid/storage/common/*` + +**Impact:** improved extensibility and backend isolation; reduced tight coupling between I/O entrypoints and backend-specific logic. + +--- + +### 1.3 Container and schema behavior tightening + +Core model modules were heavily updated: + +- `src/plaid/containers/dataset.py` +- `src/plaid/containers/sample.py` +- `src/plaid/problem_definition.py` +- `src/plaid/containers/features.py` + +Observed direction: + +- stricter validation/normalization behavior, +- clearer split/path handling, +- better alignment between container lifecycle and backend operations. + +**Impact:** better model consistency and reduced implicit behavior; possible behavior changes for downstream code depending on former permissive flows. + +--- + +### 1.4 Versioning and package metadata cleanup + +- **Added:** `src/plaid/version.py` +- `src/plaid/__init__.py` updated to expose version through dedicated module. +- storage exports in `src/plaid/storage/__init__.py` were reorganized. + +**Impact:** cleaner metadata handling and clearer package-level imports. + +--- + +## 2) Cross-repository cleanup highlights + +### 2.1 Benchmark repository footprint reduction (major) + +The branch removes a large amount of benchmark-specific code: + +- Entire benchmark trees heavily deleted (notably FNO, MGN, MMGP, Vi-Transf subtrees). +- `benchmarks/README.md` updated to match reduced benchmark surface. + +**Impact:** leaner repository and maintenance scope; downstream users relying on old benchmark scripts need migration or archival references. + +--- + +### 2.2 Docs realignment + +Documentation updates align with the reduced API: + +- Updated: `docs/source/core_concepts/*`, `quickstart.md`, `tutorials/storage.md`, `examples_tutorials.rst` +- Removed: `docs/source/core_concepts/feature_identifiers.md` + +**Impact:** docs better match active architecture, but explicit migration notes for removed modules should be expanded before release. + +--- + +### 2.3 Examples and tests reshaping + +Examples: + +- updated many examples to current APIs, +- removed old pipeline/post/legacy util examples. + +Tests: + +- removed tests for deleted subpackages (`bridges`, `pipelines`, `post`, legacy utils), +- added/init-focused storage tests (`test_in_memory`, `test_hf_datasets_init`, `test_zarr_init`, `test_cgns_init`), +- added `tests/types/test_cgns_types.py`, +- modernized dataset fixture layout under `tests/containers/dataset/*`. + +**Impact:** tests now target the active code surface and new storage architecture. + +--- + +## 3) New modules and notable API additions + +### 3.1 Added modules + +1. `src/plaid/storage/backend_api.py` + - backend protocol for storage implementations. +2. `src/plaid/storage/in_memory/__init__.py` + - in-memory backend capabilities for runtime/testing use. +3. `src/plaid/version.py` + - dedicated package version source. +4. `src/plaid/utils/info.py` + - utility module introduced during cleanup. + +### 3.2 Notable updated API zones + +- `Dataset` / `Sample` container behaviors, +- `ProblemDefinition` normalization and split logic, +- `storage.reader` / `storage.writer` dispatch flow, +- backend init modules (`cgns`, `hf_datasets`, `zarr`) and registry integration. + +--- + +## 4) Breaking-change summary + +Treat this branch as a **breaking release line** due to: + +- full removal of `plaid.bridges`, `plaid.pipelines`, `plaid.post`, +- removal of multiple utility/type modules, +- changed behavior/entrypoints in storage and containers, +- removal of many benchmark and example paths. + +Recommended: publish with explicit migration guidance and compatibility notes. + +--- + +## 5) Suggested pre-release checklist (updated) + +- [ ] Finalize changelog entries specifically listing removed modules and migration replacements. +- [ ] Add a migration page (or release note section) mapping old imports/usages to new APIs. +- [ ] Run quality gates on current branch state: + - [ ] `uv run pytest tests -x` + - [ ] `uv run ruff check .` + - [ ] `uv run pyright` (or project-approved equivalent static typing gate) +- [ ] Ensure optional dependency policies are explicit for storage backends and examples. +- [ ] Validate docs links after removed pages/examples/benchmarks. + +--- + +## 6) Release-note positioning + +`BasicCleaning` should be positioned as a **major simplification and maintainability release**: + +- smaller and clearer public API, +- storage architecture modernization through backend contracts, +- stricter/cleaner core model behavior, +- substantial repository decluttering (benchmarks/examples/tests/docs aligned to active scope). + +This framing sets correct expectations that the release improves long-term maintainability but may require migration for existing users. diff --git a/benchmarks/FNO/2D_ElPlDynamics/README.md b/benchmarks/FNO/2D_ElPlDynamics/README.md deleted file mode 100644 index 8ab8f004..00000000 --- a/benchmarks/FNO/2D_ElPlDynamics/README.md +++ /dev/null @@ -1,6 +0,0 @@ -To run this benchmark, you need to download and untar the dataset from Zenodo, then edit in the files "convert_to_rectilinear_grid.py", "train.py" and "build_pred.py" the following variables at the top of the files: - -- `dataset_path`: the location where the plaid dataset has been untarred -- `rect_dataset_path`: temp folder used by the scripts, for the dataset projected onto a regular grid - -Run in the order: "convert_to_rectilinear_grid.py", "train.py" and "build_pred.py" to generate the prediction on the testing set. diff --git a/benchmarks/FNO/2D_ElPlDynamics/build_pred.py b/benchmarks/FNO/2D_ElPlDynamics/build_pred.py deleted file mode 100644 index 8968fa6b..00000000 --- a/benchmarks/FNO/2D_ElPlDynamics/build_pred.py +++ /dev/null @@ -1,243 +0,0 @@ -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.MeshTools.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.FE.Spaces.FESpaces import LagrangeSpaceGeo -from Muscat.FE.DofNumbering import ComputeDofNumbering -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import Muscat.MeshContainers.ElementsDescription as ED -from Muscat.MeshTools.ConstantRectilinearMeshTools import CreateConstantRectilinearMesh -from Muscat.MeshTools.MeshTetrahedrization import Tetrahedrization - -from scipy.sparse import coo_matrix -import copy -import torch -from physicsnemo.models.fno.fno import FNO -import torch.nn as nn -from Muscat.MeshTools.MeshTools import ComputeSignedDistance -from plaid import Dataset as Plaid_Dataset -import pickle - - -dataset_path = # path to update to location where rectilinear dataset is created - - -def renormalize(values): - """ - Function to normalize input of the FNO - """ - input_values = copy.deepcopy(values) - input_values[:, [2]] = torch.sigmoid(input_values[:, [2]]/0.1) - xhi = input_values[:, [2]] - input_values[:, [0, 1]] /= 20 - return input_values, xhi - - -def denormalize(values): - """ - Inverse function of renormalise except the mask that we do not need - """ - input_values = copy.deepcopy(values) - xhi = input_values[:, [2]] - input_values[:, [0, 1]] *= 20 - return input_values, xhi - - -def preproccess_sample(sample): - """Function to project the sample to a rectilinear grid using Muscat - This function only return the initial timestep as it is the only information needed - in the autoregressive method (when testing we simply apply recusively the model) - - Parameters - ---------- - sample : plaid.Sample - plaid sample to be projected to the rectilinear grid - - Returns - ------- - input_fields: torch.Tensor - 2D fields (shape: (1,3,301,151)) ready to be used as a 2D image - xhi: torch.Tensor - 2D mask (shape: (1,1,301,151)) ready to be used as a 2D image - """ - old_mesh = CGNSToMesh(sample.get_mesh(time=0)) - # Switching to 2D mesh instead of 3D shell, need C order for cython - old_mesh.nodes = (old_mesh.nodes[:, [0, 1]]).copy(order='C') - size = 150 - ref_mesh = Tetrahedrization(CreateConstantRectilinearMesh( - [size*2+1, size+1], [0, 0], [100/size, 100/size])) - - indexes = np.zeros(ref_mesh.GetNumberOfNodes(), int) - values = np.zeros((ref_mesh.GetNumberOfNodes(), 15)) - for connect in ref_mesh.GetElementsOfType(ED.Triangle_3).connectivity: - values[connect, indexes[connect]] = 1 - indexes[connect] += 1 - average = 1/indexes - data, i, j = np.zeros(0), np.zeros(0).astype(int), np.zeros(0).astype(int) - for elem_index, connect in enumerate(ref_mesh.GetElementsOfType(ED.Triangle_3).connectivity): - data = np.concatenate((data, average[connect]), axis=0) - i = np.concatenate((i, connect), axis=0) - j = np.concatenate((j, [elem_index]*3), axis=0) - - # This operator is needed if one want to project the EROSION_STATUS element field to the nodes - operator_elem_to_node = coo_matrix((data, (i, j))) - - # Compute Field Transfer operator for node fields - space = LagrangeSpaceGeo - numbering = ComputeDofNumbering(old_mesh, space, fromConnectivity=True) - displacement_field = FEField("FakeField", old_mesh, space, numbering) - op, _, _ = GetFieldTransferOp( - displacement_field, ref_mesh.nodes, method="Interp/Clamp") - - triangle_centers = np.mean( - ref_mesh.nodes[ref_mesh.GetElementsOfType(ED.Triangle_3).connectivity], axis=-2) - _, _, entities = GetFieldTransferOp( - displacement_field, triangle_centers, method="Interp/Clamp") - data = np.ones_like(entities.squeeze()) - i = old_mesh.GetElementsOfType(ED.Triangle_3).GetNumberOfElements() - op_elem = coo_matrix( - (data, (np.arange(entities.shape[0]), entities.squeeze()))) - - # getting input values for the first timestep - ux = op@sample.get_field(name="U_x", zone_name="Zone", - base_name="Base_2_2", time=0) - uy = op@sample.get_field(name="U_y", zone_name="Zone", - base_name="Base_2_2", time=0) - signed_distance = ComputeSignedDistance( - copy.deepcopy(old_mesh), ref_mesh.nodes) - fields = np.stack((ux, uy, signed_distance), axis=1) - fields_pt = torch.permute(torch.tensor( - fields).view(150*2+1, 150+1, 3), (2, 0, 1)) - renormalised_fields, xhi = renormalize(fields_pt.unsqueeze(0)) - return renormalised_fields, xhi - - -def postprocess_sample(prediction, sample): - """Function to switch back from rectilinear mesh to the original mesh - - Parameters - ---------- - prediction : torch.Tensor - prediction of the neural network that needs to be projected back to the original mesh - sample : plaid.Sample - Plaid sample containing the mesh of the orginal geometry - - Returns - ------- - ux: np.array - displacements on the x axis - uy: np.array - displacements on the y axis - erosion_node_field: np.array - erosion field projected on the nodes - """ - old_mesh = CGNSToMesh(sample.get_mesh(time=0)) - # Switching to 2D mesh instead of 3D shell, need C order for cython - old_mesh.nodes = (old_mesh.nodes[:, [0, 1]]).copy(order='C') - size = 150 - ref_mesh = Tetrahedrization(CreateConstantRectilinearMesh( - [size*2+1, size+1], [0, 0], [100/size, 100/size])) - - # Compute Field Transfer operator for node fields - space = LagrangeSpaceGeo - numbering = ComputeDofNumbering(ref_mesh, space, fromConnectivity=True) - displacement_field = FEField("FakeField", ref_mesh, space, numbering) - op, _, _ = GetFieldTransferOp( - displacement_field, old_mesh.nodes, method="Interp/Clamp") - prediction, xhi = denormalize(prediction) - # getting input values for the first timestep by projecting back - ux = op@(prediction[0, 0].cpu().numpy().reshape(-1)) - uy = op@(prediction[0, 1].cpu().numpy().reshape(-1)) - erosion_node_field = op@(prediction[0, 2].cpu().numpy().reshape(-1)) - return ux, uy, erosion_node_field - - -def compute_predictions(dataset_path, ids, model, save_name, device, dtype): - """Function to compute the predictions from a list of ids in the dataset - ideally we compute the prediction on unseen samples - - Parameters - ---------- - dataset_path : str - path of the plaid dataset - ids : iterable - iterable of ids where to compute the prediction - model : torch.nn.model - Pytorch model trained on the dataset, we assume that it was trained for predicting variation between t and t+1, not directly t+1 - save_name : str - String where to save the prediction results - device : torch.device - device where to run the computation (needs to be the same than the model) - dtype : torch.dtype - dtype of the model (needs to be the same than the model) - """ - - processes_number = 32 - - dataset_size = Plaid_Dataset()._load_number_of_samples_(dataset_path) - dataset = Plaid_Dataset() - dataset._load_from_dir_(savedir=dataset_path, verbose=True, - processes_number=processes_number, ids=list(ids)) - - predictions = [] - # For all meshes in the test set - for i in ids: - sample = dataset[i] - print(sample.get_field_names()) - print(sample) - UXs = [] - UYs = [] - EROSIONS = [] - - # Making auto regression to predict the outcome on the test tensile - with torch.inference_mode(): - input_model, xhi = preproccess_sample(sample) - input_model = input_model.to(device=device, dtype=dtype) - out = input_model - xhi = xhi.to(device=device, dtype=dtype) - for _ in range(40): - out = (model(out) + out) - out[:, [2]] = torch.clamp(out[:, [2]], 0, 1) - ux_predicted, uy_predicted, erosions_predicted = postprocess_sample( - out, sample) - UXs.append(ux_predicted) - UYs.append(uy_predicted) - EROSIONS.append(erosions_predicted) - # Registering the results - predictions.append({ - "U_x": np.stack([np.zeros_like(UXs[0])]+UXs), - "U_y": np.stack([np.zeros_like(UYs[0])]+UYs), - "EROSION_STATUS": np.stack([np.ones_like(EROSIONS[0])]+EROSIONS) - }) - with open(save_name, 'wb') as file: - pickle.dump(predictions, file) - - - -if __name__ == "__main__": - - # Ids where to test the prediction - # Make sure it does not contain the training set - ids = range(900, 1000) - save_file = "saved_model.pt" - - device = torch.device("cuda") - dtype = torch.float - - model = FNO( - in_channels=3, - out_channels=3, - decoder_layers=1, - decoder_layer_size=64, - dimension=2, - latent_channels=64, - num_fno_layers=8, - num_fno_modes=20, - padding=8, - ) - - model.load_state_dict(torch.load(save_file)) - model.to(device=device, dtype=dtype) - - save_name = 'predictions.pkl' - compute_predictions(dataset_path, ids, model, save_name, device, dtype) diff --git a/benchmarks/FNO/2D_ElPlDynamics/convert_to_rectilinear_grid.py b/benchmarks/FNO/2D_ElPlDynamics/convert_to_rectilinear_grid.py deleted file mode 100644 index 52104d12..00000000 --- a/benchmarks/FNO/2D_ElPlDynamics/convert_to_rectilinear_grid.py +++ /dev/null @@ -1,155 +0,0 @@ -from plaid import Dataset -from plaid import Sample -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.MeshTools.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.FE.Spaces.FESpaces import LagrangeSpaceGeo -from Muscat.FE.DofNumbering import ComputeDofNumbering -from Muscat.Bridges.CGNSBridge import MeshToCGNS, CGNSToMesh -import Muscat.MeshContainers.ElementsDescription as ED -from Muscat.MeshTools.ConstantRectilinearMeshTools import CreateConstantRectilinearMesh -from Muscat.MeshTools.MeshTetrahedrization import Tetrahedrization -from Muscat.MeshTools.MeshModificationTools import DeleteElements -from Muscat.MeshTools.MeshTools import ComputeSignedDistance -from scipy.sparse import coo_matrix -import copy -import os - - -dataset_path = "/path/to/plaid/" # path to update to input plaid dataset -rect_dataset_path = "/path/to/rect_dataset_path/" # path to update to location where rectilinear dataset is created - - -dataset = Dataset() -for sample_index in range(1000): - dataset._load_from_dir_( - savedir=dataset_path, - verbose=True, - processes_number=1, - ids=[sample_index]) - sample = dataset.get_samples([sample_index])[sample_index] - old_mesh = CGNSToMesh(sample.get_mesh(time=0)) - - old_mesh.nodes = (old_mesh.nodes[:, [0, 1]]).copy(order='C') - size = 150 - ref_mesh = Tetrahedrization(CreateConstantRectilinearMesh( - [size * 2 + 1, size + 1], [0, 0], [100 / size, 100 / size])) - - indexes = np.zeros(ref_mesh.GetNumberOfNodes(), int) - values = np.zeros((ref_mesh.GetNumberOfNodes(), 15)) - for connect in ref_mesh.GetElementsOfType(ED.Triangle_3).connectivity: - values[connect, indexes[connect]] = 1 - indexes[connect] += 1 - average = 1 / indexes - data, i, j = np.zeros(0), np.zeros( - 0).astype(int), np.zeros(0).astype(int) - for elem_index, connect in enumerate( - ref_mesh.GetElementsOfType(ED.Triangle_3).connectivity): - data = np.concatenate((data, average[connect]), axis=0) - i = np.concatenate((i, connect), axis=0) - j = np.concatenate((j, [elem_index] * 3), axis=0) - - operator_elem_to_node = coo_matrix((data, (i, j))) - - # Compute Field Transfer operator for node fields - space = LagrangeSpaceGeo - numbering = ComputeDofNumbering(old_mesh, space, fromConnectivity=True) - displacement_field = FEField("FakeField", old_mesh, space, numbering) - op, _, _ = GetFieldTransferOp( - displacement_field, ref_mesh.nodes, method="Interp/Clamp") - - # Compute elem to node operator - triangle_centers = np.mean( - ref_mesh.nodes[ref_mesh.GetElementsOfType(ED.Triangle_3).connectivity], axis=-2) - # entities contains the index of the element for each triangle_center - _, _, entities = GetFieldTransferOp( - displacement_field, triangle_centers, method="Interp/Clamp") - data = np.ones_like(entities.squeeze()) - i = old_mesh.GetElementsOfType(ED.Triangle_3).GetNumberOfElements() - op_elem = coo_matrix( - (data, (np.arange(entities.shape[0]), entities.squeeze()))) - # Building a new sample - new_sample = Sample() - tree = MeshToCGNS(ref_mesh) - new_sample.add_tree(tree, time=0) - - ux = sample.get_field(name="U_x", zone_name="Zone", - base_name="Base_2_3", time=0) - uy = sample.get_field(name="U_y", zone_name="Zone", - base_name="Base_2_3", time=0) - new_sample.add_field("U_x", op @ ux, zone_name="Zone", - base_name="Base_2_2", time=0) - new_sample.add_field("U_y", op @ uy, zone_name="Zone", - base_name="Base_2_2", time=0) - new_sample.add_field( - "Signed_Distance", - ComputeSignedDistance( - copy.deepcopy(old_mesh), - ref_mesh.nodes), - zone_name="Zone", - base_name="Base_2_2", - time=0) - dt = 0.001 - for i in range(1, 40): - # Casting CGNS to Muscat.mesh - old_mesh = CGNSToMesh( - sample.get_mesh( - time=dt * i, - apply_links=True)) - old_mesh.nodes = (old_mesh.nodes[:, [0, 1]]).copy(order='C') - - # Compute Field Transfer operator for node fields - space = LagrangeSpaceGeo - numbering = ComputeDofNumbering( - old_mesh, space, fromConnectivity=True) - displacement_field = FEField( - "FakeField", old_mesh, space, numbering) - op, _, _ = GetFieldTransferOp( - displacement_field, ref_mesh.nodes, method="Interp/ZeroFill") - - # Compute Field Transfer operator for elem fields - triangle_centers = np.mean( - ref_mesh.nodes[ref_mesh.GetElementsOfType(ED.Triangle_3).connectivity], axis=-2) - _, _, entities = GetFieldTransferOp( - displacement_field, triangle_centers, method="Interp/Clamp") - data = np.ones_like(entities.squeeze()) - list_i = old_mesh.GetElementsOfType( - ED.Triangle_3).GetNumberOfElements() - op_elem = coo_matrix( - (data, (np.arange(entities.shape[0]), entities.squeeze()))) - - # Removing broken elements of the mesh - mask = np.zeros(old_mesh.GetNumberOfElements()) - mask[old_mesh.elemFields['EROSION_STATUS'] == 0] = 1 - DeleteElements(old_mesh, mask, updateElementFields=True) - - ux = old_mesh.nodeFields["U_x"] - uy = old_mesh.nodeFields["U_y"] - path_linked_sample = os.path.join( - rect_dataset_path, f"dataset/samples/sample_{sample_index:09d}/meshes/mesh_{0:09d}.cgns") - new_sample.link_tree( - path_linked_sample, - linked_sample=new_sample, - linked_time=0, - time=dt * i) - new_sample.add_field("U_x", op @ ux, zone_name="Zone", - base_name="Base_2_2", time=dt * i) - new_sample.add_field("U_y", op @ uy, zone_name="Zone", - base_name="Base_2_2", time=dt * i) - # Compute signed distance of the mesh where broken elements were - # removed - new_sample.add_field( - "Signed_Distance", - ComputeSignedDistance( - copy.deepcopy(old_mesh), - ref_mesh.nodes), - zone_name="Zone", - base_name="Base_2_2", - time=dt * i) - old_mesh = CGNSToMesh( - sample.get_mesh( - time=dt * i, - apply_links=True)) - new_sample.save(os.path.join(rect_dataset_path, - "dataset/samples/sample_{:09d}".format(sample_index))) diff --git a/benchmarks/FNO/2D_ElPlDynamics/train.py b/benchmarks/FNO/2D_ElPlDynamics/train.py deleted file mode 100644 index 3c890c1e..00000000 --- a/benchmarks/FNO/2D_ElPlDynamics/train.py +++ /dev/null @@ -1,132 +0,0 @@ - -import torch -from torch.utils.data import DataLoader,SubsetRandomSampler -import torch.optim as optim -import torch.nn as nn -from tqdm import tqdm - -from physicsnemo.models.fno.fno import FNO -from utils import TemporalFractureReader - - - -rect_dataset_path = # path to update to plaid dataset - - - -def renormalize(values): - """ - Function to normalize input of the FNO - """ - # Computing the smooth mask using signed distance - values[:, [3]] = torch.sigmoid(values[:, [3]]/0.1) - # The smooth mask is used intead of xhi - xhi = values[:, [3]] - # We remove the EROSION_STATUS from the inputs (channel 2) - input_values = values[:, [0, 1, 3]] - # Scaling the displacements with respect to the max displacement in the simulation - input_values[:, [0, 1]] /= 20 - return input_values, xhi - - -def train(model, optimizer, dataloader, epochs, device, dtype): - """Training loop for the models - - Parameters - ---------- - model : nn.Module - pytorch model that takes the input fields and the mask and predict the variation of the input - optimizer : torch.optim.Optimizer - pytorch optimizer - dataloader : torch.data.DataLoader - Dataloader that return the input fields and output fields - epochs : int - number of iterations on the dataset - device : torch.device - device on which to run computations - dtype : torch.dtype - data type of the model - """ - for epoch in tqdm(range(epochs)): - for input_field, output_field in dataloader: - input_values, xhi = renormalize(input_field) - output_values, _ = renormalize(output_field) - input_values, output_values, xhi = input_values.to(device=device, dtype=dtype), output_values.to( - device=device, dtype=dtype), xhi.to(device=device, dtype=dtype) - # The model predicts the variation rather than the output field - loss = torch.mean(torch.mean( - (model(input_values) + input_values-output_values)**2, dim=(1, 2, 3))) - loss.backward() - optimizer.step() - optimizer.zero_grad() - - -def main(model, rect_dataset_path, num_workers, batch_size, epochs, learning_rate, pin_memory, save_file): - """Function for starting the training and saving a model - We advise using DDP to scale this function to multi-node multi-gpu to reduce computation time - - Parameters - ---------- - model : torch.nn.Model - model to train - rect_dataset_path : str - location of the dataset - num_workers : int - number of cpus - batch_size : int - batch size for computing the loss - epochs : int - number of epochs to perform - learning_rate : float - learning rate for the gradient descent - pin_memory : bool - pin_memory parameter for the dataloader - save_file : str - path where to save the trained model - """ - - device = torch.device("cuda") - dtype = torch.float - - temp_dataset = TemporalFractureReader( - rect_dataset_path, num_workers, step_definition=1) - - model.to(device=device, dtype=dtype) - - optimizer = optim.Adam(model.parameters(), lr=learning_rate) - ## making sure we are training on the training set only - sampler = SubsetRandomSampler(range(1000)) - dataloader = DataLoader(temp_dataset, batch_size=batch_size, - num_workers=num_workers, pin_memory=pin_memory,sampler=sampler) - - train(model, optimizer, dataloader, epochs, device, dtype) - - torch.save(model.state_dict(), save_file) - - -if __name__ == "__main__": - - num_workers = 32 - batch_size = 60 - epochs = 100 - learning_rate = 0.0003 - pin_memory = True - save_file = "saved_model.pt" - - # Selection of the model either FNO or DAFNO - - model = FNO( - in_channels=3, - out_channels=3, - decoder_layers=1, - decoder_layer_size=64, - dimension=2, - latent_channels=64, - num_fno_layers=8, - num_fno_modes=20, - padding=8, - ) - - - main(model, rect_dataset_path, num_workers, batch_size, - epochs, learning_rate, pin_memory, save_file) diff --git a/benchmarks/FNO/2D_ElPlDynamics/utils.py b/benchmarks/FNO/2D_ElPlDynamics/utils.py deleted file mode 100644 index ea3c8934..00000000 --- a/benchmarks/FNO/2D_ElPlDynamics/utils.py +++ /dev/null @@ -1,74 +0,0 @@ -import numpy as np -import torch -from torch.utils.data import Dataset - -from plaid import Dataset as Plaid_Dataset - - -class TemporalFractureReader(Dataset): - def __init__(self, data_path, processes_number, step_definition=1, subset=None): - """Class used for autoregressive methods, the goal is to predict the next timestep give the previous timestep - the time step of our simulation is very small thus one might want to predict between longer time intervals - The dataset projects all meshes on a reference rectangular mesh so that all samples in the dataset get the same size - THe reference mesh is taken rectangular so as to use the data with an FNO or a variant. - - Parameters - ---------- - data_path : str - path of the plaid dataset - processes_number : int - number of processes used to load the plaid dataset, usually taken equal to the number of processes of the job - step_definition : int, optional - number of timesteps between the input and the output, by default 20 - subset : list[int], optional - list containing a subset of indexes to load, it allows to load only a smaller part of the dataset in the , by default None - """ - self.subset = subset - self.processes_number = processes_number - dataset = Plaid_Dataset() - dataset._load_from_dir_( - savedir=data_path, - verbose=True, - processes_number=processes_number, - ids=subset, - ) - self.plaid_dataset = dataset - self.timedelta = 0.001 - self.step_definition = step_definition - - def get_fields(self, index_config, index_timestep): - ### Function to get a transfered field for a given config and given timestep. - sample = self.plaid_dataset.get_samples([index_config])[index_config] - fields = np.stack( - ( - sample.get_field(name="U_x", time=index_timestep * self.timedelta), - sample.get_field(name="U_y", time=index_timestep * self.timedelta), - sample.get_field( - name="EROSION_STATUS", time=index_timestep * self.timedelta - ), - sample.get_field( - name="Signed_Distance", time=index_timestep * self.timedelta - ), - ), - axis=1, - ) - return fields - - def __getitem__(self, index): - ### Function returning a transfered field as an initial step and a final timestep taken step_definition timesteps after - index_config = index // (40 - self.step_definition) - index_timestep = index % (40 - self.step_definition) - if self.subset is not None: - if index_config not in self.subset: - raise IndexError( - "Sample has not been loaded in this dataset, include it in the subset or " - ) - - input = self.get_fields(index_config, index_timestep) - output = self.get_fields(index_config, index_timestep + self.step_definition) - return torch.permute( - torch.tensor(input).view(150 * 2 + 1, 150 + 1, 4), (2, 0, 1) - ), torch.permute(torch.tensor(output).view(150 * 2 + 1, 150 + 1, 4), (2, 0, 1)) - - def __len__(self): - return len(self.plaid_dataset) * (40 - self.step_definition) diff --git a/benchmarks/FNO/2D_MultiScHypEl/README.md b/benchmarks/FNO/2D_MultiScHypEl/README.md deleted file mode 100644 index ba7dcc9d..00000000 --- a/benchmarks/FNO/2D_MultiScHypEl/README.md +++ /dev/null @@ -1,7 +0,0 @@ -To run this benchmark, you need to download and untar the dataset from Zenodo, then edit in the files "prepare_[dataset].py", "train.py" and "construct_prediction.py" the following variables at the top of the files: - -- `plaid_location`: the location where the plaid dataset has been untarred -- `prepared_data_dir`: temp folder used by the scripts, for the dataset projected onto a regular grid -- `predicted_data_dir`: temp folder used by the scripts, for the prediction of the test set onto the regular grid - -Run in the order: "prepare_[dataset].py", "train.py" and "construct_prediction.py" to generate the prediction on the testing set. \ No newline at end of file diff --git a/benchmarks/FNO/2D_MultiScHypEl/construct_prediction.py b/benchmarks/FNO/2D_MultiScHypEl/construct_prediction.py deleted file mode 100644 index f2e89d4c..00000000 --- a/benchmarks/FNO/2D_MultiScHypEl/construct_prediction.py +++ /dev/null @@ -1,76 +0,0 @@ -from plaid import Dataset -from plaid import ProblemDefinition -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.Bridges.CGNSBridge import CGNSToMesh -from Muscat.FE.FETools import PrepareFEComputation -from tqdm import tqdm - -import os, pickle, time - -start = time.time() - - -plaid_location = # path to update -predicted_data_dir= # path to update - - -datapath=os.path.join(plaid_location, "dataset") -pb_defpath=os.path.join(plaid_location, "problem_definition") - - -dataset_pred = Dataset() -dataset_pred._load_from_dir_(predicted_data_dir, verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('DOE_train') -ids_test = problem.get_split('DOE_test') - - -dataset = Dataset() -dataset._load_from_dir_(datapath, ids=ids_test, verbose=True, processes_number=4) - - -n_train = len(ids_train) -n_test = len(ids_test) - -out_fields_names = ['u1', 'u2', 'P11', 'P12', 'P22', 'P21', 'psi'] -out_scalars_names = ['effective_energy'] -nbe_features = len(out_fields_names) + len(out_scalars_names) - -rec_mesh = CGNSToMesh(dataset_pred[ids_test[0]].get_mesh()) - - -prediction = [] - -count = 0 -for sample_index in tqdm(ids_test): - - sample_pred = dataset_pred[sample_index] - sample = dataset[sample_index] - - input_mesh = CGNSToMesh(sample.get_mesh()) - - space, numberings,_,_ = PrepareFEComputation(rec_mesh, numberOfComponents=1) - field_mesh = FEField("", mesh=rec_mesh, space=space, numbering=numberings[0]) - efilter = ElementFilter(dimensionality=rec_mesh.GetElementsDimensionality()) - op, status, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= input_mesh.nodes, method="Interp/Clamp" , elementFilter=efilter , verbose=False) - - prediction.append({}) - for fn in out_fields_names: - prediction[count][fn] = op.dot(sample_pred.get_field(fn)) - for sn in out_scalars_names: - prediction[count][sn] = sample_pred.get_scalar(sn) - - count += 1 - -with open('prediction_2dMultiScHypEl.pkl', 'wb') as file: - pickle.dump(prediction, file) - -print("duration construct predictions =", time.time()-start) -# 28 seconds diff --git a/benchmarks/FNO/2D_MultiScHypEl/prepare_2D_MultiScHypEl.py b/benchmarks/FNO/2D_MultiScHypEl/prepare_2D_MultiScHypEl.py deleted file mode 100644 index 7556376a..00000000 --- a/benchmarks/FNO/2D_MultiScHypEl/prepare_2D_MultiScHypEl.py +++ /dev/null @@ -1,121 +0,0 @@ - -from plaid import Sample -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.Bridges.CGNSBridge import MeshToCGNS,CGNSToMesh -from Muscat.Containers.ConstantRectilinearMeshTools import CreateConstantRectilinearMesh -from Muscat.Containers.MeshTetrahedrization import Tetrahedrization -from Muscat.Containers.MeshModificationTools import ComputeSkin -from Muscat.FE.FETools import PrepareFEComputation -from Muscat.FE.FETools import ComputeNormalsAtPoints -import copy -from tqdm import tqdm - -import os, shutil -import time - - -plaid_location = # path to update -prepared_data_dir= # path to update - - -start = time.time() - - -from plaid import ProblemDefinition - - -def compute_signed_distance(mesh,eval_points): - """Function to compute the signed distance from the border of the mesh - - Args: - mesh (Muscat.Mesh): mesh which needs the singed distance - eval_points (np.array): Points where to compute the signed distance. - - Returns: - np.array: returns the signed distance of the mesh at th eeval points - """ - ComputeSkin(mesh,inPlace=True) - space, numberings,_,_ = PrepareFEComputation(mesh,numberOfComponents=1) - field_mesh = FEField("", mesh=mesh, space=space, numbering=numberings[0]) - opSkin, statusSkin, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= eval_points, method="Interp/Clamp" , elementFilter= ElementFilter(dimensionality=mesh.GetElementsDimensionality()-1) , verbose=False) - normals = ComputeNormalsAtPoints(mesh) - skinpos = opSkin.dot(mesh.nodes) - normalspos = opSkin.dot(normals) - sign_distance = -1*np.sign(np.sum((eval_points - skinpos)*normalspos,axis=-1)) - distance = np.sqrt(np.sum((eval_points - skinpos)**2,axis=-1)) - return sign_distance*distance - - - -datapath=os.path.join(plaid_location, "dataset") -pb_defpath=os.path.join(plaid_location, "problem_definition") - - - -in_scalars_names = ["C11","C12","C22"] -out_fields_names = ["u1", "u2", "P11", "P12", "P22", "P21", "psi"] -out_scalars_names = ["effective_energy"] - - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('DOE_train') -ids_test = problem.get_split('DOE_test') - - - -size=200 -rec_mesh = Tetrahedrization(CreateConstantRectilinearMesh([size+1,size+1], [0,0], [1/size, 1/size])) -out_nodes = rec_mesh.nodes - - -nSamples = len(ids_train)+len(ids_test) - -for sample_index in tqdm(range(nSamples)): - - sample = Sample.load_from_dir(dir_path = os.path.join(datapath, "samples/sample_{:09d}".format(sample_index))) - - input_mesh = CGNSToMesh(sample.get_mesh(time=0)) - - input_mesh.nodes= (input_mesh.nodes[:,[0,1]]).copy(order='C') - - new_sample=Sample() - tree = MeshToCGNS(rec_mesh) - new_sample.add_tree(tree,time=0) - - - if sample_index in ids_train: - scalar_names = in_scalars_names + out_scalars_names - - space, numberings,_,_ = PrepareFEComputation(input_mesh,numberOfComponents=1) - field_mesh = FEField("", mesh=input_mesh, space=space, numbering=numberings[0]) - efilter = ElementFilter(dimensionality=input_mesh.GetElementsDimensionality()) - op, status, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= out_nodes, method="Interp/Clamp" , elementFilter=efilter , verbose=False) - - for field_name in out_fields_names: - old_field = sample.get_field( name=field_name) - new_sample.add_field(field_name, op.dot(old_field)) - - elif sample_index in ids_test: - scalar_names = in_scalars_names - else: - raise("unkown sample_index") - - - for scalar_name in scalar_names: - old_scalar= sample.get_scalar( name=scalar_name) - new_sample.add_scalar(scalar_name, old_scalar) - new_sample.add_field("Signed_Distance",compute_signed_distance(copy.deepcopy(input_mesh),rec_mesh.nodes)) - - path = os.path.join(prepared_data_dir,"dataset/samples/sample_{:09d}".format(sample_index)) - if os.path.exists(path) and os.path.isdir(path): - shutil.rmtree(path) - new_sample.save(path) - -print("duration prepare =", time.time()-start) -# 126 seconds diff --git a/benchmarks/FNO/2D_MultiScHypEl/train_and_predict.py b/benchmarks/FNO/2D_MultiScHypEl/train_and_predict.py deleted file mode 100644 index df5761bd..00000000 --- a/benchmarks/FNO/2D_MultiScHypEl/train_and_predict.py +++ /dev/null @@ -1,166 +0,0 @@ -from plaid import Dataset -from plaid import ProblemDefinition -import numpy as np - -import os, shutil -import time - -import torch -from physicsnemo.models.fno.fno import FNO - -start = time.time() - - -plaid_location = # path to update -prepared_data_dir = # path to update -predicted_data_dir = # path to update - - -pb_defpath=os.path.join(plaid_location, "problem_definition") - -dataset = Dataset() -dataset._load_from_dir_(os.path.join(prepared_data_dir, "dataset"), verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('DOE_train') -ids_test = problem.get_split('DOE_test') - - -n_train = len(ids_train) -n_test = len(ids_test) - - -in_scalars_names = ["C11","C12","C22"] -out_fields_names = ["u1", "u2", "P11", "P12", "P22", "P21", "psi"] -out_scalars_names = ["effective_energy"] - - -size = 201 - -# TRAIN - -inputs = np.empty((n_train, len(in_scalars_names)+1, size, size)) -for i, id_sample in enumerate(ids_train): - for in_chan in range(len(in_scalars_names)+1): - inputs[i, in_chan, :, :] = dataset[id_sample].get_field("Signed_Distance").reshape((size, size)) - for k, sn in enumerate(in_scalars_names): - inputs[i, k+1, :, :] = dataset[id_sample].get_scalar(sn) - -outputs = np.empty((n_train, len(out_scalars_names)+len(out_fields_names), size, size)) -for i, id_sample in enumerate(ids_train): - for k, fn in enumerate(out_fields_names): - outputs[i, k, :, :] = dataset[id_sample].get_field(fn).reshape((size, size)) - for k, sn in enumerate(out_scalars_names): - outputs[i, k+len(out_fields_names), :, :] = dataset[id_sample].get_scalar(sn) - - -min_in = inputs.min(axis=(0, 2, 3), keepdims=True) -max_in = inputs.max(axis=(0, 2, 3), keepdims=True) -inputs = (inputs - min_in) / (max_in - min_in) - - -min_out = outputs.min(axis=(0, 2, 3), keepdims=True) -max_out = outputs.max(axis=(0, 2, 3), keepdims=True) -outputs = (outputs - min_out) / (max_out - min_out) - - -import torch -from torch.utils.data import Dataset, TensorDataset - -class GridDataset(Dataset): - def __init__(self, inputs, outputs): - self.inputs = torch.tensor(inputs, dtype=torch.float32) - self.outputs = torch.tensor(outputs, dtype=torch.float32) - - def __len__(self): - return self.inputs.shape[0] - - def __getitem__(self, idx): - return self.inputs[idx], self.outputs[idx] - -from torch.utils.data import DataLoader -dataset__ = GridDataset(inputs, outputs) -loader = DataLoader(dataset__, batch_size=64, shuffle=True) - - - -model = FNO( -in_channels=inputs.shape[1], -out_channels=outputs.shape[1], -decoder_layers=4, -decoder_layer_size=64, -dimension=2, -latent_channels=64, -num_fno_layers=4, -padding=0, -).cuda() - - -optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) -loss_fn = torch.nn.MSELoss() - -n_epoch = 2000 -for epoch in range(n_epoch): - model.train() - total_loss = 0.0 - for x, y in loader: - x, y = x.cuda(), y.cuda() - pred = model(x) - loss = loss_fn(pred, y) - optimizer.zero_grad() - loss.backward() - optimizer.step() - total_loss += loss.item() - print(f"Epoch {epoch+1}, Loss: {total_loss/len(loader)}") - - - -# TEST - -inputs = np.empty((n_test, len(in_scalars_names)+1, size, size)) - -for i, id_sample in enumerate(ids_test): - for in_chan in range(len(in_scalars_names)+1): - inputs[i, in_chan, :, :] = dataset[id_sample].get_field("Signed_Distance").reshape((size, size)) - for k, sn in enumerate(in_scalars_names): - inputs[i, k+1, :, :] = dataset[id_sample].get_scalar(sn) - - -inputs = (inputs - min_in) / (max_in - min_in) - - - -test_dataset = TensorDataset(torch.tensor(inputs, dtype=torch.float32)) -test_loader = DataLoader(test_dataset, batch_size=1, shuffle=False) - -model.eval() -y_preds = [] - -with torch.no_grad(): - for batch in test_loader: - x_batch = batch[0].cuda() - y_batch_pred = model(x_batch).cpu() - y_preds.append(y_batch_pred) - -y_pred = torch.cat(y_preds, dim=0).numpy() - -outputs_pred = y_pred * (max_out - min_out) + min_out - - - -for i, id_sample in enumerate(ids_test): - for k, fn in enumerate(out_fields_names): - dataset[id_sample].add_field(fn, outputs_pred[i, k, :, :].flatten()) - for k, sn in enumerate(out_scalars_names): - dataset[id_sample].add_scalar(sn, np.mean(outputs_pred[i, k+len(out_fields_names), :, :].flatten())) - - -if os.path.exists(predicted_data_dir) and os.path.isdir(predicted_data_dir): - shutil.rmtree(predicted_data_dir) -dataset[ids_test]._save_to_dir_(predicted_data_dir) - - -print("duration train =", time.time()-start) -# GPUA30, 15130 seconds diff --git a/benchmarks/FNO/2D_profile/README.md b/benchmarks/FNO/2D_profile/README.md deleted file mode 100644 index ba7dcc9d..00000000 --- a/benchmarks/FNO/2D_profile/README.md +++ /dev/null @@ -1,7 +0,0 @@ -To run this benchmark, you need to download and untar the dataset from Zenodo, then edit in the files "prepare_[dataset].py", "train.py" and "construct_prediction.py" the following variables at the top of the files: - -- `plaid_location`: the location where the plaid dataset has been untarred -- `prepared_data_dir`: temp folder used by the scripts, for the dataset projected onto a regular grid -- `predicted_data_dir`: temp folder used by the scripts, for the prediction of the test set onto the regular grid - -Run in the order: "prepare_[dataset].py", "train.py" and "construct_prediction.py" to generate the prediction on the testing set. \ No newline at end of file diff --git a/benchmarks/FNO/2D_profile/construct_prediction.py b/benchmarks/FNO/2D_profile/construct_prediction.py deleted file mode 100644 index b9fcf576..00000000 --- a/benchmarks/FNO/2D_profile/construct_prediction.py +++ /dev/null @@ -1,74 +0,0 @@ -from plaid import Dataset -from plaid import ProblemDefinition -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.Bridges.CGNSBridge import CGNSToMesh -from Muscat.FE.FETools import PrepareFEComputation -from tqdm import tqdm - -import os, pickle, time, copy, shutil - -start = time.time() - - -plaid_location = # path to update -predicted_data_dir = # path to update - - -datapath=os.path.join(plaid_location, "dataset") -pb_defpath=os.path.join(plaid_location, "problem_definition") - - - -dataset_pred = Dataset() -dataset_pred._load_from_dir_(predicted_data_dir, verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train') -ids_test = problem.get_split('test') - - -dataset = Dataset() -dataset._load_from_dir_(datapath, ids=ids_test, verbose=True, processes_number=4) - - -n_train = len(ids_train) -n_test = len(ids_test) - -out_fields_names = ['Mach', 'Pressure', 'Velocity-x', 'Velocity-y'] -nbe_features = len(out_fields_names) - -rec_mesh = CGNSToMesh(dataset_pred[ids_test[0]].get_mesh()) - - -prediction = [] - -count = 0 -for sample_index in tqdm(ids_test): - - sample_pred = dataset_pred[sample_index] - sample = dataset[sample_index] - - input_mesh = CGNSToMesh(sample.get_mesh()) - - space, numberings,_,_ = PrepareFEComputation(rec_mesh, numberOfComponents=1) - field_mesh = FEField("", mesh=rec_mesh, space=space, numbering=numberings[0]) - efilter = ElementFilter(dimensionality=rec_mesh.GetElementsDimensionality()) - op, status, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= input_mesh.nodes, method="Interp/Clamp" , elementFilter=efilter , verbose=False) - - prediction.append({}) - for fn in out_fields_names: - prediction[count][fn] = op.dot(sample_pred.get_field(fn)) - - count += 1 - -with open('prediction_2d_profile.pkl', 'wb') as file: - pickle.dump(prediction, file) - -print("duration construct predictions =", time.time()-start) -# 95 seconds diff --git a/benchmarks/FNO/2D_profile/prepare_2d_profile.py b/benchmarks/FNO/2D_profile/prepare_2d_profile.py deleted file mode 100644 index 33e20ac6..00000000 --- a/benchmarks/FNO/2D_profile/prepare_2d_profile.py +++ /dev/null @@ -1,110 +0,0 @@ - -from plaid import ProblemDefinition -from plaid import Sample -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.Bridges.CGNSBridge import MeshToCGNS,CGNSToMesh -from Muscat.Containers.ConstantRectilinearMeshTools import CreateConstantRectilinearMesh -from Muscat.Containers.MeshTetrahedrization import Tetrahedrization -from Muscat.Containers.MeshModificationTools import ComputeSkin -from Muscat.FE.FETools import PrepareFEComputation -from Muscat.FE.FETools import ComputeNormalsAtPoints -import copy -from tqdm import tqdm - -import os, shutil, time - - -plaid_location = # path to update -prepared_data_dir = # path to update - - - -start = time.time() - -def compute_signed_distance(mesh,eval_points): - """Function to compute the signed distance from the border of the mesh - - Args: - mesh (Muscat.Mesh): mesh which needs the singed distance - eval_points (np.array): Points where to compute the signed distance. - - Returns: - np.array: returns the signed distance of the mesh at th eeval points - """ - ComputeSkin(mesh,inPlace=True) - space, numberings,_,_ = PrepareFEComputation(mesh,numberOfComponents=1) - field_mesh = FEField("", mesh=mesh, space=space, numbering=numberings[0]) - opSkin, statusSkin, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= eval_points, method="Interp/Clamp" , elementFilter= ElementFilter(dimensionality=mesh.GetElementsDimensionality()-1) , verbose=False) - normals = ComputeNormalsAtPoints(mesh) - skinpos = opSkin.dot(mesh.nodes) - normalspos = opSkin.dot(normals) - sign_distance = -1*np.sign(np.sum((eval_points - skinpos)*normalspos,axis=-1)) - distance = np.sqrt(np.sum((eval_points - skinpos)**2,axis=-1)) - return sign_distance*distance - - - -datapath=os.path.join(plaid_location, "dataset") -pb_defpath=os.path.join(plaid_location, "problem_definition") - - -out_fields_names = ["Mach", "Pressure", "Velocity-x", "Velocity-y"] - - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train') -ids_test = problem.get_split('test') - - -size=200 -rec_mesh = Tetrahedrization(CreateConstantRectilinearMesh([3*size+1,4*size+1], [-1,-2], [1/size, 1/size])) -out_nodes = rec_mesh.nodes - - -nSamples = len(ids_train)+len(ids_test) - -for sample_index in tqdm(range(nSamples)): - - sample = Sample.load_from_dir(dir_path = os.path.join(datapath, "samples/sample_{:09d}".format(sample_index))) - - input_mesh = CGNSToMesh(sample.get_mesh(time=0)) - - input_mesh.nodes= (input_mesh.nodes[:,[0,1]]).copy(order='C') - - new_sample=Sample() - tree = MeshToCGNS(rec_mesh) - new_sample.add_tree(tree,time=0) - - - if sample_index in ids_train: - - space, numberings,_,_ = PrepareFEComputation(input_mesh,numberOfComponents=1) - field_mesh = FEField("", mesh=input_mesh, space=space, numbering=numberings[0]) - efilter = ElementFilter(dimensionality=input_mesh.GetElementsDimensionality()) - op, status, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= out_nodes, method="Interp/Clamp" , elementFilter=efilter , verbose=False) - - for field_name in out_fields_names: - old_field = sample.get_field( name=field_name) - new_sample.add_field(field_name, op.dot(old_field)) - - elif sample_index in ids_test: - pass - else: - raise("unkown sample_index") - - - new_sample.add_field("Signed_Distance",compute_signed_distance(copy.deepcopy(input_mesh),rec_mesh.nodes)) - - path = os.path.join(prepared_data_dir,"dataset/samples/sample_{:09d}".format(sample_index)) - if os.path.exists(path) and os.path.isdir(path): - shutil.rmtree(path) - new_sample.save(path) - -print("duration prepare =", time.time()-start) -# 408 seconds diff --git a/benchmarks/FNO/2D_profile/train_and_predict.py b/benchmarks/FNO/2D_profile/train_and_predict.py deleted file mode 100644 index fbf0751d..00000000 --- a/benchmarks/FNO/2D_profile/train_and_predict.py +++ /dev/null @@ -1,153 +0,0 @@ -from plaid import Dataset -from plaid import ProblemDefinition -import numpy as np - -import os, shutil -import time - -import torch -from physicsnemo.models.fno.fno import FNO - -start = time.time() - - -pb_defpath = # path to update -prepared_data_dir = # path to update -predicted_data_dir = # path to update - - -dataset = Dataset() -dataset._load_from_dir_(os.path.join(prepared_data_dir, "dataset"), verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train') -ids_test = problem.get_split('test') - - -n_train = len(ids_train) -n_test = len(ids_test) - - -out_fields_names = ["Mach", "Pressure", "Velocity-x", "Velocity-y"] - -size1 = 601 -size2 = 801 - - -# TRAIN - -inputs = np.empty((n_train, 1, size1, size2)) -for i, id_sample in enumerate(ids_train): - inputs[i, 0, :, :] = dataset[id_sample].get_field("Signed_Distance").reshape((size1, size2)) - -outputs = np.empty((n_train, len(out_fields_names), size1, size2)) -for i, id_sample in enumerate(ids_train): - for k, fn in enumerate(out_fields_names): - outputs[i, k, :, :] = dataset[id_sample].get_field(fn).reshape((size1, size2)) - - -min_in = inputs.min(axis=(0, 2, 3), keepdims=True) -max_in = inputs.max(axis=(0, 2, 3), keepdims=True) -inputs = (inputs - min_in) / (max_in - min_in) - - -min_out = outputs.min(axis=(0, 2, 3), keepdims=True) -max_out = outputs.max(axis=(0, 2, 3), keepdims=True) -outputs = (outputs - min_out) / (max_out - min_out) - - -import torch -from torch.utils.data import Dataset - -class GridDataset(Dataset): - def __init__(self, inputs, outputs): - self.inputs = torch.tensor(inputs, dtype=torch.float32) - self.outputs = torch.tensor(outputs, dtype=torch.float32) - - def __len__(self): - return self.inputs.shape[0] - - def __getitem__(self, idx): - return self.inputs[idx], self.outputs[idx] - -from torch.utils.data import DataLoader, TensorDataset -dataset__ = GridDataset(inputs, outputs) -loader = DataLoader(dataset__, batch_size=16, shuffle=True) - - - -model = FNO( -in_channels=inputs.shape[1], -out_channels=outputs.shape[1], -decoder_layers=2, -decoder_layer_size=32, -dimension=2, -latent_channels=32, -num_fno_layers=4, -num_fno_modes=32, -padding=0, -).cuda() - - -optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) -loss_fn = torch.nn.MSELoss() - -n_epoch = 200 -for epoch in range(n_epoch): - model.train() - total_loss = 0.0 - for x, y in loader: - x, y = x.cuda(), y.cuda() - pred = model(x) - loss = loss_fn(pred, y) - optimizer.zero_grad() - loss.backward() - optimizer.step() - total_loss += loss.item() - print(f"Epoch {epoch+1}, Loss: {total_loss/len(loader)}") - - - -# TEST - -inputs = np.empty((n_test, 1, size1, size2)) -for i, id_sample in enumerate(ids_test): - inputs[i, 0, :, :] = dataset[id_sample].get_field("Signed_Distance").reshape((size1, size2)) - - -inputs = (inputs - min_in) / (max_in - min_in) - -test_dataset = TensorDataset(torch.tensor(inputs, dtype=torch.float32)) -test_loader = DataLoader(test_dataset, batch_size=16, shuffle=False) - - - -model.eval() -y_preds = [] - -with torch.no_grad(): - for batch in test_loader: - x_batch = batch[0].cuda() - y_batch_pred = model(x_batch).cpu() - y_preds.append(y_batch_pred) - -y_pred = torch.cat(y_preds, dim=0).numpy() - - -outputs_pred = y_pred * (max_out - min_out) + min_out - - -for i, id_sample in enumerate(ids_test): - for k, fn in enumerate(out_fields_names): - dataset[id_sample].add_field(fn, outputs_pred[i, k, :, :].flatten()) - - -if os.path.exists(predicted_data_dir) and os.path.isdir(predicted_data_dir): - shutil.rmtree(predicted_data_dir) -dataset[ids_test]._save_to_dir_(predicted_data_dir) - - -print("duration train =", time.time()-start) -# GPUA30, 8887 seconds diff --git a/benchmarks/FNO/README.md b/benchmarks/FNO/README.md deleted file mode 100644 index 3642e969..00000000 --- a/benchmarks/FNO/README.md +++ /dev/null @@ -1,7 +0,0 @@ -### List of dependencies: - -- [PLAID=0.1.6](https://github.com/PLAID-lib/plaid) -- [PhysicsNemo=1.1.1](https://github.com/NVIDIA/physicsnemo) -- Muscat: - - 2D_ElPlDynamics: [Muscat=87189dab](https://git.cloud.safran/safransa/safrantech/muscat/muscat/-/tree/87189dab624f058706786ce16dcfd7736ddddef5) - - Other datasets: [Muscat=2.4.1](https://gitlab.com/drti/muscat) \ No newline at end of file diff --git a/benchmarks/FNO/Rotor37/README.md b/benchmarks/FNO/Rotor37/README.md deleted file mode 100644 index ba7dcc9d..00000000 --- a/benchmarks/FNO/Rotor37/README.md +++ /dev/null @@ -1,7 +0,0 @@ -To run this benchmark, you need to download and untar the dataset from Zenodo, then edit in the files "prepare_[dataset].py", "train.py" and "construct_prediction.py" the following variables at the top of the files: - -- `plaid_location`: the location where the plaid dataset has been untarred -- `prepared_data_dir`: temp folder used by the scripts, for the dataset projected onto a regular grid -- `predicted_data_dir`: temp folder used by the scripts, for the prediction of the test set onto the regular grid - -Run in the order: "prepare_[dataset].py", "train.py" and "construct_prediction.py" to generate the prediction on the testing set. \ No newline at end of file diff --git a/benchmarks/FNO/Rotor37/construct_prediction.py b/benchmarks/FNO/Rotor37/construct_prediction.py deleted file mode 100644 index 8b2aa824..00000000 --- a/benchmarks/FNO/Rotor37/construct_prediction.py +++ /dev/null @@ -1,76 +0,0 @@ -from plaid import Dataset -from plaid import ProblemDefinition -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.Bridges.CGNSBridge import CGNSToMesh -from Muscat.FE.FETools import PrepareFEComputation -from tqdm import tqdm - -import os, pickle, time, copy, shutil - -start = time.time() - - -plaid_location = # path to update -predicted_data_dir = # path to update - - - -datapath=os.path.join(plaid_location, "dataset") -pb_defpath=os.path.join(plaid_location, "problem_definition") - -dataset_pred = Dataset() -dataset_pred._load_from_dir_(predicted_data_dir, verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train_1000') -ids_test = problem.get_split('test') - - -dataset = Dataset() -dataset._load_from_dir_(datapath, ids=ids_test, verbose=True, processes_number=4) - - -n_train = len(ids_train) -n_test = len(ids_test) - -out_fields_names = ['Density', 'Pressure', 'Temperature'] -out_scalars_names = ['Massflow', 'Compression_ratio', 'Efficiency'] -nbe_features = len(out_fields_names) + len(out_scalars_names) - -rec_mesh = CGNSToMesh(dataset_pred[ids_test[0]].get_mesh()) - - -prediction = [] - -count = 0 -for sample_index in tqdm(ids_test): - - sample_pred = dataset_pred[sample_index] - sample = dataset[sample_index] - - input_mesh = CGNSToMesh(sample.get_mesh()) - - space, numberings,_,_ = PrepareFEComputation(rec_mesh, numberOfComponents=1) - field_mesh = FEField("", mesh=rec_mesh, space=space, numbering=numberings[0]) - efilter = ElementFilter(dimensionality=rec_mesh.GetElementsDimensionality()) - op, status, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= input_mesh.nodes, method="Interp/Clamp" , elementFilter=efilter , verbose=False) - - prediction.append({}) - for fn in out_fields_names: - prediction[count][fn] = op.dot(sample_pred.get_field(fn)) - for sn in out_scalars_names: - prediction[count][sn] = sample_pred.get_scalar(sn) - - count += 1 - -with open('prediction_rotor37.pkl', 'wb') as file: - pickle.dump(prediction, file) - -print("duration construct predictions =", time.time()-start) -# 114 seconds diff --git a/benchmarks/FNO/Rotor37/prepare_rotor37.py b/benchmarks/FNO/Rotor37/prepare_rotor37.py deleted file mode 100644 index 8bc6f8ae..00000000 --- a/benchmarks/FNO/Rotor37/prepare_rotor37.py +++ /dev/null @@ -1,126 +0,0 @@ -from plaid import Sample -from plaid import ProblemDefinition -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.Bridges.CGNSBridge import MeshToCGNS,CGNSToMesh -from Muscat.Containers.ConstantRectilinearMeshTools import CreateConstantRectilinearMesh -from Muscat.Containers.MeshModificationTools import ComputeSkin -from Muscat.FE.FETools import PrepareFEComputation -from Muscat.FE.FETools import ComputeNormalsAtPoints - -import os, shutil, time -from tqdm import tqdm - - - - -plaid_location = # path to update -prepared_data_dir = # path to update - - - - -start = time.time() - - -def compute_signed_distance(mesh,eval_points): - """Function to compute the signed distance from the border of the mesh - - Args: - mesh (Muscat.Mesh): mesh which needs the singed distance - eval_points (np.array): Points where to compute the signed distance. - - Returns: - np.array: returns the signed distance of the mesh at th eeval points - """ - ComputeSkin(mesh,inPlace=True) - space, numberings,_,_ = PrepareFEComputation(mesh,numberOfComponents=1) - field_mesh = FEField("", mesh=mesh, space=space, numbering=numberings[0]) - opSkin, statusSkin, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= eval_points, method="Interp/Clamp" , elementFilter= ElementFilter(dimensionality=mesh.GetElementsDimensionality()-1) , verbose=False) - normals = ComputeNormalsAtPoints(mesh) - skinpos = opSkin.dot(mesh.nodes) - normalspos = opSkin.dot(normals) - sign_distance = -1*np.sign(np.sum((eval_points - skinpos)*normalspos,axis=-1)) - distance = np.sqrt(np.sum((eval_points - skinpos)**2,axis=-1)) - return sign_distance*distance - - - -datapath=os.path.join(plaid_location, "dataset") -pb_defpath=os.path.join(plaid_location, "problem_definition") - - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train_1000') -ids_test = problem.get_split('test') - - - -in_scalars_names = ['Omega', 'P'] -out_fields_names = ['Density', 'Pressure', 'Temperature'] -out_scalars_names = ['Massflow', 'Compression_ratio', 'Efficiency'] - - -size = 64 - -length = [0.06/(size-1), 0.083/(size-1), 0.06/(size-1)] -origin = [-0.01, 0.172, -0.03] - -ref_mesh = CreateConstantRectilinearMesh([size,size,size], origin, length) - - -out_nodes = ref_mesh.nodes - -for sample_index in tqdm(range(len(ids_train)+len(ids_test))): - - sample = Sample.load_from_dir(dir_path = os.path.join(datapath, "samples/sample_{:09d}".format(sample_index))) - input_mesh = CGNSToMesh(sample.get_mesh(time=0)) - - - new_sample=Sample() - tree = MeshToCGNS(ref_mesh) - new_sample.add_tree(tree) - - - if sample_index in ids_train: - scalar_names = in_scalars_names + out_scalars_names - - space, numberings,_,_ = PrepareFEComputation(input_mesh,numberOfComponents=1) - field_mesh = FEField("", mesh=input_mesh, space=space, numbering=numberings[0]) - efilter = ElementFilter(dimensionality=input_mesh.GetElementsDimensionality()) - op, status, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= out_nodes, method="Interp/Clamp", elementFilter=efilter, verbose=False) - - for fn in out_fields_names: - projected_field = op.dot(sample.get_field(fn)) - new_sample.add_field(fn, projected_field) - - elif sample_index in ids_test: - scalar_names = in_scalars_names - else: - raise("unkown sample_index") - - - skinpos = op.dot(input_mesh.nodes) - sign_distance = np.linalg.norm(out_nodes - skinpos,axis=-1) - - new_sample.add_field("Signed_Distance", sign_distance) - - - for scalar_name in scalar_names: - old_scalar= sample.get_scalar( name=scalar_name) - new_sample.add_scalar(scalar_name, old_scalar) - - - path = os.path.join(prepared_data_dir,"dataset/samples/sample_{:09d}".format(sample_index)) - if os.path.exists(path) and os.path.isdir(path): - shutil.rmtree(path) - new_sample.save(path) - - -print("duration prepare =", time.time()-start) -# 1689 seconds diff --git a/benchmarks/FNO/Rotor37/train_and_predict.py b/benchmarks/FNO/Rotor37/train_and_predict.py deleted file mode 100644 index 5aaefe09..00000000 --- a/benchmarks/FNO/Rotor37/train_and_predict.py +++ /dev/null @@ -1,167 +0,0 @@ -from plaid import Dataset -from plaid import ProblemDefinition -import numpy as np - -import os, shutil -import time - -import torch -from physicsnemo.models.fno.fno import FNO - -start = time.time() - - -plaid_location = # path to update -prepared_data_dir = # path to update -predicted_data_dir = # path to update - - - -pb_defpath=os.path.join(plaid_location, "problem_definition") - -dataset = Dataset() -dataset._load_from_dir_(os.path.join(prepared_data_dir, "dataset"), verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train_1000') -ids_test = problem.get_split('test') - - -n_train = len(ids_train) -n_test = len(ids_test) - - -in_scalars_names = ['Omega', 'P'] -out_fields_names = ['Density', 'Pressure', 'Temperature'] -out_scalars_names = ['Massflow', 'Compression_ratio', 'Efficiency'] - - -size = 64 - - -# TRAIN -inputs = np.empty((n_train, len(in_scalars_names)+1, size, size, size)) -for i, id_sample in enumerate(ids_train): - for in_chan in range(len(in_scalars_names)+1): - inputs[i, in_chan, :, :, :] = dataset[id_sample].get_field("Signed_Distance").reshape((size, size, size)) - for k, sn in enumerate(in_scalars_names): - inputs[i, k+1, :, :, :] = dataset[id_sample].get_scalar(sn) - -outputs = np.empty((n_train, len(out_scalars_names)+len(out_fields_names), size, size, size)) -for i, id_sample in enumerate(ids_train): - for k, fn in enumerate(out_fields_names): - outputs[i, k, :, :, :] = dataset[id_sample].get_field(fn).reshape((size, size, size)) - for k, sn in enumerate(out_scalars_names): - outputs[i, k+len(out_fields_names), :, :, :] = dataset[id_sample].get_scalar(sn) - - -min_in = inputs.min(axis=(0, 2, 3, 4), keepdims=True) -max_in = inputs.max(axis=(0, 2, 3, 4), keepdims=True) -inputs = (inputs - min_in) / (max_in - min_in + 1e-8) - - -min_out = outputs.min(axis=(0, 2, 3, 4), keepdims=True) -max_out = outputs.max(axis=(0, 2, 3, 4), keepdims=True) -outputs = (outputs - min_out) / (max_out - min_out + 1e-8) - - -import torch -from torch.utils.data import Dataset, TensorDataset - -class GridDataset(Dataset): - def __init__(self, inputs, outputs): - self.inputs = torch.tensor(inputs, dtype=torch.float32) - self.outputs = torch.tensor(outputs, dtype=torch.float32) - - def __len__(self): - return self.inputs.shape[0] - - def __getitem__(self, idx): - return self.inputs[idx], self.outputs[idx] - -from torch.utils.data import DataLoader -dataset__ = GridDataset(inputs, outputs) -loader = DataLoader(dataset__, batch_size=1, shuffle=True) - - -model = FNO( -in_channels=inputs.shape[1], -out_channels=outputs.shape[1], -decoder_layers=4, -decoder_layer_size=32, -dimension=3, -latent_channels=32, -num_fno_layers=4, -num_fno_modes=32, -padding=2, -).cuda() - - -optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) -loss_fn = torch.nn.MSELoss() - -n_epoch = 100 -for epoch in range(n_epoch): - model.train() - total_loss = 0.0 - for x, y in loader: - x, y = x.cuda(), y.cuda() - pred = model(x) - loss = loss_fn(pred, y) - optimizer.zero_grad() - loss.backward() - optimizer.step() - total_loss += loss.item() - print(f"Epoch {epoch+1}, Loss: {total_loss/len(loader)}") - - - -# TEST - -inputs = np.empty((n_test, len(in_scalars_names)+1, size, size, size)) - -for i, id_sample in enumerate(ids_test): - for in_chan in range(len(in_scalars_names)+1): - inputs[i, in_chan, :, :, :] = dataset[id_sample].get_field("Signed_Distance").reshape((size, size, size)) - for k, sn in enumerate(in_scalars_names): - inputs[i, k+1, :, :, :] = dataset[id_sample].get_scalar(sn) - -inputs = (inputs - min_in) / (max_in - min_in+ 1e-8) - - -test_dataset = TensorDataset(torch.tensor(inputs, dtype=torch.float32)) -test_loader = DataLoader(test_dataset, batch_size=1, shuffle=False) - - - -model.eval() -y_preds = [] - -with torch.no_grad(): - for batch in test_loader: - x_batch = batch[0].cuda() - y_batch_pred = model(x_batch).cpu() - y_preds.append(y_batch_pred) - -y_pred = torch.cat(y_preds, dim=0).numpy() - - -outputs_pred = y_pred * (max_out - min_out+ 1e-8) + min_out - - -for i, id_sample in enumerate(ids_test): - for k, fn in enumerate(out_fields_names): - dataset[id_sample].add_field(fn, outputs_pred[i, k, :, :, :].flatten()) - for k, sn in enumerate(out_scalars_names): - dataset[id_sample].add_scalar(sn, np.mean(outputs_pred[i, k+len(out_fields_names), :, :, :].flatten())) - - -if os.path.exists(predicted_data_dir) and os.path.isdir(predicted_data_dir): - shutil.rmtree(predicted_data_dir) -dataset[ids_test]._save_to_dir_(predicted_data_dir) - - -print("duration train =", time.time()-start) -# GPUA30, 26370 seconds diff --git a/benchmarks/FNO/Tensile2d/README.md b/benchmarks/FNO/Tensile2d/README.md deleted file mode 100644 index ba7dcc9d..00000000 --- a/benchmarks/FNO/Tensile2d/README.md +++ /dev/null @@ -1,7 +0,0 @@ -To run this benchmark, you need to download and untar the dataset from Zenodo, then edit in the files "prepare_[dataset].py", "train.py" and "construct_prediction.py" the following variables at the top of the files: - -- `plaid_location`: the location where the plaid dataset has been untarred -- `prepared_data_dir`: temp folder used by the scripts, for the dataset projected onto a regular grid -- `predicted_data_dir`: temp folder used by the scripts, for the prediction of the test set onto the regular grid - -Run in the order: "prepare_[dataset].py", "train.py" and "construct_prediction.py" to generate the prediction on the testing set. \ No newline at end of file diff --git a/benchmarks/FNO/Tensile2d/construct_prediction.py b/benchmarks/FNO/Tensile2d/construct_prediction.py deleted file mode 100644 index ca18a367..00000000 --- a/benchmarks/FNO/Tensile2d/construct_prediction.py +++ /dev/null @@ -1,77 +0,0 @@ -from plaid import Dataset -from plaid import ProblemDefinition -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.Bridges.CGNSBridge import CGNSToMesh -from Muscat.FE.FETools import PrepareFEComputation -from tqdm import tqdm - -import os, pickle, time - -start = time.time() - - -plaid_location = # path to update -predicted_data_dir = # path to update - - - -datapath=os.path.join(plaid_location, "dataset") -pb_defpath=os.path.join(plaid_location, "problem_definition") - - -dataset_pred = Dataset() -dataset_pred._load_from_dir_(predicted_data_dir, verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train_500') -ids_test = problem.get_split('test') - - -dataset = Dataset() -dataset._load_from_dir_(datapath, ids=ids_test, verbose=True, processes_number=4) - - -n_train = len(ids_train) -n_test = len(ids_test) - -out_fields_names = ['U1', 'U2', 'sig11', 'sig22', 'sig12'] -out_scalars_names = ['max_von_mises', 'max_U2_top', 'max_sig22_top'] -nbe_features = len(out_fields_names) + len(out_scalars_names) - -rec_mesh = CGNSToMesh(dataset_pred[ids_test[0]].get_mesh()) - - -prediction = [] - -count = 0 -for sample_index in tqdm(ids_test): - - sample_pred = dataset_pred[sample_index] - sample = dataset[sample_index] - - input_mesh = CGNSToMesh(sample.get_mesh()) - - space, numberings,_,_ = PrepareFEComputation(rec_mesh, numberOfComponents=1) - field_mesh = FEField("", mesh=rec_mesh, space=space, numbering=numberings[0]) - efilter = ElementFilter(dimensionality=rec_mesh.GetElementsDimensionality()) - op, status, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= input_mesh.nodes, method="Interp/Clamp" , elementFilter=efilter , verbose=False) - - prediction.append({}) - for fn in out_fields_names: - prediction[count][fn] = op.dot(sample_pred.get_field(fn)) - for sn in out_scalars_names: - prediction[count][sn] = sample_pred.get_scalar(sn) - - count += 1 - -with open('prediction_tensile2d.pkl', 'wb') as file: - pickle.dump(prediction, file) - -print("duration construct predictions =", time.time()-start) -# 15 seconds diff --git a/benchmarks/FNO/Tensile2d/prepare_tensile2d.py b/benchmarks/FNO/Tensile2d/prepare_tensile2d.py deleted file mode 100644 index d90d1d46..00000000 --- a/benchmarks/FNO/Tensile2d/prepare_tensile2d.py +++ /dev/null @@ -1,116 +0,0 @@ -from plaid import Sample -from plaid import ProblemDefinition -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.Bridges.CGNSBridge import MeshToCGNS,CGNSToMesh -from Muscat.Containers.ConstantRectilinearMeshTools import CreateConstantRectilinearMesh -from Muscat.Containers.MeshTetrahedrization import Tetrahedrization -from Muscat.Containers.MeshModificationTools import ComputeSkin -from Muscat.FE.FETools import PrepareFEComputation -from Muscat.FE.FETools import ComputeNormalsAtPoints -import copy -from tqdm import tqdm - -import os, shutil -import time - - - -plaid_location = # path to update -prepared_data_dir= # path to update - - - -start = time.time() - -def compute_signed_distance(mesh,eval_points): - """Function to compute the signed distance from the border of the mesh - - Args: - mesh (Muscat.Mesh): mesh which needs the singed distance - eval_points (np.array): Points where to compute the signed distance. - - Returns: - np.array: returns the signed distance of the mesh at th eeval points - """ - ComputeSkin(mesh,inPlace=True) - space, numberings,_,_ = PrepareFEComputation(mesh,numberOfComponents=1) - field_mesh = FEField("", mesh=mesh, space=space, numbering=numberings[0]) - opSkin, statusSkin, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= eval_points, method="Interp/Clamp" , elementFilter= ElementFilter(dimensionality=mesh.GetElementsDimensionality()-1) , verbose=False) - normals = ComputeNormalsAtPoints(mesh) - skinpos = opSkin.dot(mesh.nodes) - normalspos = opSkin.dot(normals) - sign_distance = -1*np.sign(np.sum((eval_points - skinpos)*normalspos,axis=-1)) - distance = np.sqrt(np.sum((eval_points - skinpos)**2,axis=-1)) - return sign_distance*distance - - -datapath=os.path.join(plaid_location, "dataset") -pb_defpath=os.path.join(plaid_location, "problem_definition") - - -in_scalars_names = ["P","p1","p2","p3","p4","p5"] -out_fields_names = ["U1","U2","sig11","sig22","sig12"] -out_scalars_names = ["max_von_mises","max_U2_top",'max_sig22_top'] - - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train_500') -ids_test = problem.get_split('test') - - -size=100 -rec_mesh = Tetrahedrization(CreateConstantRectilinearMesh([size*2+1,size*2+1], [-1,-1], [1/size, 1/size])) -out_nodes = rec_mesh.nodes - - -nSamples = len(ids_train)+len(ids_test) - -for sample_index in tqdm(range(nSamples)): - - sample = Sample.load_from_dir(dir_path = os.path.join(datapath, "samples/sample_{:09d}".format(sample_index))) - - input_mesh = CGNSToMesh(sample.get_mesh(time=0)) - - input_mesh.nodes= (input_mesh.nodes[:,[0,1]]).copy(order='C') - - new_sample=Sample() - tree = MeshToCGNS(rec_mesh) - new_sample.add_tree(tree,time=0) - - if sample_index in ids_train: - - scalar_names = in_scalars_names + out_scalars_names - - space, numberings,_,_ = PrepareFEComputation(input_mesh,numberOfComponents=1) - field_mesh = FEField("", mesh=input_mesh, space=space, numbering=numberings[0]) - efilter = ElementFilter(dimensionality=input_mesh.GetElementsDimensionality()) - op, status, _ = GetFieldTransferOp(inputField= field_mesh, targetPoints= out_nodes, method="Interp/Clamp" , elementFilter=efilter , verbose=False) - - for field_name in out_fields_names: - old_field = sample.get_field( name=field_name) - new_sample.add_field(field_name, op.dot(old_field)) - - elif sample_index in ids_test: - scalar_names = in_scalars_names - else: - raise("unkown sample_index") - - - for scalar_name in scalar_names: - old_scalar= sample.get_scalar( name=scalar_name) - new_sample.add_scalar(scalar_name, old_scalar) - new_sample.add_field("Signed_Distance",compute_signed_distance(copy.deepcopy(input_mesh),rec_mesh.nodes)) - - path = os.path.join(prepared_data_dir,"dataset/samples/sample_{:09d}".format(sample_index)) - if os.path.exists(path) and os.path.isdir(path): - shutil.rmtree(path) - new_sample.save(path) - -print("duration prepare =", time.time()-start) -# 87 seconds diff --git a/benchmarks/FNO/Tensile2d/train_and_predict.py b/benchmarks/FNO/Tensile2d/train_and_predict.py deleted file mode 100644 index 22ad1ec4..00000000 --- a/benchmarks/FNO/Tensile2d/train_and_predict.py +++ /dev/null @@ -1,164 +0,0 @@ -from plaid import Dataset -from plaid import ProblemDefinition -import numpy as np - -import os, shutil -import time - -import torch -from physicsnemo.models.fno.fno import FNO - -start = time.time() - - -plaid_location = # path to update -prepared_data_dir = # path to update -predicted_data_dir = # path to update - - - -pb_defpath=os.path.join(plaid_location, "problem_definition") - -dataset = Dataset() -dataset._load_from_dir_(os.path.join(prepared_data_dir, "dataset"), verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train_500') -ids_test = problem.get_split('test') - - -n_train = len(ids_train) -n_test = len(ids_test) - - -in_scalars_names = ["P","p1","p2","p3","p4","p5"] -out_fields_names = ["U1","U2","sig11","sig22","sig12"] -out_scalars_names = ["max_von_mises","max_U2_top",'max_sig22_top'] - -size = 201 - - -# TRAIN - -inputs = np.empty((n_train, len(in_scalars_names)+1, size, size)) -for i, id_sample in enumerate(ids_train): - for in_chan in range(len(in_scalars_names)+1): - inputs[i, in_chan, :, :] = dataset[id_sample].get_field("Signed_Distance").reshape((size, size)) - for k, sn in enumerate(in_scalars_names): - inputs[i, k+1, :, :] = dataset[id_sample].get_scalar(sn) - -outputs = np.empty((n_train, len(out_scalars_names)+len(out_fields_names), size, size)) -for i, id_sample in enumerate(ids_train): - for k, fn in enumerate(out_fields_names): - outputs[i, k, :, :] = dataset[id_sample].get_field(fn).reshape((size, size)) - for k, sn in enumerate(out_scalars_names): - outputs[i, k+len(out_fields_names), :, :] = dataset[id_sample].get_scalar(sn) - - -min_in = inputs.min(axis=(0, 2, 3), keepdims=True) -max_in = inputs.max(axis=(0, 2, 3), keepdims=True) -inputs = (inputs - min_in) / (max_in - min_in) - - -min_out = outputs.min(axis=(0, 2, 3), keepdims=True) -max_out = outputs.max(axis=(0, 2, 3), keepdims=True) -outputs = (outputs - min_out) / (max_out - min_out) - - -import torch -from torch.utils.data import Dataset, TensorDataset - -class GridDataset(Dataset): - def __init__(self, inputs, outputs): - self.inputs = torch.tensor(inputs, dtype=torch.float32) - self.outputs = torch.tensor(outputs, dtype=torch.float32) - - def __len__(self): - return self.inputs.shape[0] - - def __getitem__(self, idx): - return self.inputs[idx], self.outputs[idx] - -from torch.utils.data import DataLoader -dataset__ = GridDataset(inputs, outputs) -loader = DataLoader(dataset__, batch_size=64, shuffle=True) - - -model = FNO( -in_channels=inputs.shape[1], -out_channels=outputs.shape[1], -decoder_layers=4, -decoder_layer_size=64, -dimension=2, -latent_channels=64, -num_fno_layers=4, -padding=0, -).cuda() - -optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) -loss_fn = torch.nn.MSELoss() - -n_epoch = 2000 -for epoch in range(n_epoch): - model.train() - total_loss = 0.0 - for x, y in loader: - x, y = x.cuda(), y.cuda() - pred = model(x) - loss = loss_fn(pred, y) - optimizer.zero_grad() - loss.backward() - optimizer.step() - total_loss += loss.item() - print(f"Epoch {epoch+1}, Loss: {total_loss/len(loader)}") - - - -# TEST - -inputs = np.empty((n_test, len(in_scalars_names)+1, size, size)) - -for i, id_sample in enumerate(ids_test): - for in_chan in range(len(in_scalars_names)+1): - inputs[i, in_chan, :, :] = dataset[id_sample].get_field("Signed_Distance").reshape((size, size)) - for k, sn in enumerate(in_scalars_names): - inputs[i, k+1, :, :] = dataset[id_sample].get_scalar(sn) - -inputs = (inputs - min_in) / (max_in - min_in) - - - -test_dataset = TensorDataset(torch.tensor(inputs, dtype=torch.float32)) -test_loader = DataLoader(test_dataset, batch_size=1, shuffle=False) - -model.eval() -y_preds = [] - -with torch.no_grad(): - for batch in test_loader: - x_batch = batch[0].cuda() - y_batch_pred = model(x_batch).cpu() - y_preds.append(y_batch_pred) - -y_pred = torch.cat(y_preds, dim=0).numpy() - -outputs_pred = y_pred * (max_out - min_out) + min_out - - - -for i, id_sample in enumerate(ids_test): - for k, fn in enumerate(out_fields_names): - dataset[id_sample].add_field(fn, outputs_pred[i, k, :, :].flatten()) - for k, sn in enumerate(out_scalars_names): - dataset[id_sample].add_scalar(sn, np.mean(outputs_pred[i, k+len(out_fields_names), :, :].flatten())) - - -if os.path.exists(predicted_data_dir) and os.path.isdir(predicted_data_dir): - shutil.rmtree(predicted_data_dir) -dataset[ids_test]._save_to_dir_(predicted_data_dir) - - -print("duration train =", time.time()-start) -# GPUA30, 9980 seconds diff --git a/benchmarks/FNO/VKI-LS59/README.md b/benchmarks/FNO/VKI-LS59/README.md deleted file mode 100644 index ba7dcc9d..00000000 --- a/benchmarks/FNO/VKI-LS59/README.md +++ /dev/null @@ -1,7 +0,0 @@ -To run this benchmark, you need to download and untar the dataset from Zenodo, then edit in the files "prepare_[dataset].py", "train.py" and "construct_prediction.py" the following variables at the top of the files: - -- `plaid_location`: the location where the plaid dataset has been untarred -- `prepared_data_dir`: temp folder used by the scripts, for the dataset projected onto a regular grid -- `predicted_data_dir`: temp folder used by the scripts, for the prediction of the test set onto the regular grid - -Run in the order: "prepare_[dataset].py", "train.py" and "construct_prediction.py" to generate the prediction on the testing set. \ No newline at end of file diff --git a/benchmarks/FNO/VKI-LS59/construct_prediction.py b/benchmarks/FNO/VKI-LS59/construct_prediction.py deleted file mode 100644 index 24e2528a..00000000 --- a/benchmarks/FNO/VKI-LS59/construct_prediction.py +++ /dev/null @@ -1,74 +0,0 @@ -from plaid import Dataset -from plaid import Sample -from plaid import ProblemDefinition -from Muscat.Bridges.CGNSBridge import CGNSToMesh -import numpy as np -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.Fields.FEField import FEField -from Muscat.Bridges.CGNSBridge import CGNSToMesh -from Muscat.FE.FETools import PrepareFEComputation -from tqdm import tqdm - -import os, pickle, time, shutil, copy - -start = time.time() - - -plaid_location = # path to update -predicted_data_dir = # path to update - - - -datapath=os.path.join(plaid_location, "dataset") -pb_defpath=os.path.join(plaid_location, "problem_definition") - - - -dataset_pred = Dataset() -dataset_pred._load_from_dir_(predicted_data_dir, verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train') -ids_test = problem.get_split('test') - - -dataset = Dataset() -dataset._load_from_dir_(datapath, ids=ids_test, verbose=True, processes_number=4) - - -n_train = len(ids_train) -n_test = len(ids_test) - - -out_fields_names = ['mach', 'nut'] -out_scalars_names = ['Q', 'power', 'Pr', 'Tr', 'eth_is', 'angle_out'] -nbe_features = len(out_fields_names) + len(out_scalars_names) - -rec_mesh = CGNSToMesh(dataset_pred[ids_test[0]].get_mesh()) - - -prediction = [] - -count = 0 -for sample_index in tqdm(ids_test): - - sample_pred = dataset_pred[sample_index] - sample = dataset[sample_index] - - prediction.append({}) - for fn in out_fields_names: - prediction[count][fn] = sample_pred.get_field(fn, base_name="Base_2_2") - for sn in out_scalars_names: - prediction[count][sn] = sample_pred.get_scalar(sn) - - count += 1 - - -with open('prediction_vki.pkl', 'wb') as file: - pickle.dump(prediction, file) - -print("duration construct predictions =", time.time()-start) -# 3.5 seconds diff --git a/benchmarks/FNO/VKI-LS59/prepare_vki.py b/benchmarks/FNO/VKI-LS59/prepare_vki.py deleted file mode 100644 index d6f722e4..00000000 --- a/benchmarks/FNO/VKI-LS59/prepare_vki.py +++ /dev/null @@ -1,74 +0,0 @@ -from plaid import ProblemDefinition -from plaid import Sample -from Muscat.Bridges.CGNSBridge import MeshToCGNS -from Muscat.Containers.ConstantRectilinearMeshTools import CreateConstantRectilinearMesh -from Muscat.Containers.MeshTetrahedrization import Tetrahedrization -import os, time, shutil -from tqdm import tqdm - - -start = time.time() - -plaid_location = # path to update -prepared_data_dir = # path to update - - - - -datapath=os.path.join(plaid_location, "dataset") -pb_defpath=os.path.join(plaid_location, "problem_definition") - - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train') -ids_test = problem.get_split('test') - - -in_scalars_names = ['angle_in', 'mach_out'] -out_fields_names = ['mach', 'nut'] -out_scalars_names = ['Q', 'power', 'Pr', 'Tr', 'eth_is', 'angle_out'] - - -nx = 301 -ny = 121 - - - -rec_mesh = Tetrahedrization(CreateConstantRectilinearMesh([nx,ny], [0,0], [1/(nx-1), 1/(ny-1)])) -out_nodes = rec_mesh.nodes - - -nSamples = len(ids_train)+len(ids_test) - -for sample_index in tqdm(range(nSamples)): - - sample = Sample.load_from_dir(dir_path = os.path.join(datapath, "samples/sample_{:09d}".format(sample_index))) - - tree = MeshToCGNS(rec_mesh) - - new_sample = Sample() - new_sample.add_tree(tree) - - if sample_index in ids_train: - scalar_names = in_scalars_names + out_scalars_names - for fn in out_fields_names: - new_sample.add_field(fn, sample.get_field(fn, base_name="Base_2_2")) - elif sample_index in ids_test: - scalar_names = in_scalars_names - else: - raise("unkown sample_index") - - for sn in scalar_names: - new_sample.add_scalar(sn, sample.get_scalar(sn)) - - new_sample.add_field("Signed_Distance", sample.get_field("sdf", base_name="Base_2_2")) - - path = os.path.join(prepared_data_dir,"dataset/samples/sample_{:09d}".format(sample_index)) - if os.path.exists(path) and os.path.isdir(path): - shutil.rmtree(path) - new_sample.save(path) - -print("duration prepare =", time.time()-start) -# 47 seconds diff --git a/benchmarks/FNO/VKI-LS59/train_and_predict.py b/benchmarks/FNO/VKI-LS59/train_and_predict.py deleted file mode 100644 index da5723ea..00000000 --- a/benchmarks/FNO/VKI-LS59/train_and_predict.py +++ /dev/null @@ -1,157 +0,0 @@ -from plaid import Dataset -from plaid import ProblemDefinition -import numpy as np - -import os, shutil -import time - -import torch -from physicsnemo.models.fno.fno import FNO - -start = time.time() - - -plaid_location = # path to update -prepared_data_dir = # path to update -predicted_data_dir= # path to update - - - -pb_defpath=os.path.join(plaid_location, "problem_definition") - -dataset = Dataset() -dataset._load_from_dir_(os.path.join(prepared_data_dir, "dataset"), verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train') -ids_test = problem.get_split('test') - - -n_train = len(ids_train) -n_test = len(ids_test) - - -in_scalars_names = ['angle_in', 'mach_out'] -out_fields_names = ['mach', 'nut'] -out_scalars_names = ['Q', 'power', 'Pr', 'Tr', 'eth_is', 'angle_out'] - - -size1 = 301 -size2 = 121 - - -# TRAIN - -inputs = np.empty((n_train, len(in_scalars_names)+1, size1, size2)) -for i, id_sample in enumerate(ids_train): - for in_chan in range(len(in_scalars_names)+1): - inputs[i, in_chan, :, :] = dataset[id_sample].get_field("Signed_Distance", base_name="Base_2_2").reshape((size1, size2)) - for k, sn in enumerate(in_scalars_names): - inputs[i, k+1, :, :] = dataset[id_sample].get_scalar(sn) - -outputs = np.empty((n_train, len(out_scalars_names)+len(out_fields_names), size1, size2)) -for i, id_sample in enumerate(ids_train): - for k, fn in enumerate(out_fields_names): - outputs[i, k, :, :] = dataset[id_sample].get_field(fn, base_name="Base_2_2").reshape((size1, size2)) - for k, sn in enumerate(out_scalars_names): - outputs[i, k+len(out_fields_names), :, :] = dataset[id_sample].get_scalar(sn) - - -min_in = inputs.min(axis=(0, 2, 3), keepdims=True) -max_in = inputs.max(axis=(0, 2, 3), keepdims=True) -inputs = (inputs - min_in) / (max_in - min_in) - - -min_out = outputs.min(axis=(0, 2, 3), keepdims=True) -max_out = outputs.max(axis=(0, 2, 3), keepdims=True) -outputs = (outputs - min_out) / (max_out - min_out) - - -import torch -from torch.utils.data import Dataset - -class GridDataset(Dataset): - def __init__(self, inputs, outputs): - self.inputs = torch.tensor(inputs, dtype=torch.float32) - self.outputs = torch.tensor(outputs, dtype=torch.float32) - - def __len__(self): - return self.inputs.shape[0] - - def __getitem__(self, idx): - return self.inputs[idx], self.outputs[idx] - -from torch.utils.data import DataLoader -dataset__ = GridDataset(inputs, outputs) -loader = DataLoader(dataset__, batch_size=64, shuffle=True) - - - -model = FNO( -in_channels=inputs.shape[1], -out_channels=outputs.shape[1], -decoder_layers=4, -decoder_layer_size=64, -dimension=2, -latent_channels=64, -num_fno_layers=4, -padding=0, -).cuda() - - -optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) -loss_fn = torch.nn.MSELoss() - -n_epoch = 2000 -for epoch in range(n_epoch): - model.train() - total_loss = 0.0 - for x, y in loader: - x, y = x.cuda(), y.cuda() - pred = model(x) - loss = loss_fn(pred, y) - optimizer.zero_grad() - loss.backward() - optimizer.step() - total_loss += loss.item() - print(f"Epoch {epoch+1}, Loss: {total_loss/len(loader)}") - - - -# TEST - -inputs = np.empty((n_test, len(in_scalars_names)+1, size1, size2)) - -for i, id_sample in enumerate(ids_test): - for in_chan in range(len(in_scalars_names)+1): - inputs[i, in_chan, :, :] = dataset[id_sample].get_field("Signed_Distance", base_name="Base_2_2").reshape((size1, size2)) - for k, sn in enumerate(in_scalars_names): - inputs[i, k+1, :, :] = dataset[id_sample].get_scalar(sn) - -inputs = (inputs - min_in) / (max_in - min_in) - - -model.eval() -with torch.no_grad(): - x_test = torch.tensor(inputs, dtype=torch.float32).cuda() - y_pred = model(x_test).cpu().numpy() - -outputs_pred = y_pred * (max_out - min_out) + min_out - - -for i, id_sample in enumerate(ids_test): - for k, fn in enumerate(out_fields_names): - dataset[id_sample].add_field(fn, outputs_pred[i, k, :, :].flatten()) - for k, sn in enumerate(out_scalars_names): - dataset[id_sample].add_scalar(sn, np.mean(outputs_pred[i, k+len(out_fields_names), :, :].flatten())) - - -if os.path.exists(predicted_data_dir) and os.path.isdir(predicted_data_dir): - shutil.rmtree(predicted_data_dir) -dataset[ids_test]._save_to_dir_(predicted_data_dir) - - -print("duration train =", time.time()-start) -# GPUA30, 9344 seconds diff --git a/benchmarks/MARIO/README.md b/benchmarks/MARIO/README.md deleted file mode 100644 index d5bac93f..00000000 --- a/benchmarks/MARIO/README.md +++ /dev/null @@ -1,7 +0,0 @@ -The codes to run the benchmark is located in the repo [MARIO](https://github.com/giovannicatalani/MARIO). Dedicated folders are: -- [`Tensile2d`](https://github.com/giovannicatalani/MARIO/tree/main/Tensile2d_task) -- [`2D_MultiScHypEl`](https://github.com/giovannicatalani/MARIO/tree/main/2D_MultiScHypEl_task) -- [`2D_ElPlDynamics`](https://github.com/giovannicatalani/MARIO/tree/main/2D_ElastoPlastoDynamics_task) -- [`Rotor37`](https://github.com/giovannicatalani/MARIO/tree/main/Rotor37_task) -- [`2D_profile`](https://github.com/giovannicatalani/MARIO/tree/main/2Dprofile_task) -- [`VKI-LS59`](https://github.com/giovannicatalani/MARIO/tree/main/VKILS59_task) diff --git a/benchmarks/MGN/README.md b/benchmarks/MGN/README.md deleted file mode 100644 index 8e257e9e..00000000 --- a/benchmarks/MGN/README.md +++ /dev/null @@ -1,87 +0,0 @@ -# MeshGraphNet - -This repository contains the implementation of the **MeshGraphNet (MGN)** used for predicting **steady** and **unsteady** flow problems. - ---- - -## 📁 Project Structure - -``` -mgn/ -├── configs/ # JSON configuration files for different datasets -├── data.py # Data loading and preprocessing -├── main.py # Entry point for training steady flow problems -├── main_2d_elasto_ddp.py # Entry point for training 2D Elasto-PlastoDynamics -├── model.py # Model architecture -├── train.py # Training logic -├── utils.py # Utilities -├── README.md # Project documentation -``` - ---- - -## 🚀 Launch Instructions - -### ✅ Steady Flow Problems - -Use `main.py` to train MGN on steady-state CFD datasets such as `2d_profile`, `tensile2d`, `rotor37`, `vkils59`, and `2d_multiscale`. - -#### Example: -```bash -# Set configuration and runtime options -CONFIG_PATH="configs/2d_profile.json" -TARGET="Pressure" -RUN_NAME="mgn_Pressure" -SAVE_PATH="output" - -# Launch training -python main.py \ - --config "$CONFIG_PATH" \ - --target_field "$TARGET" \ - --run_name "$RUN_NAME" \ - --save_path "$SAVE_PATH" -``` - -The `--target_field` argument defines the specific field (e.g., `U1`, `Pressure`, etc.) you want the model to predict. By default, the model will train on **all available fields** if this argument is omitted. - ---- - -### 🔄 Unsteady Case: 2D Elasto-PlastoDynamics - -Use `main_2d_elasto_ddp.py` to train on the 2D Elasto-PlastoDynamics problem using **Distributed Data Parallel (DDP)**. - -#### Example: -```bash -srun python main_2d_elasto_ddp.py \ - --data_dir "$DATA_DIR" \ - --problem_dir "$PROBLEM_DIR" \ - --batch_size 16 \ - --epochs 100 -``` - -Note: This script is compatible with Slurm-based HPC environments. If you're not using Slurm, replace `srun` with `python` as appropriate for your local setup. - ---- - -## 🧩 Configuration Files - -Located in the `configs/` folder, each `.json` file defines: -- Dataset paths -- Model hyperparameters -- Training/validation parameters - -Examples: -- `2d_profile.json` -- `2d_multiscale.json` -- `tensile2d.json` -- `vkils59.json` -- `rotor37.json` - -## ⚙️ Dependencies - -Install the required Python libraries before running the scripts: -- torch=2.6.0 -- dgl=2.4.0+cu124 -- muscat=2.4.1 -- plaid=0.1.6 -- scikit-learn=1.6.1 diff --git a/benchmarks/MGN/configs.py b/benchmarks/MGN/configs.py deleted file mode 100644 index f00a704c..00000000 --- a/benchmarks/MGN/configs.py +++ /dev/null @@ -1,62 +0,0 @@ -import argparse -import json - - -def parse_args(): - # Initialize the argument parser - parser = argparse.ArgumentParser() - parser.add_argument("--config", type=str, help="Path to JSON configuration file") - - # Add all the existing arguments - parser.add_argument("--dataset_name", type=str) - parser.add_argument("--dataset_path", type=str) - parser.add_argument("--split_train_name", type=str, default="train") - parser.add_argument("--split_test_name", type=str, default="test") - parser.add_argument("--bandwidth", type=float) - - parser.add_argument( - "--activation", type=str, choices=["relu", "elu", "leaky"], default="relu" - ) - parser.add_argument("--aggregation", type=str, default="sum") - parser.add_argument("--batch_size", type=int, default=1) - parser.add_argument("--num_epochs", type=int, default=1111) - parser.add_argument("--input_dim_nodes", type=int, default=2) - parser.add_argument("--input_dim_edges", type=int, default=1) - parser.add_argument("--output_dim", type=int, default=2) - parser.add_argument("--processor_size", type=int, default=10) - parser.add_argument("--num_layers_node_processor", type=int, default=2) - parser.add_argument("--num_layers_edge_processor", type=int, default=2) - parser.add_argument("--hidden_dim_node_encoder", type=int, default=32) - parser.add_argument("--num_layers_node_encoder", type=int, default=2) - parser.add_argument("--hidden_dim_edge_encoder", type=int, default=32) - parser.add_argument("--num_layers_edge_encoder", type=int, default=2) - parser.add_argument("--hidden_dim_node_decoder", type=int, default=32) - parser.add_argument("--num_layers_node_decoder", type=int, default=2) - parser.add_argument("--lr", type=float, default=1e-3) - - parser.add_argument("--error_n_levels", type=int, default=40) - parser.add_argument("--error_k_hop_levels", type=int, default=3) - parser.add_argument("--error_min_points", type=int, default=5) - parser.add_argument("--error_threshold", type=float, default=0.08) - parser.add_argument("--error_check_interval", type=int, default=10) - parser.add_argument("--method_error", type=str, default="relative_error") - - parser.add_argument("--mode", type=str, default="classic", choices=["classic"]) - parser.add_argument("--target_field", type=str, required=True) - parser.add_argument("--run_name", type=str, required=True) - parser.add_argument("--save_path", type=str, required=True) - - # Parse and return the arguments - args = parser.parse_args() - - # Load configurations from JSON file if specified - if args.config: - with open(args.config, "r") as f: - configs = json.load(f)[0] - - # Override defaults with JSON configurations - for arg, value_list in configs.items(): - if hasattr(args, arg): - setattr(args, arg, value_list[0]) - - return args diff --git a/benchmarks/MGN/configs/2d_multiscale.json b/benchmarks/MGN/configs/2d_multiscale.json deleted file mode 100644 index ba1d307a..00000000 --- a/benchmarks/MGN/configs/2d_multiscale.json +++ /dev/null @@ -1,70 +0,0 @@ -[ - { - "dataset_name": [ - "2D_Multiscale" - ], - "dataset_path": [ - "/2D_Multiscale_Hyperelasticity/plaid" - ], - "split_train_name": [ - "DOE_train" - ], - "split_test_name": [ - "DOE_test" - ], - "bandwidth": [ - 0.03692578713441289 - ], - "activation": [ - "leaky" - ], - "aggregation": [ - "sum" - ], - "batch_size": [ - 1 - ], - "hidden_dim_edge_encoder": [ - 32 - ], - "hidden_dim_node_decoder": [ - 32 - ], - "hidden_dim_node_encoder": [ - 32 - ], - "input_dim_edges": [ - 1 - ], - "input_dim_nodes": [ - 15 - ], - "lr": [ - 0.001 - ], - "num_epochs": [ - 1000 - ], - "num_layers_edge_encoder": [ - 2 - ], - "num_layers_edge_processor": [ - 2 - ], - "num_layers_node_decoder": [ - 2 - ], - "num_layers_node_encoder": [ - 2 - ], - "num_layers_node_processor": [ - 2 - ], - "output_dim": [ - 1 - ], - "processor_size": [ - 10 - ] - } -] \ No newline at end of file diff --git a/benchmarks/MGN/configs/2d_profile.json b/benchmarks/MGN/configs/2d_profile.json deleted file mode 100644 index d6bdd968..00000000 --- a/benchmarks/MGN/configs/2d_profile.json +++ /dev/null @@ -1,70 +0,0 @@ -[ - { - "dataset_name": [ - "2D_Profile" - ], - "dataset_path": [ - "/2D_Profile/plaid" - ], - "split_train_name": [ - "train" - ], - "split_test_name": [ - "test" - ], - "bandwidth": [ - 0.5670999324654937 - ], - "activation": [ - "leaky" - ], - "aggregation": [ - "sum" - ], - "batch_size": [ - 1 - ], - "hidden_dim_edge_encoder": [ - 128 - ], - "hidden_dim_node_decoder": [ - 128 - ], - "hidden_dim_node_encoder": [ - 128 - ], - "input_dim_edges": [ - 1 - ], - "input_dim_nodes": [ - 12 - ], - "lr": [ - 0.001 - ], - "num_epochs": [ - 1000 - ], - "num_layers_edge_encoder": [ - 2 - ], - "num_layers_edge_processor": [ - 2 - ], - "num_layers_node_decoder": [ - 2 - ], - "num_layers_node_encoder": [ - 2 - ], - "num_layers_node_processor": [ - 2 - ], - "output_dim": [ - 1 - ], - "processor_size": [ - 10 - ] - } -] \ No newline at end of file diff --git a/benchmarks/MGN/configs/rotor37.json b/benchmarks/MGN/configs/rotor37.json deleted file mode 100644 index 1b04a1a9..00000000 --- a/benchmarks/MGN/configs/rotor37.json +++ /dev/null @@ -1,70 +0,0 @@ -[ - { - "dataset_name": [ - "Rotor37" - ], - "dataset_path": [ - "/Rotor37/plaid" - ], - "split_train_name": [ - "train_1000" - ], - "split_test_name": [ - "test" - ], - "bandwidth": [ - 0.0024524890638974715 - ], - "activation": [ - "leaky" - ], - "aggregation": [ - "sum" - ], - "batch_size": [ - 1 - ], - "hidden_dim_edge_encoder": [ - 64 - ], - "hidden_dim_node_decoder": [ - 64 - ], - "hidden_dim_node_encoder": [ - 64 - ], - "input_dim_edges": [ - 1 - ], - "input_dim_nodes": [ - 8 - ], - "lr": [ - 0.001 - ], - "num_epochs": [ - 1000 - ], - "num_layers_edge_encoder": [ - 2 - ], - "num_layers_edge_processor": [ - 2 - ], - "num_layers_node_decoder": [ - 2 - ], - "num_layers_node_encoder": [ - 2 - ], - "num_layers_node_processor": [ - 2 - ], - "output_dim": [ - 3 - ], - "processor_size": [ - 10 - ] - } -] \ No newline at end of file diff --git a/benchmarks/MGN/configs/tensile2d.json b/benchmarks/MGN/configs/tensile2d.json deleted file mode 100644 index bf5c6864..00000000 --- a/benchmarks/MGN/configs/tensile2d.json +++ /dev/null @@ -1,70 +0,0 @@ -[ - { - "dataset_name": [ - "Tensile2D" - ], - "dataset_path": [ - "/Tensile2d/plaid" - ], - "split_train_name": [ - "train_500" - ], - "split_test_name": [ - "test" - ], - "bandwidth": [ - 0.02976729880552192 - ], - "activation": [ - "leaky" - ], - "aggregation": [ - "sum" - ], - "batch_size": [ - 1 - ], - "hidden_dim_edge_encoder": [ - 16 - ], - "hidden_dim_node_decoder": [ - 16 - ], - "hidden_dim_node_encoder": [ - 16 - ], - "input_dim_edges": [ - 1 - ], - "input_dim_nodes": [ - 18 - ], - "lr": [ - 0.001 - ], - "num_epochs": [ - 1000 - ], - "num_layers_edge_encoder": [ - 2 - ], - "num_layers_edge_processor": [ - 2 - ], - "num_layers_node_decoder": [ - 2 - ], - "num_layers_node_encoder": [ - 2 - ], - "num_layers_node_processor": [ - 2 - ], - "output_dim": [ - 5 - ], - "processor_size": [ - 10 - ] - } -] \ No newline at end of file diff --git a/benchmarks/MGN/configs/vkils59.json b/benchmarks/MGN/configs/vkils59.json deleted file mode 100644 index cc8c52e5..00000000 --- a/benchmarks/MGN/configs/vkils59.json +++ /dev/null @@ -1,70 +0,0 @@ -[ - { - "dataset_name": [ - "VKI_LS59" - ], - "dataset_path": [ - "/VKILS59/plaid" - ], - "split_train_name": [ - "train" - ], - "split_test_name": [ - "test" - ], - "bandwidth": [ - 0.015646763135580142 - ], - "activation": [ - "leaky" - ], - "aggregation": [ - "sum" - ], - "batch_size": [ - 1 - ], - "hidden_dim_edge_encoder": [ - 32 - ], - "hidden_dim_node_decoder": [ - 32 - ], - "hidden_dim_node_encoder": [ - 32 - ], - "input_dim_edges": [ - 1 - ], - "input_dim_nodes": [ - 14 - ], - "lr": [ - 0.001 - ], - "num_epochs": [ - 1000 - ], - "num_layers_edge_encoder": [ - 2 - ], - "num_layers_edge_processor": [ - 2 - ], - "num_layers_node_decoder": [ - 2 - ], - "num_layers_node_encoder": [ - 2 - ], - "num_layers_node_processor": [ - 2 - ], - "output_dim": [ - 1 - ], - "processor_size": [ - 10 - ] - } -] \ No newline at end of file diff --git a/benchmarks/MGN/data.py b/benchmarks/MGN/data.py deleted file mode 100644 index 37c864d4..00000000 --- a/benchmarks/MGN/data.py +++ /dev/null @@ -1,421 +0,0 @@ -import csv - -import dgl -import Muscat.Containers.ElementsDescription as ED -import numpy as np -import torch -from dgl.data import DGLDataset -from Muscat.Bridges.CGNSBridge import CGNSToMesh -from Muscat.Containers import MeshModificationTools as MMT -from Muscat.Containers.Filters import FilterObjects as FO -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.FE.FETools import PrepareFEComputation -from Muscat.FE.Fields.FEField import FEField -from rich.progress import track -from sklearn.preprocessing import StandardScaler -from torch.nn.functional import one_hot - -from plaid import Sample - - -def tri_cells_to_edges(cells): - edges = torch.cat([cells[:, :2], cells[:, 1:], cells[:, ::2]], dim=0) - receivers, _ = torch.min(edges, dim=-1) - senders, _ = torch.max(edges, dim=-1) - - packed_edges = torch.stack([senders, receivers], dim=-1).int() - unique_edges = torch.unique(packed_edges, dim=0) - unique_edges = torch.cat([unique_edges, torch.flip(unique_edges, dims=[-1])], dim=0) - return unique_edges - - -def quad_cells_to_edges(cells): - edges = torch.cat( - [ - cells[:, 0:2], - cells[:, 1:3], - cells[:, 2:4], - torch.stack((cells[:, 3], cells[:, 0]), dim=-1), - ], - dim=0, - ) - receivers, _ = torch.min(edges, dim=-1) - senders, _ = torch.max(edges, dim=-1) - - packed_edges = torch.stack([senders, receivers], dim=-1).int() - unique_edges = torch.unique(packed_edges, dim=0) - unique_edges = torch.cat([unique_edges, torch.flip(unique_edges, dims=[-1])], dim=0) - return unique_edges - - -def distance_field(mesh, nTag=None): - MMT.ComputeSkin(mesh, md=None, inPlace=True, skinTagName="Skin") - dim = int(mesh.GetElementsDimensionality()) - Tspace, Tnumberings, _, _ = PrepareFEComputation(mesh, numberOfComponents=1) - field_mesh = FEField("", mesh=mesh, space=Tspace, numbering=Tnumberings[0]) - opSkin, _, _ = GetFieldTransferOp( - inputField=field_mesh, - targetPoints=mesh.nodes, - method="Interp/Clamp", - elementFilter=FO.ElementFilter(dimensionality=dim - 1, nTag=nTag), - verbose=False, - ) - skinpos = opSkin.dot(mesh.nodes) - distance = np.sqrt(np.sum((skinpos - mesh.nodes) ** 2, axis=1)) - return distance - - -def get_data(mesh, dataset_name=None, load_fields=True): - # Automatically retrieve node fields - if load_fields: - node_fields = {k: torch.tensor(v) for k, v in mesh.nodeFields.items()} - else: - node_fields = {} - - # Automatically retrieve nodetags - tag_label_map = { - "Airfoil": 2, - "Holes": 2, - "Top": 2, - "Inlet": 4, - "Bottom": 4, - "Inflow": 4, - "Ext_bound": 6, - "Outflow": 6, - "Intrado": 1, - "Extrado": 3, - "Periodic_1": 5, - "Periodic_2": 7, - } - - labels = np.zeros(mesh.GetNumberOfNodes(), dtype=int) - for tag_name in mesh.nodesTags: - tag_name_str = str(tag_name).split("(")[0].strip() - tag_ids = mesh.GetNodalTag(tag_name_str).GetIds() - label_value = tag_label_map.get(tag_name_str, 0) - labels[tag_ids] = label_value - node_type = torch.tensor(labels) - - # Calculate distance field - nTag = { - "2D_Profile": "Airfoil", - "2D_Multiscale": "Holes", - # "Tensile2D": "Top", - "AirfRANS": "Airfoil", - }.get(dataset_name, None) - - if dataset_name == "VKI_LS59": - distance = node_fields["sdf"] - else: - dst = distance_field(mesh, nTag=nTag) - distance = torch.tensor(dst) - - # Select element type and compute edges - element_type = ED.Quadrangle_4 if dataset_name == "VKI_LS59" else ED.Triangle_3 - cells = torch.tensor(mesh.elements[element_type].connectivity) - edge_function = ( - quad_cells_to_edges if dataset_name == "VKI_LS59" else tri_cells_to_edges - ) - edges = edge_function(cells) - - mesh_pos = torch.tensor(mesh.nodes) - - # Structure return values into distinct parts - node_fields_dict = node_fields - node_features_dict = { - "distance": distance, - "node_type": node_type, - "mesh_pos": mesh_pos, - } - - return node_fields_dict, node_features_dict, edges - - -def read_indices_from_csv(dataset_path, split_train_name=None, split_test_name=None): - train_indices = [] - test_indices = [] - with open(f"{dataset_path}/problem_definition/split.csv", mode="r") as file: - csv_reader = csv.reader(file) - for row in csv_reader: - if row[0] == split_train_name: - train_indices = list(map(int, row[1:])) - elif row[0] == split_test_name: - test_indices = list(map(int, row[1:])) - return train_indices, test_indices - - -def load_datasets( - dataset_name, - dataset_path, - split_train_name=None, - split_test_name=None, - target_field="all_fields", -): - # Read indices from CSV - train_indices, test_indices = read_indices_from_csv( - dataset_path, split_train_name, split_test_name - ) - - # Scalar input and output definitions - scalar_input_dict = { - "VKI_LS59": ["angle_in", "mach_out"], - "Tensile2D": ["P", "p1", "p2", "p3", "p4", "p5"], - "2D_Multiscale": ["C11", "C12", "C22"], - "Rotor37": ["Omega", "P"], - } - - scalar_output_dict = { - "VKI_LS59": ["Q", "power", "Pr", "Tr", "eth_is", "angle_out"], - "Tensile2D": ["max_von_mises", "max_U2_top", "max_sig22_top"], - "2D_Multiscale": ["effective_energy"], - "Rotor37": ["Massflow", "Compression_ratio", "Efficiency"], - } - - # Fields expected to be retrieved - field_names_dict = { - "VKI_LS59": ["mach", "nut"], - "Tensile2D": ["U1", "U2", "sig11", "sig22", "sig12"], - "2D_Multiscale": ["u1", "u2", "P11", "P12", "P22", "P21", "psi"], - "Rotor37": ["Density", "Pressure", "Temperature"], - "2D_Profile": ["Mach", "Pressure", "Velocity-x", "Velocity-y"], - } - - def process_samples(dataset_name, dataset_path, indices, field_names, process_type): - X_nodes, X_edges, X_node_tags, X_scalars, X_distances = [], [], [], [], [] - Y_fields, Y_scalars = [], [] - - description = f"✅ Processing {process_type} samples" - - for i in track( - range(len(indices)), total=len(indices), description=description - ): - id_sample = indices[i] - sample_path = f"{dataset_path}/dataset/samples/sample_{id_sample:09}/" - mesh_data = Sample.load_from_dir(sample_path) - tree = mesh_data.get_mesh() - - if dataset_name == "VKI_LS59": - mesh = CGNSToMesh(tree, baseNames=["Base_2_2"]) - else: - mesh = CGNSToMesh(tree) - - # Get data from mesh - load_fields = process_type == "train" - node_fields, node_features, edges = get_data( - mesh, dataset_name=dataset_name, load_fields=load_fields - ) - - nodes = node_features["mesh_pos"] - node_tags = node_features["node_type"] - distances = node_features["distance"] - - # Determine which fields to retrieve - if target_field == "all_fields": - effective_field_names = field_names - elif target_field and target_field in field_names: - effective_field_names = [target_field] - else: - effective_field_names = [] - - # Handle case where no valid target field is specified - if not effective_field_names: - raise ValueError( - "No valid field selected for processing. Please specify a valid target field or use 'all_fields'." - ) - - if process_type == "train": - fields = [node_fields[fn] for fn in effective_field_names] - fields = torch.column_stack(fields) - Y_fields += [fields] - - # Retrieve input scalars - in_scalars_names = scalar_input_dict.get(dataset_name, []) - X_scalars.append( - [mesh_data.get_scalar(fn) for fn in in_scalars_names] - if in_scalars_names - else [] - ) - - # Retrieve output scalars - out_scalars_names = scalar_output_dict.get(dataset_name, []) - Y_scalars.append( - [mesh_data.get_scalar(fn) for fn in out_scalars_names] - if out_scalars_names - else [] - ) - - X_nodes += [nodes] - X_edges += [edges] - X_node_tags += [node_tags] - X_distances += [distances] - - X_scalars = np.array(X_scalars) - Y_scalars = np.array(Y_scalars) - - # Processed data - data = { - "X_nodes": X_nodes, - "X_edges": X_edges, - "X_node_tags": X_node_tags, - "X_distances": X_distances, - "X_scalars": X_scalars, - "Y_fields": Y_fields, - "Y_scalars": Y_scalars, - } - - return data - - # Get the field names specific to the dataset - field_names = field_names_dict.get(dataset_name, []) - - # Process train and test samples - train_data = process_samples( - dataset_name, dataset_path, train_indices, field_names, "train" - ) - test_data = process_samples( - dataset_name, dataset_path, test_indices, field_names, "test" - ) - - return train_data, test_data, train_indices, test_indices - - -class GraphDataset(DGLDataset): - def __init__( - self, - args, - data, - data_type, - in_scaler=None, - out_scaler=None, - fields_min=None, - fields_max=None, - ): - super().__init__(name="graph_dataset") - - self.data = data - self.in_scaler = in_scaler - self.out_scaler = out_scaler - self.fields_min = fields_min - self.fields_max = fields_max - - self.num_samples = len(data["X_nodes"]) - self.graphs = [] - - if data_type == "train": - self.in_scaler = StandardScaler() - self.out_scaler = StandardScaler() - - self.input_globals = ( - torch.tensor( - self.in_scaler.fit_transform(data["X_scalars"]), dtype=torch.float32 - ) - if data["X_scalars"].size > 0 - else torch.tensor([]) - ) - self.output_globals = ( - torch.tensor( - self.out_scaler.fit_transform(data["Y_scalars"]), - dtype=torch.float32, - ) - if data["Y_scalars"].size > 0 - else torch.tensor([]) - ) - - self.fields_min = torch.min( - torch.cat(data["Y_fields"], dim=0), dim=0 - ).values - self.fields_max = torch.max( - torch.cat(data["Y_fields"], dim=0), dim=0 - ).values - - elif data_type == "test": - assert all( - v is not None for v in [in_scaler, out_scaler, fields_min, fields_max] - ) - - self.input_globals = ( - torch.tensor( - self.in_scaler.transform(data["X_scalars"]), dtype=torch.float32 - ) - if data["X_scalars"].size > 0 - else torch.tensor([]) - ) - self.output_globals = ( - torch.tensor( - self.out_scaler.transform(data["Y_scalars"]), dtype=torch.float32 - ) - if data["Y_scalars"].size > 0 - else torch.tensor([]) - ) - - description = f"🚀 Processing {data_type} graphs" - for i in track( - range(self.num_samples), total=self.num_samples, description=description - ): - pos = data["X_nodes"][i].to(torch.float32) - edge_index = data["X_edges"][i].t().long() - - tags = one_hot(data["X_node_tags"][i], num_classes=9).to(torch.float32) - dis = data["X_distances"][i].unsqueeze(1).to(torch.float32) - - num_sca = ( - self.input_globals[i].shape[-1] if self.input_globals.size(0) > 0 else 0 - ) - sca = ( - torch.ones((len(pos), num_sca), dtype=torch.float32) - * self.input_globals[i].unsqueeze(0) - if num_sca - else torch.tensor([]) - ) - - src, dst = edge_index[0], edge_index[1] - graph = dgl.graph((src, dst)) - - node_features = [pos, tags, dis] - if sca.size(0) > 0: - node_features.append(sca) - - graph.ndata["x"] = torch.cat(node_features, dim=1) - - if data_type == "train": - Y_fields = ( - ( - (data["Y_fields"][i] - self.fields_min) - / (self.fields_max - self.fields_min) - ) - .clone() - .detach() - .to(torch.float32) - ) - graph.ndata["y"] = Y_fields - - graph.ndata["pos"] = pos - - # Calculate squared distances - bandwidth = args.bandwidth * 10 - differences = ( - (data["X_nodes"][i][src] - data["X_nodes"][i][dst]).clone().detach() - ) - sqdists = torch.sum(differences**2, dim=1).unsqueeze(1) - sqdists = torch.exp(-0.5 * sqdists / bandwidth).to(torch.float32) - graph.edata["f"] = sqdists - - self.graphs.append(graph) - - def __getitem__(self, idx): - graph = self.graphs[idx] - input_globals = ( - self.input_globals[idx] - if self.input_globals.numel() > 0 - else torch.tensor([]) - ) - output_globals = ( - self.output_globals[idx] - if self.output_globals.numel() > 0 - else torch.tensor([]) - ) - - return graph, input_globals, output_globals - - def __len__(self): - return len(self.graphs) diff --git a/benchmarks/MGN/main.py b/benchmarks/MGN/main.py deleted file mode 100644 index 14203028..00000000 --- a/benchmarks/MGN/main.py +++ /dev/null @@ -1,19 +0,0 @@ -from configs import parse_args -from data import load_datasets -from model import create_model -from train import train - -if __name__ == "__main__": - args = parse_args() - print(args) - - train_data, test_data, train_indices, test_indices = load_datasets( - args.dataset_name, - args.dataset_path, - args.split_train_name, - args.split_test_name, - args.target_field, - ) - - model, optimizer, loss_fn = create_model(args) - train(args, model, optimizer, loss_fn, train_data, test_data) diff --git a/benchmarks/MGN/main_2d_elasto_ddp.py b/benchmarks/MGN/main_2d_elasto_ddp.py deleted file mode 100644 index a436bb54..00000000 --- a/benchmarks/MGN/main_2d_elasto_ddp.py +++ /dev/null @@ -1,522 +0,0 @@ -import argparse -import os -import pickle -import time - -import dgl -import numpy as np -import torch -import torch.distributed as dist -from dgl.dataloading import GraphDataLoader -from torch.utils.data import Dataset -from torch.utils.data.distributed import DistributedSampler - -from plaid import Dataset as PlaidDataset -from plaid import ProblemDefinition - -from data import * -from utils import * -from model import MeshGraphNet - - -class ElastoDataset(Dataset): - def __init__( - self, plaid_dataset, plaid_problem, split="train", fields=("U_x", "U_y") - ): - self.plaid_ds = plaid_dataset - self.ids = plaid_problem.get_split(split) - sample0 = self.plaid_ds[self.ids[0]] - self.time_steps = sample0.get_all_mesh_times() - self.n_steps = len(self.time_steps) - 1 - self.fields = fields - - # Precompute mesh-level features - self.mesh_list = [] - for sid in self.ids: - sample = self.plaid_ds[sid] - pos = torch.tensor(sample.get_nodes(), dtype=torch.float32) - cells = sample.get_elements()["TRI_3"] - edge_index = ( - tri_cells_to_edges(torch.tensor(cells, dtype=torch.long)) - .t() - .contiguous() - ) - - _, sdf = get_distances_to_borders(sample.get_nodes(), cells) - sdf = torch.tensor(sdf, dtype=torch.float32) - sdf_sine = sinusoidal_embedding( - sdf, num_basis=4, max_coord=4, spacing=0.001 - ) - angles = angles_to_planes(pos) - sph = torch.cat( - [spherical_harmonics(angles[:, i], l_max=4)[:, 1:] for i in range(4)], - dim=1, - ) - - src, dst = edge_index - disp = pos[src] - pos[dst] - sqd = torch.exp( - -0.5 * (disp**2).sum(1, keepdim=True) / (5.757861066563731 * 10) - ) - edge_attr = torch.cat([sqd, disp], dim=-1) - - self.mesh_list.append( - { - "pos": pos, - "edge_index": edge_index, - "edge_attr": edge_attr, - "sdf": sdf, - "sdf_sine": sdf_sine, - "sph": sph, - } - ) - - def __len__(self): - return len(self.ids) * self.n_steps - - def __getitem__(self, idx): - sim_idx = idx // self.n_steps - t = idx % self.n_steps - sid = self.ids[sim_idx] - sample = self.plaid_ds[sid] - mesh = self.mesh_list[sim_idx] - - # Velocities at t and t+1 - ux_t = torch.tensor( - sample.get_field(self.fields[0], time=self.time_steps[t]), - dtype=torch.float32, - ) - uy_t = torch.tensor( - sample.get_field(self.fields[1], time=self.time_steps[t]), - dtype=torch.float32, - ) - ux_tp = torch.tensor( - sample.get_field(self.fields[0], time=self.time_steps[t + 1]), - dtype=torch.float32, - ) - uy_tp = torch.tensor( - sample.get_field(self.fields[1], time=self.time_steps[t + 1]), - dtype=torch.float32, - ) - - u_t = torch.stack([ux_t, uy_t], dim=-1) - u_tp = torch.stack([ux_tp, uy_tp], dim=-1) - - # Input features: [u_t, pos, sdf, sph] - x = torch.cat([u_t, mesh["pos"], mesh["sdf"], mesh["sph"]], dim=-1) - # Target: Δu = u_{t+1}-u_t - y = u_tp - u_t - - # Build DGL graph - g = dgl.graph((mesh["edge_index"][0], mesh["edge_index"][1])) - g.ndata["x"] = x - g.ndata["y"] = y - g.edata["f"] = mesh["edge_attr"] - return g - - -def compute_minmax_scaler(train_ds, args, device): - loader = GraphDataLoader( - train_ds, - batch_size=args.batch_size, - shuffle=False, - num_workers=args.num_workers, - pin_memory=True, - ) - xs, ys = [], [] - for batch in loader: - xs.append(batch.ndata["x"][:, :2]) - ys.append(batch.ndata["y"]) - all_x = torch.cat(xs, dim=0) - all_y = torch.cat(ys, dim=0) - return { - "type": "minmax", - "min_x": all_x.min(0)[0].to(device), - "max_x": all_x.max(0)[0].to(device), - "min_y": all_y.min(0)[0].to(device), - "max_y": all_y.max(0)[0].to(device), - } - - -def compute_standard_scaler(train_ds, args, device): - loader = GraphDataLoader( - train_ds, - batch_size=args.batch_size, - shuffle=False, - num_workers=args.num_workers, - pin_memory=True, - ) - xs, ys = [], [] - for batch in loader: - xs.append(batch.ndata["x"][:, :2]) - ys.append(batch.ndata["y"]) - all_x = torch.cat(xs, dim=0) - all_y = torch.cat(ys, dim=0) - return { - "type": "standard", - "mean_x": all_x.mean(0).to(device), - "std_x": all_x.std(0).to(device), - "mean_y": all_y.mean(0).to(device), - "std_y": all_y.std(0).to(device), - } - - -def save_checkpoint(path, model, optimizer, epoch, rank): - if rank == 0: - state = { - "epoch": epoch, - "model_state": model.module.state_dict() - if isinstance(model, torch.nn.parallel.DistributedDataParallel) - else model.state_dict(), - "optim_state": optimizer.state_dict(), - } - torch.save(state, path) - print(f"→ [Rank {rank}] Saved checkpoint to {path}") - - -def load_checkpoint(path, model, optimizer=None, device="cpu"): - if not os.path.isfile(path): - return 0 - ckpt = torch.load(path, map_location=device) - model.load_state_dict(ckpt["model_state"]) - if optimizer and "optim_state" in ckpt: - optimizer.load_state_dict(ckpt["optim_state"]) - start_epoch = ckpt.get("epoch", 0) + 1 - print(f"→ Loaded checkpoint '{path}', resume at epoch {start_epoch}") - return start_epoch - - -def inference(model, device, test_ds, scaler, args, rank): - if rank != 0: - return None - print("🔍 Starting inference on test set") - t0 = time.perf_counter() - model.eval() - preds = [] - - with torch.no_grad(): - for sim_idx, sid in enumerate(test_ds.ids): - mesh = test_ds.mesh_list[sim_idx] - sample = test_ds.plaid_ds[sid] - - # Init u_pred at t0 - ux0 = torch.tensor(sample.get_field("U_x", time=test_ds.time_steps[0])).to( - device - ) - uy0 = torch.tensor(sample.get_field("U_y", time=test_ds.time_steps[0])).to( - device - ) - u_pred = torch.stack([ux0, uy0], dim=-1) - - pred_dict = {"U_x": [], "U_y": []} - - for i, fn in enumerate(["U_x", "U_y"]): - pred_dict[fn].append(u_pred[:, i].cpu().numpy()) - - for t in range(test_ds.n_steps): - # Prepare x0 - if scaler["type"] == "minmax": - x0 = (u_pred - scaler["min_x"]) / ( - scaler["max_x"] - scaler["min_x"] - ) - elif scaler["type"] == "standard": - x0 = (u_pred - scaler["mean_x"]) / scaler["std_x"] - else: - x0 = u_pred - - x = torch.cat( - [ - x0, - mesh["pos"].to(device), - mesh["sdf"].to(device), - mesh["sph"].to(device), - ], - dim=-1, - ) - - g = dgl.graph((mesh["edge_index"][0], mesh["edge_index"][1])).to(device) - g.ndata["x"], g.edata["f"] = x, mesh["edge_attr"].to(device) - - out = model(g.ndata["x"], g.edata["f"], g) - - if scaler["type"] == "minmax": - dv = out * (scaler["max_y"] - scaler["min_y"]) + scaler["min_y"] - elif scaler["type"] == "standard": - dv = out * scaler["std_y"] + scaler["mean_y"] - else: - dv = out - - u_pred = u_pred + dv - - # Ground truth at t+1 - ux_tp = torch.tensor( - sample.get_field("U_x", time=test_ds.time_steps[t + 1]) - ).to(device) - uy_tp = torch.tensor( - sample.get_field("U_y", time=test_ds.time_steps[t + 1]) - ).to(device) - torch.stack([ux_tp, uy_tp], dim=-1) - - for i, fn in enumerate(["U_x", "U_y"]): - pred_dict[fn].append(u_pred[:, i].cpu().numpy()) - - # Stack along time - for fn in pred_dict: - pred_dict[fn] = np.stack(pred_dict[fn], axis=1).T - preds.append(pred_dict) - - with open(args.submission_path, "wb") as f: - pickle.dump(preds, f) - - t1 = time.perf_counter() - print(f"🎯 Inference done in {t1 - t0:.2f}s — saving to {args.submission_path}") - - -def main_worker(args): - t_start = time.perf_counter() - - # Init DDP - # os.environ.setdefault("MASTER_ADDR", args.master_addr) - # os.environ.setdefault("MASTER_PORT", str(args.master_port)) - rank = int(os.environ.get("SLURM_PROCID")) - world_size = int(os.environ.get("SLURM_NTASKS")) - ngpus_per_node = torch.cuda.device_count() - local_rank = rank % ngpus_per_node - - t0 = time.perf_counter() - dist.init_process_group("nccl", world_size=world_size, rank=rank) - t1 = time.perf_counter() - if rank == 0: - print(f"🚀 [DDP init] {t1 - t0:.2f}s — world_size={world_size}") - - # Set device - torch.cuda.set_device(local_rank) - device = torch.device(f"cuda:{local_rank}") - - # Load data & problem - if rank == 0: - print("📂 Loading Plaid data") - t0 = time.perf_counter() - plaid_ds = PlaidDataset() - plaid_ds._load_from_dir_(args.data_dir, verbose=(rank == 0)) - problem = ProblemDefinition() - problem._load_from_dir_(args.problem_dir) - t1 = time.perf_counter() - if rank == 0: - print( - f"✔️ [Data load] {t1 - t0:.2f}s — train sims={len(problem.get_split('train'))}, test sims={len(problem.get_split('test'))}" - ) - - # Build datasets - if rank == 0: - print("🚀 Building datasets") - t0 = time.perf_counter() - train_ds = ElastoDataset(plaid_ds, problem, split="train") - test_ds = ElastoDataset(plaid_ds, problem, split="test") - t1 = time.perf_counter() - if rank == 0: - print( - f"✔️ [Dataset] {t1 - t0:.2f}s — train size={len(train_ds)}, test size={len(test_ds)}" - ) - - # Infer dims - sample0 = train_ds[0] - dim_x = sample0.ndata["x"].shape[1] - dim_f = sample0.edata["f"].shape[1] - dim_y = sample0.ndata["y"].shape[1] - if rank == 0: - print(f"🎯 Dimensions → node_in={dim_x}, edge_in={dim_f}, node_out={dim_y}") - - # Compute scaler - scaler = {"type": "none"} - if args.scaler == "minmax": - if rank == 0: - print("📏 Computing Min-Max scaler") - t0 = time.perf_counter() - tmp = compute_minmax_scaler(train_ds, args, device) - dist.broadcast_object_list([tmp], src=0) - scaler = tmp - t1 = time.perf_counter() - if rank == 0: - print(f"[Scaler minmax] {t1 - t0:.2f}s → {scaler}") - elif args.scaler == "standard": - if rank == 0: - print("📏 Computing Standard scaler") - t0 = time.perf_counter() - tmp = compute_standard_scaler(train_ds, args, device) - dist.broadcast_object_list([tmp], src=0) - scaler = tmp - t1 = time.perf_counter() - if rank == 0: - print(f"✔️ [Scaler std] {t1 - t0:.2f}s → {scaler}") - - # DataLoader + sampler - t0 = time.perf_counter() - train_sampler = DistributedSampler( - train_ds, num_replicas=world_size, rank=rank, shuffle=True - ) - train_loader = GraphDataLoader( - train_ds, - batch_size=args.batch_size, - sampler=train_sampler, - num_workers=args.num_workers, - pin_memory=False, - ) - t1 = time.perf_counter() - if rank == 0: - print(f"[Loader build] {t1 - t0:.2f}s — batches={len(train_loader)}") - - # Model, optimizer, scheduler - if rank == 0: - print("🚀 Building model") - t0 = time.perf_counter() - model = MeshGraphNet( - input_dim_nodes=dim_x, - input_dim_edges=dim_f, - output_dim=dim_y, - processor_size=args.processor_size, - num_layers_node_processor=args.num_layers_node_processor, - num_layers_edge_processor=args.num_layers_edge_processor, - hidden_dim_node_encoder=args.hidden_dim_node_encoder, - num_layers_node_encoder=args.num_layers_node_encoder, - hidden_dim_edge_encoder=args.hidden_dim_edge_encoder, - num_layers_edge_encoder=args.num_layers_edge_encoder, - hidden_dim_node_decoder=args.hidden_dim_node_decoder, - num_layers_node_decoder=args.num_layers_node_decoder, - aggregation=args.aggregation, - activation=args.activation, - ).to(device) - - model = torch.nn.parallel.DistributedDataParallel( - model, device_ids=[local_rank], output_device=local_rank - ) - criterion = torch.nn.MSELoss() - optimizer = torch.optim.Adam(model.parameters(), lr=args.lr) - scheduler = torch.optim.lr_scheduler.LambdaLR( - optimizer, lr_lambda=lambda e: args.lr_decay_rate**e - ) - use_amp = args.amp and device.type.startswith("cuda") - scaler_amp = torch.cuda.amp.GradScaler() if use_amp else None - t1 = time.perf_counter() - if rank == 0: - print(f"✔️ [Model build] {t1 - t0:.2f}s") - - start_epoch = 0 - best_loss = float("inf") - - # Training loop - for epoch in range(start_epoch, args.epochs): - ep0 = time.perf_counter() - model.train() - train_sampler.set_epoch(epoch) - running_loss = 0.0 - - for batch in train_loader: - g = batch.to(device) - - if scaler["type"] == "minmax": - g.ndata["x"][:, :2] = (g.ndata["x"][:, :2] - scaler["min_x"]) / ( - scaler["max_x"] - scaler["min_x"] - ) - g.ndata["y"] = (g.ndata["y"] - scaler["min_y"]) / ( - scaler["max_y"] - scaler["min_y"] - ) - elif scaler["type"] == "standard": - g.ndata["x"][:, :2] = (g.ndata["x"][:, :2] - scaler["mean_x"]) / scaler[ - "std_x" - ] - g.ndata["y"] = (g.ndata["y"] - scaler["mean_y"]) / scaler["std_y"] - - optimizer.zero_grad() - if use_amp: - with torch.cuda.amp.autocast(): - pred = model(g.ndata["x"], g.edata["f"], g) - loss = criterion(pred, g.ndata["y"]) - scaler_amp.scale(loss).backward() - scaler_amp.unscale_(optimizer) - torch.nn.utils.clip_grad_norm_(model.parameters(), args.clip_grad_norm) - scaler_amp.step(optimizer) - scaler_amp.update() - else: - pred = model(g.ndata["x"], g.edata["f"], g) - loss = criterion(pred, g.ndata["y"]) - loss.backward() - torch.nn.utils.clip_grad_norm_(model.parameters(), args.clip_grad_norm) - optimizer.step() - - running_loss += loss.item() - - scheduler.step() - - # Reduce & log - loss_tensor = torch.tensor([running_loss, len(train_loader)], device=device) - dist.reduce(loss_tensor, dst=0, op=dist.ReduceOp.SUM) - - if rank == 0: - tot, cnt = loss_tensor.tolist() - avg_loss = tot / (world_size * cnt) - ep1 = time.perf_counter() - print( - f"✅ [Epoch {epoch + 1}/{args.epochs}] loss={avg_loss:.9f} — time={ep1 - ep0:.2f}s" - ) - - os.makedirs(args.ckpt_path, exist_ok=True) - - if (epoch + 1) % args.save_interval == 0: - save_checkpoint( - f"{args.ckpt_path}/epoch_{epoch + 1}.pth", - model, - optimizer, - epoch, - rank, - ) - - if avg_loss < best_loss: - best_loss = avg_loss - save_checkpoint( - f"{args.ckpt_path}/best.pth", model, optimizer, epoch, rank - ) - inference(model, device, test_ds, scaler, args, rank) - - t_end = time.perf_counter() - if rank == 0: - print(f"All done — total time {(t_end - t_start) / 60:.1f} min") - - dist.destroy_process_group() - - -if __name__ == "__main__": - parser = argparse.ArgumentParser("DDP MeshGraphNet on Plaid CFD") - parser.add_argument("--data_dir", type=str, required=True) - parser.add_argument("--problem_dir", type=str, required=True) - parser.add_argument("--batch_size", type=int, default=8) - parser.add_argument("--num_workers", type=int, default=1) - parser.add_argument("--lr", type=float, default=1e-3) - parser.add_argument("--lr_decay_rate", type=float, default=0.99) - parser.add_argument("--epochs", type=int, default=200) - parser.add_argument("--amp", action="store_true") - parser.add_argument( - "--scaler", choices=["none", "minmax", "standard"], default="standard" - ) - parser.add_argument("--clip_grad_norm", type=float, default=1.0) - parser.add_argument("--ckpt_path", type=str, default="./ckpt_path") - parser.add_argument("--submission_path", type=str, default="./submission.pkl") - parser.add_argument("--save_interval", type=int, default=20) - parser.add_argument("--processor_size", type=int, default=15) - parser.add_argument("--num_layers_node_processor", type=int, default=2) - parser.add_argument("--num_layers_edge_processor", type=int, default=2) - parser.add_argument("--hidden_dim_node_encoder", type=int, default=64) - parser.add_argument("--num_layers_node_encoder", type=int, default=2) - parser.add_argument("--hidden_dim_edge_encoder", type=int, default=64) - parser.add_argument("--num_layers_edge_encoder", type=int, default=2) - parser.add_argument("--hidden_dim_node_decoder", type=int, default=64) - parser.add_argument("--num_layers_node_decoder", type=int, default=2) - parser.add_argument( - "--aggregation", type=str, choices=["sum", "mean", "max"], default="sum" - ) - parser.add_argument("--activation", type=str, default="leaky") - parser.add_argument("--master_addr", type=str, default="127.0.0.1") - parser.add_argument("--master_port", type=int, default=65325) - args = parser.parse_args() - - main_worker(args) diff --git a/benchmarks/MGN/model.py b/benchmarks/MGN/model.py deleted file mode 100644 index f34b08a2..00000000 --- a/benchmarks/MGN/model.py +++ /dev/null @@ -1,118 +0,0 @@ -from typing import Union - -import torch -from dgl import DGLGraph -from physicsnemo.models.gnn_layers.mesh_graph_mlp import MeshGraphMLP -from physicsnemo.models.meshgraphnet.meshgraphnet import MeshGraphNetProcessor, MetaData -from physicsnemo.models.module import Module -from torch import Tensor - - -class MeshGraphNet(Module): - def __init__( - self, - input_dim_nodes: int, - input_dim_edges: int, - output_dim: int, - processor_size: int = 15, - num_layers_node_processor: int = 2, - num_layers_edge_processor: int = 2, - hidden_dim_node_encoder: int = 128, - num_layers_node_encoder: int = 2, - hidden_dim_edge_encoder: int = 128, - num_layers_edge_encoder: int = 2, - hidden_dim_node_decoder: int = 128, - num_layers_node_decoder: int = 2, - aggregation: str = "sum", - do_concat_trick: bool = False, - num_processor_checkpoint_segments: int = 0, - activation: str = "relu", - ): - super().__init__(meta=MetaData()) - - if activation == "relu": - activation_fn = torch.nn.ReLU() - elif activation == "elu": - activation_fn = torch.nn.ELU() - elif activation == "leaky": - activation_fn = torch.nn.LeakyReLU(0.05) - else: - raise ValueError() - - self.edge_encoder = MeshGraphMLP( - input_dim_edges, - output_dim=hidden_dim_edge_encoder, - hidden_dim=hidden_dim_edge_encoder, - hidden_layers=num_layers_edge_encoder, - activation_fn=activation_fn, - norm_type="LayerNorm", - recompute_activation=False, - ) - self.node_encoder = MeshGraphMLP( - input_dim_nodes, - output_dim=hidden_dim_node_encoder, - hidden_dim=hidden_dim_node_encoder, - hidden_layers=num_layers_node_encoder, - activation_fn=activation_fn, - norm_type="LayerNorm", - recompute_activation=False, - ) - self.node_decoder = MeshGraphMLP( - hidden_dim_node_encoder, - output_dim=output_dim, - hidden_dim=hidden_dim_node_decoder, - hidden_layers=num_layers_node_decoder, - activation_fn=activation_fn, - norm_type=None, - recompute_activation=False, - ) - self.processor = MeshGraphNetProcessor( - processor_size=processor_size, - input_dim_node=hidden_dim_node_encoder, - input_dim_edge=hidden_dim_edge_encoder, - num_layers_node=num_layers_node_processor, - num_layers_edge=num_layers_edge_processor, - aggregation=aggregation, - norm_type="LayerNorm", - activation_fn=activation_fn, - do_concat_trick=do_concat_trick, - num_processor_checkpoint_segments=num_processor_checkpoint_segments, - ) - - def forward( - self, - node_features: Tensor, - edge_features: Tensor, - graph: Union[DGLGraph, list[DGLGraph]], - ) -> Tensor: - edge_features = self.edge_encoder(edge_features) - node_features = self.node_encoder(node_features) - x = self.processor(node_features, edge_features, graph) - x = self.node_decoder(x) - return x - - -def create_model(args): - model = MeshGraphNet( - input_dim_nodes=args.input_dim_nodes, - input_dim_edges=args.input_dim_edges, - output_dim=args.output_dim, - processor_size=args.processor_size, - num_layers_node_processor=args.num_layers_node_processor, - num_layers_edge_processor=args.num_layers_edge_processor, - hidden_dim_node_encoder=args.hidden_dim_node_encoder, - num_layers_node_encoder=args.num_layers_node_encoder, - hidden_dim_edge_encoder=args.hidden_dim_edge_encoder, - num_layers_edge_encoder=args.num_layers_edge_encoder, - hidden_dim_node_decoder=args.hidden_dim_node_decoder, - num_layers_node_decoder=args.num_layers_node_decoder, - aggregation=args.aggregation, - do_concat_trick=False, - num_processor_checkpoint_segments=0, - activation=args.activation, - ) - - optimizer = torch.optim.Adam(model.parameters(), lr=args.lr) - loss_fn = torch.nn.MSELoss() - - return model, optimizer, loss_fn diff --git a/benchmarks/MGN/train.py b/benchmarks/MGN/train.py deleted file mode 100644 index d685b83c..00000000 --- a/benchmarks/MGN/train.py +++ /dev/null @@ -1,146 +0,0 @@ -import os -import time - -import pandas as pd -import torch -from data import GraphDataset -from dgl.dataloading import GraphDataLoader -from utils import save_fields - -device = torch.device("cuda" if torch.cuda.is_available() else "cpu") - -for i in range(torch.cuda.device_count()): - print(f"💻 Using device {i}: {torch.cuda.get_device_properties(i).name}") - - -def train(args, model, optimizer, loss_fn, train_data, test_data): - model.to(device) - - # Create necessary directories - checkpoint_dir = os.path.join( - args.save_path, f"{args.dataset_name}/{args.run_name}/checkpoints" - ) - predictions_dir = os.path.join( - args.save_path, f"{args.dataset_name}/{args.run_name}/predictions" - ) - metrics_dir = os.path.join( - args.save_path, f"{args.dataset_name}/{args.run_name}/metrics" - ) - - os.makedirs(checkpoint_dir, exist_ok=True) - os.makedirs(predictions_dir, exist_ok=True) - os.makedirs(metrics_dir, exist_ok=True) - - # Dataset - train_dataset = GraphDataset(args, train_data, data_type="train") - test_dataset = GraphDataset( - args, - test_data, - data_type="test", - in_scaler=train_dataset.in_scaler, - out_scaler=train_dataset.out_scaler, - fields_min=train_dataset.fields_min, - fields_max=train_dataset.fields_max, - ) - - fields_min = train_dataset.fields_min.clone().detach().to(device) - fields_max = train_dataset.fields_max.clone().detach().to(device) - - # Dataloader - train_dataloader = GraphDataLoader( - train_dataset, - batch_size=args.batch_size, - shuffle=True, - drop_last=True, - ) - - test_dataloader = GraphDataLoader( - test_dataset, - batch_size=1, - shuffle=False, - drop_last=False, - ) - - # Calculate and print model parameters - total_params = sum(p.numel() for p in model.parameters() if p.requires_grad) - print(f"⚙️ Total number of model parameters: {total_params}") - - # Record the start time - start_time = time.time() - - # Data structure for storing metrics - metrics = [] - - # Training loop - num_epochs = args.num_epochs - - for epoch in range(num_epochs): - epoch_start_time = time.time() - - train_loss = 0.0 - model.train() - y_trains, y_train_preds = [], [] - - for idx, (graph, _, _) in enumerate(train_dataloader): - optimizer.zero_grad() - - graph = graph.to(device) - pred = model(graph.ndata["x"], graph.edata["f"], graph) - loss = loss_fn(graph.ndata["y"], pred) - - loss.backward() - optimizer.step() - train_loss += loss.item() - - y_trains.append(graph.ndata["y"]) - y_train_preds.append(pred) - - train_loss /= idx + 1 - - model.eval() - y_test_preds = [] - - with torch.no_grad(): - for idx, (graph, _, _) in enumerate(test_dataloader): - graph = graph.to(device) - pred = model(graph.ndata["x"], graph.edata["f"], graph) - - pred = pred * (fields_max - fields_min) + fields_min - y_test_preds.append(pred) - - if (epoch + 1) % 100 == 0: - torch.save( - model.state_dict(), - os.path.join(checkpoint_dir, f"state_epoch_{epoch + 1}.pt"), - ) - save_fields( - os.path.join(predictions_dir, f"predicted_fields_{epoch + 1}.h5"), - y_test_preds, - ) - - epoch_end_time = time.time() - epoch_duration = epoch_end_time - epoch_start_time - - # Collect metrics for this epoch - metrics.append( - {"epoch": epoch, "train_loss": train_loss, "duration": epoch_duration} - ) - - metrics_str = ( - f"🌟" - f"Epoch {epoch + 1} | " - f"Train Loss: {train_loss:.7f} | " - f"Duration: {epoch_duration:.2f} (s) " - ) - print(metrics_str) - - # Saving collected metrics to a CSV file - metrics_df = pd.DataFrame(metrics) - metrics_df.to_csv(os.path.join(metrics_dir, "metrics.csv"), index=False) - - # Record the end time - end_time = time.time() - - # Calculate the training duration - training_duration = end_time - start_time - print(f"⏰ Training took {training_duration:.7f} seconds") diff --git a/benchmarks/MGN/utils.py b/benchmarks/MGN/utils.py deleted file mode 100644 index 13018b32..00000000 --- a/benchmarks/MGN/utils.py +++ /dev/null @@ -1,134 +0,0 @@ -import math - -import h5py -import numpy as np -import torch -from Muscat.Containers.MeshInspectionTools import ComputeMeshMinMaxLengthScale -from sklearn.neighbors import KDTree - - -def get_bandwidth(mesh) -> float: - lengthscale = ComputeMeshMinMaxLengthScale(mesh) - return lengthscale - - -def relative_rmse_field( - y_true: list[torch.Tensor], y_pred: list[torch.Tensor], threshold: float = 0.0 -) -> torch.Tensor: - return torch.sqrt( - torch.mean( - torch.stack( - [ - torch.linalg.norm(y - y_hat, axis=0) ** 2 - / (len(y) * step_function_field(y, threshold) ** 2) - for y, y_hat in zip(y_true, y_pred) - ] - ), - dim=0, - ) - ) - - -def save_fields(filename: str, fields: list[torch.Tensor]) -> None: - with h5py.File(filename, "w", libver="latest") as f: - for idx, field in enumerate(fields): - f.create_dataset(str(idx), data=field.cpu().numpy()) - return None - - -def save_scalars(file_path, data_list): - with h5py.File(file_path, "w") as f: - for i, data_array in enumerate(data_list): - f.create_dataset(f"array_{i}", data=data_array) - - -def load_fields(filename: str) -> list[torch.Tensor]: - fields = [] - with h5py.File(filename, "r") as f: - # Sort the keys numerically - for name in sorted(f.keys(), key=int): - data = f[name][()] - tensor = torch.from_numpy(data) - fields.append(tensor) - return fields - - -def load_scalars(file_path): - data_list = [] - with h5py.File(file_path, "r") as f: - # Sort the keys numerically - for key in sorted(f.keys(), key=lambda x: int(x.split("_")[1])): - data_array = f[key][()] - data_list.append(data_array) - return data_list - - -def extract_border_edges(faces): - edge_dict = {} - - for face in faces: - for i in range(3): - edge = tuple(sorted((face[i], face[(i + 1) % 3]))) - if edge in edge_dict: - edge_dict[edge] += 1 - else: - edge_dict[edge] = 1 - - border_edges = [edge for edge, count in edge_dict.items() if count == 1] - return np.array(border_edges) - - -def get_distances_to_borders(pos, cells): - faces = np.array(cells) - points = np.array(pos) - bars = extract_border_edges(faces) - border_bars_node_ids = np.unique(np.ravel(bars)) - is_border = np.zeros(len(points), dtype=bool) - is_border[border_bars_node_ids] = True - search_index = KDTree(points[is_border]) - sdf, _ = search_index.query(points, return_distance=True) - return is_border, sdf - - -def sinusoidal_embedding( - x: torch.Tensor, num_basis: int = 8, max_coord: float = 2.0, spacing: float = 1.0 -) -> torch.Tensor: - # Normalize and compute frequencies - x = x / spacing - max_seq = max_coord / spacing - exponents = -math.log(max_seq * 4 / math.pi) / (num_basis - 1) - div_term = torch.exp(torch.arange(num_basis, device=x.device) * exponents) - emb = x.unsqueeze(-1) * div_term - sin_emb = emb.sin() - cos_emb = emb.cos() - # Concat and flatten the last two dims: -> [*, D * 2 * num_basis] - return torch.cat([sin_emb, cos_emb], dim=-1).flatten(-2, -1) - - -def angles_to_planes(coords): - x, y = coords[:, 0], coords[:, 1] - angles = torch.stack( - [ - torch.atan2(y, x), - torch.atan2(y, -x), - torch.atan2(-y, -x), - torch.atan2(-y, x), - ], - dim=1, - ) - return angles - - -def spherical_harmonics(angle: torch.Tensor, l_max: int = 3) -> torch.Tensor: - cos_t = torch.cos(angle).cpu().numpy() - harmonics = [] - for l in range(l_max + 1): - coeffs = np.zeros(l + 1) - coeffs[-1] = 1 - P_l = np.polynomial.legendre.Legendre(coeffs) - harmonics.append( - torch.tensor(P_l(cos_t), device=angle.device, dtype=angle.dtype).unsqueeze( - -1 - ) - ) - return torch.cat(harmonics, dim=-1) # [*, l_max+1] diff --git a/benchmarks/MMGP/2D_profile/README.md b/benchmarks/MMGP/2D_profile/README.md deleted file mode 100644 index 2c741b8e..00000000 --- a/benchmarks/MMGP/2D_profile/README.md +++ /dev/null @@ -1,14 +0,0 @@ -To run this benchmark, you need to download and untar the dataset from Zenodo, then edit in the files "morphing_script.py" and "train_and_predict.py" the following variables at the top of the files: - -- `plaid_location`: the location where the plaid dataset has been untarred - -Run in the order: "create_coarse_common_mesh.py", "launch_morphings.py", and "train_and_predict.py" to generate the prediction on the testing set. - -Remark: the loop in "launch_morphings.py" is embarrasingly parallel, and the calls to "morphing_script.py" can be efficiently handled by any job scheduler. - - -### List of dependencies - -- [PLAID=0.1.6](https://github.com/PLAID-lib/plaid) -- [Muscat=2.4.1](https://gitlab.com/drti/muscat) -- [GPy=1.13.2](https://github.com/SheffieldML/GPy) \ No newline at end of file diff --git a/benchmarks/MMGP/2D_profile/create_coarse_common_mesh.py b/benchmarks/MMGP/2D_profile/create_coarse_common_mesh.py deleted file mode 100644 index e3a25a48..00000000 --- a/benchmarks/MMGP/2D_profile/create_coarse_common_mesh.py +++ /dev/null @@ -1,73 +0,0 @@ -from Muscat.IO import XdmfWriter as XW -from Muscat.MeshTools.Remesh import Remesh -from Muscat.Containers import AnisotropicMetricComputation as AMC -import numpy as np -from Muscat.Containers.MeshModificationTools import ComputeSkin - - -from Muscat.Containers import ElementsDescription as ED -from Muscat.Containers.Filters import FilterObjects as FO -from Muscat.Containers.Filters import FilterOperators as FOp - -from Muscat.IO.CGNSReader import ReadCGNS - -import os - - -plaid_location = "/path/to/plaid/" # path to update to input plaid dataset - - -reference_mesh_index = 0 - -common_mesh_path = os.path.join(plaid_location, "dataset/samples/sample_00000000"+str(reference_mesh_index)+"/meshes/mesh_000000000.cgns") -mesh = ReadCGNS(fileName=common_mesh_path) - - -mesh.nodes = np.ascontiguousarray(mesh.nodes[:,:2]) - -m = mesh.nodeFields['Mach'] -p = mesh.nodeFields['Pressure'] -Ux = mesh.nodeFields['Velocity-x'] -Uy = mesh.nodeFields['Velocity-y'] - - -m_scaled = (1.0 / np.std(m)) * (m - np.mean(m)) -p_scaled = (1.0 / np.std(p)) * (p - np.mean(p)) -norm_U = np.sqrt(Ux**2 + Uy**2) -norm_U_scaled = (1.0 / np.std(norm_U)) * (norm_U - np.mean(norm_U)) -psi = 1.*p_scaled + 1.*m_scaled + 0.*norm_U_scaled - -metric = AMC.ComputeMetric(mesh, psi, err = 0.03, gradL2 = False) -# metric = AMC.ComputeMetric(mesh, psi, err = 0.03, hmin = 0.005, hmax = 1000, gradL2 = False) - -remeshed_mesh = Remesh(mesh = mesh, solution = None, metric = metric) - -remeshed_mesh = ComputeSkin(remeshed_mesh, inPlace=True) - -nf_skin = FO.NodeFilter(eTag =["Skin"]) -indices_skin = nf_skin.GetNodesIndices(remeshed_mesh) - -nf1 = FO.NodeFilter(eTag =["Skin"], zone=[lambda xyz: (xyz[:, 0] - 1.5)]) -nf2 = FO.NodeFilter(eTag =["Skin"], zone=[lambda xyz: (xyz[:, 1] - 0.5)]) -nf3 = FO.NodeFilter(eTag =["Skin"], zone=[lambda xyz: (-xyz[:, 0] - 0.5)]) -nf4 = FO.NodeFilter(eTag =["Skin"], zone=[lambda xyz: (-xyz[:, 1] - 0.5)]) - - -nf = FOp.IntersectionFilter(filters=[nf1, nf2, nf3, nf4]) -indices_airfoil = nf.GetNodesIndices(remeshed_mesh) -# print("indices_airfoil =", indices_airfoil) -remeshed_mesh.GetNodalTag("Airfoil").AddToTag(indices_airfoil) - -indices_ext_bound = np.setdiff1d(indices_skin, indices_airfoil) -remeshed_mesh.GetNodalTag("Ext_bound").AddToTag(indices_ext_bound) - -nf5 = FO.NodeFilter(eTag =["Skin"], zone=[lambda xyz: (xyz[:, 0] + 0.9999)]) -inlet_indices = nf5.GetNodesIndices(remeshed_mesh) -remeshed_mesh.GetNodalTag("Inlet").AddToTag(inlet_indices) - -remeshed_mesh.elements[ED.Bar_2].tags.DeleteTags(["Skin"]) -del remeshed_mesh.elements[ED.Bar_2] -remeshed_mesh.nodesTags.DeleteTags(['Corners', 'NTag_mmg_0']) - - -XW.WriteMeshToXdmf("coarse_common_mesh.xdmf", remeshed_mesh) diff --git a/benchmarks/MMGP/2D_profile/launch_morphings.py b/benchmarks/MMGP/2D_profile/launch_morphings.py deleted file mode 100644 index 1892db7d..00000000 --- a/benchmarks/MMGP/2D_profile/launch_morphings.py +++ /dev/null @@ -1,4 +0,0 @@ -import os - -for CASE in range(0, 399): - os.system(f"python morphing_script.py {CASE}") diff --git a/benchmarks/MMGP/2D_profile/model.py b/benchmarks/MMGP/2D_profile/model.py deleted file mode 100644 index 7efe257d..00000000 --- a/benchmarks/MMGP/2D_profile/model.py +++ /dev/null @@ -1,87 +0,0 @@ -import GPy -import numpy as np -from sklearn.base import BaseEstimator, RegressorMixin - - -class GPyRegressor(BaseEstimator, RegressorMixin): - """Custom Gaussian Process Regressor using GPy library. - - Args: - normalizer (bool): Whether to normalize the output. - constant_mean (bool): Whether to use a constant mean model. - kernel (str): Type of kernel to use in the Gaussian Process. - Options: 'Matern52', 'Matern32', 'Rbf'. - num_restarts (int): Number of restarts for kernel optimization. - """ - - def __init__( - self, - normalizer: bool = False, - constant_mean: bool = False, - kernel: str = "Matern52", - num_restarts: int = 5, - ): - self.normalizer = normalizer - self.constant_mean = constant_mean - self.kernel = kernel - self.num_restarts = num_restarts - - def fit(self, X, y): - """Fit the Gaussian Process model to the data. - - Args: - X (ndarray): Input features of shape (n_samples, n_features). - y (ndarray): Target values of shape (n_samples,) or (n_samples, 1). - - Returns: - self: Returns the instance of the fitted model. - """ - # Reshape y to have shape (n_samples, 1) if it's 1D - if y.ndim == 1: - y = y[:, None] - - # Define the kernel based on the specified kernel type - if self.kernel == "Matern52": - kernel = GPy.kern.Matern52(input_dim=X.shape[-1], ARD=True) - elif self.kernel == "Matern32": - kernel = GPy.kern.Matern32(input_dim=X.shape[-1], ARD=True) - elif self.kernel == "RBF": - kernel = GPy.kern.RBF(input_dim=X.shape[-1], ARD=True) - else: - raise ValueError("Kernel should be 'RBF', 'Matern32', or 'Matern52'") - - mean_function = None - if self.constant_mean: - mean_function = GPy.mappings.Constant( - input_dim=X.shape[-1], output_dim=y.shape[-1] - ) - - # Create and optimize the GP regression model - self.kmodel = GPy.models.GPRegression( - X=X, - Y=y, - kernel=kernel, - normalizer=self.normalizer, - mean_function=mean_function, - ) - self.kmodel.optimize_restarts(num_restarts=self.num_restarts, messages=False) - return self - - def predict(self, X, return_var: bool = False): - """Predict using the Gaussian Process model. - - Args: - X (ndarray): Input features of shape (n_samples, n_features). - return_var (bool): Return the predictive variance. - - Returns: - mean (ndarray): Predicted mean values for the input data - or - (mean, variance) (ndarray): Predicted mean and variance values if return_var is True. - """ - # Get the mean prediction from the GP model - mean, var = self.kmodel.predict(X) - if return_var: - return mean, np.tile(var, (1, mean.shape[-1])) - else: - return mean diff --git a/benchmarks/MMGP/2D_profile/morphing_script.py b/benchmarks/MMGP/2D_profile/morphing_script.py deleted file mode 100644 index 39e60c2c..00000000 --- a/benchmarks/MMGP/2D_profile/morphing_script.py +++ /dev/null @@ -1,156 +0,0 @@ - -import pickle -import copy -import time -import sys , os - -from Muscat.FE.Fields.FEField import FEField -from Muscat.FE.FETools import PrepareFEComputation -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.NativeTransfer import NativeTransfer -from Muscat.IO.CGNSReader import ReadCGNS -from Muscat.IO import XdmfReader as XR -import Muscat.Containers.MeshInspectionTools as UMIT -from Muscat.Containers.MeshModificationTools import CleanLonelyNodes , ComputeSkin - -from utils_2dprofile import ElasticProblem , VectorialDistance_Muscat_preprocessed,signedDistannce_Function_kokkos - - -plaid_location = "/path/to/plaid/" # path to update to input plaid dataset - - -def MatchTwoGeometries(mesh, Tmesh,TmeshIndex=250,max_iteration=200, tolerance= 1*10**(-3) ,YoungModulus=0.1 , nu=0.3 , alpha=200, gamma=5,beta=0,formulation="vect_distance",tags=[]) : - - space, numbering,_,_ = PrepareFEComputation(mesh,numberOfComponents=1) - - Tspace, Tnumberings,_,_ = PrepareFEComputation(Tmesh,numberOfComponents=1) - field_Tmesh = FEField("", mesh=Tmesh, space=Tspace, numbering=Tnumberings[0]) - - begin=time.time() - - - Tmesh_partition={} - for tag in tags: - Tmesh_partition[tag]={} - mesh_Filter= ElementFilter(nTag =tag) - Tmesh_tag=UMIT.ExtractElementsByElementFilter(Tmesh, mesh_Filter) - CleanLonelyNodes(Tmesh_tag) - Tspace, Tnumberings,_,_ = PrepareFEComputation(Tmesh_tag,numberOfComponents=1) - FE_Tmesh_tag = FEField("", mesh=Tmesh_tag, space=Tspace, numbering=Tnumberings[0]) - Tmesh_partition[tag]["mesh"]=Tmesh_tag - Tmesh_partition[tag]["FEField"]=FE_Tmesh_tag - - nt = NativeTransfer() - nt.SetVerbose(False) - nt.SetTransferMethod("Interp/Clamp") - nt.SetSourceFEField(FE_Tmesh_tag, elementFilter=None) - Tmesh_partition[tag]["TransferOperator"]=nt - - - space, numbering,_,_ = PrepareFEComputation(mesh,numberOfComponents=1) - - Tspace, Tnumberings,_,_ = PrepareFEComputation(Tmesh,numberOfComponents=1) - field_Tmesh = FEField("", mesh=Tmesh, space=Tspace, numbering=Tnumberings[0]) - - begin=time.time() - - - sol = mesh.nodes*0 - - - dist = signedDistannce_Function_kokkos(Tmesh, mesh,field_Tmesh=field_Tmesh) - - d= max(abs(dist)) - - begin=time.time() - i=0 - - - variable_E=True - tranfer_time=0 - - while d > tolerance: - - print(f"Iteration {i}") - print(d) - if i%40 ==0 and i<340: - #print(i) - variable_E =True - alpha=alpha*2 - gamma = gamma*2 - - vectDist=VectorialDistance_Muscat_preprocessed(mesh,tags=tags,Tmesh_partition=Tmesh_partition) - signedDistance = signedDistannce_Function_kokkos(Tmesh, mesh,field_Tmesh=field_Tmesh) - data_signedDistance= signedDistance* -0.25 - - if i ==600: - #print(i) - variable_E =False - alpha=50 - - vectDist = vectDist*0.0 - data_signedDistance= signedDistance* 2 - - gamma = gamma*0.007 - - - d= max(abs(signedDistance)) - extraFields = [FEField("vectDist_0",mesh=mesh, space=space, numbering=numbering[0], data= 1*vectDist[:,0]), - FEField("vectDist_1",mesh=mesh, space=space, numbering=numbering[0], data= 1*vectDist[:,1]), - FEField("signedDistance",mesh=Tmesh, space=space, numbering=numbering[0], data= data_signedDistance)] - - - sol = ElasticProblem(mesh, extraFields,YoungModulus,nu,alpha,formulation=formulation,variable_E=variable_E) - - - - mesh.nodes += sol*gamma - if i==max_iteration: - time_per_sample=time.time()-begin - print("Time=",time_per_sample) - - break - i+=1 - - time_per_sample=time.time()-begin - - print("number of iteration= ", i) - print("Time=",time_per_sample) - print("tranfer time = ", tranfer_time) - - return mesh - - - -tags=["Airfoil"] -sample=int(sys.argv[1]) -print("sample = ", sample) - -Tmesh_index=str(sample).zfill(3) - -Tmesh_path = os.path.join(plaid_location, "dataset/samples/sample_000000"+str(Tmesh_index)+"/meshes/mesh_000000000.cgns") - -# plaid_location_coarse = # path to update to plaid dataset containing a sample with the coarse common mesh -# reference_mesh_index=0 -# reference_mesh_path = os.path.join(plaid_location_coarse, "dataset/samples/sample_00000000"+str(reference_mesh_index)+"/meshes/mesh_000000000.cgns") -# reference_mesh = ReadCGNS(fileName=reference_mesh_path) -reference_mesh = XR.ReadXdmf("coarse_common_mesh.xdmf") -Tmesh = ReadCGNS(fileName=Tmesh_path) - -ComputeSkin(reference_mesh,inPlace=True) -ComputeSkin(Tmesh,inPlace=True) - -mesh =MatchTwoGeometries(copy.deepcopy(reference_mesh), Tmesh ,max_iteration=700 ,tolerance= 5*10**(-5) ,YoungModulus=1, nu=0.3 , alpha=100, gamma=15,beta=0, - formulation="signed_distance",tags=tags) - - -displacement_field=mesh.nodes-reference_mesh.nodes - - -folder_path = "displacement_field" - -if not os.path.exists(folder_path): - os.makedirs(folder_path) - -with open(folder_path+"/displacement_field"+Tmesh_index+".pkl", 'wb') as file: - pickle.dump(displacement_field, file) \ No newline at end of file diff --git a/benchmarks/MMGP/2D_profile/train_and_predict.py b/benchmarks/MMGP/2D_profile/train_and_predict.py deleted file mode 100644 index 3a693a1d..00000000 --- a/benchmarks/MMGP/2D_profile/train_and_predict.py +++ /dev/null @@ -1,222 +0,0 @@ -import numpy as np -import copy -import pickle -import time -from sklearn.preprocessing import StandardScaler, MinMaxScaler - -from Muscat.IO.CGNSReader import ReadCGNS -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.FE.Fields.FEField import FEField -from Muscat.FE.FETools import PrepareFEComputation -from Muscat.Containers.Filters import FilterObjects as FO -from Muscat.IO import XdmfReader as XR -from scipy.sparse import identity - -from model import GPyRegressor - -from utils_2dprofile import POD -import os - -plaid_location = "/path/to/plaid/" # path to update to input plaid dataset - - - -start=time.time() - -# plaid_location_coarse = # path to update to plaid dataset containing a sample with the coarse common mesh -# reference_mesh_index=0 -# reference_mesh_path = os.path.join(plaid_location_coarse, "dataset/samples/sample_00000000"+str(reference_mesh_index)+"/meshes/mesh_000000000.cgns") -# reference_mesh = ReadCGNS(fileName=reference_mesh_path) -reference_mesh = XR.ReadXdmf("coarse_common_mesh.xdmf") - -reference_mesh_path = os.path.join(plaid_location, "dataset/samples/sample_00000000"+str(reference_mesh_index)+"/meshes/mesh_000000000.cgns") -reference_mesh_fin = ReadCGNS(fileName=reference_mesh_path) - -space, numbering,_,_ = PrepareFEComputation(reference_mesh,numberOfComponents=1) -fefield = FEField("", mesh=reference_mesh, space=space, numbering=numbering[0]) -op_ref_mesh=GetFieldTransferOp(inputField= fefield, targetPoints= reference_mesh_fin.nodes, method="Interp/Clamp" , elementFilter= ElementFilter(dimensionality=2) )[0] - - - -node_fields=['Mach', 'Pressure', 'Velocity-x', 'Velocity-y'] -n_train=300 -nNodes_referenceMesh=reference_mesh_fin.nodes.shape[0] - -flattend_data_morphing=np.zeros((n_train,2*reference_mesh.nodes.shape[0])) - - - -# transfer data on reference mesh by pull back - -data={} - -for field_name in node_fields: - data[field_name] = np.zeros((n_train,nNodes_referenceMesh)) - -morphing_displacement_fields=np.zeros((n_train,nNodes_referenceMesh,2)) - - -for sample in range(n_train): - print(sample) - Tmesh_index=str(sample).zfill(3) - - file = open("displacement_field/displacement_field"+Tmesh_index+".pkl", 'rb') - morphing_displacement_field_coarse = pickle.load(file) - file.close() - - morphing_displacement_fields[sample] = op_ref_mesh.dot(morphing_displacement_field_coarse) - - temp_mesh=copy.deepcopy(reference_mesh_fin) - - temp_mesh.nodes += morphing_displacement_fields[sample] - - - Tmesh_path = os.path.join(plaid_location, "dataset/samples/sample_00000000"+str(Tmesh_index)+"/meshes/mesh_000000000.cgns") - Tmesh = ReadCGNS(fileName=Tmesh_path) - - space, numbering,_,_ = PrepareFEComputation(Tmesh,numberOfComponents=1) - fefield = FEField("", mesh=Tmesh, space=space, numbering=numbering[0]) - OP=GetFieldTransferOp(inputField= fefield, targetPoints= temp_mesh.nodes, method="Interp/Clamp" , elementFilter= ElementFilter(dimensionality=2) )[0] - - for field_name in node_fields: - data[field_name][sample] = OP.dot(Tmesh.nodeFields[field_name]) - - -for sample in range(n_train): - flattend_data_morphing[sample,:]=morphing_displacement_fields[sample,:,:].flatten() - - -#dimensionality reduction -elementFilter = FO.ElementFilter() -elementFilter.SetDimensionality(2) - -correlationOperator1c = identity(reference_mesh_fin.nodes.shape[0]) - -correlationOperator2c = identity(2*reference_mesh.nodes.shape[0]) - -data_POD={} -reducedOrderBasis={} -generalizedCoordinates={} -eigenvalues={} - -n_output= 40 - -for field_name in node_fields: - reducedOrderBasis[field_name] , generalizedCoordinates[field_name] , eigenvalues[field_name] = POD(data[field_name],correlationOperator=correlationOperator1c,nmodes=n_output) - print("Energy for field "+field_name+"= ", np.sum(eigenvalues[field_name] [:n_output])/np.sum(eigenvalues[field_name] [:])) - - -n_modes_input=18 -reducedOrderBasis_u , generalizedCoordinates_u, eigenvalues_u = POD(data=flattend_data_morphing,correlationOperator=correlationOperator2c,nmodes=n_modes_input) -print("Energy for displacement fields = ", np.sum(eigenvalues_u[:n_modes_input])/np.sum(eigenvalues_u[:])) - - -# scaling inputs and outputs - - -scalerX= StandardScaler() -X=scalerX.fit_transform(generalizedCoordinates_u) - -y_scalers = [] -Y = [] - -for field_name in node_fields: - y_scaler = MinMaxScaler() - y_scalers.append(y_scaler) - Y.append(y_scaler.fit_transform(generalizedCoordinates[field_name])) - -# train GP - -def train_single_output(X, y,num_restarts=7): - gp = GPyRegressor(num_restarts=num_restarts) - gp.fit(X, y) - return gp - - - -n_outputs = Y[0].shape[-1] - -gp_models_fields=[] -print(f">> training field") - -for i,field_name in enumerate(node_fields): - - print(f">> training {node_fields[i]}") - - model_one_field=[] - for j in range(n_outputs): - print(f"Coord. {j}") - model=train_single_output(X,Y[i][:,j]) - model_one_field.append(model) - - gp_models_fields.append(model_one_field) - - - - -# predict - -n_test=100 - -alpha=np.zeros((n_test,n_modes_input)) -transfer_op_test=[] -for sample in range(n_test): - - - print(sample) - Tmesh_index=str(sample+n_train).zfill(3) - - file = open("displacement_field/displacement_field"+Tmesh_index+".pkl", 'rb') - displacement_field = pickle.load(file) - file.close() - alpha[sample] = np.dot(reducedOrderBasis_u,correlationOperator2c.dot(displacement_field.flatten())) - - temp_mesh=copy.deepcopy(reference_mesh_fin) - temp_mesh.nodes += op_ref_mesh.dot(displacement_field) - - - Tmesh_path = os.path.join(plaid_location, "dataset/samples/sample_00000000"+str(Tmesh_index)+"/meshes/mesh_000000000.cgns") - Tmesh = ReadCGNS(fileName=Tmesh_path) - - space, numbering,_,_ = PrepareFEComputation(temp_mesh,numberOfComponents=1) - fefield = FEField("", mesh=temp_mesh, space=space, numbering=numbering[0]) - - - OP=GetFieldTransferOp(inputField= fefield, targetPoints= Tmesh.nodes, method="Interp/Clamp" , elementFilter= ElementFilter(dimensionality=2) )[0] - transfer_op_test.append(OP) - - -input_test = scalerX.transform(alpha) - - -y_pred_common = [] - - - -for i,field_name in enumerate(node_fields): - output_dim = Y[i].shape[-1] - y_pred_i = np.empty((n_test, output_dim)) - - for j in range(output_dim): - y_pred_i[:,j] = gp_models_fields[i][j].predict(input_test)[:,0] - - - y_pred_i_inv = y_scalers[i].inverse_transform(y_pred_i) - y_pred_common_i = np.dot(y_pred_i_inv, reducedOrderBasis[field_name]) - y_pred_common.append(y_pred_common_i) - - - -prediction = [] - -for i in range(n_test): - prediction.append({}) - - - for j,field_name in enumerate(node_fields): - prediction[i][field_name] = transfer_op_test[i].dot(y_pred_common[j][i]) - -with open('prediction.pkl', 'wb') as file: - pickle.dump(prediction, file) - diff --git a/benchmarks/MMGP/2D_profile/utils_2dprofile.py b/benchmarks/MMGP/2D_profile/utils_2dprofile.py deleted file mode 100644 index fc9848e8..00000000 --- a/benchmarks/MMGP/2D_profile/utils_2dprofile.py +++ /dev/null @@ -1,320 +0,0 @@ -import Muscat.Containers.MeshInspectionTools as UMIT -import numpy as np -from Muscat.Containers import MeshInspectionTools as UMIP -from Muscat.Containers.Filters.FilterObjects import ElementFilter -from Muscat.Containers.MeshFieldOperations import GetFieldTransferOp -from Muscat.Containers.MeshModificationTools import CleanLonelyNodes -from Muscat.FE.DofNumbering import ComputeDofNumbering -from Muscat.FE.FETools import ComputeNormalsAtPoints, PrepareFEComputation -from Muscat.FE.Fields.FEField import FEField -from Muscat.FE.Fields.FieldTools import GetPointRepresentation -from Muscat.FE.Spaces.FESpaces import LagrangeSpaceP0 -from Muscat.FE.SymPhysics import MechPhysics -from Muscat.FE.SymWeakForm import GetField, GetNormal, GetScalarField -from Muscat.FE.UnstructuredFeaSym import UnstructuredFeaSym -from Muscat.Helpers.Timer import Timer - - -class MecaPhysics_ESM(MechPhysics): - # Add weak formulations to the MecaPhysics class - def __init__(self, dim=2, elasticModel="isotropic"): - super().__init__(dim) - self.spaceDimension = dim - - def WeakDirichlet(self, alpha): - u = self.primalUnknown - ut = self.primalTest - - a = GetScalarField(alpha) - return u.T * ut * a - - def WeakDirichletNormal(self, alpha): - u = self.primalUnknown - ut = self.primalTest - a = GetScalarField(alpha) - - # Normal = GetNormal(self.spaceDimension ) - Normal = GetField("normal_nodes", 2) - - return u.T * Normal * ut.T * Normal * a - - def vectDistanceFormulation(self): - Normal = GetNormal(self.spaceDimension) - ut = self.primalTest - vectDist_symb = GetField("vectDist", size=2) - - return (vectDist_symb.T) * ut - return (vectDist_symb.T * Normal) * ut.T * Normal - - def vectDistanceFormulation2(self): - GetNormal(self.spaceDimension) - ut = self.primalTest - vectDist_symb = GetField("vectDist", size=2) - return vectDist_symb.T * ut - - def Pressure_updated_normal(self, pressure): - ut = self.primalTest - - p = GetScalarField(pressure) - - Normal = GetField("normal_nodes", 2) - - return p * Normal.T * ut - - -def signedDistanceFunction(Tmesh, mesh, field_Tmesh, dim=1): - # extract the boundary nodes of mesh. - filter = ElementFilter(dimensionality=dim) - boundary_ids = filter.GetNodesIndices(mesh=mesh) - nNodes = mesh.GetNumberOfNodes() - - # calculate the distance. skinpos[i] is the projection of Mesh.nodes[i] on the boundary of Tmesh. - opSkin, statusSkin, _ = GetFieldTransferOp( - inputField=field_Tmesh, - targetPoints=mesh.nodes[boundary_ids], - method="Interp/Clamp", - elementFilter=ElementFilter(dimensionality=1), - verbose=False, - ) - skinpos = opSkin.dot(Tmesh.nodes) - - signed_distance = np.zeros(nNodes) - signed_distance[boundary_ids] = np.sqrt( - np.sum((skinpos - mesh.nodes[boundary_ids]) ** 2, axis=1) - ) - - # calculate the sign. If statusBulk[i]==1, then Mesh.nodes[i] is inside Tmesh. - _, statusBulk0, _ = GetFieldTransferOp( - inputField=field_Tmesh, - targetPoints=mesh.nodes[boundary_ids], - method="Interp/Clamp", - elementFilter=ElementFilter(dimensionality=2), - verbose=False, - ) - statusBulk = np.zeros(nNodes) - statusBulk[boundary_ids] = statusBulk0[:, 0] - - signed_distance[statusBulk[:] == 1] *= -1 - - return signed_distance - - -def ElasticProblem( - mesh, - extraFields, - E=5.0, - nu=0.3, - alpha=200, - formulation="vect_distance", - variable_E=False, -): - problem = UnstructuredFeaSym() - problem.fields = {f.name: f for f in extraFields} - - dim = 2 - - numberingP0 = ComputeDofNumbering( - mesh, LagrangeSpaceP0, elementFilter=ElementFilter(dimensionality=dim) - ) - volumes = abs( - UMIP.GetVolumePerElement(mesh, elementFilter=ElementFilter(dimensionality=dim)) - ) - YoungModulusField = np.zeros(len(volumes)) - YoungModulusField[:] = (1 + 1 * (max(volumes) - min(volumes)) / volumes[:]) * 0.5 - - # YoungModulusField[:]=(1*(ratio[:]/max(ratio)-min(ratio))) * 0.02 - EField = FEField( - "E", - mesh=mesh, - space=LagrangeSpaceP0, - numbering=numberingP0, - data=YoungModulusField, - ) - - problem.fields["E"] = EField - - normal_nodes = ComputeNormalsAtPoints(mesh) - space, numbering, _, _ = PrepareFEComputation(mesh, numberOfComponents=1) - a = normal_nodes[:, 0].copy(order="C") - b = normal_nodes[:, 1].copy(order="C") - normal_nodes_0 = FEField( - "normal_nodes_0", mesh=mesh, space=space, numbering=numbering[0], data=a - ) - normal_nodes_1 = FEField( - "normal_nodes_1", mesh=mesh, space=space, numbering=numbering[0], data=b - ) - - problem.fields["normal_nodes_0"] = normal_nodes_0 - problem.fields["normal_nodes_1"] = normal_nodes_1 - - P = 1 - - # the main class - - # the mecanical problem - mecaPhysics = MecaPhysics_ESM(dim=2) - mecaPhysics.SetSpaceToLagrange(P=P) - # mecaPhysics.integrationRule = "LagrangeP1Quadrature" - - # Left hand side operator - if variable_E: - mecaPhysics.AddBFormulation( - ElementFilter(dimensionality=2), mecaPhysics.GetBulkFormulation("E", nu) - ) - else: - mecaPhysics.AddBFormulation( - ElementFilter(dimensionality=2), mecaPhysics.GetBulkFormulation(E, nu) - ) - - mecaPhysics.AddBFormulation( - ElementFilter(dimensionality=1), data=mecaPhysics.WeakDirichletNormal(alpha) - ) - mecaPhysics.AddBFormulation( - ElementFilter(dimensionality=1, nTag="Ext_bound"), - data=mecaPhysics.WeakDirichletNormal(100000000), - ) - - if formulation == "signed_distance": - # term for the signed distance function formulation - - mecaPhysics.AddLFormulation( - ElementFilter(dimensionality=1), - mecaPhysics.GetPressureFormulation("signedDistance"), - ) - - mecaPhysics.AddLFormulation( - ElementFilter(dimensionality=1), data=mecaPhysics.vectDistanceFormulation() - ) - - elif formulation == "vect_distance": - # term for the vectorial distance function formulation - mecaPhysics.AddLFormulation( - ElementFilter(dimensionality=1), data=mecaPhysics.vectDistanceFormulation() - ) - mecaPhysics.AddLFormulation( - ElementFilter(dimensionality=1), - mecaPhysics.GetDistributedForceFormulation(["forceX", "forceY"]), - ) - else: - raise ValueError( - "The formulation should be 'vect_distance' or 'signed_distance' " - ) - - # set the mesh, assemble the stiffnes matrix, solve and return the solution - problem.physics.append(mecaPhysics) - mesh.ConvertDataForNativeTreatment() - problem.SetMesh(mesh) - problem.ComputeDofNumbering() - - with Timer("Assembly "): - k, f = problem.GetLinearProblem(computeK=True) - problem.solver.SetAlgo("Direct") - problem.ComputeConstraintsEquations() - - with Timer("Solve"): - problem.Solve(k, f) - - problem.PushSolutionVectorToUnknownFields() - - return GetPointRepresentation(problem.unknownFields) - - -def VectorialDistance_Muscat_preprocessed(mesh, tags, Tmesh_partition={}): - """Calculate for each node on the boundary of mesh with tag X, its closet point (projection) on the boundary of Tmesh with tag X. - - Args: - mesh (Mesh): current mesh. - tags (list, optional): list of strings (tag names), with tag name: string of the name of the name. - Tmesh_partition(dict): dict of dict. Tmesh_partition[tag] contains : - Tmesh_partition[tag]["mesh"]: subsampled mesh of Tmesh containing the elements of tag. - Tmesh_partition[tag]["FEField"]: FE Field on Tmesh_partition[tag]["mesh"]. - - Returns: - _numpy array_: _the vector distance function_ - """ - vectDist = np.zeros((len(mesh.nodes), 2)) - - for tag in tags: - # extract the IDs of the nodes on mesh with the current tag - mesh_Filter = ElementFilter(nTag=tag) - mesh_ids = mesh_Filter.GetNodesIndices(mesh=mesh) - - # calculate the closet point on Tmesh_tag - - Tmesh_partition[tag]["TransferOperator"].SetTargetPoints(mesh.nodes[mesh_ids]) - Tmesh_partition[tag]["TransferOperator"].Compute() - - operator = Tmesh_partition[tag]["TransferOperator"].GetOperator() - - skinpos = operator.dot(Tmesh_partition[tag]["mesh"].nodes) - - vectDist[mesh_ids] = skinpos - mesh.nodes[mesh_ids] - - return vectDist - - -def signedDistannce_Function_kokkos(Tmesh, mesh, field_Tmesh, dim=1): - # extract the boundary nodes of mesh. - filter = ElementFilter(dimensionality=dim, nTag=["Airfoil"]) - boundary_ids = filter.GetNodesIndices(mesh=mesh) - - TagMesh = UMIT.ExtractElementsByElementFilter(Tmesh, filter) - CleanLonelyNodes(TagMesh) - Tspace, Tnumberings, _, _ = PrepareFEComputation( - TagMesh, - numberOfComponents=1, - elementFilter=ElementFilter(dimensionality=1, nTag=["Airfoil"]), - ) - field_Tmesh_boundary = FEField( - "", mesh=TagMesh, space=Tspace, numbering=Tnumberings[0] - ) - - nNodes = mesh.GetNumberOfNodes() - - # calculate the distance. skinpos[i] is the projection of Mesh.nodes[i] on the boundary of Tmesh. - opSkin = GetFieldTransferOp( - inputField=field_Tmesh_boundary, - targetPoints=mesh.nodes[boundary_ids], - method="Interp/Clamp", - elementFilter=ElementFilter(dimensionality=1, nTag=["Airfoil"]), - )[0] - skinpos = opSkin.dot(TagMesh.nodes) - - signed_distance = np.zeros(nNodes) - signed_distance[boundary_ids] = np.sqrt( - np.sum((skinpos - mesh.nodes[boundary_ids]) ** 2, axis=1) - ) - - # calculate the sign. If statusBulk[i]==1, then Mesh.nodes[i] is inside Tmesh. - statusBulk0 = GetFieldTransferOp( - inputField=field_Tmesh, - targetPoints=mesh.nodes[boundary_ids], - method="Interp/Clamp", - elementFilter=ElementFilter(dimensionality=2), - )[1] - statusBulk = np.zeros(nNodes) - - statusBulk[boundary_ids] = statusBulk0[:, 0] - - signed_distance[statusBulk[:] == 1] *= -1 - - return signed_distance - - -def POD(data, correlationOperator, nmodes): - Nsamples = data.shape[0] - matVecProducts_x = correlationOperator.dot(data[:, :].T) - correlation_matrix_morphing = np.dot(matVecProducts_x.T, data[:, :].T) - eigenvalues_ux, eigenvectors_ux = np.linalg.eigh(correlation_matrix_morphing) - idx = eigenvalues_ux.argsort()[::-1] - eigenvalues_ux = eigenvalues_ux[idx] - eigenvectors_ux = eigenvectors_ux[:, idx] - - changeOfBasisMatrix_x = np.zeros((nmodes, Nsamples)) - for j in range(nmodes): - changeOfBasisMatrix_x[j, :] = eigenvectors_ux[:, j] / np.sqrt(eigenvalues_ux[j]) - - reducedOrderBasis_x = np.dot(changeOfBasisMatrix_x, data[:, :]) - generalizedCoordinates_x = np.dot(reducedOrderBasis_x, matVecProducts_x).T - - return reducedOrderBasis_x, generalizedCoordinates_x, eigenvalues_ux diff --git a/benchmarks/MMGP/Rotor37/README.md b/benchmarks/MMGP/Rotor37/README.md deleted file mode 100644 index 7d2e30ed..00000000 --- a/benchmarks/MMGP/Rotor37/README.md +++ /dev/null @@ -1,4 +0,0 @@ -### List of dependencies - -- [PLAID=0.1.6](https://github.com/PLAID-lib/plaid) -- [GPy=1.13.2](https://github.com/SheffieldML/GPy) diff --git a/benchmarks/MMGP/Rotor37/run_rotor37.py b/benchmarks/MMGP/Rotor37/run_rotor37.py deleted file mode 100644 index 160157fe..00000000 --- a/benchmarks/MMGP/Rotor37/run_rotor37.py +++ /dev/null @@ -1,288 +0,0 @@ -import pickle -import time - -import numpy as np -from datasets import load_dataset -from GPy.kern import RBF, Matern32, Matern52 -from GPy.models import GPRegression -from sklearn.base import BaseEstimator, RegressorMixin, clone -from sklearn.compose import ColumnTransformer, TransformedTargetRegressor -from sklearn.decomposition import PCA -from sklearn.pipeline import Pipeline -from sklearn.preprocessing import MinMaxScaler, StandardScaler - -from plaid import Sample - -dataset = load_dataset("PLAID-datasets/Rotor37", split="all_samples") - -ids_train = dataset.description["split"]["train_1000"] -ids_test = dataset.description["split"]["test"] - -out_fields_names = ["Density", "Pressure", "Temperature"] -out_scalars_names = ["Massflow", "Compression_ratio", "Efficiency"] - - -def convert_data( - ids: list[int], -) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray, np.ndarray, np.ndarray]: - """Converts a list of sample IDs into structured numpy arrays containing input and output data. - - Parameters: - ---------- - ids : list[int] - list of sample indices to retrieve from the dataset. - - Returns: - ------- - nodes : np.ndarray - Flattened array of node features for each sample. - X_scalars : np.ndarray - Array containing input scalar values (Omega, Pressure) for each sample. - Y_scalars : np.ndarray - Array containing output scalar values (Massflow, Compression Ratio, Efficiency). - Y_density : np.ndarray - Array containing field values for Density across samples. - Y_pressure : np.ndarray - Array containing field values for Pressure across samples. - Y_temperature : np.ndarray - Array containing field values for Temperature across samples. - """ - X_scalars = [] - Y_scalars = [] - Y_density = [] - Y_pressure = [] - Y_temperature = [] - nodes = [] - - for id in ids: - sample = Sample.model_validate(pickle.loads(dataset[id]["sample"])) - - nodes.append(sample.get_nodes()) - omega = sample.get_scalar("Omega") - pressure = sample.get_scalar("P") - - density = sample.get_field("Density") - pressure_field = sample.get_field("Pressure") - temperature = sample.get_field("Temperature") - - massflow = sample.get_scalar("Massflow") - compression_ratio = sample.get_scalar("Compression_ratio") - efficiency = sample.get_scalar("Efficiency") - - X_scalars.append(np.array([omega, pressure])) - Y_scalars.append(np.array([massflow, compression_ratio, efficiency])) - Y_density.append(density) - Y_pressure.append(pressure_field) - Y_temperature.append(temperature) - - # Convert lists to numpy arrays - nodes = np.stack(nodes).reshape(len(nodes), -1) - X_scalars = np.stack(X_scalars) - Y_scalars = np.stack(Y_scalars) - Y_density = np.stack(Y_density) - Y_pressure = np.stack(Y_pressure) - Y_temperature = np.stack(Y_temperature) - - return nodes, X_scalars, Y_scalars, Y_density, Y_pressure, Y_temperature - - -class GPyRegressor(BaseEstimator, RegressorMixin): - """Custom Gaussian Process Regressor using GPy library. - - Args: - normalizer (bool): Whether to normalize the output. - kernel (str): Type of kernel to use in the Gaussian Process. - Options: 'Matern52', 'Matern32', 'Rbf'. - num_restarts (int): Number of restarts for kernel optimization. - """ - - def __init__( - self, normalizer: bool = False, kernel: str = "Matern52", num_restarts: int = 5 - ): - self.normalizer = normalizer - self.kernel = kernel - self.num_restarts = num_restarts - - def fit(self, X, y): - """Fit the Gaussian Process model to the data. - - Args: - X (ndarray): Input features of shape (n_samples, n_features). - y (ndarray): Target values of shape (n_samples,) or (n_samples, 1). - - Returns: - self: Returns the instance of the fitted model. - """ - # Reshape y to have shape (n_samples, 1) if it's 1D - if y.ndim == 1: - y = y[:, None] - - # Define the kernel based on the specified kernel type - if self.kernel == "Matern52": - kernel = Matern52(input_dim=X.shape[-1], ARD=True) - elif self.kernel == "Matern32": - kernel = Matern32(input_dim=X.shape[-1], ARD=True) - elif self.kernel == "RBF": - kernel = RBF(input_dim=X.shape[-1], ARD=True) - else: - raise ValueError("Kernel should be 'RBF', 'Matern32', or 'Matern52'") - - # Create and optimize the GP regression model - self.kmodel = GPRegression(X=X, Y=y, kernel=kernel, normalizer=self.normalizer) - self.kmodel.optimize_restarts(num_restarts=self.num_restarts, messages=False) - return self - - def predict(self, X, return_var: bool = False): - """Predict using the Gaussian Process model. - - Args: - X (ndarray): Input features of shape (n_samples, n_features). - return_var (bool): Return the predictive variance. - - Returns: - mean (ndarray): Predicted mean values for the input data - or - (mean, variance) (ndarray): Predicted mean and variance values if return_var is True. - """ - # Get the mean prediction from the GP model - mean, var = self.kmodel.predict(X) - if return_var: - return mean, np.tile(var, (1, mean.shape[-1])) - else: - return mean - - -def build_pipeline(apply_output_pca: bool = False) -> Pipeline: - """Constructs a regression pipeline that includes: - - PCA transformation on input features. - - Standard scaling of input features. - - Optional PCA transformation on the output. - - Gaussian Process regression using `GPyRegressor`. - - Parameters: - ---------- - apply_output_pca : bool, optional (default=False) - If True, applies PCA on the output variables in addition to scaling. - - Returns: - ------- - pipeline : Pipeline - A scikit-learn pipeline with preprocessing and regression steps. - """ - # define PCA transformation for input features - pca_transformer = [ - ( - "pca", - PCA(n_components=40), - np.arange(2, 2 + nodes_train.shape[-1]), - ) - ] - - input_preprocessor = ColumnTransformer(pca_transformer, remainder="passthrough") - - # define output preprocessing (scaling + optional PCA) - output_preprocessor = Pipeline( - steps=[ - ("scaler", MinMaxScaler()), - ("pca", PCA(n_components=80)) - if apply_output_pca - else ("identity", "passthrough"), - ] - ) - - # define regressor with output transformation - regressor = TransformedTargetRegressor( - regressor=clone(GPyRegressor()), - check_inverse=False, - transformer=output_preprocessor, - ) - - # full pipeline with input preprocessing, scaling, and regression - pipeline = Pipeline( - steps=[ - ("preprocessor", input_preprocessor), - ("scaler", StandardScaler()), - ("regressor", regressor), - ] - ) - - return pipeline - - -if __name__ == "__main__": - start = time.time() - - ( - nodes_train, - X_scalars_train, - Y_scalars_train, - Y_density_train, - Y_pressure_train, - Y_temperature_train, - ) = convert_data(ids_train) - - # Train - X_train = np.concatenate([X_scalars_train, nodes_train], axis=-1) - - # scalars - pipeline_scalars = build_pipeline(apply_output_pca=False) - pipeline_scalars.fit(X_train, Y_scalars_train) - # fields - pipeline_density = build_pipeline(apply_output_pca=True) - pipeline_density.fit(X_train, Y_density_train) - - pipeline_temperature = build_pipeline(apply_output_pca=True) - pipeline_temperature.fit(X_train, Y_temperature_train) - - pipeline_pressure = build_pipeline(apply_output_pca=True) - pipeline_pressure.fit(X_train, Y_pressure_train) - - print("duration train:", time.time() - start) - start = time.time() - - # Predict - - ( - nodes_test, - X_scalars_test, - Y_scalars_test, - Y_density_test, - Y_pressure_test, - Y_temperature_test, - ) = convert_data(ids_test) - - X_test = np.concatenate([X_scalars_test, nodes_test], axis=-1) - - predictions = {} - - y_pred = pipeline_scalars.predict(X_test) - predictions["Massflow"] = y_pred[:, 0] - predictions["Compression_ratio"] = y_pred[:, 1] - predictions["Efficiency"] = y_pred[:, 2] - - y_pred = pipeline_density.predict(X_test) - predictions["Density"] = y_pred - - y_pred = pipeline_temperature.predict(X_test) - predictions["Temperature"] = y_pred - - y_pred = pipeline_pressure.predict(X_test) - predictions["Pressure"] = y_pred - - print("duration test:", time.time() - start) - start = time.time() - - # dump - reference = [] - for i, id in enumerate(ids_test): - reference.append({}) - for fn in out_fields_names: - reference[i][fn] = predictions[fn][i] - for sn in out_scalars_names: - reference[i][sn] = predictions[sn][i] - - with open("prediction.pkl", "wb") as file: - pickle.dump(reference, file) - - # duration train: 416.71344685554504 - # duration test: 1.2284891605377197 diff --git a/benchmarks/MMGP/Tensile2d/README.md b/benchmarks/MMGP/Tensile2d/README.md deleted file mode 100644 index 317af5b9..00000000 --- a/benchmarks/MMGP/Tensile2d/README.md +++ /dev/null @@ -1,14 +0,0 @@ -The code to run this benchmark can be retrieved at https://gitlab.com/drti/mmgp. More precisely, it is located in examples/Tensile2d. - -The file `configuration.yml` must be eddited to indicate the location of the untarred dataset at `init_dataset_location`, and `number_Monte_Carlo_samples` can be lowered to 2, since predictive uncertainty is not of interest for this benchmark. Beside, both `number_of_modes` are set to 16. Then run: - -```python -python run.py --preprocess --train --infer --export_predictions -``` - -The prediction will be generated in a folder named `Tensile2d_predicted` located in a folder configurated under `generated_data_folder`. Finally, the prediction file can be generated using `construct_prediction.py` (locations at the top of the file must be set). - -### List of dependencies - -- [PLAID=0.1.6](https://github.com/PLAID-lib/plaid) -- [MMGP=0.0.9](https://gitlab.com/drti/mmgp) \ No newline at end of file diff --git a/benchmarks/MMGP/Tensile2d/construct_prediction.py b/benchmarks/MMGP/Tensile2d/construct_prediction.py deleted file mode 100644 index e1a26138..00000000 --- a/benchmarks/MMGP/Tensile2d/construct_prediction.py +++ /dev/null @@ -1,58 +0,0 @@ -from plaid import Dataset -from plaid import ProblemDefinition - -from tqdm import tqdm - -import os, pickle, time - -start = time.time() - - -plaid_location = # to update - -pb_defpath=os.path.join(plaid_location, "problem_definition") - -predicted_data_dir= "data/Tensile2d_predicted/dataset" - -dataset_pred = Dataset() -dataset_pred._load_from_dir_(predicted_data_dir, verbose=True, processes_number=4) - -problem = ProblemDefinition() -problem._load_from_dir_(pb_defpath) - -ids_train = problem.get_split('train_500') -ids_test = problem.get_split('test') - - -n_train = len(ids_train) -n_test = len(ids_test) - -out_fields_names = ['U1', 'U2', 'sig11', 'sig22', 'sig12'] -out_scalars_names = ['max_von_mises', 'max_U2_top', 'max_sig22_top'] -nbe_features = len(out_fields_names) + len(out_scalars_names) - - -prediction = [] - -count = 0 -for sample_index in tqdm(ids_test): - - sample_pred = dataset_pred[sample_index] - - prediction.append({}) - for fn in out_fields_names: - prediction[count][fn] = sample_pred.get_field(fn+"_predicted") - for sn in out_scalars_names: - prediction[count][sn] = sample_pred.get_scalar(sn+"_predicted") - - count += 1 - -with open('prediction_tensile2d.pkl', 'wb') as file: - pickle.dump(prediction, file) - -print("duration construct predictions =", time.time()-start) -# 4 seconds - -# preprocess done in 112.36455297470093 s -# train done in 1396.3768372535706 s -# inference done in 86.41910243034363 s diff --git a/benchmarks/MMGP/VKI-LS59/AM_POD.py b/benchmarks/MMGP/VKI-LS59/AM_POD.py deleted file mode 100644 index fe141462..00000000 --- a/benchmarks/MMGP/VKI-LS59/AM_POD.py +++ /dev/null @@ -1,552 +0,0 @@ -import time -from itertools import combinations_with_replacement - -import numpy as np -from joblib import Parallel, delayed -from scipy.optimize import least_squares - - -class PolynomialManifoldApproximation: - def __init__( - self, - polynomial_order, - r, - q=None, - podtype="am", - reg_ls=1e-3, - reg_nt=None, - tol_rot=1e-6, - tol_nt=1e-8, - max_iter_rot=100, - max_iter_nt=None, - n_jobs=-1, - verbose=False, - ): - """This is an implementation of the paper https://arxiv.org/pdf/2306.13748. - - Initialize the class for polynomial manifold approximation. - - Parameters: - polynomial_order (int): Maximum order of the polynomial (P). - r (int): Number of principal modes to retain in the POD base. - q (int or None): number of complements modes to retain in the POD base. - podtype (str): - pod: for pod reduction, - - poly: pod + polynomial correction reduction, - - am: for adaptative manifold reduction - reg_ls (float): Regularization parameter for the coefficients (L2 penalty) in least square problem. - reg_nt (float or None): Regularization parameter for the coefficients (L2 penalty) in Newthon solver. - tol_rot (float): Tolerance for iterative convergence for the rotation space. - tol_nt (float): Tolerance for iterative convergence for the Newton. - max_iter_rot (int): Maximum number of iterations for the optimization loop of the rotation matrix. - max_iter_nt (int): Maximum number of iterations for the Newthon loop. - n_jobs (int): number of cpu - verbose (bool): True to display optimization iterations. - """ - assert podtype in ["pod", "poly", "am"] - if polynomial_order == 1 and podtype != "pod": - if verbose: - print( - f"WARNING: podtype is set to <{podtype:s}> but polynomial_order is <{polynomial_order:1}>. A podtype will be used instead" - ) - podtype = "pod" - self.polynomial_order = polynomial_order - self.r = r - self.q = q if q is None else r + q - self.type = podtype - self.reg_ls = reg_ls - self.reg_nt = reg_nt - self.tol_rot = tol_rot - self.tol_nt = tol_nt if tol_nt > 0 or tol_nt is None else 1e-8 - self.max_iter_rot = max_iter_rot - self.max_iter_nt = max_iter_nt - self.n_jobs = n_jobs - self.mean = None - self.V = None # Principal modes (from SVD) - self.Vb = None # Extra principal modes (from SVD) - self.psi = None # polynomial correction - self.is_fitted = False - self.is_start = False - self.verbose = verbose - - def _center_data(self, S): - """Center the data by subtracting the mean from each feature. - - Parameters: - S (np.ndarray): Snapshot matrix (N, Ns). - - Returns: - np.ndarray: Centered snapshot matrix. - """ - if not self.is_fitted: - self.mean = np.mean(S, axis=1, keepdims=True) - return S - np.tile(self.mean, (1, S.shape[1])) - - def reduce(self, S): - """POD projection. - - Parameters: - S (np.ndarray): Snapshot matrix (N, Ns). - - Returns: - np.ndarray: reduced Snapshot matrix (r, Ns). - """ - return self.V.T @ S - - def _generate_combinations(self, coefficients): - """Generate the W matrix containing all polynomial combinations up to the specified order. - - Parameters: - coefficients (np.ndarray): Coefficients from the projection (r, Ns). - - Returns: - W (np.ndarray): Array of polynomial combinations. (p, Ns) - """ - coeff_dim = coefficients.shape[0] - num_snapshots = coefficients.shape[1] - - # Genera tutte le combinazioni di indici fino al grado polinomiale specificato - combs = list( - combinations_with_replacement(range(coeff_dim), 2) - ) # Polynomial degree 2 - - for deg in range(3, self.polynomial_order + 1): # Extend for higher degrees - combs.extend(combinations_with_replacement(range(coeff_dim), deg)) - - num_combinations = len(combs) - W = np.empty((num_combinations, num_snapshots)) - - # Funzione per calcolare la colonna idx di W - def compute_column(snapshot): - return np.array([np.prod(snapshot[list(combo)]) for combo in combs]) - - if num_snapshots > 1: - # Parallelizzazione solo se ci sono più di una colonna - results = Parallel(n_jobs=-1)( - delayed(compute_column)(snapshot) for snapshot in coefficients.T - ) - W[:, :] = np.column_stack(results) - else: - # Se c'è solo una colonna, esegui il calcolo sequenziale - W[:, 0] = compute_column(coefficients.T[0]) - - return W - - def score(self, S_test): - S_test = S_test.T - assert self.is_fitted - norm_S_test = np.linalg.norm(self._center_data(S_test), ord="fro") - S_reduced_test = self.reduce(self._center_data(S_test)) - W_test = self._generate_combinations(S_reduced_test) - return abs( - 1 - - np.linalg.norm( - self.V @ S_reduced_test + self.Vb @ self.psi @ W_test, ord="fro" - ) - / norm_S_test - ) - - def fit(self, S, S_test=None): - """Fit the model by iteratively solving the optimization loop. - - Parameters: - S (np.ndarray): Snapshot matrix (N, Ns). - """ - S = S.T - if S_test is not None: - S_test = S_test.T - assert self.r < S.shape[1] - if self.q is not None: - assert self.q < S.shape[1] - - S_centered = self._center_data(S) # (N x Ns) - norm_S = np.linalg.norm(S_centered, ord="fro") - if S_test is not None: - norm_S_test = np.linalg.norm(self._center_data(S_test), ord="fro") - # Initial SVD and initialization - U, sig, _ = np.linalg.svd(S_centered, full_matrices=False) - if self.verbose: - print(40 * "=") - if self.verbose: - print( - f"POD computed cumulated expected variance : {np.cumsum(sig**2)[self.r] / np.sum(sig**2):.4f}" - ) - if self.verbose: - print(40 * "=") - if not self.is_start: - # compute standard POD - self.V = U[:, : self.r] # (N x r) - self.Vb = U[:, self.r : self.q] # (N x q) - S_reduce = self.reduce(S_centered) # (r x Ns) - error_pod = 1 - np.linalg.norm(self.V @ S_reduce, ord="fro") / norm_S - - # compute polynomial correction - now = time.time() - W = self._generate_combinations(S_reduce) - self.psi = np.linalg.lstsq( - W @ W.T + self.reg_ls * np.eye(W.shape[0]), - (self.Vb.T @ S_centered @ W.T).T, - rcond=None, - )[0].T # (q x Ns) - polynomial_error = ( - 1 - - np.linalg.norm(self.V @ S_reduce + self.Vb @ self.psi @ W, ord="fro") - / norm_S - ) - SW = np.concatenate([S_reduce.T, (self.psi @ W).T], axis=1) - old_error = np.copy(polynomial_error) - if self.verbose: - print( - f"---> compute polynomial correction {(time.time() - now):.4f} seconds" - ) - if self.type == "pod": - self.psi *= 0 - self.is_fitted = True - return self - elif self.type == "poly": - self.is_fitted = True - return self - self.is_start = True - self.iter_step = 0 - else: - # compute standard POD - V = U[:, : self.r] # (N x r) - Vb = U[:, self.r : self.q] # (N x q) - S_reduce = self.reduce(S_centered) # (r x Ns) - error_pod = 1 - np.linalg.norm(V @ S_reduce, ord="fro") / norm_S - # compute polynomial correction - now = time.time() - W = self._generate_combinations(S_reduce) - psi = np.linalg.lstsq( - W @ W.T + self.reg_ls * np.eye(W.shape[0]), - (Vb.T @ S_centered @ W.T).T, - rcond=None, - )[0].T # (q x Ns) - polynomial_error = ( - 1 - np.linalg.norm(V @ S_reduce + Vb @ psi @ W, ord="fro") / norm_S - ) - if self.verbose: - print( - f"---> compute polynomial correction {(time.time() - now):.4f} seconds" - ) - # restore last AM iteration - now = time.time() - S_reduce = self.restart_reduce - W = self._generate_combinations(S_reduce) - SW = np.concatenate([S_reduce.T, (self.psi @ W).T], axis=1) - old_error = ( - 1 - - np.linalg.norm(self.V @ S_reduce + self.Vb @ self.psi @ W, ord="fro") - / norm_S - ) - if self.verbose: - print( - f"---> restore last AM iteration {(time.time() - now):.4f} seconds" - ) - - if S_test is not None: - S_reduced_test = self.reduce(self._center_data(S_test)) - W_test = self._generate_combinations(S_reduced_test) - old_error_test = abs( - 1 - - np.linalg.norm( - self.V @ S_reduced_test + self.Vb @ self.psi @ W_test, ord="fro" - ) - / norm_S_test - ) - if self.verbose: - print( - f"---> pod {error_pod:.4e} vs polynomial {polynomial_error:.4e} vs AM-polynomial {old_error:.4e} vs test {old_error_test:.4e}" - ) - else: - if self.verbose: - print( - f"---> pod {error_pod:.4e} vs polynomial {polynomial_error:.4e} vs AM-polynomial {old_error:.4e}" - ) - - if self.verbose: - print(40 * "=") - if self.verbose: - print("initialization completed") - if self.verbose: - print(40 * "=") - - for iter_loop in range(self.max_iter_rot): - if self.verbose: - print(f"AM iteration {self.iter_step}") - V_old = np.copy(self.V) - Vb_old = np.copy(self.Vb) - psi_old = np.copy(self.psi) - - # Step 1: Solve the Orthogonal Procrustes Problem - now = time.time() - Omega = self.solve_orthogonal_procrustes(S_centered, SW) # (N x [r+q]) - if self.verbose: - print( - f"---> solve procrustes problem in {(time.time() - now):.4f} seconds" - ) - - # Step 2: Update V and Vb - self.V = Omega[:, : self.r] - self.Vb = Omega[:, self.r :] - - # Step 3: Optimize SW using Levenberg-Marquardt - now = time.time() - S_reduce = self._levenberg_marquardt(S_centered, Omega) - self.restart_reduce = S_reduce - if self.verbose: - print( - f"---> solve levenberg marquardt problem in {(time.time() - now):.4f} seconds" - ) - - # Step 4: Compute W, and psi - W = self._generate_combinations(S_reduce) # (p x Ns) - - now = time.time() - self.psi = np.linalg.lstsq( - W @ W.T + self.reg_ls * np.eye(W.shape[0]), - (self.Vb.T @ S_centered @ W.T).T, - rcond=None, - )[0].T # (q x Ns) - if self.verbose: - print( - f"---> least square problem solved in {(time.time() - now):.4f} seconds" - ) - - # Step 5: Compute SW and approximation error - SW = np.concatenate([S_reduce.T, (self.psi @ W).T], axis=1) # (Ns x [r+q]) - new_error = ( - 1 - - np.linalg.norm(self.V @ S_reduce + self.Vb @ self.psi @ W, ord="fro") - / norm_S - ) - if S_test is None: - if self.verbose: - print( - f"---> pod {error_pod:.4e} vs polynomial {polynomial_error:.4e} vs AM-polynomial {new_error:.4e}" - ) - else: - S_reduced_test = self.reduce(self._center_data(S_test)) - W_test = self._generate_combinations(S_reduced_test) - new_error_test = abs( - 1 - - np.linalg.norm( - self.V @ S_reduced_test + self.Vb @ self.psi @ W_test, ord="fro" - ) - / norm_S_test - ) - if self.verbose: - print( - f"---> pod {error_pod:.4e} vs polynomial {polynomial_error:.4e} vs AM-polynomial {new_error:.4e} vs test {new_error_test:.4e}" - ) - - if S_test is not None and (new_error_test > old_error_test): - if self.verbose: - print("early stop criteria") - self.V = np.copy(V_old) - self.Vb = np.copy(Vb_old) - self.psi = np.copy(psi_old) - self.is_fitted = True - break - elif abs(old_error - new_error) < self.tol_rot: - if self.verbose: - print("converged criteria") - self.is_fitted = True - break - - old_error = np.copy(new_error) - if S_test is not None: - old_error_test = np.copy(new_error_test) - self.iter_step += 1 - - self.is_fitted = True - return self - - def solve_orthogonal_procrustes(self, S_centered, SW): - """Solve the Orthogonal Procrustes Problem: minimize ||S_centered - Omega SW|| - subject to Omega^T Omega = I. - - Parameters: - S_centered (np.ndarray): Centered snapshot matrix (N, Ns). - SW (np.ndarray): concatenated [S_reduced.T (psi@W).T] (Ns, Ns) - - Returns: - np.ndarray: Orthogonal matrix Omega of shape (N, Ns). - """ - A = np.dot(S_centered, SW) - U, _, Vt = np.linalg.svd(A, full_matrices=False) - Omega = np.dot(U, Vt) - - return Omega - - def _levenberg_marquardt(self, S, Omega): - """Solve the minimization problem ||S - Omega X(coefficients)|| using Levenberg-Marquardt, - parallelizing the computation over columns of S. - - Parameters: - S (np.ndarray): Snapshot vector of shape (N, Ns). - Omega (np.ndarray): Matrix of shape (N, r + p) such that Omega^T Omega = I. - - Returns: - S_reduced (np.ndarray): Optimized coefficients vector of shape (r, Ns). - """ - num_cols = S.shape[1] # Number of columns in S - S_reduced = self.reduce(S) # Reduce the dimension of S - - def optimize_column(i): - """Optimize a single column using Levenberg-Marquardt.""" - x0 = S_reduced[:, i].ravel() # Flatten initial guess - result = least_squares( - lambda x: self._residu(x, S[:, i], Omega), - x0, - jac=lambda x: self._jacobian(x, S[:, i], Omega), - method="trf", - max_nfev=self.max_iter_nt * S.shape[0] * self.polynomial_order - if self.max_iter_nt is not None - else None, - ftol=self.tol_nt, - xtol=self.tol_nt, - gtol=self.tol_nt, - ) - return result.x # Return optimized solution for this column - - # Run parallel optimization for each column - results = Parallel(n_jobs=self.n_jobs)( - delayed(optimize_column)(i) for i in range(num_cols) - ) - - # Convert list of results into an array - X_opt = np.column_stack(results) - - return X_opt # Return the optimized matrix - - def _residu(self, coefficients, S_col, Omega): - """Compute the residual for a single column of S in the Levenberg-Marquardt optimization. - - Parameters: - coefficients (np.ndarray): Parameters to be optimized (in the reduced space). - S_col (np.ndarray): A single column of the original data matrix S. - Omega (np.ndarray): Weight matrix. - - Returns: - np.ndarray: Residuals as a 1D array. - """ - coefficients = coefficients.reshape(-1, 1) - g = self._generate_combinations( - coefficients - ) # Generate polynomial combinations - X = np.concatenate([coefficients, self.psi @ g], axis=0) # Construct the matrix - - resid = ( - Omega.T @ S_col.reshape(-1, 1) - X - ) # Compute the residual for this column - return resid.ravel() # Return a 1D vector for least_squares - - def _jacobian(self, coefficients, S_col, Omega): - """Compute the Jacobian matrix of the residual function. - - Parameters: - coefficients (np.ndarray): Parameters to be optimized (in the reduced space). - S_col (np.ndarray): A single column of the original data matrix S. - Omega (np.ndarray): Weight matrix. - - Returns: - np.ndarray: Jacobian matrix of shape (num_residuals, num_parameters). - """ - coefficients = coefficients.reshape(-1, 1) # Ensure correct shape - - # Compute Jacobian with respect to coefficients - J_g = self._compute_combinations_jacobian( - coefficients - ) # Custom function to compute derivatives - - J_psi_g = self.psi @ J_g # Apply transformation psi - - # Construct the full Jacobian matrix - J = -np.concatenate([np.eye(coefficients.shape[0]), J_psi_g], axis=0) - - if self.reg_nt is None or self.reg_nt == 0.0: - return J - else: - JtJ = J.T @ J # Shape (r, r) - diag_JtJ = np.diag(JtJ) # Extract diagonal - - # Add regularization term - regularized_JtJ = JtJ + self.reg_nt * np.diag( - diag_JtJ - ) # Regularized J.T @ J - # Compute Cholesky decomposition - L = np.linalg.cholesky(regularized_JtJ).T # Shape (r, r) - - # Solve for a modified J with the same shape as the original J (N, r) - J = J @ np.linalg.inv(L) - return J - - def _compute_combinations_jacobian(self, coefficients): - """Compute the Jacobian of the polynomial combinations matrix W with respect to coefficients. - - Parameters: - coefficients (np.ndarray): Coefficients from the projection (r, Ns). - - Returns: - J (np.ndarray): Jacobian matrix of polynomial combinations. Shape: (p, r, Ns) - where p is the number of polynomial combinations. - """ - coeff_dim = coefficients.shape[0] # Number of coefficients (r) - combs = [] # store combinations - - # Generate all polynomial combinations - for deg in range(2, self.polynomial_order + 1): - combs.extend(combinations_with_replacement(range(coeff_dim), deg)) - - num_combinations = len(combs) # Total number of polynomial terms - J = np.zeros( - (num_combinations, coeff_dim, coefficients.shape[1]) - ) # Jacobian storage - - # Compute derivatives for each column (snapshot) - for idx, snapshot in enumerate(coefficients.T): # Iterate over snapshots - for row_idx, combo in enumerate( - combs - ): # Iterate over polynomial combinations - combo = list(combo) # Convert to list for indexing - - # Compute the polynomial term itself - w_i = np.prod(snapshot[combo]) - - # Compute derivatives w.r.t. each coefficient - for j in range(coeff_dim): # Iterate over coefficients - if j in combo: - # Compute derivative using the product rule - partial_derivative = ( - w_i / snapshot[j] * combo.count(j) - if snapshot[j] != 0 - else 0 - ) - J[row_idx, j, idx] = partial_derivative # Store in Jacobian - - return J.squeeze() # Shape: (p, r, Ns) - - def transform(self, S): - S = S.T - assert self.is_fitted - assert self.V.shape[0] == S.shape[0] - - S_centered = self._center_data(S) - S_reduce = self.reduce(S_centered) - return S_reduce.T - - def fit_transform(self, S_train, S_test): - self = self.fit(S_train, S_test) - return self.transform(S_train) - - def inverse_transform(self, S_reduce): - S_reduce = S_reduce.T - assert self.is_fitted - assert self.V.shape[1] == S_reduce.shape[0] - W = self._generate_combinations(S_reduce) - S = ( - self.V @ S_reduce - + self.Vb @ self.psi @ W - + np.tile(self.mean, (1, S_reduce.shape[1])) - ) - return S.T diff --git a/benchmarks/MMGP/VKI-LS59/README.md b/benchmarks/MMGP/VKI-LS59/README.md deleted file mode 100644 index 9da8f2e4..00000000 --- a/benchmarks/MMGP/VKI-LS59/README.md +++ /dev/null @@ -1,6 +0,0 @@ -The code to run this benchmark entry is documented in `VKI_study.ipynb`. - -### List of dependencies - -- [PLAID=0.1.6](https://github.com/PLAID-lib/plaid) -- [GPy=1.13.2](https://github.com/SheffieldML/GPy) \ No newline at end of file diff --git a/benchmarks/MMGP/VKI-LS59/VKI_study.py b/benchmarks/MMGP/VKI-LS59/VKI_study.py deleted file mode 100644 index 0ef0e2ff..00000000 --- a/benchmarks/MMGP/VKI-LS59/VKI_study.py +++ /dev/null @@ -1,231 +0,0 @@ -# --- -# jupyter: -# jupytext: -# text_representation: -# extension: .py -# format_name: percent -# format_version: '1.3' -# jupytext_version: 1.17.3 -# kernelspec: -# display_name: 2dblade -# language: python -# name: python3 -# --- - -# %% -# %reload_ext autoreload -# %autoreload 2 -import time - -import numpy as np -from AM_POD import PolynomialManifoldApproximation -from data import extract_split_data, make_kfold_splits -from tqdm import tqdm - -# 1) Load the train split and build 5 folds -inputs, outputs = extract_split_data("train") -folds = make_kfold_splits(inputs, outputs, n_splits=5, random_state=42) - -# %% -# 2) Define the grid of (polynomial_order, r) to search -param_grid = [(p, r) for p in [1, 2, 3] for r in [5, 10, 15, 20, 25, 30, 35, 40]] - -results_mach = {} -all_errors = [] -print("=== Searching best parameters for MACH ===") -for poly_order, r in param_grid: - print(f"Testing polynomial_order={poly_order}, r={r} for MACH") - t0 = time.time() - fold_errors = [] - - with tqdm( - total=len(folds), desc=f" MACH p={poly_order}, r={r}", leave=False - ) as pbar: - for fold_idx, (train_in, train_out, val_in, val_out) in enumerate(folds): - # prepare snapshots for mach - S_train = np.stack(train_out["mach"], axis=0) - S_val = np.stack(val_out["mach"], axis=0) - - pma = PolynomialManifoldApproximation( - polynomial_order=poly_order, - r=r, - ) - pma.fit(S_train, S_val) - err = pma.score(S_val) - fold_errors.append(err) - - # update the tqdm bar - avg_err = np.mean(fold_errors) - pbar.set_postfix({"last_err": f"{err:.4e}", "avg_err": f"{avg_err:.4e}"}) - pbar.update() - - all_errors.append( - {"poly_order": poly_order, "r": r, "fold": fold_idx, "error": err} - ) - - avg_err = np.mean(fold_errors) - elapsed = time.time() - t0 - results_mach[(poly_order, r)] = avg_err - print( - f"--> Done p={poly_order}, r={r}: avg_error={avg_err:.4f} (time {elapsed:.1f}s)" - ) - print("_" * 80) - -best_mach = min(results_mach, key=results_mach.get) -print( - f"\nBest for MACH → polynomial_order={best_mach[0]}, r={best_mach[1]} " - f"(avg_val_error={results_mach[best_mach]:.4f})" -) - -# %% -from utils import CVResults - -results_mach = CVResults(all_errors) -results_mach.print_summary() - -# %% -# 2) Define the grid of (polynomial_order, r) to search -param_grid = [(p, r) for p in [1, 2, 3] for r in [5, 10, 15, 20, 25, 30, 35, 40]] - -# 3b) Search best params for 'nut' -results_nut = {} -all_errors_nut = [] - -print("=== Searching best parameters for NUT ===") -for poly_order, r in param_grid: - print(f"\nTesting polynomial_order={poly_order}, r={r} for NUT") - t0 = time.time() - fold_errors = [] - - with tqdm( - total=len(folds), desc=f" NUT p={poly_order}, r={r}", leave=False - ) as pbar: - for fold_idx, (train_in, train_out, val_in, val_out) in enumerate(folds): - # prepare snapshots for nut - S_train = np.stack(train_out["nut"], axis=0) - S_val = np.stack(val_out["nut"], axis=0) - - pma = PolynomialManifoldApproximation( - polynomial_order=poly_order, - r=r, - reg_ls=0.001, - ) - pma.fit(S_train, S_val) - err = pma.score(S_val) - fold_errors.append(err) - - # update the tqdm bar - avg_err = np.mean(fold_errors) - pbar.set_postfix({"last_err": f"{err:.4e}", "avg_err": f"{avg_err:.4e}"}) - pbar.update() - - all_errors_nut.append( - {"poly_order": poly_order, "r": r, "fold": fold_idx, "error": err} - ) - - avg_err = np.mean(fold_errors) - elapsed = time.time() - t0 - results_nut[(poly_order, r)] = avg_err - print( - f"--> Done p={poly_order}, r={r}: avg_error={avg_err:.4f} (time {elapsed:.1f}s)" - ) - print("_" * 80) - -best_nut = min(results_nut, key=results_nut.get) -print( - f"\nBest for NUT → polynomial_order={best_nut[0]}, r={best_nut[1]} " - f"(avg_val_error={results_nut[best_nut]:.4f})" -) - -# %% -results_nut = CVResults(all_errors_nut) -results_nut.print_summary() - -# %% -# %reload_ext autoreload -# %autoreload 2 -from processor import InputProcessor, OutputProcessor - -inputprocessor = InputProcessor(explained_variance=0.99999) -X = inputprocessor.fit_transform(inputs) - -# %% -print(inputprocessor.n_components_) - -# %% -outputprocessor = OutputProcessor( - mach_params=(3, 5), nut_params=(1, 40), verbose=True, max_iter_rot=22 -) -Y = outputprocessor.fit_transform(outputs) - -# %% -from model import GPyRegressor - -# one single GP -gpmodel = GPyRegressor() -gpmodel.fit(X, Y) - -# %% -inputs_test, outputs_test = extract_split_data("test") - -# %% -X_test = inputprocessor.transform(inputs_test) -Y_test = gpmodel.predict(X_test) -outputs_test_pred = outputprocessor.inverse_transform(Y_test) - -# %% -import matplotlib.pyplot as plt -from utils import plot_mach_nut, plot_scalars_pred_vs_true - -fig, axs = plot_mach_nut(inputs_test, outputs_test, outputs_test_pred, idx=100) -plt.show() - -# %% -fig, axs = plot_scalars_pred_vs_true(outputs_test, outputs_test_pred) - -# %% -from data import dump_predictions - -dump_predictions(outputs_test_pred, "predictions.pkl") - -# %% -from joblib import Parallel, delayed -from model import GPyRegressor - - -def train_single_output(X, y): - gp = GPyRegressor() - gp.fit(X, y) - return gp - - -# X: (n_samples, n_features) -# Y: (n_samples, n_outputs) -n_outputs = Y.shape[1] - -# Parallel training: use all cores (n_jobs=-1), and print progress (verbose=10) -gp_models = Parallel(n_jobs=-1, verbose=10)( - delayed(train_single_output)(X, Y[:, j]) for j in range(n_outputs) -) - -# %% -len(gp_models) - -# %% -X_test = inputprocessor.transform(inputs_test) -Y_test = np.column_stack([gp.predict(X_test) for gp in gp_models]) -outputs_test_pred = outputprocessor.inverse_transform(Y_test) - -# %% -import matplotlib.pyplot as plt -from utils import plot_mach_nut, plot_scalars_pred_vs_true - -fig, axs = plot_mach_nut(inputs_test, outputs_test, outputs_test_pred, idx=100) - -# %% -fig, axs = plot_scalars_pred_vs_true(outputs_test, outputs_test_pred) - -# %% -from data import dump_predictions - -dump_predictions(outputs_test_pred, "predictions.pkl") diff --git a/benchmarks/MMGP/VKI-LS59/data.py b/benchmarks/MMGP/VKI-LS59/data.py deleted file mode 100644 index 4d2eb25b..00000000 --- a/benchmarks/MMGP/VKI-LS59/data.py +++ /dev/null @@ -1,168 +0,0 @@ -import pickle -from pathlib import Path -from typing import Any, Literal, Optional - -from datasets import load_dataset -from sklearn.model_selection import KFold - -from plaid.bridges.huggingface_bridge import huggingface_dataset_to_plaid - - -def extract_split_data( - split: Literal["train", "test", "traintest"], -) -> tuple[dict[str, list[Any]], dict[str, list[Any]]]: - """Extract input and output dictionaries for all samples in a given split of a Plaid dataset. - - Args: - dataset_dir (str): Path to the directory containing the 'huggingface' subfolder. - split (Literal["train", "test", "traintest"]): Which split to extract; - 'train', 'test', or 'traintest' (concatenation of train and test). - - Returns: - inputs (dict[str, list[Any]]): Keys are mesh/scalar names; values are lists of sample data. - outputs (dict[str, list[Any]]): Keys are selected field/scalar names; values are lists of sample data. - """ - # 1) Load the HuggingFace dataset from disk - hf_dataset = load_dataset("PLAID-datasets/VKI-LS59", split="all_samples") - - # 2) Convert to Plaid and retrieve the problem definition - plaid_dataset, problem_definition = huggingface_dataset_to_plaid(hf_dataset) - - # 3) Get sample indices for the requested split - if split == "traintest": - ids = problem_definition.get_split("train") + problem_definition.get_split( - "test" - ) - else: - ids = problem_definition.get_split(split) - - # 4) Determine the base mesh name from the first sample - sample0 = plaid_dataset[ids[0]] - base_name = sample0.get_base_names()[0] - - # 5) Retrieve the names of all input scalars - input_scalars = problem_definition.get_input_scalars_names() - - # Define exactly which outputs to extract - FIELD_OUTPUTS = ["mach", "nut"] - SCALAR_OUTPUTS = ["Q", "power", "Pr", "Tr", "eth_is", "angle_out"] - - inputs: dict[str, list[Any]] = {} - outputs: dict[str, list[Any]] = {} - - # --- INPUTS --- - # Mesh node coordinates - inputs["nodes"] = [plaid_dataset[i].get_nodes(base_name=base_name) for i in ids] - - # Input scalar values - for key in input_scalars: - inputs[key] = [plaid_dataset[i].get_scalar(key) for i in ids] - - # --- OUTPUTS --- - # Selected mesh field data - for field_name in FIELD_OUTPUTS: - outputs[field_name] = [ - plaid_dataset[i].get_field(field_name, base_name=base_name) for i in ids - ] - - # Selected output scalar values - for key in SCALAR_OUTPUTS: - outputs[key] = [plaid_dataset[i].get_scalar(key) for i in ids] - - return inputs, outputs - - -def make_kfold_splits( - inputs: dict[str, list[Any]], - outputs: dict[str, list[Any]], - n_splits: int = 5, - shuffle: bool = True, - random_state: Optional[int] = None, -) -> list[ - tuple[ - dict[str, list[Any]], # train inputs - dict[str, list[Any]], # train outputs - dict[str, list[Any]], # val inputs - dict[str, list[Any]], # val outputs - ] -]: - """Split inputs and outputs into K folds for cross‑validation. - - Args: - inputs (dict[str, list[Any]]): - Dictionary of input data where each key maps to a list of samples. - outputs (dict[str, list[Any]]): - Dictionary of output data where each key maps to a list of samples. - n_splits (int, optional): - Number of folds. Defaults to 5. - shuffle (bool, optional): - Whether to shuffle the data before splitting. Defaults to True. - random_state (Optional[int], optional): - Seed for reproducible shuffling. Defaults to None. - - Returns: - list of tuples, one per fold, each containing: - - train_inputs (dict) - - train_outputs (dict) - - val_inputs (dict) - - val_outputs (dict) - """ - # Number of samples inferred from the length of any input list - n_samples = len(next(iter(inputs.values()))) - kf = KFold(n_splits=n_splits, shuffle=shuffle, random_state=random_state) - - splits: list[ - tuple[ - dict[str, list[Any]], - dict[str, list[Any]], - dict[str, list[Any]], - dict[str, list[Any]], - ] - ] = [] - - for train_idx, val_idx in kf.split(range(n_samples)): - # Build train/val dicts by indexing into the lists - train_inputs = {k: [v[i] for i in train_idx] for k, v in inputs.items()} - val_inputs = {k: [v[i] for i in val_idx] for k, v in inputs.items()} - train_outputs = {k: [v[i] for i in train_idx] for k, v in outputs.items()} - val_outputs = {k: [v[i] for i in val_idx] for k, v in outputs.items()} - - splits.append((train_inputs, train_outputs, val_inputs, val_outputs)) - - return splits - - -def dump_predictions( - outputs_pred: dict[str, list[Any]], filename: str = "predictions.pkl" -) -> None: - """Dump predicted outputs to a pickle file with the same structure as the reference. - - Args: - outputs_pred (dict[str, list[Any]]): - Predicted outputs containing keys - 'nut', 'mach', 'Q', 'power', 'Pr', 'Tr', 'eth_is', 'angle_out'. - filename (str): Path to the output .pkl file. - """ - FIELD_OUTPUTS = ["nut", "mach"] - SCALAR_OUTPUTS = ["Q", "power", "Pr", "Tr", "eth_is", "angle_out"] - - # Build a list of dicts, one per sample - n_samples = len(outputs_pred[FIELD_OUTPUTS[0]]) - predictions = [] - for i in range(n_samples): - rec: dict[str, Any] = {} - for fn in FIELD_OUTPUTS: - rec[fn] = outputs_pred[fn][i] - for sn in SCALAR_OUTPUTS: - rec[sn] = outputs_pred[sn][i] - predictions.append(rec) - - # Ensure output directory exists - dump_path = Path(filename) - dump_path.parent.mkdir(parents=True, exist_ok=True) - - # Write to pickle - with dump_path.open("wb") as f: - pickle.dump(predictions, f) - - print(f"Predictions successfully dumped to '{dump_path}'") diff --git a/benchmarks/MMGP/VKI-LS59/model.py b/benchmarks/MMGP/VKI-LS59/model.py deleted file mode 100644 index ee94d318..00000000 --- a/benchmarks/MMGP/VKI-LS59/model.py +++ /dev/null @@ -1,88 +0,0 @@ - -import GPy -import numpy as np -from sklearn.base import BaseEstimator, RegressorMixin - - -class GPyRegressor(BaseEstimator, RegressorMixin): - """Custom Gaussian Process Regressor using GPy library. - - Args: - normalizer (bool): Whether to normalize the output. - constant_mean (bool): Whether to use a constant mean model. - kernel (str): Type of kernel to use in the Gaussian Process. - Options: 'Matern52', 'Matern32', 'Rbf'. - num_restarts (int): Number of restarts for kernel optimization. - """ - - def __init__( - self, - normalizer: bool = False, - constant_mean: bool = False, - kernel: str = "Matern52", - num_restarts: int = 5, - ): - self.normalizer = normalizer - self.constant_mean = constant_mean - self.kernel = kernel - self.num_restarts = num_restarts - - def fit(self, X, y): - """Fit the Gaussian Process model to the data. - - Args: - X (ndarray): Input features of shape (n_samples, n_features). - y (ndarray): Target values of shape (n_samples,) or (n_samples, 1). - - Returns: - self: Returns the instance of the fitted model. - """ - # Reshape y to have shape (n_samples, 1) if it's 1D - if y.ndim == 1: - y = y[:, None] - - # Define the kernel based on the specified kernel type - if self.kernel == "Matern52": - kernel = GPy.kern.Matern52(input_dim=X.shape[-1], ARD=True) - elif self.kernel == "Matern32": - kernel = GPy.kern.Matern32(input_dim=X.shape[-1], ARD=True) - elif self.kernel == "RBF": - kernel = GPy.kern.RBF(input_dim=X.shape[-1], ARD=True) - else: - raise ValueError("Kernel should be 'RBF', 'Matern32', or 'Matern52'") - - mean_function = None - if self.constant_mean: - mean_function = GPy.mappings.Constant( - input_dim=X.shape[-1], output_dim=y.shape[-1] - ) - - # Create and optimize the GP regression model - self.kmodel = GPy.models.GPRegression( - X=X, - Y=y, - kernel=kernel, - normalizer=self.normalizer, - mean_function=mean_function, - ) - self.kmodel.optimize_restarts(num_restarts=self.num_restarts, messages=False) - return self - - def predict(self, X, return_var: bool = False): - """Predict using the Gaussian Process model. - - Args: - X (ndarray): Input features of shape (n_samples, n_features). - return_var (bool): Return the predictive variance. - - Returns: - mean (ndarray): Predicted mean values for the input data - or - (mean, variance) (ndarray): Predicted mean and variance values if return_var is True. - """ - # Get the mean prediction from the GP model - mean, var = self.kmodel.predict(X) - if return_var: - return mean, np.tile(var, (1, mean.shape[-1])) - else: - return mean diff --git a/benchmarks/MMGP/VKI-LS59/processor.py b/benchmarks/MMGP/VKI-LS59/processor.py deleted file mode 100644 index 90852abe..00000000 --- a/benchmarks/MMGP/VKI-LS59/processor.py +++ /dev/null @@ -1,280 +0,0 @@ -# processor.py - -from typing import Any, Optional - -import numpy as np -from AM_POD import PolynomialManifoldApproximation -from sklearn.decomposition import PCA -from sklearn.preprocessing import StandardScaler - - -class InputProcessor: - """Preprocesses input data by applying PCA to the mesh node coordinates - and standard-scaling the resulting PCA components concatenated with - additional input scalars. - - Args: - explained_variance (float): The fraction of variance to preserve - in the PCA decomposition (e.g. 0.99999). - """ - - def __init__(self, explained_variance: float): - self.explained_variance = explained_variance - self.pca: PCA = None - self.scaler: StandardScaler = None - self.n_components_: int = None - self.node_shape_: tuple = None # to reconstruct original shape - - def fit(self, inputs: dict[str, list[np.ndarray]]) -> "InputProcessor": - """Fit the PCA on flattened node arrays and fit a StandardScaler - on the concatenation of PCA components + the two input scalars. - - Args: - inputs: dict with keys: - - 'nodes': list of np.ndarray of shape (n_nodes, 2) - - 'angle_in': list of floats - - 'mach_out': list of floats - - Returns: - self - """ - # Extract raw data - nodes_list = inputs["nodes"] - angle_list = inputs["angle_in"] - mach_list = inputs["mach_out"] - - # Remember original node shape for inverse transform - self.node_shape_ = nodes_list[0].shape # e.g. (36421, 2) - - # Flatten each nodes array to shape (n_nodes*2,) - X_nodes = np.stack([n.reshape(-1) for n in nodes_list], axis=0) - - # PCA to reduce dimensionality, preserving desired variance - self.pca = PCA(n_components=self.explained_variance, svd_solver="full") - X_nodes_pca = self.pca.fit_transform(X_nodes) - self.n_components_ = self.pca.n_components_ - - # Build combined feature matrix: [ PCA components | angle_in | mach_out ] - X_combined = np.hstack( - [ - X_nodes_pca, - np.array(angle_list).reshape(-1, 1), - np.array(mach_list).reshape(-1, 1), - ] - ) - - # Standard normalization - self.scaler = StandardScaler() - self.scaler.fit(X_combined) - - return self - - def transform(self, inputs: dict[str, list[np.ndarray]]) -> np.ndarray: - """Apply the fitted PCA and StandardScaler to new data. - - Args: - inputs: dict with same structure as in fit(). - - Returns: - A 2D numpy array of shape (n_samples, n_pca_components + 2). - """ - nodes_list = inputs["nodes"] - angle_list = inputs["angle_in"] - mach_list = inputs["mach_out"] - - X_nodes = np.stack([n.reshape(-1) for n in nodes_list], axis=0) - X_nodes_pca = self.pca.transform(X_nodes) - - X_combined = np.hstack( - [ - X_nodes_pca, - np.array(angle_list).reshape(-1, 1), - np.array(mach_list).reshape(-1, 1), - ] - ) - - return self.scaler.transform(X_combined) - - def fit_transform(self, inputs: dict[str, list[np.ndarray]]) -> np.ndarray: - """Fit PCA and scaler on inputs, then transform and return processed data in one step. - - Args: - inputs: dict with same structure as in fit(). - - Returns: - A 2D numpy array of shape (n_samples, n_pca_components + 2). - """ - self.fit(inputs) - return self.transform(inputs) - - def inverse_transform(self, X_transformed: np.ndarray) -> dict[str, list[Any]]: - """Reconstruct approximate original nodes and scalars from the processed data. - - Args: - X_transformed: Array of shape (n_samples, n_pca_components + 2) - as output by transform(). - - Returns: - dict with keys: - - 'nodes': list of np.ndarray of shape original (n_nodes, 2) - - 'angle_in': list of floats - - 'mach_out': list of floats - """ - # Undo standard scaling - X_combined = self.scaler.inverse_transform(X_transformed) - - # Split back into PCA components and scalars - X_nodes_pca = X_combined[:, : self.n_components_] - angle_arr = X_combined[:, self.n_components_] - mach_arr = X_combined[:, self.n_components_ + 1] - - # Inverse PCA to reconstruct flattened nodes - X_nodes_flat = self.pca.inverse_transform(X_nodes_pca) - - # Reshape each to original mesh shape - nodes_list = [flat.reshape(self.node_shape_) for flat in X_nodes_flat] - - return { - "nodes": nodes_list, - "angle_in": list(angle_arr), - "mach_out": list(mach_arr), - } - - -class OutputProcessor: - """Preprocesses outputs by reducing mesh fields 'mach' and 'nut' via - PolynomialManifoldApproximation and standard-scaling the concatenation - of reduced fields and additional scalar outputs. - - Args: - mach_params (tuple[int, int]): (polynomial_order, r) for 'mach' field. - nut_params (tuple[int, int]): (polynomial_order, r) for 'nut' field. - podtype (str): 'pod', 'poly', or 'am' reduction type. - reg_ls (float): L2 regularization for least squares. - reg_nt (Optional[float]): L2 regularization for Newton solver. - tol_rot (float): Tolerance for rotation convergence. - tol_nt (float): Tolerance for Newton solver. - max_iter_rot (int): Max iterations for rotation optimization. - max_iter_nt (Optional[int]): Max iter for Newton loop. - n_jobs (int): Number of parallel jobs. - verbose (bool): Verbosity flag. - """ - - def __init__( - self, - mach_params: tuple[int, int], - nut_params: tuple[int, int], - podtype: str = "am", - reg_ls: float = 0.01, - reg_nt: Optional[float] = None, - tol_rot: float = 1e-6, - tol_nt: float = 1e-8, - max_iter_rot: int = 100, - max_iter_nt: Optional[int] = None, - n_jobs: int = -1, - verbose: bool = False, - ): - polynomial_order_m, r_m = mach_params - polynomial_order_n, r_n = nut_params - self.pma_mach = PolynomialManifoldApproximation( - polynomial_order=polynomial_order_m, - r=r_m, - q=None, - podtype=podtype, - reg_ls=reg_ls, - reg_nt=reg_nt, - tol_rot=tol_rot, - tol_nt=tol_nt, - max_iter_rot=max_iter_rot, - max_iter_nt=max_iter_nt, - n_jobs=n_jobs, - verbose=verbose, - ) - self.pma_nut = PolynomialManifoldApproximation( - polynomial_order=polynomial_order_n, - r=r_n, - q=None, - podtype=podtype, - reg_ls=reg_ls, - reg_nt=reg_nt, - tol_rot=tol_rot, - tol_nt=tol_nt, - max_iter_rot=max_iter_rot, - max_iter_nt=max_iter_nt, - n_jobs=n_jobs, - verbose=verbose, - ) - self.scaler: StandardScaler = None - self.r_mach = r_m - self.r_nut = r_n - - def fit( - self, - outputs_train: dict[str, list[Any]], - outputs_test: dict[str, list[Any]] | None = None, - ) -> "OutputProcessor": - """Fit PMA models on train/test data and a StandardScaler on combined features.""" - # Prepare field snapshots (samples x features) - S_train_mach = np.stack(outputs_train["mach"], axis=0) - S_train_nut = np.stack(outputs_train["nut"], axis=0) - if outputs_test is not None: - S_test_mach = np.stack(outputs_test["mach"], axis=0) - S_test_nut = np.stack(outputs_test["nut"], axis=0) - else: - S_test_mach = None - S_test_nut = None - - # Fit manifold approximations - self.pma_mach.fit(S_train_mach, S_test_mach) - self.pma_nut.fit(S_train_nut, S_test_nut) - - # Transform fields - X_mach = self.pma_mach.transform(S_train_mach) - X_nut = self.pma_nut.transform(S_train_nut) - - # Stack scalar outputs - scalar_keys = ["Q", "power", "Pr", "Tr", "eth_is", "angle_out"] - X_scalars = np.vstack([np.array(outputs_train[k]) for k in scalar_keys]).T - - # Combine and scale - X = np.hstack([X_mach, X_nut, X_scalars]) - self.scaler = StandardScaler().fit(X) - return self - - def transform(self, outputs: dict[str, list[Any]]) -> np.ndarray: - """Transform outputs using fitted PMA and scaler.""" - S_mach = np.stack(outputs["mach"], axis=0) - S_nut = np.stack(outputs["nut"], axis=0) - X_mach = self.pma_mach.transform(S_mach) - X_nut = self.pma_nut.transform(S_nut) - scalar_keys = ["Q", "power", "Pr", "Tr", "eth_is", "angle_out"] - X_scalars = np.vstack([np.array(outputs[k]) for k in scalar_keys]).T - X = np.hstack([X_mach, X_nut, X_scalars]) - return self.scaler.transform(X) - - def fit_transform( - self, - outputs_train: dict[str, list[Any]], - outputs_test: dict[str, list[Any]] | None = None, - ) -> np.ndarray: - """Fit PMA and scaler then transform train outputs.""" - self.fit(outputs_train, outputs_test) - return self.transform(outputs_train) - - def inverse_transform(self, X_transformed: np.ndarray) -> dict[str, list[Any]]: - """Inverse transform to reconstruct approximate original outputs.""" - X_comb = self.scaler.inverse_transform(X_transformed) - n1 = self.r_mach - n2 = self.r_nut - X_mach = X_comb[:, :n1] - X_nut = X_comb[:, n1 : n1 + n2] - scalars = X_comb[:, n1 + n2 :] - - mach_arr = self.pma_mach.inverse_transform(X_mach) - nut_arr = self.pma_nut.inverse_transform(X_nut) - mach_list = [mach_arr[i] for i in range(mach_arr.shape[0])] - nut_list = [nut_arr[i] for i in range(nut_arr.shape[0])] - scalar_keys = ["Q", "power", "Pr", "Tr", "eth_is", "angle_out"] - scalar_dict = {k: list(scalars[:, idx]) for idx, k in enumerate(scalar_keys)} - - return {"mach": mach_list, "nut": nut_list, **scalar_dict} diff --git a/benchmarks/MMGP/VKI-LS59/utils.py b/benchmarks/MMGP/VKI-LS59/utils.py deleted file mode 100644 index d78d4b7c..00000000 --- a/benchmarks/MMGP/VKI-LS59/utils.py +++ /dev/null @@ -1,215 +0,0 @@ -import matplotlib.pyplot as plt -import numpy as np -import pandas as pd - - -class CVResults: - def __init__(self, records: list): - """records: list of dicts with keys 'poly_order','r','fold','error'.""" - import pandas as pd - - self.df = pd.DataFrame(records) - - def to_dataframe(self) -> "pd.DataFrame": - """Return the raw DataFrame.""" - return self.df - - def print_summary(self): - """Print per‑fold errors with scientific‑notation, plus avg and max.""" - import pandas as pd - - # pivot: rows = (poly_order, r), cols = fold - pivot = self.df.pivot_table( - index=["poly_order", "r"], columns="fold", values="error" - ) - # add avg and max - pivot["avg_error"] = pivot.mean(axis=1) - pivot["max_error"] = pivot.max(axis=1) - - # Set pandas to use scientific notation for floats - pd.set_option("display.float_format", "{:.4e}".format) - - print("\nPer‑fold errors with average and maximum (scientific notation):\n") - print(pivot) - - # Reset to default float format - pd.reset_option("display.float_format") - - -def plot_mach_nut( - inputs: dict, - outputs_true: dict, - outputs_pred: dict, - idx: int, - grid_shape: tuple = (301, 121), - figsize: tuple = (20, 20), - levels: int = 200, -) -> tuple: - """Plot side-by-side contour plots for 'mach' and 'nut' fields of a given sample, - and their prediction errors below with a symmetric red-blue colormap (white at zero). - - Args: - inputs (dict): Must contain key 'nodes' -> list of flattened arrays shape (nx*ny*2,). - outputs_true (dict): True outputs with keys 'mach', 'nut' -> lists of flattened arrays shape (nx*ny,). - outputs_pred (dict): Predicted outputs with keys 'mach', 'nut' -> lists of flattened arrays shape (nx*ny,). - idx (int): Index of the sample to plot. - grid_shape (tuple): (nx, ny) dimensions to reshape nodes and fields. - figsize (tuple): Size of the figure as (width, height). - levels (int): Number of contour levels for plotting. - - Returns: - fig (plt.Figure), axs (np.ndarray): Matplotlib figure and axes array. - """ - nx, ny = grid_shape - node_flat = inputs["nodes"][idx] - nodes = node_flat.reshape(nx, ny, 2) - - mach_pred = outputs_pred["mach"][idx].reshape(nx, ny) - nut_pred = outputs_pred["nut"][idx].reshape(nx, ny) - - mach_true = outputs_true["mach"][idx].reshape(nx, ny) - nut_true = outputs_true["nut"][idx].reshape(nx, ny) - - # Compute vertical shift between first and last rows - s = nodes[0, -1, 1] - nodes[0, 0, 1] - - # Create 2x2 subplots - fig, axs = plt.subplots(2, 2, figsize=figsize) - - # Row 1: Predicted fields - cf0 = axs[0, 0].contourf(nodes[:, :, 0], nodes[:, :, 1], mach_pred, levels) - axs[0, 0].contourf(nodes[:, :, 0], nodes[:, :, 1] - s, mach_pred, levels) - fig.colorbar(cf0, ax=axs[0, 0], orientation="vertical", aspect=10) - axs[0, 0].set_title(f"Sample {idx} - MACH Prediction") - axs[0, 0].set_aspect("equal") - - cf1 = axs[0, 1].contourf(nodes[:, :, 0], nodes[:, :, 1], nut_pred, levels) - axs[0, 1].contourf(nodes[:, :, 0], nodes[:, :, 1] - s, nut_pred, levels) - fig.colorbar(cf1, ax=axs[0, 1], orientation="vertical", aspect=10) - axs[0, 1].set_title(f"Sample {idx} - NUT Prediction") - axs[0, 1].set_aspect("equal") - - # Row 2: Prediction errors with symmetric colormap around zero - err_mach = mach_pred - mach_true - # Determine symmetric range - max_err_m = np.max(np.abs(err_mach)) - cf2 = axs[1, 0].contourf( - nodes[:, :, 0], - nodes[:, :, 1], - err_mach, - levels, - cmap="RdBu", - vmin=-max_err_m, - vmax=max_err_m, - ) - axs[1, 0].contourf( - nodes[:, :, 0], - nodes[:, :, 1] - s, - err_mach, - levels, - cmap="RdBu", - vmin=-max_err_m, - vmax=max_err_m, - ) - fig.colorbar(cf2, ax=axs[1, 0], orientation="vertical", aspect=10) - axs[1, 0].set_title(f"Sample {idx} - MACH Error") - axs[1, 0].set_aspect("equal") - - err_nut = nut_pred - nut_true - max_err_n = np.max(np.abs(err_nut)) - cf3 = axs[1, 1].contourf( - nodes[:, :, 0], - nodes[:, :, 1], - err_nut, - levels, - cmap="RdBu", - vmin=-max_err_n, - vmax=max_err_n, - ) - axs[1, 1].contourf( - nodes[:, :, 0], - nodes[:, :, 1] - s, - err_nut, - levels, - cmap="RdBu", - vmin=-max_err_n, - vmax=max_err_n, - ) - fig.colorbar(cf3, ax=axs[1, 1], orientation="vertical", aspect=10) - axs[1, 1].set_title(f"Sample {idx} - NUT Error") - axs[1, 1].set_aspect("equal") - - plt.tight_layout() - return fig, axs - - -def plot_scalars_pred_vs_true( - outputs_true: dict, - outputs_pred: dict, - scalar_keys: list = None, - figsize: tuple = (15, 10), - marker: str = "o", - diagonal_color: str = "r", -) -> tuple: - """Plot predicted vs true scatter for each scalar with a diagonal reference line. - - Args: - outputs_true (dict): True scalar values, keys = scalar names, values = lists of floats or scalars - outputs_pred (dict): Predicted scalar values, same structure as outputs_true - scalar_keys (list, optional): List of scalar keys to plot. If None, automatically - select only those keys whose values are lists of scalars. - figsize (tuple): Figure size (width, height). - marker (str): Marker style for scatter. - diagonal_color (str): Color of the y=x diagonal line. - - Returns: - fig (plt.Figure), axs (np.ndarray): Matplotlib figure and axes array. - """ - # Auto-select keys if not provided: only lists of scalar values - if scalar_keys is None: - scalar_keys = [ - key - for key, vals in outputs_true.items() - if isinstance(vals, list) and vals and all(np.isscalar(v) for v in vals) - ] - else: - # Filter provided keys - scalar_keys = [ - key - for key in scalar_keys - if key in outputs_true - and isinstance(outputs_true[key], list) - and outputs_true[key] - and all(np.isscalar(v) for v in outputs_true[key]) - ] - - n = len(scalar_keys) - ncols = min(3, n) - nrows = (n + ncols - 1) // ncols - fig, axs = plt.subplots(nrows, ncols, figsize=figsize, squeeze=False) - - for idx, key in enumerate(scalar_keys): - row = idx // ncols - col = idx % ncols - ax = axs[row][col] - - y_true = np.array(outputs_true[key]) - y_pred = np.array(outputs_pred[key]) - - ax.scatter(y_true, y_pred, marker=marker) - - # plot diagonal - vmin = min(y_true.min(), y_pred.min()) - vmax = max(y_true.max(), y_pred.max()) - ax.plot([vmin, vmax], [vmin, vmax], diagonal_color) - - ax.set_title(key) - ax.set_xlabel("True") - ax.set_ylabel("Predicted") - - # Hide unused axes - for idx in range(n, nrows * ncols): - fig.delaxes(axs.flatten()[idx]) - - plt.tight_layout() - return fig, axs diff --git a/benchmarks/README.md b/benchmarks/README.md index 22b6a2ef..a8f87d3c 100644 --- a/benchmarks/README.md +++ b/benchmarks/README.md @@ -1,22 +1,4 @@ ## PLAID Benchmarks -This folder contains the code used to generate the baselines for the [PLAID Benchmarks](https://huggingface.co/PLAIDcompetitions), with the exception of the Augur results, which are produced using a [commercial solution](https://augurco.fr/). These benchmarks are interactive: anyone can participate by submitting their own model predictions. Each benchmark includes the corresponding dataset, along with documentation on how to use it and how to submit a solution. +Theses benchmarks where mouved to a new repository. https://github.com/PLAID-lib/plaid-benchmarks. - -### Benchmark results, as of August 5, 2025: -| Dataset | MGN | MMGP | Vi-Transf. | Augur | FNO | MARIO | -|-------------------|-----|------|------------|-------|-------|-------| -| `Tensile2d` | 0.0673 | **0.0026** | 0.0116 | 0.0154 | 0.0123 | *0.0038* | -| `2D_MultiScHypEl` | 0.0437 | ❌ | 0.0325 | **0.0232** | *0.0302* | 0.0573 | -| `2D_ElPlDynamics` | 0.1202 | ❌ | *0.0227* | 0.0346 | **0.0215** | 0.0319 | -| `Rotor37` | 0.0074 | **0.0014** | 0.0029 | 0.0033 | 0.0313 | *0.0017* | -| `2D_profile` | 0.0593 | 0.0365 | *0.0312* | 0.0425 | 0.0972 | **0.0307** | -| `VKI-LS59` | 0.0684 | 0.0312 | *0.0193* | 0.0267 | 0.0215 | **0.0124** | - -❌: Not compatible with topology variation - -### Additional notes: -- **MMGP** does not support variable mesh topologies, which limits its applicability to certain datasets and often necessitates custom preprocessing for new cases. However, when morphing is either unnecessary or inexpensive, it offers a highly efficient solution, combining fast training with good accuracy (e.g., `Tensile2d` and `Rotor37`). -- **MARIO** is computationally expensive to train but achieves consistently a very strong performance across most datasets. Its result on `2D_MultiScHypEl` is slightly worse than other tested methods, which may reflect the challenge of capturing complex shape variability in these cases. -- **Vi-Transformer** and **Augur** perform well across all datasets, showing strong versatility and generalization capabilities. -- **FNO** suffers on datasets featuring unstructured meshes with pronounced anisotropies, due to the loss of accuracy introduced by projections to and from regular grids (e.g., `Rotor37` and `2D_profile`). Additionally, the use of a 3D regular grid on `Rotor37` results in substantial computational overhead. diff --git a/benchmarks/Vi-Transf/README.md b/benchmarks/Vi-Transf/README.md deleted file mode 100644 index 71a413af..00000000 --- a/benchmarks/Vi-Transf/README.md +++ /dev/null @@ -1,50 +0,0 @@ -# PLAID Vision Transformer benchmarks - -## Training and submission -The Vi-Transformer benchmark models can be ran on the corresponding datasets as follows: - -### Stationary datasets - -#### 2D_MultiScHypEl -``` -python main_stationary.py --config-name 2d_multiscale_hyperelasticity.yaml -``` - -#### 2D_profile -``` -python main_stationary.py --config-name 2d_profile.yaml -``` - -#### Rotor37 -``` -python main_stationary.py --config-name rotor37.yaml -``` - -#### Tensile2d -``` -python main_stationary.py --config-name tensile2d.yaml -``` - -#### VKI-LS59 -``` -python main_stationary.py --config-name vkils59.yaml -``` - -### Non-stationary datasets - -#### 2D_ElPlDynamics -``` -python main_elasto_plasto_dynamics.py -``` - - -## Dependencies -- [PLAID=0.1.6](https://github.com/PLAID-lib/plaid) -- [PyTorch=2.7.0](https://pytorch.org/) -- [Einops=0.8.1](https://pypi.org/project/einops/) -- [Muscat=2.4.1](https://gitlab.com/drti/muscat) -- [PyTorchGeometric=2.6.1](https://pytorch-geometric.readthedocs.io/en/latest/) -- [Hydra=1.3.2](https://hydra.cc/docs/intro/) -- [Pymetis=2025.0.1](https://github.com/inducer/pymetis) -- [Omegaconf=2.3.0](https://omegaconf.readthedocs.io/en/2.3_branch/) -- [TorchTbProfiler=0.4.3](https://pypi.org/project/torch-tb-profiler/) diff --git a/benchmarks/Vi-Transf/configs/2d_elasto_plasto_dynamics.yaml b/benchmarks/Vi-Transf/configs/2d_elasto_plasto_dynamics.yaml deleted file mode 100644 index 291727aa..00000000 --- a/benchmarks/Vi-Transf/configs/2d_elasto_plasto_dynamics.yaml +++ /dev/null @@ -1,64 +0,0 @@ -bridge: - function: - _target_: src.data.loader.bridges.elasto_plasto_dynamics.elasto_plasto_dynamics_sample_to_geometric - _partial_: true - data_info: - name: 2d_elasto_plasto_dynamics - geometry_dimension: 2 - input_field_dim: 7 - input_scalar_dim: 1 - output_field_dim: 2 - output_scalar_dim: 0 -loader: - _target_: src.data.loader.hub_loader.HubLoader - bridge: ${bridge.function} - dataset_name: PLAID-datasets/2D_ElastoPlastoDynamics - cache_dir: cached_datasets/2D_ElastoPlastoDynamics - task_split: - - train - - test - processes_number: 1 -scaler: - _target_: src.data.scaler.StandardScaler -model: - _target_: src.models.vits.flatformer_cls_less.FlatFormerCLSLess - n_vertices_per_subdomain: 100 - n_head: 16 - dim_ff: 128 - num_layers: 2 - activation: gelu - dropout: 0.1 - latent_dim: 256 - input_field_dim: ${bridge.data_info.input_field_dim} - input_scalar_dim: ${bridge.data_info.input_scalar_dim} - output_field_dim: ${bridge.data_info.output_field_dim} - output_scalar_dim: ${bridge.data_info.output_scalar_dim} - tokenizer: - _target_: src.models.vits.tokenization.temporal_flatten_tokenizer.TemporalFlattenTokenizer - partitioner: - _target_: src.models.vits.tokenization.partitioners.metis_partitioner.MetisPartitioner - processes_number: 1 - n_vertices_per_subdomain: ${model.n_vertices_per_subdomain} - absolute_tol: 1 - relative_tol: 0.1 - tokenization_type: morton - processes_number: 1 - output_field_dim: ${model.output_field_dim} - preprocessing: - train_seed: 270 - val_seed: 92 - test_seed: 42 -score_fn: - _target_: src.evaluation.score.compute_score - _partial_: true - base_name: null -training: - lbda: 1 - test_size: 0.2 - batch_size: 500 - num_epochs: 50 - optimizer: - _target_: torch.optim.AdamW - lr: 0.0001 - weight_decay: 1.0e-05 -seed: 5 diff --git a/benchmarks/Vi-Transf/configs/2d_multiscale_hyperelasticity.yaml b/benchmarks/Vi-Transf/configs/2d_multiscale_hyperelasticity.yaml deleted file mode 100644 index 7a1d242c..00000000 --- a/benchmarks/Vi-Transf/configs/2d_multiscale_hyperelasticity.yaml +++ /dev/null @@ -1,63 +0,0 @@ -dataset: - name: 2d_multiscale_hyperelasticity - dimension: 2 -bridge: - function: - _target_: src.data.loader.bridges.multiscale_sample_to_geometric.multiscale_sample_to_geometric - _partial_: true - data_info: - name: 2d_multiscale_hyperelasticity - geometry_dimension: 2 - input_field_dim: 5 - input_scalar_dim: 3 - output_field_dim: 7 - output_scalar_dim: 1 -loader: - _target_: src.data.loader.hub_loader.HubLoader - bridge: ${bridge.function} - dataset_name: PLAID-datasets/2D_Multiscale_Hyperelasticity - cache_dir: cached_datasets/2D_Multiscale_Hyperelasticity - task_split: - - DOE_train - - DOE_test - processes_number: 1 -scaler: - _target_: src.data.scaler.StandardScaler -model: - _target_: src.models.vits.flatformer_cls_less.FlatFormerCLSLess - n_vertices_per_subdomain: 10 - n_head: 16 - dim_ff: 512 - num_layers: 5 - activation: gelu - dropout: 0.1 - latent_dim: 1024 - input_field_dim: ${bridge.data_info.input_field_dim} - input_scalar_dim: ${bridge.data_info.input_scalar_dim} - output_field_dim: ${bridge.data_info.output_field_dim} - output_scalar_dim: ${bridge.data_info.output_scalar_dim} - tokenizer: - _target_: src.models.vits.tokenization.flatten_tokenizer.FlattenTokenizer - partitioner: - _target_: src.models.vits.tokenization.partitioners.metis_partitioner.MetisPartitioner - processes_number: 1 - n_vertices_per_subdomain: ${model.n_vertices_per_subdomain} - absolute_tol: 1 - relative_tol: 0.1 - tokenization_type: morton - processes_number: 1 - output_field_dim: ${model.output_field_dim} - preprocessing: - train_seed: 27 - val_seed: 37 - test_seed: 66 -training: - lbda: 0.5 - test_size: 0.2 - batch_size: 50 - num_epochs: 500 - optimizer: - _target_: torch.optim.AdamW - lr: 0.0001 - weight_decay: 1.0e-05 -seed: 5 diff --git a/benchmarks/Vi-Transf/configs/2d_profile.yaml b/benchmarks/Vi-Transf/configs/2d_profile.yaml deleted file mode 100644 index 5aa5c256..00000000 --- a/benchmarks/Vi-Transf/configs/2d_profile.yaml +++ /dev/null @@ -1,60 +0,0 @@ -bridge: - function: - _target_: src.data.loader.bridges.profile_sample_to_geometric.profile_sample_to_geometric - _partial_: true - data_info: - name: 2d_profile - geometry_dimension: 2 - input_field_dim: 5 - input_scalar_dim: 0 - output_field_dim: 4 - output_scalar_dim: 0 -loader: - _target_: src.data.loader.hub_loader.HubLoader - bridge: ${bridge.function} - dataset_name: PLAID-datasets/2D_profile - cache_dir: cached_datasets/2D_profile - task_split: - - train - - test - processes_number: 1 -scaler: - _target_: src.data.scaler.StandardScaler -model: - _target_: src.models.vits.flatformer_cls_less.FlatFormerCLSLess - n_vertices_per_subdomain: 25 - n_head: 16 - dim_ff: 512 - num_layers: 5 - activation: gelu - dropout: 0.1 - latent_dim: 1024 - input_field_dim: ${bridge.data_info.input_field_dim} - input_scalar_dim: ${bridge.data_info.input_scalar_dim} - output_field_dim: ${bridge.data_info.output_field_dim} - output_scalar_dim: ${bridge.data_info.output_scalar_dim} - tokenizer: - _target_: src.models.vits.tokenization.flatten_tokenizer.FlattenTokenizer - partitioner: - _target_: src.models.vits.tokenization.partitioners.metis_partitioner.MetisPartitioner - processes_number: 1 - n_vertices_per_subdomain: ${model.n_vertices_per_subdomain} - absolute_tol: 1 - relative_tol: 0.1 - tokenization_type: simple - processes_number: 1 - output_field_dim: ${model.output_field_dim} - preprocessing: - train_seed: 777 - val_seed: 523 - test_seed: 175 -training: - lbda: 0.5 - test_size: 0.2 - batch_size: 10 - num_epochs: 1500 - optimizer: - _target_: torch.optim.AdamW - lr: 0.0001 - weight_decay: 1.0e-05 -seed: 99 diff --git a/benchmarks/Vi-Transf/configs/rotor37.yaml b/benchmarks/Vi-Transf/configs/rotor37.yaml deleted file mode 100644 index ae20433d..00000000 --- a/benchmarks/Vi-Transf/configs/rotor37.yaml +++ /dev/null @@ -1,60 +0,0 @@ -bridge: - function: - _target_: src.data.loader.bridges.rotor_sample_to_geometric.rotor_sample_to_geometric - _partial_: true - data_info: - name: rotor37 - geometry_dimension: 3 - input_field_dim: 6 - input_scalar_dim: 2 - output_field_dim: 3 - output_scalar_dim: 3 -loader: - _target_: src.data.loader.hub_loader.HubLoader - bridge: ${bridge.function} - dataset_name: PLAID-datasets/Rotor37 - cache_dir: cached_datasets/Rotor37 - task_split: - - train_1000 - - test - processes_number: 1 -scaler: - _target_: src.data.scaler.StandardScaler -model: - _target_: src.models.vits.flatformer_cls_less.FlatFormerCLSLess - n_vertices_per_subdomain: 30 - n_head: 16 - dim_ff: 512 - num_layers: 5 - activation: gelu - dropout: 0.1 - latent_dim: 1024 - input_field_dim: ${bridge.data_info.input_field_dim} - input_scalar_dim: ${bridge.data_info.input_scalar_dim} - output_field_dim: ${bridge.data_info.output_field_dim} - output_scalar_dim: ${bridge.data_info.output_scalar_dim} - tokenizer: - _target_: src.models.vits.tokenization.flatten_tokenizer.FlattenTokenizer - partitioner: - _target_: src.models.vits.tokenization.partitioners.metis_partitioner.MetisPartitioner - processes_number: 1 - n_vertices_per_subdomain: ${model.n_vertices_per_subdomain} - absolute_tol: 1 - relative_tol: 0.1 - tokenization_type: simple - processes_number: 1 - output_field_dim: ${model.output_field_dim} - preprocessing: - train_seed: 27 - val_seed: 37 - test_seed: 66 -training: - lbda: 0.5 - test_size: 0.2 - batch_size: 15 - num_epochs: 460 - optimizer: - _target_: torch.optim.AdamW - lr: 5.0e-05 - weight_decay: 1.0e-05 -seed: 5 diff --git a/benchmarks/Vi-Transf/configs/tensile2d.yaml b/benchmarks/Vi-Transf/configs/tensile2d.yaml deleted file mode 100644 index aef14dd0..00000000 --- a/benchmarks/Vi-Transf/configs/tensile2d.yaml +++ /dev/null @@ -1,60 +0,0 @@ -bridge: - function: - _target_: src.data.loader.bridges.tensile_sample_to_geometric.tensile_sample_to_geometric - _partial_: true - data_info: - name: tensile2d - geometry_dimension: 2 - input_field_dim: 5 - input_scalar_dim: 6 - output_field_dim: 5 - output_scalar_dim: 3 -loader: - _target_: src.data.loader.hub_loader.HubLoader - bridge: ${bridge.function} - dataset_name: PLAID-datasets/Tensile2d - cache_dir: cached_datasets/Tensile2d - task_split: - - train_500 - - test - processes_number: 1 -scaler: - _target_: src.data.scaler.StandardScaler -model: - _target_: src.models.vits.flatformer_cls_less.FlatFormerCLSLess - n_vertices_per_subdomain: 10 - n_head: 16 - dim_ff: 512 - num_layers: 5 - activation: gelu - dropout: 0.1 - latent_dim: 1024 - input_field_dim: ${bridge.data_info.input_field_dim} - input_scalar_dim: ${bridge.data_info.input_scalar_dim} - output_field_dim: ${bridge.data_info.output_field_dim} - output_scalar_dim: ${bridge.data_info.output_scalar_dim} - tokenizer: - _target_: src.models.vits.tokenization.flatten_tokenizer.FlattenTokenizer - partitioner: - _target_: src.models.vits.tokenization.partitioners.metis_partitioner.MetisPartitioner - processes_number: 1 - n_vertices_per_subdomain: ${model.n_vertices_per_subdomain} - absolute_tol: 1 - relative_tol: 0.1 - tokenization_type: morton - processes_number: 1 - output_field_dim: ${model.output_field_dim} - preprocessing: - train_seed: 777 - val_seed: 523 - test_seed: 175 -training: - lbda: 0.5 - test_size: 0.2 - batch_size: 30 - num_epochs: 500 - optimizer: - _target_: torch.optim.AdamW - lr: 0.0001 - weight_decay: 1.0e-05 -seed: 270 diff --git a/benchmarks/Vi-Transf/configs/vkils59.yaml b/benchmarks/Vi-Transf/configs/vkils59.yaml deleted file mode 100644 index 768085c5..00000000 --- a/benchmarks/Vi-Transf/configs/vkils59.yaml +++ /dev/null @@ -1,60 +0,0 @@ -bridge: - function: - _target_: src.data.loader.bridges.vki_sample_to_geometric.vki_sample_to_geometric - _partial_: true - data_info: - name: vkils59 - geometry_dimension: 2 - input_field_dim: 3 - input_scalar_dim: 2 - output_field_dim: 2 - output_scalar_dim: 6 -loader: - _target_: src.data.loader.hub_loader.HubLoader - bridge: ${bridge.function} - dataset_name: PLAID-datasets/VKI-LS59 - cache_dir: cached_datasets/VKI-LS59 - task_split: - - train - - test - processes_number: 1 -scaler: - _target_: src.data.scaler.StandardScaler -model: - _target_: src.models.vits.flatformer_cls_less.FlatFormerCLSLess - n_vertices_per_subdomain: 20 - n_head: 16 - dim_ff: 512 - num_layers: 5 - activation: gelu - dropout: 0.1 - latent_dim: 1024 - input_field_dim: ${bridge.data_info.input_field_dim} - input_scalar_dim: ${bridge.data_info.input_scalar_dim} - output_field_dim: ${bridge.data_info.output_field_dim} - output_scalar_dim: ${bridge.data_info.output_scalar_dim} - tokenizer: - _target_: src.models.vits.tokenization.flatten_tokenizer.FlattenTokenizer - partitioner: - _target_: src.models.vits.tokenization.partitioners.metis_partitioner.MetisPartitioner - processes_number: 1 - n_vertices_per_subdomain: ${model.n_vertices_per_subdomain} - absolute_tol: 1 - relative_tol: 0.1 - tokenization_type: morton - processes_number: 1 - output_field_dim: ${model.output_field_dim} - preprocessing: - train_seed: 777 - val_seed: 523 - test_seed: 175 -training: - lbda: 0.5 - test_size: 0.2 - batch_size: 10 - num_epochs: 500 - optimizer: - _target_: torch.optim.AdamW - lr: 0.0001 - weight_decay: 1.0e-05 -seed: 270 diff --git a/benchmarks/Vi-Transf/main_elasto_plasto_dynamics.py b/benchmarks/Vi-Transf/main_elasto_plasto_dynamics.py deleted file mode 100644 index 912cea7b..00000000 --- a/benchmarks/Vi-Transf/main_elasto_plasto_dynamics.py +++ /dev/null @@ -1,354 +0,0 @@ -import datetime -import logging -import os -import pickle -import time - -import hydra -import torch -from omegaconf import OmegaConf -from sklearn.model_selection import train_test_split -from torch.utils.tensorboard import SummaryWriter -from torch_geometric.data import Data -from torch_geometric.loader import DataLoader - -from src.data.utils import split_temporal_pyg_train_test - - -def seed_everything(seed: int): - import random - - import numpy as np - import torch - - random.seed(seed) - np.random.seed(seed) - torch.manual_seed(seed) - torch.cuda.manual_seed_all(seed) - - -class NoWarningFilter(logging.Filter): - def filter(self, record): - return record.levelno != logging.WARNING - - -def setup_logger(log_dir): - logger = logging.getLogger() - logger.setLevel(logging.INFO) - - # Remove all handlers - for handler in logger.handlers[:]: - logger.removeHandler(handler) - - file_handler = logging.FileHandler(f"{log_dir}/train.log") - file_handler.setLevel(logging.INFO) - file_handler.setFormatter( - logging.Formatter("%(asctime)s %(levelname)s %(message)s") - ) - file_handler.addFilter(NoWarningFilter()) - - console_handler = logging.StreamHandler() - console_handler.setLevel(logging.INFO) - console_handler.setFormatter( - logging.Formatter("%(asctime)s %(levelname)s %(message)s") - ) - console_handler.addFilter(NoWarningFilter()) - - logger.addHandler(file_handler) - logger.addHandler(console_handler) - - return logger - - -@hydra.main( - version_base=None, - config_path="configs", - config_name="2d_elasto_plasto_dynamics.yaml", -) -def main(cfg): - # Use Hydra's run directory for all outputs & logging - timestamp = datetime.datetime.now().strftime("%Y_%m_%d__%H_%M_%S") - log_dir = os.path.join("logs", cfg.bridge.data_info.name, f"run_{timestamp}") - os.makedirs(log_dir, exist_ok=False) - tb_logger = SummaryWriter(log_dir=log_dir) - logger = setup_logger(log_dir) - - # write config to log_dir: - with open(f"{log_dir}/config.yaml", "w") as f: - OmegaConf.save(cfg, f) - - # loading components - loader = hydra.utils.instantiate(cfg.loader) - scaler = hydra.utils.instantiate(cfg.scaler) - model = hydra.utils.instantiate(cfg.model) - optimizer = hydra.utils.instantiate( - cfg.training.optimizer, params=model.parameters() - ) - - seed = cfg.get("seed", None) - if seed is not None: - logger.info(f"Using seed {seed}") - seed_everything(seed) - - # loading datasets - logger.info("\n------\nData pre-processing\n------") - preprocessing_start = time.time() - problem_definition, plaid_train_dataset, plaid_test_dataset = loader.load_plaid() - test_ids = plaid_test_dataset.get_sample_ids() - train_ids = plaid_train_dataset.get_sample_ids() - del plaid_train_dataset, plaid_test_dataset - - problem_definition, train_dataset, test_dataset = loader.load() - - # splitting datasets into train/val splits - train_ids, val_ids = train_test_split(train_ids, test_size=cfg.training.test_size) - print(train_ids) - print(val_ids) - logger.info(f"Train_ids: {train_ids}") - logger.info(f"Val_ids: {val_ids}") - - split_train_dataset, split_val_dataset = split_temporal_pyg_train_test( - train_dataset, train_ids=train_ids, test_ids=val_ids - ) - train_dataset, val_dataset = split_train_dataset, split_val_dataset - - # unpacking - train_dataset = [sample for sample_list in train_dataset for sample in sample_list] - test_dataset = [sample for sample_list in test_dataset for sample in sample_list] - val_dataset = [sample for sample_list in val_dataset for sample in sample_list] - - # scaling the pyg dataset - train_dataset = scaler.fit_transform(train_dataset) - val_dataset = scaler.transform(val_dataset) - test_dataset = scaler.transform(test_dataset) - - # repacking by sample id - packed_train_dataset = [] - for id in train_ids: - sample_list = [sample for sample in train_dataset if sample.sample_id == id] - packed_train_dataset.append(sample_list) - packed_test_dataset = [] - for id in test_ids: - sample_list = [sample for sample in test_dataset if sample.sample_id == id] - packed_test_dataset.append(sample_list) - packed_val_dataset = [] - for id in val_ids: - sample_list = [sample for sample in val_dataset if sample.sample_id == id] - packed_val_dataset.append(sample_list) - - train_dataset, test_dataset, val_dataset = ( - packed_train_dataset, - packed_test_dataset, - packed_val_dataset, - ) - - # preprocessing the datasets - train_dataset = model.preprocess( - pyg_dataset=train_dataset, - plaid_dataset=None, - seed=cfg.model.preprocessing.get("train_seed", None), - type="train", - ) - - val_dataset = model.preprocess( - pyg_dataset=val_dataset, - plaid_dataset=None, - seed=cfg.model.preprocessing.get("val_seed", None), - type="val", - ) - preprocessing_end = time.time() - logger.info( - f"Preprocessing time: {str(datetime.timedelta(seconds=(preprocessing_end - preprocessing_start)))}" - ) - - # training - logger.info("\n--------\nTraining\n--------") - - train_loader = DataLoader( - train_dataset, batch_size=cfg.training.batch_size, shuffle=True - ) - val_loader = DataLoader( - val_dataset, batch_size=cfg.training.batch_size, shuffle=False - ) - - best_val_loss = float("inf") - best_model_epoch = 0 - best_model_path = os.path.join(log_dir, "best_model.pt") - - device = "cuda" if torch.cuda.is_available() else "cpu" - lbda = cfg.training.lbda - logger.info(f"Using lbda: {lbda}") - logger.info(f"Using device: {device}") - epochs = cfg.training.num_epochs - model.to(device) - - output_fields_names = train_dataset[0].output_fields_names - output_scalars_names = train_dataset[0].output_scalars_names - - train_start = time.time() - for epoch in range(epochs): - # train loop - epoch_train_loss = 0 - epoch_train_field_losses = torch.zeros(len(output_fields_names)) - epoch_train_scalar_losses = torch.zeros(len(output_scalars_names)) - - model.train() - for batch_id, batch in enumerate(train_loader): - local_batch_size = batch.num_graphs - field_losses, scalar_losses = model.compute_loss(batch) - field_loss, scalar_loss = field_losses.mean(), scalar_losses.mean() - loss = lbda * field_loss + (1 - lbda) * scalar_loss - - # recording losses - epoch_train_field_losses += field_losses.detach().cpu() * ( - local_batch_size / len(train_dataset) - ) - epoch_train_scalar_losses += scalar_losses.detach().cpu() * ( - local_batch_size / len(train_dataset) - ) - epoch_train_loss += loss.item() * (local_batch_size / len(train_dataset)) - - optimizer.zero_grad() - loss.backward() - optimizer.step() - - for n, fn in enumerate(output_fields_names): - tb_logger.add_scalar( - f"train/loss/{fn}", epoch_train_field_losses[n].item(), epoch - ) - for n, sn in enumerate(output_scalars_names): - tb_logger.add_scalar( - f"train/loss/{sn}", epoch_train_scalar_losses[n].item(), epoch - ) - tb_logger.add_scalar("train/loss", epoch_train_loss, epoch) - - # validation loop - epoch_val_loss = 0 - epoch_val_field_losses = torch.zeros(len(output_fields_names)) - epoch_val_scalar_losses = torch.zeros(len(output_scalars_names)) - - model.eval() - with torch.no_grad(): - for batch_id, batch in enumerate(val_loader): - local_batch_size = batch.num_graphs - field_losses, scalar_losses = model.compute_loss(batch) - field_loss, scalar_loss = field_losses.mean(), scalar_losses.mean() - loss = lbda * field_loss + (1 - lbda) * scalar_loss - - # recording losses - epoch_val_field_losses += field_losses.detach().cpu() * ( - local_batch_size / len(val_dataset) - ) - epoch_val_scalar_losses += scalar_losses.detach().cpu() * ( - local_batch_size / len(val_dataset) - ) - epoch_val_loss += loss.item() * (local_batch_size / len(val_dataset)) - - for n, fn in enumerate(output_fields_names): - tb_logger.add_scalar( - f"val/loss/{fn}", epoch_val_field_losses[n].item(), epoch - ) - for n, sn in enumerate(output_scalars_names): - tb_logger.add_scalar( - f"val/loss/{sn}", epoch_val_scalar_losses[n].item(), epoch - ) - tb_logger.add_scalar("val/loss", epoch_val_loss, epoch) - logger.info( - f"Epoch {epoch:>{len(str(epochs))}}: Train Loss: {epoch_train_loss:.5f} | Val Loss: {epoch_val_loss:.5f}" - ) - - # save best model - if epoch_val_loss < best_val_loss: - best_val_loss = epoch_val_loss - torch.save(model.state_dict(), best_model_path) - best_model_epoch = epoch - best_model_time = time.time() - train_end = time.time() - logger.info("") - logger.info( - f"Training time: {str(datetime.timedelta(seconds=(train_end - train_start)))}" - ) - logger.info( - f"Saved best model at epoch {best_model_epoch} with loss {best_val_loss:.5f}" - ) - logger.info( - f"Training time to reach the best model epoch: {str(datetime.timedelta(seconds=(best_model_time - train_start)))}" - ) - - # loading the best model - logger.info("Loading the best saved model") - model.load_state_dict(torch.load(best_model_path, weights_only=True)) - - del train_dataset, val_dataset - - buffer = test_dataset - # test dataset predictions - test_dataset = model.preprocess( - pyg_dataset=buffer, - plaid_dataset=None, - seed=cfg.model.preprocessing.get("test_seed", None), - type="test", - ) - - predictions_dict = {} - model.eval() - first_samples = {} - for id in test_ids: - for _data in test_dataset: - if _data.sample_id == id: - first_samples[id] = _data - - import copy - - for id in test_ids: - data = first_samples[id] - - with torch.no_grad(): - n_timesteps = len(data.timestep_list) - instantaneous_predictions = torch.zeros( - (data.x.shape[0], len(data.output_fields_names)) - ) - field_predictions_concat = torch.empty( - (n_timesteps, data.x.shape[0], len(data.output_fields_names)) - ) - field_predictions_concat[0] = instantaneous_predictions - buffer_list = [] - for n, ts in enumerate(data.timestep_list[:-1]): - input_data = copy.deepcopy(data) - input_data.x[:, -len(input_data.output_fields_names) :] = ( - instantaneous_predictions - ) - input_data.input_scalars = torch.tensor([[ts]], dtype=torch.float32) - input_data.time = ts - - instantaneous_predictions, _ = model.predict(input_data) - buffer = Data( - output_fields=instantaneous_predictions, - output_fields_names=output_fields_names, - ) - buffer_list.append(buffer) - unscaled_solutions = scaler.inverse_transform_prediction(buffer_list) - for n, ts in enumerate(data.timestep_list[:-1]): - field_predictions_concat[n + 1] = unscaled_solutions[n].output_fields - predictions_dict[data.sample_id] = field_predictions_concat - - # creating submission - reference = [] - for i, id in enumerate(test_ids): - reference.append({}) - for n, fn in enumerate(output_fields_names): - reference[i][fn] = predictions_dict[id][..., n].numpy() - - with open(os.path.join(log_dir, "reference.pkl"), "wb") as file: - pickle.dump(reference, file) - - logger.info( - "\n\ - ---------------------\n\ - ------Finished!------\n\ - ---------------------" - ) - - -if __name__ == "__main__": - main() diff --git a/benchmarks/Vi-Transf/main_stationary.py b/benchmarks/Vi-Transf/main_stationary.py deleted file mode 100644 index 0c39bc6c..00000000 --- a/benchmarks/Vi-Transf/main_stationary.py +++ /dev/null @@ -1,282 +0,0 @@ -import datetime -import logging -import os -import time - -import hydra -import torch -from omegaconf import OmegaConf -from sklearn.model_selection import train_test_split -from torch.utils.tensorboard import SummaryWriter -from torch_geometric.data import Data -from torch_geometric.loader import DataLoader -from tqdm import tqdm - -from src.data.utils import split_plaid_train_test, split_pyg_train_test -from src.evaluation.submission import create_submission - - -def seed_everything(seed: int): - import random - - import numpy as np - import torch - - random.seed(seed) - np.random.seed(seed) - torch.manual_seed(seed) - torch.cuda.manual_seed_all(seed) - - -def setup_logger(log_dir): - logger = logging.getLogger() - logger.setLevel(logging.INFO) - - # Remove all handlers associated with the root logger object (Hydra may add its own) - for handler in logger.handlers[:]: - logger.removeHandler(handler) - - # File handler - file_handler = logging.FileHandler(f"{log_dir}/train.log") - file_handler.setLevel(logging.INFO) - file_handler.setFormatter( - logging.Formatter("%(asctime)s %(levelname)s %(message)s") - ) - - # Console handler - console_handler = logging.StreamHandler() - console_handler.setLevel(logging.INFO) - console_handler.setFormatter( - logging.Formatter("%(asctime)s %(levelname)s %(message)s") - ) - - logger.addHandler(file_handler) - logger.addHandler(console_handler) - return logger - - -@hydra.main(version_base=None, config_path="configs", config_name="rotor37.yaml") -def main(cfg): - # Use Hydra's run directory for all outputs & logging - timestamp = datetime.datetime.now().strftime("%Y_%m_%d__%H_%M_%S") - log_dir = os.path.join("logs", cfg.bridge.data_info.name, f"run_{timestamp}") - os.makedirs(log_dir, exist_ok=False) - tb_logger = SummaryWriter(log_dir=log_dir) - logger = setup_logger(log_dir) - - # write config to log_dir: - with open(f"{log_dir}/config.yaml", "w") as f: - OmegaConf.save(cfg, f) - - # loading components - loader = hydra.utils.instantiate(cfg.loader) - scaler = hydra.utils.instantiate(cfg.scaler) - model = hydra.utils.instantiate(cfg.model) - optimizer = hydra.utils.instantiate( - cfg.training.optimizer, params=model.parameters() - ) - - seed = cfg.get("seed", None) - if seed is not None: - logger.info(f"Using seed {seed}") - seed_everything(seed) - - # loading datasets - logger.info("\n------\nData pre-processing\n------") - preprocessing_start = time.time() - problem_definition, plaid_train_dataset, plaid_test_dataset = loader.load_plaid() - problem_definition, train_dataset, test_dataset = loader.load() - - # splitting datasets into train/val splits - train_ids = plaid_train_dataset.get_sample_ids() - train_ids, val_ids = train_test_split(train_ids, test_size=cfg.training.test_size) - logger.info(f"Train_ids: {train_ids}") - logger.info(f"Val_ids: {val_ids}") - - plaid_train_dataset, plaid_val_dataset = split_plaid_train_test( - plaid_train_dataset, train_ids=train_ids, test_ids=val_ids - ) - train_dataset, val_dataset = split_pyg_train_test( - train_dataset, train_ids=train_ids, test_ids=val_ids - ) - - # scaling the pyg dataset - train_dataset = scaler.fit_transform(train_dataset) - val_dataset = scaler.transform(val_dataset) - test_dataset = scaler.transform(test_dataset) - - # preprocessing the datasets - train_dataset = model.preprocess( - pyg_dataset=train_dataset, - plaid_dataset=plaid_train_dataset, - seed=cfg.model.preprocessing.get("train_seed", None), - type="train", - ) - - val_dataset = model.preprocess( - pyg_dataset=val_dataset, - plaid_dataset=plaid_val_dataset, - seed=cfg.model.preprocessing.get("val_seed", None), - type="val", - ) - preprocessing_end = time.time() - logger.info( - f"Preprocessing time: {str(datetime.timedelta(seconds=(preprocessing_end - preprocessing_start)))}" - ) - - # training - logger.info("\n--------\nTraining\n--------") - - train_loader = DataLoader( - train_dataset, batch_size=cfg.training.batch_size, shuffle=True - ) - val_loader = DataLoader( - val_dataset, batch_size=cfg.training.batch_size, shuffle=False - ) - - best_val_loss = float("inf") - best_model_epoch = 0 - best_model_path = os.path.join(log_dir, "best_model.pt") - - device = "cuda" if torch.cuda.is_available() else "cpu" - lbda = cfg.training.lbda - logger.info(f"Using lbda: {lbda}") - logger.info(f"Using device: {device}") - epochs = cfg.training.num_epochs - model.to(device) - - output_fields_names = train_dataset[0].output_fields_names - output_scalars_names = train_dataset[0].output_scalars_names - - train_start = time.time() - for epoch in range(epochs): - # train loop - epoch_train_loss = 0 - epoch_train_field_losses = torch.zeros(len(output_fields_names)) - epoch_train_scalar_losses = torch.zeros(len(output_scalars_names)) - - model.train() - for batch_id, batch in enumerate(train_loader): - local_batch_size = batch.num_graphs - field_losses, scalar_losses = model.compute_loss(batch) - field_loss, scalar_loss = field_losses.mean(), scalar_losses.mean() - loss = lbda * field_loss + (1 - lbda) * scalar_loss - - # recording losses - epoch_train_field_losses += field_losses.detach().cpu() * ( - local_batch_size / len(train_dataset) - ) - epoch_train_scalar_losses += scalar_losses.detach().cpu() * ( - local_batch_size / len(train_dataset) - ) - epoch_train_loss += loss.item() * (local_batch_size / len(train_dataset)) - - optimizer.zero_grad() - loss.backward() - optimizer.step() - - for n, fn in enumerate(output_fields_names): - tb_logger.add_scalar( - f"train/loss/{fn}", epoch_train_field_losses[n].item(), epoch - ) - for n, sn in enumerate(output_scalars_names): - tb_logger.add_scalar( - f"train/loss/{sn}", epoch_train_scalar_losses[n].item(), epoch - ) - tb_logger.add_scalar("train/loss", epoch_train_loss, epoch) - - # validation loop - epoch_val_loss = 0 - epoch_val_field_losses = torch.zeros(len(output_fields_names)) - epoch_val_scalar_losses = torch.zeros(len(output_scalars_names)) - - model.eval() - with torch.no_grad(): - for batch_id, batch in enumerate(val_loader): - local_batch_size = batch.num_graphs - field_losses, scalar_losses = model.compute_loss(batch) - field_loss, scalar_loss = field_losses.mean(), scalar_losses.mean() - loss = lbda * field_loss + (1 - lbda) * scalar_loss - - # recording losses - epoch_val_field_losses += field_losses.detach().cpu() * ( - local_batch_size / len(val_dataset) - ) - epoch_val_scalar_losses += scalar_losses.detach().cpu() * ( - local_batch_size / len(val_dataset) - ) - epoch_val_loss += loss.item() * (local_batch_size / len(val_dataset)) - - for n, fn in enumerate(output_fields_names): - tb_logger.add_scalar( - f"val/loss/{fn}", epoch_val_field_losses[n].item(), epoch - ) - for n, sn in enumerate(output_scalars_names): - tb_logger.add_scalar( - f"val/loss/{sn}", epoch_val_scalar_losses[n].item(), epoch - ) - tb_logger.add_scalar("val/loss", epoch_val_loss, epoch) - logger.info( - f"Epoch {epoch:>{len(str(epochs))}}: Train Loss: {epoch_train_loss:.5f} | Val Loss: {epoch_val_loss:.5f}" - ) - - # save best model - if epoch_val_loss < best_val_loss: - best_val_loss = epoch_val_loss - torch.save(model.state_dict(), best_model_path) - best_model_epoch = epoch - best_model_time = time.time() - train_end = time.time() - logger.info("") - logger.info( - f"Training time: {str(datetime.timedelta(seconds=(train_end - train_start)))}" - ) - logger.info( - f"Saved best model at epoch {best_model_epoch} with loss {best_val_loss:.5f}" - ) - logger.info( - f"Training time to reach the best model epoch: {str(datetime.timedelta(seconds=(best_model_time - train_start)))}" - ) - - # loading the best model - logger.info("Loading the best saved model") - model.load_state_dict(torch.load(best_model_path, weights_only=True)) - - # test dataset predictions - test_dataset = model.preprocess( - pyg_dataset=test_dataset, - plaid_dataset=plaid_test_dataset, - seed=cfg.model.preprocessing.get("test_seed", None), - type="test", - ) - - prediction_dataset = [] - model.eval() - for data in tqdm(test_dataset, desc="Test", total=len(test_dataset)): - with torch.no_grad(): - fields_predictions, scalars_predictions = model.predict(data) - prediction_data = Data() - prediction_data.fields_prediction = fields_predictions - prediction_data.scalars_prediction = scalars_predictions - - prediction_data.output_fields_names = data.output_fields_names - prediction_data.output_scalars_names = data.output_scalars_names - prediction_data.sample_id = data.sample_id - prediction_dataset.append(prediction_data) - - post_processed_dataset = model.postprocess( - prediction_dataset, plaid_test_dataset, type="test" - ) - prediction_dataset_unscaled = scaler.inverse_transform_prediction( - post_processed_dataset - ) - - # creating submission - logger.info("\n------\nSubmission\n------") - create_submission(prediction_dataset_unscaled, plaid_test_dataset, save_dir=log_dir) - - logger.info("\n---------------------\n------Finished!------\n---------------------") - - -if __name__ == "__main__": - main() diff --git a/benchmarks/Vi-Transf/src/__init__.py b/benchmarks/Vi-Transf/src/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/benchmarks/Vi-Transf/src/data/__init__.py b/benchmarks/Vi-Transf/src/data/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/benchmarks/Vi-Transf/src/data/loader/__init__.py b/benchmarks/Vi-Transf/src/data/loader/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/__init__.py b/benchmarks/Vi-Transf/src/data/loader/bridges/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/airfrans_sample_to_geometric.py b/benchmarks/Vi-Transf/src/data/loader/bridges/airfrans_sample_to_geometric.py deleted file mode 100644 index 68ce8a0e..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/bridges/airfrans_sample_to_geometric.py +++ /dev/null @@ -1,104 +0,0 @@ -import numpy as np -import torch -from torch_geometric.data import Data - -from plaid import Sample -from plaid import ProblemDefinition - -from .utils import faces_to_edges - - -def airfrans_sample_to_geometric( - sample: Sample, sample_id: int, problem_definition: ProblemDefinition = None -) -> Data: - """Converts a Plaid sample to PytorchGeometric Data object. - - Args: - sample (plaid.Sample): data sample - - Returns: - Data: the converted data sample - """ - vertices = sample.get_vertices() - faces = sample.get_elements()["QUAD_4"] - edge_index = faces_to_edges(faces, num_nodes=vertices.shape[0]) - - airfoil_ids = sample.get_nodal_tags()["Airfoil"] - - v1 = vertices[edge_index[:, 0]] - v2 = vertices[edge_index[:, 1]] - edge_weight = np.linalg.norm(v2 - v1, axis=1) - - # loading scalars - aoa = sample.get_scalar("angle_of_attack") - inlet_velocity = sample.get_scalar("inlet_velocity") - u_inlet = [np.cos(aoa) * inlet_velocity, np.sin(aoa) * inlet_velocity] - cl = sample.get_scalar("C_L") - cd = sample.get_scalar("C_D") - output_scalars = np.array([cl, cd]) - - # loading fields - nut = sample.get_field("nut") - ux = sample.get_field("Ux") - uy = sample.get_field("Uy") - p = sample.get_field("p") - implicit_distance = sample.get_field("implicit_distance") - - # inlet velocities - # u_inlet_field = np.array([np.cos(aoa), np.sin(aoa)]) * np.ones((nut.shape[0], 2)) - - # TODO: Normals - # normals = np.zeros_like(u_inlet) - - # converting to torch tensor - u_inlet = torch.tensor(u_inlet, dtype=torch.float32) - output_scalars = torch.tensor(output_scalars, dtype=torch.float32) - vertices = torch.tensor(vertices, dtype=torch.float32) - faces = torch.tensor(faces) - edge_index = torch.tensor(edge_index, dtype=torch.long) - edge_weight = torch.tensor(edge_weight, dtype=torch.float32) - nut = torch.tensor(nut, dtype=torch.float32) - ux = torch.tensor(ux, dtype=torch.float32) - uy = torch.tensor(uy, dtype=torch.float32) - p = torch.tensor(p, dtype=torch.float32) - implicit_distance = torch.tensor(implicit_distance, dtype=torch.float32) - - # u_inlet = torch.tensor(u_inlet) - # normals = torch.tensor(normals) - - # input / output features - input_scalars = u_inlet.reshape(1, -1) - - output_scalars = output_scalars.reshape(1, -1) - - input_fields = torch.concatenate( - [ - vertices, - implicit_distance.reshape(-1, 1), - ], - dim=1, - ) - - output_fields = torch.concatenate( - [nut.reshape(-1, 1), p.reshape(-1, 1), ux.reshape(-1, 1), uy.reshape(-1, 1)], - dim=1, - ) - - data = Data( - pos=vertices, - input_scalars=input_scalars, - x=input_fields, - output_scalars=output_scalars, - output_fields=output_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - airfoil_ids=airfoil_ids, - input_fields_names=["x", "y", "implicit_distance"], - output_fields_names=["nut", "p", "Ux", "Uy"], - input_scalars_names=["ux_inlet", "uy_inlet"], - output_scalars_names=["C_L", "C_D"], - ) - - return data diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/base_sample_to_geometric.py b/benchmarks/Vi-Transf/src/data/loader/bridges/base_sample_to_geometric.py deleted file mode 100644 index c1cf6529..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/bridges/base_sample_to_geometric.py +++ /dev/null @@ -1,131 +0,0 @@ -import numpy as np -import torch -from torch_geometric.data import Data - -from plaid import Sample -from plaid import ProblemDefinition - -from .utils import faces_to_edges - - -def base_sample_to_geometric( - sample: Sample, sample_id: int, problem_definition: ProblemDefinition -) -> Data: - """Converts a Plaid sample to PytorchGeometric Data object. - - Args: - sample (plaid.Sample): data sample - - Returns: - Data: the converted data sample - """ - vertices = sample.get_vertices() - position_field_names = ["x", "y"] if vertices.shape[1] == 2 else ["x", "y", "z"] - edge_index = [] - n_elem_types = len(sample.get_elements()) - coalesce = True - if n_elem_types > 1: - coalesce = False - assert len(sample.get_elements()) == 1, "More than one element type" - for _, faces in sample.get_elements().items(): - edge_index.append( - faces_to_edges(faces, num_nodes=vertices.shape[0], coalesce=coalesce) - ) - edge_index = np.concatenate(edge_index, axis=0) - if not coalesce: - edge_index = coalesce(edge_index) - - v1 = vertices[edge_index[:, 0]] - v2 = vertices[edge_index[:, 1]] - edge_weight = np.linalg.norm(v2 - v1, axis=1) - - # loading scalars - input_scalars_names = problem_definition.get_input_scalars_names() - output_scalars_names = problem_definition.get_output_scalars_names() - - input_scalars = [] - output_scalars = [] - for name in input_scalars_names: - input_scalars.append(sample.get_scalar(name)) - for name in output_scalars_names: - output_scalars.append(sample.get_scalar(name)) - - # loading fields - input_fields_names = problem_definition.get_input_fields_names() - output_fields_names = problem_definition.get_output_fields_names() - - if len(input_fields_names) > 0: - if input_fields_names[0] == "cell_ids": - input_fields_names.pop(0) - - if len(input_fields_names) >= 1: - input_fields = [] - for field_name in input_fields_names: - input_fields.append(sample.get_field(field_name)) - input_fields = np.vstack(input_fields).T - input_fields = np.concatenate((vertices, input_fields), axis=1) - input_fields_names = [*position_field_names, *input_fields_names] - else: - input_fields = vertices - input_fields_names = position_field_names - - output_fields = [] - for field_name in output_fields_names: - output_fields.append(sample.get_field(field_name)) - output_fields = np.vstack(output_fields).T - - # torch tensor conversion - input_scalars = torch.tensor(input_scalars, dtype=torch.float32).reshape(1, -1) - input_fields = torch.tensor(input_fields, dtype=torch.float32) - - vertices = torch.tensor(vertices, dtype=torch.float32) - edge_weight = torch.tensor(edge_weight, dtype=torch.float32) - edge_index = torch.tensor(edge_index, dtype=torch.long) - faces = torch.tensor(faces, dtype=torch.long) - - # Extracting special nodal tags - nodal_tags = {} - for k, v in sample.get_nodal_tags().items(): - nodal_tags[k + "_id"] = torch.tensor(v, dtype=torch.long) - - if None not in output_scalars and None not in output_fields: - output_scalars = torch.tensor(output_scalars, dtype=torch.float32).reshape( - 1, -1 - ) - output_fields = torch.tensor(output_fields, dtype=torch.float32) - - data = Data( - pos=vertices, - input_scalars=input_scalars, - x=input_fields, - output_scalars=output_scalars, - output_fields=output_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=input_scalars_names, - output_scalars_names=output_scalars_names, - **nodal_tags, - ) - - return data - - data = Data( - pos=vertices, - input_scalars=input_scalars, - x=input_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=input_scalars_names, - output_scalars_names=output_scalars_names, - **nodal_tags, - ) - - return data diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/elasto_plasto_dynamics.py b/benchmarks/Vi-Transf/src/data/loader/bridges/elasto_plasto_dynamics.py deleted file mode 100644 index 58bdfde8..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/bridges/elasto_plasto_dynamics.py +++ /dev/null @@ -1,135 +0,0 @@ -import warnings - -import numpy as np -import torch -from Muscat.Bridges.CGNSBridge import CGNSToMesh -from Muscat.Containers import MeshModificationTools as MMT -from Muscat.Containers.Filters import FilterObjects as FO -from torch_geometric.data import Data - -from plaid import Sample -from plaid import ProblemDefinition -from src.data.loader.bridges.multiscale_sample_to_geometric import get_distance_to_ids -from src.data.loader.bridges.utils import faces_to_edges - -# warnings.filterwarnings("ignore", module="Muscat") -# warnings.filterwarnings("ignore", category=UserWarning) -warnings.filterwarnings("ignore") - - -def elasto_plasto_dynamics_sample_to_geometric( - sample: Sample, sample_id: int, problem_definition: ProblemDefinition -) -> Data: - """Converts a Plaid sample to PytorchGeometric Data object. - - Args: - sample (plaid.Sample): data sample - - Returns: - Data: the converted data sample - """ - vertices = sample.get_vertices() - - edge_index = [] - coalesce = True - for _, faces in sample.get_elements().items(): - edge_index.append( - faces_to_edges(faces, num_nodes=vertices.shape[0], coalesce=coalesce) - ) - edge_index = np.concatenate(edge_index, axis=0) - - mesh = CGNSToMesh(sample.get_mesh()) - MMT.ComputeSkin(mesh, md=2, inPlace=True, skinTagName="Skin") - nfSkin = FO.NodeFilter(eTag="Skin") - nodeIndexSkin = nfSkin.GetNodesIndices(mesh) - mesh.GetNodalTag("Skin").AddToTag(nodeIndexSkin) - border_ids = mesh.GetNodalTag("Skin").GetIds() - - sdf, projection_vectors = get_distance_to_ids(vertices, border_ids) - sdf = sdf.reshape(-1, 1) - input_fields = np.concatenate((vertices, sdf, projection_vectors), axis=1) - input_fields_names = ["x", "y", "sdf", "dist_vect_x", "dist_vect_y", "U_x", "U_y"] - output_fields_names = ["U_x", "U_y"] - input_scalars_names = ["time"] - output_scalars_names = [] - - timestep_list = sample.get_all_mesh_times() - - output_fields = np.vstack( - ( - sample.get_field("U_x", time=timestep_list[0]), - sample.get_field("U_y", time=timestep_list[0]), - ) - ).T - data_list = [] - - vertices_torch = torch.from_numpy(vertices).to( - torch.float32 - ) # torch.tensor(vertices, dtype=torch.float32) - edge_index = torch.from_numpy(edge_index).to( - torch.long - ) # torch.tensor(edge_index, dtype=torch.long) - - for t0, t1 in zip(timestep_list[:-1], timestep_list[1:]): - output_fields_t1 = output_fields - if output_fields_t1[0, 0] is not None: - output_fields = np.vstack( - (sample.get_field("U_x", time=t1), sample.get_field("U_y", time=t1)) - ).T - - input_scalars = np.array([t0]) - input_fields = np.column_stack( - (vertices, sdf, projection_vectors, output_fields_t1) - ) - - # torch tensor conversion - input_scalars = ( - torch.from_numpy(input_scalars).to(torch.float32).reshape(1, -1) - ) # torch.tensor(input_scalars, dtype=torch.float32).reshape(1, -1) - input_fields = torch.from_numpy(input_fields).to( - torch.float32 - ) # torch.tensor(input_fields, dtype=torch.float32) - - # Extracting special nodal tags - nodal_tags = {} - for k, v in sample.get_nodal_tags().items(): - nodal_tags["border_id"] = torch.tensor(border_ids, dtype=torch.int) - - if None not in output_fields: - output_fields = torch.from_numpy(output_fields).to( - torch.float32 - ) # torch.tensor(output_fields, dtype=torch.float32) - - data = Data( - pos=vertices_torch, - input_scalars=input_scalars, - x=input_fields, - output_fields=output_fields, - edge_index=edge_index.T, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=input_scalars_names, - output_scalars_names=output_scalars_names, - time=t0, - timestep_list=timestep_list, - **nodal_tags, - ) - else: - data = Data( - pos=vertices_torch, - input_scalars=input_scalars, - x=input_fields, - edge_index=edge_index.T, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=input_scalars_names, - output_scalars_names=output_scalars_names, - time=t0, - timestep_list=timestep_list, - **nodal_tags, - ) - data_list.append(data) - - return data_list diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/multiscale_sample_to_geometric.py b/benchmarks/Vi-Transf/src/data/loader/bridges/multiscale_sample_to_geometric.py deleted file mode 100644 index 5097fe9f..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/bridges/multiscale_sample_to_geometric.py +++ /dev/null @@ -1,126 +0,0 @@ -import numpy as np -import torch -from sklearn.neighbors import KDTree -from torch_geometric.data import Data - -from plaid import Sample -from plaid import ProblemDefinition - -from .utils import faces_to_edges - - -def get_distance_to_ids(vertices, boundary_ids): - boundary_vertices = vertices[boundary_ids, :] - search_index = KDTree(boundary_vertices) - sdf, projection_id = search_index.query(vertices, return_distance=True) - - projection_vertices = boundary_vertices[projection_id.ravel()] - projection_vectors = projection_vertices - vertices - projection_vectors_norm = np.linalg.norm(projection_vectors, axis=1) - projection_vectors_norm[projection_vectors_norm == 0] = 1 - projection_vectors = projection_vectors / projection_vectors_norm[:, None] - - return sdf, projection_vectors - - -def multiscale_sample_to_geometric( - sample: Sample, sample_id: int, problem_definition: ProblemDefinition -) -> Data: - """Converts a Plaid sample to PytorchGeometric Data object. - - Args: - sample (plaid.Sample): data sample - - Returns: - Data: the converted data sample - """ - vertices = sample.get_vertices() - edge_index = [] - faces = sample.get_elements()["TRI_3"] - edge_index = faces_to_edges(faces, num_nodes=vertices.shape[0], coalesce=True) - - v1 = vertices[edge_index[:, 0]] - v2 = vertices[edge_index[:, 1]] - edge_weight = np.linalg.norm(v2 - v1, axis=1) - - # data names - input_fields_names = [] - output_fields_names = ["u1", "u2", "P11", "P12", "P22", "P21", "psi"] - output_scalars_names = ["effective_energy"] - input_scalars_names = ["C11", "C12", "C22"] - - input_scalars = [] - output_scalars = [] - for name in input_scalars_names: - input_scalars.append(sample.get_scalar(name)) - for name in output_scalars_names: - output_scalars.append(sample.get_scalar(name)) - - input_fields = vertices - input_fields_names = ["x", "y"] - - output_fields = [] - for field_name in output_fields_names: - output_fields.append(sample.get_field(field_name)) - output_fields = np.vstack(output_fields).T - - # Extracting special nodal tags - nodal_tags = {} - for k, v in sample.get_nodal_tags().items(): - nodal_tags[k + "_id"] = torch.tensor(v, dtype=torch.long) - - holes_ids = sample.get_nodal_tags()["Holes"] - sdf, projection_vectors = get_distance_to_ids(vertices, holes_ids) - input_fields = np.concatenate((vertices, sdf, projection_vectors), axis=1) - input_fields_names = ["x", "y", "sdf", "dist_vect_x", "dist_vect_y"] - - # torch tensor conversion - input_scalars = torch.tensor(input_scalars, dtype=torch.float32).reshape(1, -1) - input_fields = torch.tensor(input_fields, dtype=torch.float32) - - vertices = torch.tensor(vertices, dtype=torch.float32) - edge_weight = torch.tensor(edge_weight, dtype=torch.float32) - edge_index = torch.tensor(edge_index, dtype=torch.long) - faces = torch.tensor(faces, dtype=torch.long) - - if None not in output_scalars and None not in output_fields: - output_scalars = torch.tensor(output_scalars, dtype=torch.float32).reshape( - 1, -1 - ) - output_fields = torch.tensor(output_fields, dtype=torch.float32) - - data = Data( - pos=vertices, - input_scalars=input_scalars, - x=input_fields, - output_scalars=output_scalars, - output_fields=output_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=input_scalars_names, - output_scalars_names=output_scalars_names, - **nodal_tags, - ) - - return data - - data = Data( - pos=vertices, - input_scalars=input_scalars, - x=input_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=input_scalars_names, - output_scalars_names=output_scalars_names, - **nodal_tags, - ) - - return data diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/no_bridge.py b/benchmarks/Vi-Transf/src/data/loader/bridges/no_bridge.py deleted file mode 100644 index aa8e622d..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/bridges/no_bridge.py +++ /dev/null @@ -1,6 +0,0 @@ -from plaid import Sample -from plaid import ProblemDefinition - - -def no_bridge(sample: Sample, sample_id: int, problem_definition: ProblemDefinition): - return sample diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/profile_sample_to_geometric.py b/benchmarks/Vi-Transf/src/data/loader/bridges/profile_sample_to_geometric.py deleted file mode 100644 index 8ecd5365..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/bridges/profile_sample_to_geometric.py +++ /dev/null @@ -1,86 +0,0 @@ -import numpy as np -import torch -from torch_geometric.data import Data - -from plaid import Sample -from plaid import ProblemDefinition - -from .multiscale_sample_to_geometric import get_distance_to_ids -from .utils import faces_to_edges - - -def profile_sample_to_geometric( - sample: Sample, sample_id: int, problem_definition: ProblemDefinition -) -> Data: - vertices = sample.get_vertices() - edge_index = [] - faces = sample.get_elements()["TRI_3"] - edge_index = faces_to_edges(faces, num_nodes=vertices.shape[0], coalesce=True) - - v1 = vertices[edge_index[:, 0]] - v2 = vertices[edge_index[:, 1]] - edge_weight = np.linalg.norm(v2 - v1, axis=1) - - # data names - output_fields_names = ["Mach", "Pressure", "Velocity-x", "Velocity-y"] - - input_fields = vertices - - output_fields = [] - for field_name in output_fields_names: - output_fields.append(sample.get_field(field_name)) - output_fields = np.vstack(output_fields).T - - # Extracting special nodal tags - nodal_tags = {} - for k, v in sample.get_nodal_tags().items(): - nodal_tags[k + "_id"] = torch.tensor(v, dtype=torch.long) - - airfoil_ids = sample.get_nodal_tags()["Airfoil"] - sdf, projection_vectors = get_distance_to_ids(vertices, airfoil_ids) - input_fields = np.concatenate((vertices, sdf, projection_vectors), axis=1) - input_fields_names = ["x", "y", "sdf", "dist_vect_x", "dist_vect_y"] - - # torch tensor conversion - input_fields = torch.from_numpy(input_fields).to(torch.float32) - - vertices = torch.tensor(vertices, dtype=torch.float32) - edge_weight = torch.tensor(edge_weight, dtype=torch.float32) - edge_index = torch.tensor(edge_index, dtype=torch.long) - faces = torch.tensor(faces, dtype=torch.long) - - if None not in output_fields: - output_fields = torch.tensor(output_fields, dtype=torch.float32) - - data = Data( - pos=vertices, - x=input_fields, - output_fields=output_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=[], - output_scalars_names=[], - **nodal_tags, - ) - - return data - - data = Data( - pos=vertices, - x=input_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=[], - output_scalars_names=[], - **nodal_tags, - ) - - return data diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/rotor_sample_to_geometric.py b/benchmarks/Vi-Transf/src/data/loader/bridges/rotor_sample_to_geometric.py deleted file mode 100644 index f4488a5c..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/bridges/rotor_sample_to_geometric.py +++ /dev/null @@ -1,3 +0,0 @@ -from .base_sample_to_geometric import base_sample_to_geometric - -rotor_sample_to_geometric = base_sample_to_geometric diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/tensile_sample_to_geometric.py b/benchmarks/Vi-Transf/src/data/loader/bridges/tensile_sample_to_geometric.py deleted file mode 100644 index f80dd88c..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/bridges/tensile_sample_to_geometric.py +++ /dev/null @@ -1,142 +0,0 @@ -import numpy as np -import torch -from torch_geometric.data import Data - -from plaid import Sample -from plaid import ProblemDefinition - -from .multiscale_sample_to_geometric import get_distance_to_ids -from .utils import faces_to_edges - - -def extract_border_edges(faces): - edge_dict = {} - - for face in faces: - for i in range(3): - edge = tuple(sorted((face[i], face[(i + 1) % 3]))) - if edge in edge_dict: - edge_dict[edge] += 1 - else: - edge_dict[edge] = 1 - - border_edges = [edge for edge, count in edge_dict.items() if count == 1] - return np.array(border_edges) - - -def get_border_ids(vertices, faces): - bars = extract_border_edges(faces) - border_ids = np.unique(np.ravel(bars)) - return border_ids - - -def tensile_sample_to_geometric( - sample: Sample, sample_id: int, problem_definition: ProblemDefinition -): - vertices = sample.get_vertices() - edge_index = [] - n_elem_types = len(sample.get_elements()) - coalesce = True - if n_elem_types > 1: - coalesce = False - assert len(sample.get_elements()) == 1, "More than one element type" - for _, faces in sample.get_elements().items(): - edge_index.append( - faces_to_edges(faces, num_nodes=vertices.shape[0], coalesce=coalesce) - ) - edge_index = np.concatenate(edge_index, axis=0) - if not coalesce: - edge_index = coalesce(edge_index) - - v1 = vertices[edge_index[:, 0]] - v2 = vertices[edge_index[:, 1]] - edge_weight = np.linalg.norm(v2 - v1, axis=1) - - # loading scalars - input_scalars_names = ["P", "p1", "p2", "p3", "p4", "p5"] - output_scalars_names = ["max_von_mises", "max_U2_top", "max_sig22_top"] - input_fields_names = [] - output_fields_names = ["U1", "U2", "sig11", "sig22", "sig12"] - - input_scalars = [] - output_scalars = [] - for name in input_scalars_names: - input_scalars.append(sample.get_scalar(name)) - for name in output_scalars_names: - output_scalars.append(sample.get_scalar(name)) - - # sdf and one hot encoding - border_ids = get_border_ids(vertices, faces) - sdf, projection_vectors = get_distance_to_ids(vertices, border_ids) - # labels = np.array(list(map(int, is_border))).reshape(-1, 1) - # labels[labels == 1] = 6 - # labels = torch.tensor(labels) - # one_hot = one_hot(labels[:, 0].long(), num_classes=9) - - if len(input_fields_names) > 0: - if input_fields_names[0] == "cell_ids": - input_fields_names.pop(0) - - input_fields = np.concatenate((vertices, sdf, projection_vectors), axis=1) - input_fields_names = ["x", "y", "sdf", "dist_vect_x", "dist_vect_y"] - - output_fields = [] - for field_name in output_fields_names: - output_fields.append(sample.get_field(field_name)) - output_fields = np.vstack(output_fields).T - - # torch tensor conversion - input_scalars = torch.tensor(input_scalars, dtype=torch.float32).reshape(1, -1) - input_fields = torch.tensor(input_fields, dtype=torch.float32) - - vertices = torch.tensor(vertices, dtype=torch.float32) - edge_weight = torch.tensor(edge_weight, dtype=torch.float32) - edge_index = torch.tensor(edge_index, dtype=torch.long) - faces = torch.tensor(faces, dtype=torch.long) - - # Extracting special nodal tags - nodal_tags = {} - for k, v in sample.get_nodal_tags().items(): - nodal_tags[k + "_id"] = torch.tensor(v, dtype=torch.long) - nodal_tags["Border_id"] = torch.tensor(border_ids, dtype=torch.long) - - if None not in output_scalars and None not in output_fields: - output_scalars = torch.tensor(output_scalars, dtype=torch.float32).reshape( - 1, -1 - ) - output_fields = torch.tensor(output_fields, dtype=torch.float32) - - data = Data( - pos=vertices, - input_scalars=input_scalars, - x=input_fields, - output_scalars=output_scalars, - output_fields=output_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=input_scalars_names, - output_scalars_names=output_scalars_names, - **nodal_tags, - ) - - return data - - data = Data( - pos=vertices, - input_scalars=input_scalars, - x=input_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=input_scalars_names, - output_scalars_names=output_scalars_names, - **nodal_tags, - ) - return data diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/utils.py b/benchmarks/Vi-Transf/src/data/loader/bridges/utils.py deleted file mode 100644 index 034f08c0..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/bridges/utils.py +++ /dev/null @@ -1,34 +0,0 @@ -import numpy as np -import torch -from torch_geometric.utils._coalesce import coalesce as geometric_coalesce - - -def my_coalesce(edges: torch.Tensor | np.ndarray, num_nodes: int, reduce="add"): - if isinstance(edges, np.ndarray): - edges = torch.tensor(edges).T - return geometric_coalesce(edges, num_nodes=num_nodes, reduce=reduce).T.numpy() - edges = geometric_coalesce(edges.T, num_nodes=num_nodes, reduce=reduce).T - return edges - - -def faces_to_edges(faces: np.ndarray, num_nodes: int, coalesce: bool = True): - """Creates a list of edges from a Faces array. - - Args: - faces (np.ndarray): Array of faces shape (n_faces, face_dim) - - Returns: - np.ndarray: the edge list of shape (n, 2) - """ - assert len(faces.shape) == 2, "Wrong shape for the faces, should be a 2D array" - - # Generate edges (without duplicates in one pass) - rolled = np.roll(faces, -1, axis=1) - edges = np.vstack((faces.ravel(), rolled.ravel())).T - edges = np.concatenate((edges, edges[:, ::-1]), axis=0) - - # Ensure unique edges by sorting each edge and using np.unique - if coalesce: - edges = my_coalesce(edges, num_nodes) - - return edges diff --git a/benchmarks/Vi-Transf/src/data/loader/bridges/vki_sample_to_geometric.py b/benchmarks/Vi-Transf/src/data/loader/bridges/vki_sample_to_geometric.py deleted file mode 100644 index a764be95..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/bridges/vki_sample_to_geometric.py +++ /dev/null @@ -1,114 +0,0 @@ -import numpy as np -import torch -from torch_geometric.data import Data - -from plaid import Sample -from plaid import ProblemDefinition - -from .utils import faces_to_edges - - -def vki_sample_to_geometric( - sample: Sample, sample_id: int, problem_definition: ProblemDefinition -) -> Data: - """Converts a Plaid sample to PytorchGeometric Data object. - - Args: - sample (plaid.Sample): data sample - - Returns: - Data: the converted data sample - """ - vertices = sample.get_vertices(base_name="Base_2_2") - edge_index = [] - faces = sample.get_elements(base_name="Base_2_2")["QUAD_4"] - edge_index = faces_to_edges(faces, num_nodes=vertices.shape[0], coalesce=True) - - v1 = vertices[edge_index[:, 0]] - v2 = vertices[edge_index[:, 1]] - edge_weight = np.linalg.norm(v2 - v1, axis=1) - - # data names - input_fields_names = ["sdf"] - output_fields_names = ["mach", "nut"] - output_scalars_names = ["Q", "power", "Pr", "Tr", "eth_is", "angle_out"] - input_scalars_names = ["angle_in", "mach_out"] - - input_scalars = [] - output_scalars = [] - for name in input_scalars_names: - input_scalars.append(sample.get_scalar(name)) - for name in output_scalars_names: - output_scalars.append(sample.get_scalar(name)) - - if len(input_fields_names) >= 1: - input_fields = [] - for field_name in input_fields_names: - input_fields.append(sample.get_field(field_name, base_name="Base_2_2")) - input_fields = np.vstack(input_fields).T - input_fields = np.concatenate((vertices, input_fields), axis=1) - input_fields_names = ["x", "y", *input_fields_names] - else: - input_fields = vertices - input_fields_names = ["x", "y"] - - output_fields = [] - for field_name in output_fields_names: - output_fields.append(sample.get_field(field_name, base_name="Base_2_2")) - output_fields = np.vstack(output_fields).T - - # torch tensor conversion - input_scalars = torch.tensor(input_scalars, dtype=torch.float32).reshape(1, -1) - input_fields = torch.tensor(input_fields, dtype=torch.float32) - - vertices = torch.tensor(vertices, dtype=torch.float32) - edge_weight = torch.tensor(edge_weight, dtype=torch.float32) - edge_index = torch.tensor(edge_index, dtype=torch.long) - faces = torch.tensor(faces, dtype=torch.long) - - # Extracting special nodal tags - nodal_tags = {} - for k, v in sample.get_nodal_tags(base_name="Base_2_2").items(): - nodal_tags[k + "_id"] = torch.tensor(v, dtype=torch.long) - - if None not in output_scalars and None not in output_fields: - output_scalars = torch.tensor(output_scalars, dtype=torch.float32).reshape( - 1, -1 - ) - output_fields = torch.tensor(output_fields, dtype=torch.float32) - - data = Data( - pos=vertices, - input_scalars=input_scalars, - x=input_fields, - output_scalars=output_scalars, - output_fields=output_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=input_scalars_names, - output_scalars_names=output_scalars_names, - **nodal_tags, - ) - - return data - - data = Data( - pos=vertices, - input_scalars=input_scalars, - x=input_fields, - edge_index=edge_index.T, - edge_weight=edge_weight, - faces=faces, - sample_id=sample_id, - input_fields_names=input_fields_names, - output_fields_names=output_fields_names, - input_scalars_names=input_scalars_names, - output_scalars_names=output_scalars_names, - **nodal_tags, - ) - - return data diff --git a/benchmarks/Vi-Transf/src/data/loader/hub_loader.py b/benchmarks/Vi-Transf/src/data/loader/hub_loader.py deleted file mode 100644 index 3e679b7c..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/hub_loader.py +++ /dev/null @@ -1,50 +0,0 @@ -from datasets import load_dataset - -from plaid.bridges.huggingface_bridge import huggingface_dataset_to_plaid -from plaid import Dataset as PlaidDataset -from plaid import ProblemDefinition - -from .loader import Loader - - -class HubLoader(Loader): - def __init__( - self, - bridge: callable, - dataset_name: str, - task_split: str, - cache_dir: str, - processes_number: int = 1, - ): - """Initializes the HubLoader with the given parameters. - - :param bridge: A callable that bridges the dataset loading process. - :param dataset_dir: The directory where the dataset is stored. - :param task_split: The split of the dataset to load (default is "all_samples"). - :param load_from: The source from which to load the dataset (default is "huggingface"). - """ - super().__init__(bridge, task_split, processes_number=processes_number) - self.dataset_name = dataset_name - self.cache_dir = cache_dir - - def load_plaid(self, **kwargs) -> tuple[ProblemDefinition, PlaidDataset, ...]: - hf_dataset = None - try: - hf_dataset = load_dataset( - self.dataset_name, split="all_samples", cache_dir=self.cache_dir - ) - except Exception as e: - print(f"Error loading dataset from Hugging Face: {e}") - print( - "Please refer to the documentation (https://huggingface.co/PLAID-datasets)" - ) - print( - "Provide a correct dataset_name with format 'PLAID-datasets/DATASET'." - ) - raise e - - plaid_dataset, problem_definition = huggingface_dataset_to_plaid(hf_dataset) - - return problem_definition, *self.get_dataset_split( - problem_definition, plaid_dataset - ) diff --git a/benchmarks/Vi-Transf/src/data/loader/loader.py b/benchmarks/Vi-Transf/src/data/loader/loader.py deleted file mode 100644 index 0ed530dc..00000000 --- a/benchmarks/Vi-Transf/src/data/loader/loader.py +++ /dev/null @@ -1,117 +0,0 @@ -import os -from multiprocessing import Pool -from typing import Union - -from torch_geometric.data import Data -from tqdm import tqdm - -from plaid import Dataset as PlaidDataset -from plaid import ProblemDefinition - - -class Loader: - def __init__( - self, - bridge: callable, - task_split: Union[list[str], str, list[Union[list[str], str]]], - processes_number: int = 1, - ): - """Initializes the Loader with the given parameters. - - :param bridge: A callable that bridges the dataset loading process. - :param task_split: The split(s) of the dataset to load. - """ - self.bridge = bridge - self.task_split: Union[list[str], str, list[Union[list[str], str]]] = task_split - self.processes_number = processes_number - - def load_plaid(self, verbose: bool) -> tuple[ProblemDefinition, list[PlaidDataset]]: - raise NotImplementedError("This method should be implemented in the subclass") - - def get_dataset_split( - self, problem_definition: ProblemDefinition, plaid_dataset: PlaidDataset - ) -> Union[PlaidDataset, tuple[PlaidDataset]]: - if not type(self.task_split) == str: - datasets = [] - for split in self.task_split: - if type(split) == str: - split_ids = problem_definition.get_split(split) - else: - split_ids = [ - split_id - for split_str in split - for split_id in problem_definition.get_split(split_str) - ] - dataset = PlaidDataset() - dataset.set_samples(plaid_dataset.get_samples(ids=split_ids)) - datasets.append(dataset) - - return tuple(datasets) - - ids = problem_definition.get_split(self.task_split) - dataset = PlaidDataset() - dataset.set_samples(plaid_dataset.get_samples(ids=ids)) - return (dataset,) - - def load(self, verbose=False) -> tuple[ProblemDefinition, tuple[list[Data], ...]]: - """Load and converts a plaid dataset to torch geometric format. - - Returns: - tuple[ProblemDefinition, Union[list[Data], tuple[list[Data], ...]]]: - A tuple containing the problem definition and either a single list of Data objects - or a tuple of multiple lists of Data objects. - """ - buffer = self.load_plaid(verbose=verbose) - problem_definition = buffer[0] - dataset_list = buffer[1:] - - processed_list = [] - for dataset in dataset_list: - processed_list.append( - self.plaid_to_bridge( - dataset, problem_definition=problem_definition, verbose=verbose - ) - ) - - return problem_definition, *processed_list - - def plaid_to_bridge( - self, dataset: PlaidDataset, problem_definition: ProblemDefinition, verbose=True - ) -> list[Data]: - """Converts a Plaid dataset to PytorchGeometric dataset. - - Args: - dataset (plaid.containers.dataset.Dataset): Plaid dataset - - Returns: - list[Data]: the converted dataset - """ - if verbose: - print("in bridge") - if self.processes_number == -1: - self.processes_number = os.cpu_count() - data_list = [] - sample_ids, samples = list(zip(*list(dataset.get_samples().items()))) - if self.processes_number == 0 or self.processes_number == 1: - if verbose: - iterator = tqdm(zip(samples, sample_ids), total=len(samples)) - else: - iterator = zip(samples, sample_ids) - for sample, sample_id in iterator: - new_data = self.bridge(sample, sample_id, problem_definition) - data_list.append(new_data) - return data_list - - with Pool(processes=self.processes_number) as p: - args_iter = zip(samples, sample_ids, [problem_definition] * len(samples)) - if verbose: - iterator = tqdm(p.starmap(self.bridge, args_iter), total=len(samples)) - else: - iterator = p.starmap(self.bridge, args_iter) - for new_data in iterator: - # if isinstance(new_data, list): - # data_list.extend(new_data) - # else: - data_list.append(new_data) - - return data_list diff --git a/benchmarks/Vi-Transf/src/data/scaler.py b/benchmarks/Vi-Transf/src/data/scaler.py deleted file mode 100644 index 6884e36b..00000000 --- a/benchmarks/Vi-Transf/src/data/scaler.py +++ /dev/null @@ -1,320 +0,0 @@ -from copy import deepcopy - -import torch -from sklearn.preprocessing import StandardScaler as SklearnStandardScaler -from torch_geometric.data import Data - - -class Scaler: - def __init__(self, *args, **kwargs): - pass - - def partial_fit(self, *args, **kwargs): - pass - - def fit(self, dataset: list[Data]) -> None: - pass - - def transform(self, dataset: list[Data]) -> list[Data]: - return dataset - - def fit_transform(self, dataset: list[Data]) -> list[Data]: - return dataset - - def inverse_transform(self, dataset: list[Data]) -> list[Data]: - return dataset - - -class NoScaler(Scaler): - def __init__(self, *args, **kwargs): - super().__init__() - - -class StandardScaler(Scaler): - """StandardScaler class for scaling and inverse scaling of node features, output scalars, and output fields in - PyTorch Geometric datasets. - - Args: - scalers (Union[str, list[str]], optional): A string representing a scaler name, or a list of three scalers - for node features, output scalars, and output fields. - Defaults to "StandardScaler". - """ - - def __init__(self): - self.scalers = [SklearnStandardScaler for _ in range(6)] - - def _reset(self): - """Resets the scalers for node features, output scalars, and output fields.""" - ( - self.xf_scaler, - self.xs_scaler, - self.yf_scaler, - self.ys_scaler, - self.edge_attr_scaler, - self.edge_weight_scaler, - ) = [s() for s in self.scalers] - - def fit(self, dataset: list[Data]): - """Fits the scalers to the provided dataset. Each Data object in the dataset must have - `x` and `input_scalars`. `output_scalars` and `output_fields` are optional. - - Args: - dataset (list[Data]): A list of PyTorch Geometric Data objects. - """ - self._reset() - - for data in dataset: - self.xf_scaler.partial_fit(data.x) - - if ( - hasattr(data, "input_scalars") - and isinstance(data.input_scalars, torch.Tensor) - and len(data.input_scalars.ravel()) > 0 - ): - self.xs_scaler.partial_fit(data.input_scalars) - - # Check if output_fields exists before fitting - if ( - hasattr(data, "output_fields") - and isinstance(data.output_fields, torch.Tensor) - and len(data.output_fields.ravel()) > 0 - ): - self.yf_scaler.partial_fit(data.output_fields) - - # Check if output_scalars exists before fitting - if ( - hasattr(data, "output_scalars") - and isinstance(data.output_scalars, torch.Tensor) - and len(data.output_scalars.ravel()) > 0 - ): - self.ys_scaler.partial_fit(data.output_scalars) - - if data.edge_attr is not None: - self.edge_attr_scaler.partial_fit(data.edge_attr) - if data.edge_weight is not None: - self.edge_weight_scaler.partial_fit(data.edge_weight.reshape(-1, 1)) - - def transform(self, dataset: list[Data]): - """Transforms the dataset based on the fitted scalers. - - Args: - dataset (list[Data]): A list of PyTorch Geometric Data objects to be transformed. - """ - dataset = deepcopy(dataset) - - for data in dataset: - xf_dtype = data.x.dtype - - data.x = torch.tensor( - self.xf_scaler.transform(data.x), - dtype=xf_dtype, - ) - if ( - hasattr(data, "input_scalars") - and isinstance(data.input_scalars, torch.Tensor) - and len(data.input_scalars.ravel()) > 0 - ): - xs_dtype = data.input_scalars.dtype - data.input_scalars = torch.tensor( - self.xs_scaler.transform(data.input_scalars), - dtype=xs_dtype, - ) - - # Transform output_fields if it exists - if ( - hasattr(data, "output_fields") - and isinstance(data.output_fields, torch.Tensor) - and len(data.output_fields.ravel()) > 0 - ): - yf_dtype = data.output_fields.dtype - data.output_fields = torch.tensor( - self.yf_scaler.transform(data.output_fields), - dtype=yf_dtype, - ) - - # Transform output_scalars if it exists - if ( - hasattr(data, "output_scalars") - and isinstance(data.output_scalars, torch.Tensor) - and len(data.output_scalars.ravel()) > 0 - ): - ys_dtype = data.output_scalars.dtype - data.output_scalars = torch.tensor( - self.ys_scaler.transform(data.output_scalars), - dtype=ys_dtype, - ) - - if ( - isinstance(data.edge_attr, torch.Tensor) - and len(data.edge_attr.ravel()) > 0 - ): - edge_attr_dtype = data.edge_attr.dtype - data.edge_attr = torch.as_tensor( - self.edge_attr_scaler.transform(data.edge_attr), - dtype=edge_attr_dtype, - ) - - if ( - isinstance(data.edge_weight, torch.Tensor) - and len(data.edge_weight.ravel()) > 0 - ): - edge_weight_dtype = data.edge_weight.dtype - data.edge_weight = torch.as_tensor( - self.edge_weight_scaler.transform(data.edge_weight.reshape(-1, 1)), - dtype=edge_weight_dtype, - ).reshape(-1) - - return dataset - - def fit_transform(self, dataset: list[Data]): - """Fits the scalers and then transforms the dataset. A combination of `fit` and `transform`. - - Args: - dataset (list[Data]): A list of PyTorch Geometric Data objects. - - Returns: - list[Data]: Transformed dataset. - """ - self.fit(dataset) - return self.transform(dataset) - - def inverse_transform(self, dataset: list[Data] | Data): - """Reverts the scaling transformation applied on the dataset, bringing the data back to its original scale. - - Args: - dataset (list[Data] | Data): A list or single PyTorch Geometric Data object to be inverse transformed. - - Returns: - list[Data] | Data: Inverse transformed dataset or data point. - """ - dataset = deepcopy(dataset) - is_data = False - if isinstance(dataset, Data): - dataset = [dataset] - is_data = True - - for data in dataset: - xf_dtype = data.x.dtype - - data.x = torch.tensor( - self.xf_scaler.inverse_transform(data.x), - dtype=xf_dtype, - ) - - if ( - hasattr(data, "input_scalars") - and isinstance(data.input_scalars, torch.Tensor) - and len(data.input_scalars.ravel()) > 0 - ): - xs_dtype = data.input_scalars.dtype - data.input_scalars = torch.tensor( - self.xs_scaler.inverse_transform(data.input_scalars), - dtype=xs_dtype, - ) - - # Inverse transform output_fields if it exists - if ( - hasattr(data, "output_fields") - and isinstance(data.output_fields, torch.Tensor) - and len(data.output_fields.ravel()) > 0 - ): - yf_dtype = data.output_fields.dtype - data.output_fields = torch.tensor( - self.yf_scaler.inverse_transform(data.output_fields), - dtype=yf_dtype, - ) - - # Inverse transform output_scalars if it exists - if ( - hasattr(data, "output_scalars") - and isinstance(data.output_scalars, torch.Tensor) - and len(data.output_scalars.ravel()) > 0 - ): - ys_dtype = data.output_scalars.dtype - data.output_scalars = torch.tensor( - self.ys_scaler.inverse_transform(data.output_scalars), - dtype=ys_dtype, - ) - - if ( - isinstance(data.edge_attr, torch.Tensor) - and len(data.edge_attr.ravel()) > 0 - ): - edge_attr_dtype = data.edge_attr.dtype - data.edge_attr = torch.as_tensor( - self.edge_attr_scaler.inverse_transform(data.edge_attr), - dtype=edge_attr_dtype, - ) - - if ( - isinstance(data.edge_weight, torch.Tensor) - and len(data.edge_weight.ravel()) > 0 - ): - edge_weight_dtype = data.edge_weight.dtype - data.edge_weight = torch.as_tensor( - self.edge_weight_scaler.inverse_transform( - data.edge_weight.reshape(-1, 1) - ), - dtype=edge_weight_dtype, - ).reshape(-1) - - if ( - hasattr(data, "fields_prediction") - and isinstance(data.fields_prediction, torch.Tensor) - and len(data.fields_prediction.ravel()) > 0 - ): - data.fields_prediction = torch.tensor( - self.yf_scaler.inverse_transform(data.fields_prediction), - dtype=data.fields_prediction.dtype, - ) - - if ( - hasattr(data, "scalars_prediction") - and isinstance(data.scalars_prediction, torch.Tensor) - and len(data.scalars_prediction.ravel()) > 0 - ): - data.scalars_prediction = torch.tensor( - self.ys_scaler.inverse_transform(data.scalars_prediction), - dtype=data.scalars_prediction.dtype, - ) - - if is_data: - return dataset[0] - - return dataset - - def inverse_transform_prediction(self, dataset: list[Data]): - """Inverse transforms the output_fields_prediction attribute of the dataset. - - Args: - dataset (list[Data]): A list of PyTorch Geometric Data objects. - - Returns: - list[Data]: Dataset with inverse transformed predictions. - """ - for data in dataset: - if ( - hasattr(data, "fields_prediction") - and isinstance(data.fields_prediction, torch.Tensor) - and len(data.fields_prediction.ravel()) > 0 - ): - data.fields_prediction = torch.tensor( - self.yf_scaler.inverse_transform( - data.fields_prediction.detach().cpu().numpy() - ), - dtype=data.fields_prediction.dtype, - ) - - if ( - hasattr(data, "scalars_prediction") - and isinstance(data.scalars_prediction, torch.Tensor) - and len(data.scalars_prediction.ravel()) > 0 - ): - data.scalars_prediction = torch.tensor( - self.ys_scaler.inverse_transform( - data.scalars_prediction.detach().cpu().numpy().reshape(1, -1) - ), - dtype=data.scalars_prediction.dtype, - ) - - return dataset diff --git a/benchmarks/Vi-Transf/src/data/utils.py b/benchmarks/Vi-Transf/src/data/utils.py deleted file mode 100644 index 2543c2a8..00000000 --- a/benchmarks/Vi-Transf/src/data/utils.py +++ /dev/null @@ -1,52 +0,0 @@ -import copy - -from torch_geometric.data import Data - -from plaid import Dataset as PlaidDataset - - -def extract_plaid_dataset(plaid_dataset: PlaidDataset, ids: list[int]) -> PlaidDataset: - extracted_dataset = copy.deepcopy(plaid_dataset) - sample_dict = plaid_dataset.get_samples(ids=ids) - extracted_dataset._samples = sample_dict - - return extracted_dataset - - -def split_plaid_train_test( - plaid_dataset: PlaidDataset, train_ids: list[int], test_ids: list[int] -) -> tuple[PlaidDataset, PlaidDataset]: - plaid_train_dataset = extract_plaid_dataset(plaid_dataset, ids=train_ids) - plaid_test_dataset = extract_plaid_dataset(plaid_dataset, ids=test_ids) - - return plaid_train_dataset, plaid_test_dataset - - -def split_pyg_train_test( - pyg_dataset: list[Data], train_ids: list[int], test_ids: list[int] -) -> tuple[list[Data], list[Data]]: - train_dataset = [] - test_dataset = [] - - for data in pyg_dataset: - if data.sample_id in train_ids: - train_dataset.append(data) - elif data.sample_id in test_ids: - test_dataset.append(data) - - return train_dataset, test_dataset - - -def split_temporal_pyg_train_test( - pyg_dataset: list[Data], train_ids: list[int], test_ids: list[int] -) -> tuple[list[Data], list[Data]]: - train_dataset = [] - test_dataset = [] - - for data_list in pyg_dataset: - if data_list[0].sample_id in train_ids: - train_dataset.append(data_list) - elif data_list[0].sample_id in test_ids: - test_dataset.append(data_list) - - return train_dataset, test_dataset diff --git a/benchmarks/Vi-Transf/src/evaluation/submission.py b/benchmarks/Vi-Transf/src/evaluation/submission.py deleted file mode 100644 index d6c29b92..00000000 --- a/benchmarks/Vi-Transf/src/evaluation/submission.py +++ /dev/null @@ -1,53 +0,0 @@ -import os -import pickle - -import tqdm as tqdm -from torch_geometric.data import Data - -from ..data.loader.loader import PlaidDataset - - -def create_submission( - prediction_dataset: list[Data], plaid_dataset: PlaidDataset, save_dir: str = "." -): - pyg_id_dico = {} - for n_data, data in enumerate(prediction_dataset): - pyg_id_dico[data.sample_id] = n_data - - ids_test = plaid_dataset.get_sample_ids() - pyg_sample_ids = pyg_id_dico.keys() - - assert set(ids_test) == set(pyg_sample_ids), "Dataset sample ids do not match!" - - output_fields_names = prediction_dataset[0].output_fields_names - output_scalars_names = ( - prediction_dataset[0].output_scalars_names - if hasattr(prediction_dataset[0], "output_scalars_names") - else [] - ) - - reference = [] - for i, id in enumerate(ids_test): - reference.append({}) - sample = prediction_dataset[pyg_id_dico[id]] - output_fields_prediction = ( - sample.fields_prediction.detach().cpu().numpy() - if hasattr(sample, "fields_prediction") - and sample.fields_prediction is not None - else None - ) - output_scalars_predictions = ( - sample.scalars_prediction.reshape(-1).detach().cpu().numpy() - if hasattr(sample, "scalars_prediction") - and sample.scalars_prediction is not None - else None - ) - - for n_field, fn in enumerate(output_fields_names): - reference[i][fn] = output_fields_prediction[:, n_field] - - for n_scalar, sn in enumerate(output_scalars_names): - reference[i][sn] = output_scalars_predictions[n_scalar] - - with open(os.path.join(save_dir, "reference.pkl"), "wb") as file: - pickle.dump(reference, file) diff --git a/benchmarks/Vi-Transf/src/models/__init__.py b/benchmarks/Vi-Transf/src/models/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/benchmarks/Vi-Transf/src/models/model.py b/benchmarks/Vi-Transf/src/models/model.py deleted file mode 100644 index 951e7743..00000000 --- a/benchmarks/Vi-Transf/src/models/model.py +++ /dev/null @@ -1,31 +0,0 @@ -import torch.nn as nn -from torch_geometric.data import Batch, Data - -from plaid import Dataset as PlaidDataset - - -class BaseModel(nn.Module): - def __init__(self): - super(BaseModel, self).__init__() - pass - - def preprocess( - self, - pyg_dataset: list[Data], - plaid_dataset: PlaidDataset, - seed: int, - type: str, - **kwargs, - ): - pass - - def forward(self, data: Batch): - pass - - def postprocess( - self, pyg_dataset: list[Data], plaid_dataset: PlaidDataset, type: str, **kwargs - ): - return pyg_dataset - - def evaluate(self, data: Data): - pass diff --git a/benchmarks/Vi-Transf/src/models/vits/__init__.py b/benchmarks/Vi-Transf/src/models/vits/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/benchmarks/Vi-Transf/src/models/vits/flatformer_cls_less.py b/benchmarks/Vi-Transf/src/models/vits/flatformer_cls_less.py deleted file mode 100644 index 5a7559ef..00000000 --- a/benchmarks/Vi-Transf/src/models/vits/flatformer_cls_less.py +++ /dev/null @@ -1,221 +0,0 @@ -import random -from typing import Callable, Optional, Union - -import torch -import torch.nn as nn -from einops import rearrange -from torch import Tensor -from torch.nn import TransformerEncoder, TransformerEncoderLayer -from torch_geometric.data import Batch, Data - -from ..model import BaseModel -from .tokenization.tokenizer import Tokenizer - -# B: Batch size -# T: Number of tokens -# L: Latent dimension of the tokens -# Sin: Number of input scalars -# Fin: Number of input fields -# Sout: Number of output scalars -# Fout: Number of output fields -# Nv: Number of vertices per subdomain - - -class FlatFormerCLSLess(BaseModel): - def __init__( - self, - n_vertices_per_subdomain: int, - n_head: int, - dim_ff: int, - num_layers: int, - tokenizer: Tokenizer, - input_field_dim: int = None, - input_scalar_dim: int = None, - output_scalar_dim: int = None, - output_field_dim: int = None, - activation: Union[str, Callable[[Tensor], Tensor]] = "relu", - norm_first: bool = True, - dropout: float = 0.1, - latent_dim: Optional[int] = None, - **kwargs, - ): - super().__init__() - self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu") - self.output_scalar_dim = output_scalar_dim - self.output_field_dim = output_field_dim - self.input_scalar_dim = input_scalar_dim - self.input_field_dim = input_field_dim - - self.criterion = nn.MSELoss(reduction="none") - - # Tokenizer - token_dim = n_vertices_per_subdomain * input_field_dim - self.tokenizer = tokenizer - self.n_head = n_head - - # Linear projection of the flattened tokens onto the latent space - self.encoder = nn.Linear(token_dim, latent_dim - input_scalar_dim, bias=False) - - # CLS token for the transformer encoder - self.cls_token = nn.Parameter( - torch.zeros(1, 1, latent_dim), requires_grad=True - ) # 1 1 L - nn.init.xavier_uniform_(self.cls_token) - - # Transformer encoder - encoder_layers = TransformerEncoderLayer( - d_model=latent_dim, - nhead=n_head, - dim_feedforward=dim_ff, - dropout=dropout, - batch_first=True, - norm_first=norm_first, - dtype=torch.float32, - activation=activation, - ) - self.transformer_encoder = TransformerEncoder(encoder_layers, num_layers) - - # Linear decoder - self.decoder = nn.Linear( - latent_dim + input_scalar_dim, - n_vertices_per_subdomain * (self.output_field_dim + self.output_scalar_dim), - bias=False, - ) - - def preprocess( - self, pyg_dataset: list[Data], seed: Optional[int] = None, **kwargs - ) -> list[Data]: - if seed is None: - seed = random.randint(0, 2**32 - 1) - print(f"Using seed: {seed} for data preprocessing") - dataset = self.tokenizer.preprocess(pyg_dataset, seed=seed) - - return dataset - - def forward(self, data_batch: Data | Batch): - tokens, src_key_padding_mask = self.tokenizer(data_batch) # B T (Nv * F) - if self.input_scalar_dim > 0: - input_scalars = data_batch.input_scalars.to(self._device) - else: - input_scalars = None - - field_predictions, scalar_predictions = self.forward_batch( - input_scalars, tokens, src_key_padding_mask - ) - field_predictions = self.tokenizer.untokenize_predictions( - field_predictions, data_batch - ) - - return field_predictions, scalar_predictions - - def forward_batch(self, input_scalars, tokens, src_key_padding_mask): - # latent projection - latent_tokens = self.encoder(tokens) # B T (Nv * F) -> B T (L - Sin) - - # concatenating the scalars to each token - if self.input_scalar_dim > 0: - scalar_concat_tensor = input_scalars.unsqueeze(1).expand( - (-1, latent_tokens.shape[1], -1) - ) # B Sin -> B T Sin - latent_tokens = torch.concat( - (latent_tokens, scalar_concat_tensor), dim=2 - ) # B T (L - Sin) -> B T L - - # transformer encoder - encoded_tokens = self.transformer_encoder( - latent_tokens, src_key_padding_mask=src_key_padding_mask - ) # B T L - - # reinjecting the scalars - if self.input_scalar_dim > 0: - scalar_concat_tensor = input_scalars.unsqueeze(1).expand( - (-1, encoded_tokens.shape[1], -1) - ) # B Sin -> B T Sin - encoded_tokens = torch.concat( - (encoded_tokens, scalar_concat_tensor), dim=2 - ) # B T L -> B T (L + Sin) - - # field predictions - predictions = self.decoder(encoded_tokens) # B T (Fout * Nv) - predictions = rearrange( - predictions, - "b t (n f) -> b t n f", - f=(self.output_field_dim + self.output_scalar_dim), - ) # B T Nv (Fout + Sout) - - field_predictions = predictions[:, :, :, : self.output_field_dim] - scalar_predictions = predictions[:, :, :, self.output_field_dim :].mean( - dim=(1, 2) - ) - if self.output_scalar_dim == 0: - scalar_predictions = None - - return field_predictions, scalar_predictions - - def compute_loss(self, data_batch: Data | Batch) -> torch.Tensor: - tokens, src_key_padding_mask = self.tokenizer(data_batch) - tokens, src_key_padding_mask = ( - tokens.to(self._device), - src_key_padding_mask.to(self._device), - ) - - if self.input_scalar_dim > 0: - input_scalars = data_batch.input_scalars.to(self._device) - else: - input_scalars = None - - # Forward pass - field_predictions, scalar_predictions = self.forward_batch( - input_scalars, tokens, src_key_padding_mask - ) - - # Removing padding tokens - field_predictions = field_predictions[~src_key_padding_mask, ...] - expanded_output_fields = data_batch.expanded_output_fields.to(self._device)[ - ~src_key_padding_mask, ... - ].reshape(field_predictions.shape) - - # Computing losses. It is also computed for padding nodes thanks to the expanded_output_fields tensor - field_loss = self.criterion(field_predictions, expanded_output_fields).mean( - dim=(0, 1) - ) - if self.output_scalar_dim > 0: - scalar_loss = self.criterion( - scalar_predictions, data_batch.output_scalars.to(self._device) - ).mean(dim=0) - else: - scalar_loss = torch.tensor(0, dtype=torch.float32, requires_grad=False) - - return field_loss, scalar_loss - - def predict(self, data: Data): - tokens, src_key_padding_mask = self.tokenizer(data) - tokens, src_key_padding_mask = ( - tokens.to(self._device), - src_key_padding_mask.to(self._device), - ) - if self.input_scalar_dim > 0: - input_scalars = data.input_scalars.to(self._device) - else: - input_scalars = None - field_predictions, scalar_predictions = self.forward_batch( - input_scalars, tokens, src_key_padding_mask - ) - - # removing padding tokens - field_predictions = field_predictions[~src_key_padding_mask, ...] - - # reconstructing the output_field matrix for the data sample - field_predictions = self.tokenizer.untokenize( - field_predictions.cpu(), data, keep_list=True - ) - - field_predictions = field_predictions[0] - if self.output_scalar_dim > 0: - scalar_predictions = scalar_predictions[0].cpu() - - return field_predictions, scalar_predictions - - @property - def _device(self): - return self.encoder.weight.device diff --git a/benchmarks/Vi-Transf/src/models/vits/tokenization/__init__.py b/benchmarks/Vi-Transf/src/models/vits/tokenization/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/benchmarks/Vi-Transf/src/models/vits/tokenization/flatten_data_tokenizers.py b/benchmarks/Vi-Transf/src/models/vits/tokenization/flatten_data_tokenizers.py deleted file mode 100644 index a770212a..00000000 --- a/benchmarks/Vi-Transf/src/models/vits/tokenization/flatten_data_tokenizers.py +++ /dev/null @@ -1,77 +0,0 @@ -import numpy as np -import torch -from einops import rearrange -from torch_geometric.data import Data - -from .morton import compute_morton_order - - -def flatten_tokenizer( - data: Data, n_vertices_per_subdomain: int, tokenization_type="morton" -): - n_communities = data.n_communities - padded_community_orders = torch.empty( - (n_communities, n_vertices_per_subdomain), dtype=torch.int - ) - community_reverse_orders = torch.empty(data.x.shape[0], dtype=torch.int) - - for i in range(n_communities): - community_map = data.communities == i - community_size = torch.sum(community_map).item() - community_pos = data.pos[community_map, :] - - n_repeat = n_vertices_per_subdomain // community_size + 1 - if n_vertices_per_subdomain // community_size == 0: - raise ValueError( - f"n_vertices_per_subdomain ({n_vertices_per_subdomain}) is smaller than the number of features ({data.x.shape[1]})." - ) - - if tokenization_type == "morton": - local_community_order = compute_morton_order(community_pos) - else: - local_community_order = np.arange(community_size) - - padded_local_community_order = np.tile(local_community_order, n_repeat)[ - :n_vertices_per_subdomain - ] - local_reverse_community_order = np.zeros((community_size,), dtype=int) - local_reverse_community_order[local_community_order] = np.arange(community_size) - - local_to_global = np.where(community_map)[0] - padded_community_order = local_to_global[padded_local_community_order] - reverse_community_order = ( - local_reverse_community_order + i * n_vertices_per_subdomain - ) - - padded_community_orders[i, :] = torch.from_numpy(padded_community_order).to( - torch.int - ) - community_reverse_orders[community_map] = torch.from_numpy( - reverse_community_order - ).to(torch.int) - - data.padded_community_orders = padded_community_orders # T n_vertices_per_subdomain - data.community_reverse_orders = community_reverse_orders # N - data.tokens = rearrange( - data.x[padded_community_orders], "t n d -> t (n d)" - ) # T n_vertices_per_subdomain d -> T (n_vertices_per_subdomain d) - if hasattr(data, "output_fields"): - data.expanded_output_fields = rearrange( - data.output_fields[padded_community_orders], "t n d -> t (n d)" - ) - - return data - - -def flatten_simple(data: Data, n_vertices_per_subdomain: int): - return flatten_tokenizer(data, n_vertices_per_subdomain, "simple") - - -def flatten_morton_ordered(data: Data, n_vertices_per_subdomain: int): - return flatten_tokenizer(data, n_vertices_per_subdomain, "morton") - - -data_tokenizer_registry = { - "morton": flatten_morton_ordered, - "simple": flatten_simple, -} diff --git a/benchmarks/Vi-Transf/src/models/vits/tokenization/flatten_tokenizer.py b/benchmarks/Vi-Transf/src/models/vits/tokenization/flatten_tokenizer.py deleted file mode 100644 index 48f4cc73..00000000 --- a/benchmarks/Vi-Transf/src/models/vits/tokenization/flatten_tokenizer.py +++ /dev/null @@ -1,153 +0,0 @@ -import os -import random -from typing import Literal, Optional - -import torch -from einops import rearrange -from torch.multiprocessing import Pool -from torch_geometric.data import Batch, Data -from tqdm import tqdm - -from .flatten_data_tokenizers import data_tokenizer_registry -from .partitioners.partitioner import Partitioner -from .tokenizer import Tokenizer - - -class FlattenTokenizer(Tokenizer): - def __init__( - self, - partitioner: Partitioner, - output_field_dim: int, - tokenization_type: Literal["morton", "simple"] = "morton", - processes_number=1, - ): - super().__init__() - self.partitioner = partitioner - self.n_vertices_per_subdomain = partitioner.n_vertices_per_subdomain - self.data_tokenizer_fn_name: str = tokenization_type - self.data_tokenizer_fn: callable = data_tokenizer_registry[tokenization_type] - self.processes_number = ( - os.cpu_count() if processes_number == -1 else processes_number - ) - self.output_field_dim = output_field_dim - - def _tokenize(self, dataset: list[Data]) -> list[Data]: - """Tokenizes the dataset using the specified partitioner.""" - dataset[0].x.shape[1] * self.n_vertices_per_subdomain - n_tokens_per_sim = max([datapoint.n_communities for datapoint in dataset]) - - tokenized_dataset = [] - print( - f"Using {self.processes_number} processes for the tokenizer preprocessing." - ) - if self.processes_number == 0 or self.processes_number == 1: - for datapoint in tqdm(dataset): - data = process_data_tuple( - self.data_tokenizer_fn_name, - datapoint, - self.n_vertices_per_subdomain, - n_tokens_per_sim, - ) - tokenized_dataset.append(data) - else: - with Pool(self.processes_number) as p: - for processed_datapoint in tqdm( - p.starmap( - process_data_tuple, - zip( - [self.data_tokenizer_fn_name] * len(dataset), - dataset, - [self.n_vertices_per_subdomain] * len(dataset), - [n_tokens_per_sim] * len(dataset), - ), - ), - total=len(dataset), - ): - tokenized_dataset.append(processed_datapoint) - return tokenized_dataset - - def preprocess(self, dataset, seed: Optional[int] = None) -> list[Data]: - # partitioning each datapoint in the dataset - if seed is None: - seed = random.randint(0, 2**32 - 1) - print("Using random seed for partitioning:", seed) - dataset = self.partitioner.partition(dataset, seed=seed) - - # tokenizing each datapoint in the dataset - dataset = self._tokenize(dataset) - return dataset - - def forward(self, data): - # Flatten the input data - return data.tokens, data.attn_mask - - def untokenize( - self, full_predictions: torch.Tensor, data_batch: Batch | Data, keep_list=False - ) -> torch.tensor: - # full_predictions: B T D - result_list = [] - - if isinstance(data_batch, Batch): - data_batch = data_batch.to_data_list() - else: - data_batch = [data_batch] - full_predictions = full_predictions[None, ...] - - for i, data in enumerate(data_batch): - new_result = untokenize_prediction_data( - full_predictions[i], data, self.output_field_dim - ) - result_list.append(new_result) - - if keep_list: - return result_list - - return torch.vstack(result_list) - - -def untokenize_prediction_data(full_predictions, data, pred_dim): - """Unflattens and removes padding-associated outputs.""" - return rearrange(full_predictions, "t n d -> (t n) d")[ - data.community_reverse_orders - ] - - -def process_data_tuple( - data_tokenizer_fn_name, datapoint, n_vertices_per_subdomain, n_tokens_per_sim -): - data_tokenizer_fn = data_tokenizer_registry[data_tokenizer_fn_name] - data = data_tokenizer_fn(datapoint, n_vertices_per_subdomain) - - cross_domain_padded_token, attn_mask = pad_subdomains(data.tokens, n_tokens_per_sim) - data.tokens = cross_domain_padded_token.unsqueeze(0) - data.attn_mask = attn_mask.unsqueeze(0) - - if hasattr(data, "expanded_output_fields"): - cross_domain_padded_expanded_output_fields, _ = pad_subdomains( - data.expanded_output_fields, n_tokens_per_sim - ) - data.expanded_output_fields = ( - cross_domain_padded_expanded_output_fields.unsqueeze(0) - ) - - return data - - -def pad_subdomains(tokens, n_tokens_per_sim): - token_dim = tokens.shape[1] - n_sequence_tokens = tokens.shape[0] - pad_token = torch.zeros((1, token_dim)) - - if n_tokens_per_sim > n_sequence_tokens: - tokens = torch.cat( - [tokens, pad_token.tile((n_tokens_per_sim - n_sequence_tokens, 1))] - ) - else: - assert n_tokens_per_sim == n_sequence_tokens, ( - f"n_tokens_per_sim ({n_tokens_per_sim}) must be equal to the number of sequence tokens ({n_sequence_tokens}) or greater." - ) - mask = torch.ones(n_tokens_per_sim) - mask[n_sequence_tokens:] = 0 - mask = (mask == 0).to(tokens.device) - - return tokens, mask diff --git a/benchmarks/Vi-Transf/src/models/vits/tokenization/morton.py b/benchmarks/Vi-Transf/src/models/vits/tokenization/morton.py deleted file mode 100644 index d9d59662..00000000 --- a/benchmarks/Vi-Transf/src/models/vits/tokenization/morton.py +++ /dev/null @@ -1,88 +0,0 @@ -import torch - - -def splitBy2(n): - n = n & 0x00000000FFFFFFFF - n = (n | (n << 16)) & 0x0000FFFF0000FFFF - n = (n | (n << 8)) & 0x00FF00FF00FF00FF - n = (n | (n << 4)) & 0x0F0F0F0F0F0F0F0F - n = (n | (n << 2)) & 0x3333333333333333 - n = (n | (n << 1)) & 0x5555555555555555 - return n - - -def compute_2dmorton_order(point_cloud): - assert point_cloud.shape[1] == 2, "Point cloud must be 2D" - - min_xy = point_cloud.min(axis=0).values.to(torch.float64) - max_xy = point_cloud.max(axis=0).values.to(torch.float64) - - bounding_box_size = (max_xy - min_xy).max() - leaf_size = bounding_box_size / (2**32 - 1) - - origin = min_xy - 0.5 * leaf_size - - with torch.no_grad(): - quantized_point_cloud = torch.floor((point_cloud - origin) / leaf_size) - - ij_split_by_2 = splitBy2(quantized_point_cloud.to(torch.int64)) - morton_code = ij_split_by_2[:, 0] | ij_split_by_2[:, 1] << 1 - - morton_order = torch.argsort(morton_code.to(torch.uint64)) - - return morton_order - - -def splitBy3(n): - n = n & 0b0000000000000000000000000000000000000000000111111111111111111111 - n = ( - n | n << 32 - ) & 0b0000000000011111000000000000000000000000000000001111111111111111 - n = ( - n | n << 16 - ) & 0b0000000000011111000000000000000011111111000000000000000011111111 - n = ( - n | n << 8 - ) & 0b0001000000001111000000001111000000001111000000001111000000001111 - n = ( - n | n << 4 - ) & 0b0001000011000011000011000011000011000011000011000011000011000011 - n = ( - n | n << 2 - ) & 0b0001001001001001001001001001001001001001001001001001001001001001 - return n - - -def compute_3dmorton_order(point_cloud): - assert point_cloud.shape[1] == 3, "Point cloud must be 3D" - - min_xyz = point_cloud.min(axis=0).values.to(torch.float64) - max_xyz = point_cloud.max(axis=0).values.to(torch.float64) - - bounding_box_size = (max_xyz - min_xyz).max() - leaf_size = bounding_box_size / (2**21 - 1) - - origin = min_xyz - 0.5 * leaf_size - - with torch.no_grad(): - quantized_point_cloud = torch.floor((point_cloud - origin) / leaf_size) - - ijk_split_by_3 = splitBy3(quantized_point_cloud.to(torch.int64)) - morton_code = ( - ijk_split_by_3[:, 0] | ijk_split_by_3[:, 1] << 1 | ijk_split_by_3[:, 2] << 2 - ) - - morton_order = torch.argsort(morton_code) - - return morton_order - - -def compute_morton_order(point_cloud): - if point_cloud.shape[1] == 2: - return compute_2dmorton_order(point_cloud) - elif point_cloud.shape[1] == 3: - return compute_3dmorton_order(point_cloud) - else: - raise ValueError( - f"Only works for 2d or 3d pointclouds, this won't work in dimension {point_cloud.shape[1]}" - ) diff --git a/benchmarks/Vi-Transf/src/models/vits/tokenization/partitioners/__init__.py b/benchmarks/Vi-Transf/src/models/vits/tokenization/partitioners/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/benchmarks/Vi-Transf/src/models/vits/tokenization/partitioners/metis_partitioner.py b/benchmarks/Vi-Transf/src/models/vits/tokenization/partitioners/metis_partitioner.py deleted file mode 100644 index 0ad46c9d..00000000 --- a/benchmarks/Vi-Transf/src/models/vits/tokenization/partitioners/metis_partitioner.py +++ /dev/null @@ -1,147 +0,0 @@ -import random -from typing import Optional - -import numpy as np -import pymetis -import torch -from pymetis import Options -from torch.multiprocessing import Pool, cpu_count -from torch_geometric.data import Data -from tqdm import tqdm - -from .partitioner import Partitioner - - -def npoints_to_nparts( - npoints: int, n_sim_points: int, absolute_tol: int, relative_tol: float -) -> int: - """Computes the number of subdomains in a simulation, with a tolerance given the number of points per subdomains we want and the number of points in the simulation.""" - nparts = n_sim_points // npoints + 1 - nparts = int(np.ceil((1 + relative_tol) * nparts + absolute_tol)) - - return nparts - - -def torch_geometric_to_metis_format(data: Data): - """Convert torch-geometric graph data to a Pythonic adjacency list format suitable for METIS part_graph. - - Args: - data (Data): A torch-geometric Data object containing: - - `edge_index` (torch.Tensor): Tensor of shape (2, num_edges) representing edges. - - Returns: - list[list[int]]: A Pythonic adjacency list, where adjacency[i] is a list of - nodes adjacent to node i. - """ - num_nodes = data.num_nodes - adjacency = [[] for _ in range(num_nodes)] - - for source, target in zip( - data.edge_index[0, :].numpy(), data.edge_index[1, :].numpy() - ): - adjacency[source].append(target) - - return adjacency - - -class MetisPartitioner(Partitioner): - def __init__( - self, - n_vertices_per_subdomain: int = None, - processes_number: int = 1, - absolute_tol: int = 1, - relative_tol: float = 0.05, - ): - super().__init__(n_vertices_per_subdomain) - self.processes_number = processes_number - self.absolute_tol = absolute_tol - self.relative_tol = relative_tol - - def partition(self, dataset: list[Data], seed: Optional[int] = None) -> np.ndarray: - if seed is None: - seed = random.randint(0, 2**32 - 1) - if seed is not None: - rng_generator = random.Random(seed) - seed_vector = [ - rng_generator.randint(0, 2**30 - 1) for _ in range(len(dataset)) - ] # metis doesn't accept seeds that are too large - - if self.processes_number == -1: - self.processes_number = cpu_count() - - partitioned_dataset = [] - - if self.processes_number > 1: - with Pool(processes=self.processes_number) as pool: - partitioned_dataset = list( - tqdm( - pool.starmap( - _partition_single, - zip( - dataset, - [self.n_vertices_per_subdomain] * len(dataset), - seed_vector, - [self.absolute_tol] * len(dataset), - [self.relative_tol] * len(dataset), - ), - ), - desc="Partitioning dataset with METIS", - total=len(dataset), - ) - ) - else: - for i, data in enumerate( - tqdm( - dataset, desc="Partitioning dataset with METIS", total=len(dataset) - ) - ): - partitioned_data = _partition_single( - data, - self.n_vertices_per_subdomain, - seed_vector[i], - self.absolute_tol, - self.relative_tol, - ) - partitioned_dataset.append(partitioned_data) - - return partitioned_dataset - - -def _partition_single( - data: Data, npoints, seed=None, absolute_tol: int = 1, relative_tol: float = 0.05 -) -> torch.Tensor: - if seed is None: - seed = random.randint(0, 2**30 - 1) - options = Options(seed=seed) - - n_sim_points = data.x.shape[0] - n_subdomains = npoints_to_nparts( - npoints=npoints, - n_sim_points=n_sim_points, - absolute_tol=absolute_tol, - relative_tol=relative_tol, - ) - - adjacency = torch_geometric_to_metis_format(data) - - _, communities = pymetis.part_graph( - nparts=n_subdomains, - adjacency=adjacency, - xadj=None, - adjncy=None, - vweights=None, - eweights=None, - recursive=None, - contiguous=None, - options=options, - ) - communities = torch.tensor(communities, dtype=torch.long) - for community in torch.unique(communities): - assert torch.sum(communities == community) <= npoints, ( - f"community {community} is too large for sample {data.sample_id}" - ) - - data.communities = communities.to(data.x.device) - data.n_communities = n_subdomains - - return data diff --git a/benchmarks/Vi-Transf/src/models/vits/tokenization/partitioners/partitioner.py b/benchmarks/Vi-Transf/src/models/vits/tokenization/partitioners/partitioner.py deleted file mode 100644 index 3556c483..00000000 --- a/benchmarks/Vi-Transf/src/models/vits/tokenization/partitioners/partitioner.py +++ /dev/null @@ -1,9 +0,0 @@ -from torch_geometric.data import Data - - -class Partitioner: - def __init__(self, n_vertices_per_subdomain): - self.n_vertices_per_subdomain = n_vertices_per_subdomain - - def partition(self, dataset: list[Data]) -> list[Data]: - raise NotImplementedError("This method should be implemented in the subclass") diff --git a/benchmarks/Vi-Transf/src/models/vits/tokenization/temporal_flatten_tokenizer.py b/benchmarks/Vi-Transf/src/models/vits/tokenization/temporal_flatten_tokenizer.py deleted file mode 100644 index 8394a7e9..00000000 --- a/benchmarks/Vi-Transf/src/models/vits/tokenization/temporal_flatten_tokenizer.py +++ /dev/null @@ -1,75 +0,0 @@ -import random -from typing import Literal - -from einops import rearrange -from torch_geometric.data import Data - -from .flatten_tokenizer import FlattenTokenizer, pad_subdomains, process_data_tuple -from .partitioners.partitioner import Partitioner - - -class TemporalFlattenTokenizer(FlattenTokenizer): - def __init__( - self, - partitioner: Partitioner, - output_field_dim: int, - tokenization_type: Literal["morton", "simple"] = "morton", - processes_number=1, - ): - super().__init__( - partitioner, output_field_dim, tokenization_type, processes_number - ) - - def preprocess(self, dataset: list[list[Data]], seed: int): - if seed is None: - seed = random.randint(0, 2**32 - 1) - print("Using random seed for partitioning:", seed) - first_samples_dataset = [ds[0] for ds in dataset] - first_sample_dataset = self.partitioner.partition(first_samples_dataset) - n_tokens_per_sim = max([data.n_communities for data in first_sample_dataset]) - first_samples_dataset = [ - process_data_tuple( - self.data_tokenizer_fn_name, - sample, - n_vertices_per_subdomain=self.n_vertices_per_subdomain, - n_tokens_per_sim=n_tokens_per_sim, - ) - for sample in first_samples_dataset - ] - - tokenized_dataset = [] - for n, sample_list in enumerate(dataset): - for sample in sample_list: - sample.communities = first_samples_dataset[n].communities - sample.n_communities = first_samples_dataset[n].n_communities - sample.padded_community_orders = first_samples_dataset[ - n - ].padded_community_orders - sample.community_reverse_orders = first_samples_dataset[ - n - ].community_reverse_orders - - tokens = rearrange( - sample.x[sample.padded_community_orders], "t n d -> t (n d)" - ) - cross_domain_padded_token, attn_mask = pad_subdomains( - tokens, n_tokens_per_sim - ) - sample.tokens = cross_domain_padded_token.unsqueeze(0) - sample.attn_mask = attn_mask.unsqueeze(0) - - if hasattr(sample, "output_fields"): - expanded_output_fields = rearrange( - sample.output_fields[sample.padded_community_orders], - "t n d -> t (n d)", - ) - cross_domain_padded_expanded_output_fields, _ = pad_subdomains( - expanded_output_fields, n_tokens_per_sim - ) - sample.expanded_output_fields = ( - cross_domain_padded_expanded_output_fields.unsqueeze(0) - ) - - tokenized_dataset.append(sample) - - return tokenized_dataset diff --git a/benchmarks/Vi-Transf/src/models/vits/tokenization/tokenizer.py b/benchmarks/Vi-Transf/src/models/vits/tokenization/tokenizer.py deleted file mode 100644 index eada20da..00000000 --- a/benchmarks/Vi-Transf/src/models/vits/tokenization/tokenizer.py +++ /dev/null @@ -1,18 +0,0 @@ -from typing import Optional - -import torch.nn as nn -from torch_geometric.data import Batch, Data - - -class Tokenizer(nn.Module): - def __init__(self): - super().__init__() - - def preprocess(self, dataset: list[Data], seed: Optional[int] = None) -> list[Data]: - pass - - def forward(self, data: Batch): - pass - - def untokenize(self, data: Batch): - pass diff --git a/docs/conf.py b/docs/conf.py index 189983a3..5ef2da88 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -1,10 +1,3 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # Configuration file for the Sphinx documentation builder. # # This file only contains a selection of the most common options. For a full @@ -36,6 +29,11 @@ root = basedir / "docs" / "source" / "notebooks" for file in root.rglob("*_example.py"): print(file) + if str(file).split("/")[-1] in ["downloadable_example.py", "pipeline_example.py", + "bisect_example.py", "metrics_example.py", + "init_with_tabular_example.py","interpolation_example.py", + "split_example.py", "stats_example.py" ]: + continue subprocess.run([ "jupytext", "--to", "ipynb", diff --git a/docs/fix_module.py b/docs/fix_module.py index dc04a4a8..835d80bc 100644 --- a/docs/fix_module.py +++ b/docs/fix_module.py @@ -1,10 +1,3 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports import os diff --git a/docs/source/core_concepts.rst b/docs/source/core_concepts.rst index 720a6c77..1d85b483 100644 --- a/docs/source/core_concepts.rst +++ b/docs/source/core_concepts.rst @@ -3,7 +3,7 @@ Core concepts PLAID is a datamodel and library for organizing physics datasets and defining learning problems on these datasets. -It provides high-level classes such as :py:class:`~plaid.containers.dataset.Dataset`, :py:class:`~plaid.containers.sample.Sample`, and :py:class:`~plaid.problem_definition.ProblemDefinition`, with features addressed via :doc:`core_concepts/feature_identifiers`. +It provides high-level classes such as :py:class:`~plaid.containers.dataset.Dataset`, :py:class:`~plaid.containers.sample.Sample`, and :py:class:`~plaid.problem_definition.ProblemDefinition`.. PLAID relies on the CGNS standard for representing complex physics meshes and uses human-readable formats like `.yaml`, `.csv` for other features. @@ -16,7 +16,6 @@ For more details and examples, see the :doc:`core_concepts` and :doc:`examples_t core_concepts/sample core_concepts/dataset core_concepts/problem_definition - core_concepts/feature_identifiers core_concepts/defaults core_concepts/disk_format core_concepts/interoperability diff --git a/docs/source/core_concepts/dataset.md b/docs/source/core_concepts/dataset.md index 941c461a..13befc1e 100644 --- a/docs/source/core_concepts/dataset.md +++ b/docs/source/core_concepts/dataset.md @@ -72,8 +72,6 @@ feat_ids = dataset.get_all_features_identifiers(ids=None) node_ids = dataset.get_all_features_identifiers_by_type("nodes") ``` -Learn more about identifiers: {doc}`feature_identifiers`. - ## Retrieve features by identifier(s) ```python diff --git a/docs/source/core_concepts/feature_identifiers.md b/docs/source/core_concepts/feature_identifiers.md deleted file mode 100644 index 02ded2a8..00000000 --- a/docs/source/core_concepts/feature_identifiers.md +++ /dev/null @@ -1,130 +0,0 @@ ---- -title: Feature identifiers ---- - -# Feature identifiers - -Feature identifiers are a concise, unambiguous way to point to any feature in PLAID. They replace legacy name-only APIs (now deprecated) and make it possible to uniquely address features across time steps, bases, zones and locations. - -- A feature is one of: scalar, field, nodes. -- A FeatureIdentifier is a small dictionary that encodes the feature type and, when relevant, its context (e.g., base, zone, location, time). - -Why this matters: -- Names alone can be ambiguous (e.g., a field called "pressure" may exist at several locations, times, or zones). Identifiers remove ambiguity and make operations deterministic and hashable. -- Identifiers are stable keys, so they can be used in sets, dicts, and sorting. See the underlying implementation in {py:class}`~plaid.types.feature_types.FeatureIdentifier`. -- Discussion and design notes are available in the project discussion: [Feature identifier concept](https://github.com/orgs/PLAID-lib/discussions/107). - -## Structure - -FeatureIdentifier is a `dict[str, str | float]` with a mandatory `type` key. Depending on the feature type, other keys are required or optional: - -- scalar: `{"type": "scalar", "name": }` -- field: `{"type": "field", "name": , "base_name": , "zone_name": , "location": , "time": }` - - `location` must be one of: `Vertex`, `EdgeCenter`, `FaceCenter`, `CellCenter`. - - `base_name`, `zone_name`, `location`, `time` are optional if default value mechanics apply (see {doc}`defaults`). -- nodes: `{"type": "nodes", "base_name": , "zone_name": , "time": }` - -Notes: -- Time must be a float when present. -- FeatureIdentifier is hashable and orderable (internally sorted), enabling deduplication and stable sorting. - -## Examples - -Minimal identifiers: - -```python -from plaid.types import FeatureIdentifier - -fid_scalar = FeatureIdentifier({"type": "scalar", "name": "Re"}) - -fid_field = FeatureIdentifier({ - "type": "field", - "name": "pressure", - "base_name": "Base", - "zone_name": "Zone", - "location": "Vertex", - "time": 0.0, -}) - -fid_nodes = FeatureIdentifier({ - "type": "nodes", - "base_name": "Base", - "zone_name": "Zone", - "time": 0.0, -}) -``` - -## Using identifiers with {py:class}`~plaid.containers.sample.Sample` - -The {py:class}`~plaid.containers.sample.Sample` container exposes helpers to retrieve, update, extract and merge features via identifiers. - -```python -from plaid.containers.sample import Sample - -sample = Sample(path) - -# 1) Retrieve one feature -u = sample.get_feature_from_identifier(fid_field) - -# 2) Retrieve several features -features = sample.get_features_from_identifiers([fid_scalar, fid_field]) - -# 3) Update one or several features -updated = sample.update_features_from_identifier(fid_scalar, 0.5) # scalar -updated = sample.update_features_from_identifier([fid_field], [u_new]) # field - -# 4) Extract a sub-sample containing only selected features -sub = sample.extract_sample_from_identifier([fid_field, fid_nodes]) - -# 5) Merge all features from another sample -merged = sample.merge_features(sub) -``` - -`Sample` also offers string identifiers for convenience: - -```python -u = sample.get_feature_from_string_identifier( - "field::pressure/Base/Zone/Vertex/0.0" -) - -Re = sample.get_feature_from_string_identifier("scalar::Re") -``` - -String format is: `:://...`. The order is fixed per type. If `time` is provided, it is parsed as float. - -## Using identifiers with {py:class}`~plaid.problem_definition.ProblemDefinition` - -{py:class}`~plaid.problem_definition.ProblemDefinition` stores learning inputs/outputs as lists of FeatureIdentifiers and offers utilities to add and filter them. - -```python -from plaid.problem_definition import ProblemDefinition - -problem = ProblemDefinition() - -problem.add_in_feature_identifier(fid_scalar) -problem.add_in_features_identifiers([fid_field]) - -problem.add_out_feature_identifier(fid_nodes) - -ins = problem.get_in_features_identifiers() -outs = problem.get_out_features_identifiers() - -# Filtering among registered identifiers -subset_in = problem.filter_in_features_identifiers([fid_scalar, fid_nodes]) -subset_out = problem.filter_out_features_identifiers([fid_nodes]) -``` - -Legacy name-based methods (e.g., `add_input_scalars_names`) are deprecated; prefer the identifier-based ones. - -## Best practices - -- Always include enough context to disambiguate a feature. For fields/nodes on multiple bases/zones/times, set all relevant keys. -- Use {py:meth}`~plaid.containers.sample.Sample.get_all_features_identifiers()` to introspect what identifiers exist in a sample. -- Use sets to deduplicate identifiers safely: `set(list_of_identifiers)`. -- When authoring problem definitions on disk, {py:meth}`~plaid.problem_definition.ProblemDefinition._save_to_dir_` persists identifiers under `problem_definition/problem_infos.yaml` (keys `input_features` and `output_features`). - -## See also - -- {doc}`../core_concepts` -- {doc}`defaults` -- {doc}`../examples_tutorials` diff --git a/docs/source/core_concepts/interoperability.md b/docs/source/core_concepts/interoperability.md index ce27f548..d54a2a05 100644 --- a/docs/source/core_concepts/interoperability.md +++ b/docs/source/core_concepts/interoperability.md @@ -6,6 +6,6 @@ title: Interoperability - [CGNS standard](https://cgns.org/): the mesh/field containers align with CGNS conventions for bases, zones, elements and locations. - Notebooks and pipelines: examples show data extraction to tabular arrays for scikit-learn blocks and more. - - {doc}`../notebooks/pipelines/pipeline_example` + - {py:class}`plaid.pipelines.plaid_blocks` - {py:class}`plaid.pipelines.sklearn_block_wrappers` diff --git a/docs/source/core_concepts/problem_definition.md b/docs/source/core_concepts/problem_definition.md index 2db59b19..0af99615 100644 --- a/docs/source/core_concepts/problem_definition.md +++ b/docs/source/core_concepts/problem_definition.md @@ -17,7 +17,7 @@ from plaid.problem_definition import ProblemDefinition from plaid.types import FeatureIdentifier pb = ProblemDefinition() -pb.set_task("regression") +pb.task = "regression" pb.add_in_feature_identifier(FeatureIdentifier({"type": "scalar", "name": "Re"})) pb.add_out_feature_identifier(FeatureIdentifier({ diff --git a/docs/source/core_concepts/sample.md b/docs/source/core_concepts/sample.md index eb0d6ef7..f5562b29 100644 --- a/docs/source/core_concepts/sample.md +++ b/docs/source/core_concepts/sample.md @@ -4,7 +4,7 @@ title: Sample # Sample -{py:class}`~plaid.containers.sample.Sample` represents one observation. It contains {doc}`feature_identifiers` among (all optional): +{py:class}`~plaid.containers.sample.Sample` represents one observation. It contains: - scalars: name → value - meshes containing: - nodes: mesh node coordinates, that can be located: diff --git a/docs/source/examples_tutorials.rst b/docs/source/examples_tutorials.rst index b4f807c6..9cb2f68d 100644 --- a/docs/source/examples_tutorials.rst +++ b/docs/source/examples_tutorials.rst @@ -17,13 +17,5 @@ You can find here detailed examples for different parts of plaid, explained main notebooks/containers/sample_example notebooks/containers/dataset_example notebooks/problem_definition_example - notebooks/utils/split_example - notebooks/utils/stats_example - notebooks/post/bisect_example - notebooks/post/metrics_example - notebooks/utils/init_with_tabular_example - notebooks/utils/interpolation_example notebooks/bridges/huggingface_example notebooks/pipelines/pipeline_example - notebooks/examples/downloadable_example - tutorials/storage \ No newline at end of file diff --git a/docs/source/quickstart.md b/docs/source/quickstart.md index f31898c6..d3f562d9 100644 --- a/docs/source/quickstart.md +++ b/docs/source/quickstart.md @@ -36,7 +36,6 @@ To use the library, the simplest way is to install it from the packages availabl - {doc}`core_concepts/sample` → API: {py:class}`plaid.containers.sample.Sample` - {doc}`core_concepts/dataset` → API: {py:class}`plaid.containers.dataset.Dataset` - {doc}`core_concepts/problem_definition` → API: {py:class}`plaid.problem_definition.ProblemDefinition` -- {doc}`core_concepts/feature_identifiers` → API: {py:class}`plaid.types.feature_types.FeatureIdentifier` - {doc}`core_concepts/defaults` - {doc}`core_concepts/disk_format` - {doc}`core_concepts/interoperability` diff --git a/docs/source/tutorials/storage.md b/docs/source/tutorials/storage.md index 71d2a4a8..ea24bebc 100644 --- a/docs/source/tutorials/storage.md +++ b/docs/source/tutorials/storage.md @@ -119,11 +119,11 @@ pb_def = ProblemDefinition() pb_def.add_in_features_identifiers(input_features) pb_def.add_out_features_identifiers(output_features) pb_def.add_constant_features_identifiers(constant_features) -pb_def.set_task("regression") +pb_def.task = "regression" pb_def.set_name("regression_1") pb_def.set_score_function("RRMSE") -pb_def.set_train_split({"train":"all"}) -pb_def.set_test_split({"test":"all"}) +pb_def.train_split = {"train":"all"} +pb_def.test_split = {"test":"all"} #--------------------------------------------------------------- # Define a simple function that takes a single identifier and returns a Sample. @@ -256,6 +256,17 @@ for backend in all_backends: plaid_sample = converter.to_plaid(dataset, i) print(f"duration {time.time()-start}") + # Optional: extract only selected indices inside specific variable features + # (currently supported for hf_datasets and zarr backends). + field_path = "Base_2_3/Zone/VertexFields/pressure" + selected_idx = [0, 10, 20, 30] + plaid_sample_sub = converter.to_plaid( + dataset, + 0, + features=[field_path], + indexers={field_path: selected_idx}, + ) + # instantiate the first sample, depends on the backend sample = dataset[0] # alternative way instantiate a plaid sample (much slower for hf_datasets) @@ -318,7 +329,7 @@ for backend in all_backends: # efficient plaid sample reconstruction plaid_sample = converter.to_plaid(dataset, idx) # generic way of retrieving features and send them to GPU - for time_ in plaid_sample.get_all_mesh_times(): + for time_ in plaid_sample.get_all_time_values(): torch_sample = {} for path in features: value = plaid_sample.get_feature_by_path(path=path, time=time_) @@ -329,6 +340,29 @@ for backend in all_backends: print(f"duration {time.time()-start}") print("----------") +``` + +### Indexed extraction with `indexers` + +`converter.to_dict(...)` and `converter.to_plaid(...)` accept an optional +`indexers` argument: + +```python +sample = converter.to_plaid( + dataset, + idx=0, + features=["Base/Zone/VertexFields/mach"], + indexers={"Base/Zone/VertexFields/mach": [1, 5, 9]}, +) +``` + +- `indexers` is a mapping `feature_path -> indexer` (list/array of indices or slice). +- Indexing is applied on the **last axis** of each indexed feature. +- This enables a “read less + one gathered output copy” behavior: + - **zarr**: partial chunk reads + gathered output + - **hf_datasets**: Arrow/NumPy best-effort gather + gathered output +- `cgns` backend does not use this mechanism. + print("----------------------------------------------------") print("-- Streaming test ----------------------------------") diff --git a/examples/__init__.py b/examples/__init__.py index a9efb940..e69de29b 100644 --- a/examples/__init__.py +++ b/examples/__init__.py @@ -1,6 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/examples/bridges/__init__.py b/examples/bridges/__init__.py index a9efb940..e69de29b 100644 --- a/examples/bridges/__init__.py +++ b/examples/bridges/__init__.py @@ -1,6 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/examples/bridges/check_retrocomp_benchmarks.py b/examples/bridges/check_retrocomp_benchmarks.py index bc159529..070c638d 100644 --- a/examples/bridges/check_retrocomp_benchmarks.py +++ b/examples/bridges/check_retrocomp_benchmarks.py @@ -1,15 +1,29 @@ """This files serves to check if the main retrieval command in the PLAID Benchmarks is not returning an error.""" -from plaid.bridges import huggingface_bridge +from datasets import load_dataset -hf_dataset = huggingface_bridge.load_dataset_from_hub( - f"PLAID-datasets/Tensile2d", split="all_samples[:5]", num_proc=1 -) +from plaid import Dataset +from plaid.storage.common.bridge import to_plaid_sample, to_sample_dict +from plaid.storage.common.reader import load_metadata_from_hub, load_problem_definitions_from_hub +from plaid.storage.hf_datasets.bridge import sample_to_var_sample_dict -plaid_dataset, pb_def = huggingface_bridge.huggingface_dataset_to_plaid( - hf_dataset, processes_number=1, verbose=True -) +repo_id = "PhysArena/Tensile2d" +split_name = "train" -ids_train = pb_def.get_split('train_500') -sample_train_0 = plaid_dataset[ids_train[0]] \ No newline at end of file +hf_dataset = load_dataset(repo_id, split=split_name, num_proc=1) + +flat_cst, variable_schema, constant_schema, cgns_types = load_metadata_from_hub(repo_id) + +plaid_dataset = Dataset() +for hf_sample in hf_dataset: + var_sample_dict = sample_to_var_sample_dict(hf_sample) + sample_dict = to_sample_dict(var_sample_dict, flat_cst[split_name], cgns_types) + sample = to_plaid_sample(sample_dict, cgns_types) + plaid_dataset.get_backend().add_sample(sample) + +pb_defs = load_problem_definitions_from_hub(repo_id) +pb_def = next(iter(pb_defs.values())) + +ids_train = pb_def.get_train_split_indices() +sample_train_0 = plaid_dataset[0] \ No newline at end of file diff --git a/examples/bridges/huggingface_example.py b/examples/bridges/huggingface_example.py index 3eb5b16c..de6b9a48 100644 --- a/examples/bridges/huggingface_example.py +++ b/examples/bridges/huggingface_example.py @@ -1,4 +1,3 @@ -# -*- coding: utf-8 -*- # --- # jupyter: # jupytext: @@ -44,9 +43,9 @@ from Muscat.Bridges.CGNSBridge import MeshToCGNS from Muscat.MeshTools import MeshCreationTools as MCT -from plaid.bridges import huggingface_bridge +#from plaid.bridges import huggingface_bridge from plaid import Dataset, Sample, ProblemDefinition -from plaid.types import FeatureIdentifier +#from plaid.types import FeatureIdentifier # %% @@ -54,7 +53,7 @@ def show_sample(sample: Sample): print(f"sample = {sample}") sample.show_tree() - print(f"{sample.get_scalar_names() = }") + print(f"{sample.get_global_names() = }") print(f"{sample.get_field_names() = }") @@ -90,13 +89,9 @@ def get_mem(): dataset = Dataset() -scalar_feat_id = FeatureIdentifier({"type": "scalar", "name": "scalar"}) -node_field_feat_id = FeatureIdentifier( - {"type": "field", "name": "node_field", "location": "Vertex"} -) -cell_field_feat_id = FeatureIdentifier( - {"type": "field", "name": "cell_field", "location": "CellCenter"} -) +scalar_feat_id = "Global/scalar" +node_field_feat_id = "Base_2_2/Zone/VertexFields/node_field" +cell_field_feat_id = "Base_2_2/Zone/CellCenterFields/cell_field" print("Creating meshes dataset...") for _ in range(3): @@ -106,17 +101,17 @@ def get_mem(): sample.add_tree(MeshToCGNS(mesh, exportOriginalIDs=False)) - sample.update_features_from_identifier( + sample.update_features_by_path( scalar_feat_id, np.random.randn(), in_place=True ) - sample.update_features_from_identifier( + sample.update_features_by_path( node_field_feat_id, np.random.rand(len(points)), in_place=True ) - sample.update_features_from_identifier( + sample.update_features_by_path( cell_field_feat_id, np.random.rand(len(triangles)), in_place=True ) - dataset.add_sample(sample) + dataset.get_backend().add_sample(sample) infos = { "legal": {"owner": "Bob", "license": "my_license"}, @@ -128,12 +123,13 @@ def get_mem(): print(f" {dataset = }") print(f" {infos = }") -pb_def = ProblemDefinition() +pb_def = ProblemDefinition(name="test PD") pb_def.add_in_features_identifiers([scalar_feat_id, node_field_feat_id]) pb_def.add_out_features_identifiers([cell_field_feat_id]) -pb_def.set_task("regression") -pb_def.set_split({"train": [0, 1], "test": [2]}) +pb_def.task = "regression" +pb_def.train_split = {"train": [0, 1]} +pb_def.test_split = {"test": [2]} print(f" {pb_def = }") @@ -142,16 +138,43 @@ def get_mem(): # %% main_splits = { - split_name: pb_def.get_split(split_name) for split_name in ["train", "test"] + "train": [0, 1], + "test": [2], +} + +def make_generator(dataset, ids): + def _gen(): + for i in ids: + yield dataset[i] + return _gen + +generators = { + split: make_generator(dataset, ids) + for split, ids in main_splits.items() } +from plaid.storage.common.preprocessor import preprocess -hf_datasetdict, flat_cst, key_mappings = ( - huggingface_bridge.plaid_dataset_to_huggingface_datasetdict(dataset, main_splits) +flat_cst, variable_schema, constant_schema, num_samples, cgns_types = preprocess( + generators, + num_proc=1, + verbose=False, ) +from plaid.storage.hf_datasets.bridge import generator_to_datasetdict + +hf_datasetdict = generator_to_datasetdict( + generators=generators, + variable_schema=variable_schema, + cache_dir="/tmp/hf_cache", +) + +#hf_datasetdict, flat_cst, key_mappings = ( +# huggingface_bridge.plaid_dataset_to_huggingface_datasetdict(dataset, main_splits) +#) + print(f"{hf_datasetdict = }") -print(f"{flat_cst = }") -print(f"{key_mappings = }") +#print(f"{flat_cst = }") +#print(f"{key_mappings = }") # %% [markdown] # A partitioning of all the indices is provided in `main_splits`. The conversion outputs `flat_cst` and `key_mappings`, which are central to the Hugging Face support: @@ -183,14 +206,26 @@ def generator_(ids): generators[split_name] = partial(generator_, ids = split_ids[split_name]) -hf_datasetdict, flat_cst, key_mappings = ( - huggingface_bridge.plaid_generator_to_huggingface_datasetdict( - generators - ) +flat_cst, variable_schema, constant_schema, num_samples, cgns_types = preprocess( + generators, + num_proc=1, + verbose=False, ) + +hf_datasetdict = generator_to_datasetdict( + generators=generators, + variable_schema=variable_schema, + cache_dir="/tmp/hf_cache", +) + +#hf_datasetdict, flat_cst, key_mappings = ( +# huggingface_bridge.plaid_generator_to_huggingface_datasetdict( +# generators +# ) +#) print(f"{hf_datasetdict = }") print(f"{flat_cst = }") -print(f"{key_mappings = }") +#print(f"{key_mappings = }") # %% [markdown] # In this example, the generators are not very usefull since the plaid dataset is already loaded in memory. In real settings, one can create generators in the following way to prevent loading all the data beforehand: @@ -209,11 +244,16 @@ def generator_(ids): # ## Section 3: Convert a Hugging Face dataset to plaid # %% -cgns_types = key_mappings["cgns_types"] +from plaid.storage.common.bridge import to_sample_dict, to_plaid_sample +from plaid.storage.hf_datasets.bridge import to_var_sample_dict + +dataset_2 = Dataset() +for i in range(len(hf_datasetdict["train"])): + var_sample_dict = to_var_sample_dict(hf_datasetdict["train"], i) + sample_dict = to_sample_dict(var_sample_dict, flat_cst["train"], cgns_types) + plaid_sample = to_plaid_sample(sample_dict, cgns_types) + dataset_2.get_backend().add_sample(plaid_sample) -dataset_2 = huggingface_bridge.to_plaid_dataset( - hf_datasetdict["train"], flat_cst["train"], cgns_types -) print() print(f"{dataset_2 = }") @@ -226,19 +266,37 @@ def generator_(ids): # %% with tempfile.TemporaryDirectory() as out_dir: - huggingface_bridge.save_dataset_dict_to_disk(out_dir, hf_datasetdict) - huggingface_bridge.save_infos_to_disk(out_dir, infos) - huggingface_bridge.save_tree_struct_to_disk(out_dir, flat_cst, key_mappings) - huggingface_bridge.save_problem_definition_to_disk(out_dir, "task_1", pb_def) - - loaded_hf_datasetdict = huggingface_bridge.load_dataset_from_disk(out_dir) - loaded_infos = huggingface_bridge.load_infos_from_disk(out_dir) - flat_cst, key_mappings = huggingface_bridge.load_tree_struct_from_disk(out_dir) - loaded_pb_def = huggingface_bridge.load_problem_definition_from_disk( - out_dir, "task_1" + from plaid.storage.hf_datasets.writer import save_datasetdict_to_disk + from plaid.storage.hf_datasets.reader import init_datasetdict_from_disk + from plaid.storage.common.writer import ( + save_infos_to_disk, + save_metadata_to_disk, + save_problem_definitions_to_disk, + ) + from plaid.storage.common.reader import ( + load_infos_from_disk, + load_metadata_from_disk, + load_problem_definitions_from_disk, ) - shutil.rmtree(out_dir) + save_datasetdict_to_disk(out_dir, hf_datasetdict) + save_infos_to_disk(out_dir, infos) + save_metadata_to_disk(out_dir, flat_cst, variable_schema, constant_schema, cgns_types) + save_problem_definitions_to_disk(out_dir, pb_def) + + loaded_hf_datasetdict = init_datasetdict_from_disk(out_dir) + loaded_infos = load_infos_from_disk(out_dir) + flat_cst, variable_schema, constant_schema, cgns_types = load_metadata_from_disk(out_dir) + loaded_pb_defs = load_problem_definitions_from_disk(out_dir) + loaded_pb_def = loaded_pb_defs[pb_def.name] + + key_mappings = { + "variable_features": list(variable_schema.keys()), + "constant_features": { + split: list(schema.keys()) for split, schema in constant_schema.items() + }, + "cgns_types": cgns_types, + } print(f"{loaded_hf_datasetdict = }") print(f"{loaded_infos = }") @@ -325,13 +383,13 @@ def generator_(ids): # We notice that ``hf_sample`` is not a plaid sample, but a dict containing the variable features of the datasets, with keys being the flattened path of the CGNS tree. contains a binary object efficiently handled by huggingface datasets. It can be converted into a plaid sample using a specific constructor relying on a pydantic validator, and the required `flat_cst` and `cgns_types`. # %% -plaid_sample = huggingface_bridge.to_plaid_sample( - hf_datasetdict["train"], 0, flat_cst["train"], cgns_types -) +var_sample_dict = to_var_sample_dict(hf_datasetdict["train"], 0) +sample_dict = to_sample_dict(var_sample_dict, flat_cst["train"], cgns_types) +plaid_sample = to_plaid_sample(sample_dict, cgns_types) print("Variable features:") for t in plaid_sample.get_all_time_values(): - for path in key_mappings["variable_features"]: + for path in variable_schema.keys(): print(path, plaid_sample.get_feature_by_path(path=path, time=t)) print("-------") print("Sample and CGNS tree:") @@ -451,7 +509,7 @@ def generator_(ids): # cgns_types, # enforce_shapes=False, # ) -# for t in sample.get_all_mesh_times(): +# for t in sample.get_all_time_values(): # for path in pb_def.get_in_features_identifiers(): # sample.get_feature_by_path(path=path, time=t) # for path in pb_def.get_out_features_identifiers(): diff --git a/examples/containers/__init__.py b/examples/containers/__init__.py index a9efb940..e69de29b 100644 --- a/examples/containers/__init__.py +++ b/examples/containers/__init__.py @@ -1,6 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/examples/containers/bench_parallel_load.py b/examples/containers/bench_parallel_load.py index 8390b3bc..42455eae 100644 --- a/examples/containers/bench_parallel_load.py +++ b/examples/containers/bench_parallel_load.py @@ -3,6 +3,7 @@ import tempfile import time from pathlib import Path +import gc import matplotlib.pyplot as plt import numpy as np @@ -67,36 +68,39 @@ dset = Dataset() for i in tqdm(range(args.number_of_samples), desc="Generate samples"): # ---# Read a sample provided in tests - sample_path = Path("../tests/containers/dataset/samples/sample_000000000") + sample_path = Path("../tests/containers/dataset/data/test/sample_000000000") if not (sample_path.is_dir()): sample_path = Path( - "../../tests/containers/dataset/samples/sample_000000000" + "../../tests/containers/dataset/data/test/sample_000000000" ) tmpsmp = Sample(path=sample_path) smp = tmpsmp # ---# Add some random data - smp.add_scalar("id", i) - smp.add_scalar("s0", np.random.randn()) + smp.add_global("id", i) + smp.add_global("s0", np.random.randn()) n_nodes = smp.get_nodes().shape[0] smp.add_field("f0", np.random.randn(n_nodes)) - dset.add_sample(smp) + dset._backend.add_sample(smp) dset.save_to_dir(out_dir, verbose=True) rich.print(f"create and save dataset took: {time.perf_counter() - t0:.3f} s") print() # ---# Measure loading durations depending on number of cores - all_durations = [] + all_durations: list[list[float]] = [] for nb_cores in tqdm(range(NB_CORES + 1), desc="Loop on nb_cores"): - durations = [] + durations: list[float] = [] for _ in tqdm(range(args.number_of_tests), desc=" Loop on nb_tests"): new_dset = Dataset() t0 = time.perf_counter() - new_dset.load(out_dir, processes_number=nb_cores) + print(out_dir) + new_dset.load(out_dir) #, processes_number=nb_cores) t1 = time.perf_counter() durations.append(t1 - t0) + del new_dset + all_durations.append(durations) # rich.print(f'<{nb_cores=:2d}> load took: {np.min(durations):.3f} s | {np.mean(durations):.3f} s | {np.max(durations):.3f} s | {np.std(durations):.3f} s') @@ -119,5 +123,6 @@ plt.savefig( f"bench_{args.number_of_samples}_{args.number_of_tests}_{NB_CORES}.png" ) + gc.collect() os.remove("bench_100_5_2.png") diff --git a/examples/containers/dataset_example.py b/examples/containers/dataset_example.py index fb69a246..89629087 100644 --- a/examples/containers/dataset_example.py +++ b/examples/containers/dataset_example.py @@ -49,10 +49,10 @@ # %% # Print dict util -def dprint(name: str, dictio: dict, end: str = "\n"): +def dprint(name: str, dictio: list, end: str = "\n"): print(name, "{") - for key, value in dictio.items(): - print(" ", key, ":", value) + for value in dictio: + print(value) print("}", end=end) @@ -112,7 +112,7 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # %% # Add a scalar to the Sample -sample_01.add_scalar("rotation", np.random.randn()) +sample_01.add_global("rotation", np.random.randn()) print(f"{sample_01 = }") # %% [markdown] @@ -126,7 +126,7 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # %% # Add a scalar to the second Sample -sample_02.add_scalar("rotation", np.random.randn()) +sample_02.add_global("rotation", np.random.randn()) print(f"{sample_02 = }") # %% [markdown] @@ -136,8 +136,8 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # Initialize a third empty Sample print("#---# Empty Sample") sample_03 = Sample() -sample_03.add_scalar("speed", np.random.randn()) -sample_03.add_scalar("rotation", sample_01.get_scalar("rotation")) +sample_03.add_global("speed", np.random.randn()) +sample_03.add_global("rotation", sample_01.get_global("rotation")) sample_03.features.add_tree(copy.deepcopy(cgns_mesh)) # Show Sample CGNS content @@ -145,7 +145,7 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # %% # Add a field to the third empty Sample -sample_03.add_field("temperature", np.random.rand(5), zone_name="Zone", base_name="Base_2_2") +sample_03.add_field("temperature", np.random.rand(5), zone="Zone", base="Base_2_2") sample_03.show_tree() # %% [markdown] @@ -156,9 +156,9 @@ def dprint(name: str, dictio: dict, end: str = "\n"): print(f"{sample_03 = }", end="\n\n") # Print sample scalar data -print(f"{sample_03.get_scalar_names() = }") -print(f"{sample_03.get_scalar('speed') = }") -print(f"{sample_03.get_scalar('rotation') = }", end="\n\n") +print(f"{sample_03.get_global_names() = }") +print(f"{sample_03.get_global('speed') = }") +print(f"{sample_03.get_global('rotation') = }", end="\n\n") # Print sample scalar data print(f"{sample_03.get_field_names() = }") @@ -174,11 +174,11 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # %% # Add Samples by id in the Dataset -dataset.set_sample(id=0, sample=sample_01) -dataset.set_sample(1, sample_02) +dataset._backend.set_sample(id=0, sample=sample_01) +dataset._backend.set_sample(sample_02, 1) # Add unique Sample and automatically create its id -added_sample_id = dataset.add_sample(sample_03) +added_sample_id = dataset._backend.add_sample(sample_03) print(f"{added_sample_id = }") # %% [markdown] @@ -193,27 +193,27 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # %% # Add node information to the Dataset -dataset.add_info("legal", "owner", "Safran") +#dataset.add_info("legal", "owner", "Safran") # Retrive dataset information import json -dataset_info = dataset.get_infos() -print("dataset info =", json.dumps(dataset_info, sort_keys=False, indent=4), end="\n\n") +#dataset_info = dataset.get_infos() +#print("dataset info =", json.dumps(dataset_info, sort_keys=False, indent=4), end="\n\n") # Overwrite information (logger will display warnings) infos = {"legal": {"owner": "Safran", "license": "CC0"}} dataset.set_infos(infos) # Retrive dataset information -dataset_info = dataset.get_infos() -print("dataset info =", json.dumps(dataset_info, sort_keys=False, indent=4), end="\n\n") +#dataset_info = dataset.get_infos() +#print("dataset info =", json.dumps(dataset_info, sort_keys=False, indent=4), end="\n\n") # Add tree information to the Dataset (logger will display warnings) -dataset.add_infos("data_description", {"number_of_samples": 0, "number_of_splits": 0}) +#dataset.add_infos("data_description", {"number_of_samples": 0, "number_of_splits": 0}) # Pretty print dataset information -dataset.print_infos() +#dataset.print_infos() # %% [markdown] # ### Get a list of specific Samples in a Dataset @@ -240,7 +240,8 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # %% # Create a new dataset with the sample list and ids # (ids can be provided optionally) -new_dataset = Dataset(samples=[sample_01, sample_02, sample_03], sample_ids=[3, 5, 7]) +new_dataset = Dataset() +new_dataset._backend.set_sample([sample_01, sample_02, sample_03], id=[3, 5, 7]) print(f"{new_dataset = }") print("new dataset sample ids =", new_dataset.get_sample_ids()) @@ -251,7 +252,7 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # Create a new Dataset and add multiple samples dataset = Dataset() samples = [sample_01, sample_02, sample_03] -added_ids = dataset.add_samples(samples) +added_ids = dataset._backend.add_sample(samples) print(f"{added_ids = }") print(f"{dataset = }") @@ -260,44 +261,48 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # %% # Access Sample data with indexes through the Dataset -print(f"{dataset(0) = }") # call strategy +print(f"{dataset[0] = }") # getitem strategy print(f"{dataset[1] = }") # getitem strategy print(f"{dataset[2] = }", end="\n\n") -print("scalar of the first sample = ", dataset[0].get_scalar_names()) -print("scalar of the second sample = ", dataset[1].get_scalar_names()) -print("scalar of the third sample = ", dataset[2].get_scalar_names()) +print("scalar of the first sample = ", dataset[0].get_global_names()) +print("scalar of the second sample = ", dataset[1].get_global_names()) +print("scalar of the third sample = ", dataset[2].get_global_names()) # %% # Access dataset information -print(f"{dataset[0].get_scalar('rotation') = }") -print(f"{dataset[1].get_scalar('rotation') = }") -print(f"{dataset[2].get_scalar('rotation') = }") +print(f"{dataset[0].get_global('rotation') = }") +print(f"{dataset[1].get_global('rotation') = }") +print(f"{dataset[2].get_global('rotation') = }") # %% [markdown] # ### Get Dataset scalars to tabular # %% # Print scalars in tabular format -print(f"{dataset.get_scalar_names() = }", end="\n\n") +scalar_names = set() +for x in range(len(dataset)): + scalar_names.update(dataset[x].get_global_names()) + +print(f"{scalar_names = }", end="\n\n") -dprint("get rotation scalar = ", dataset.get_scalars_to_tabular(["rotation"])) -dprint("get speed scalar = ", dataset.get_scalars_to_tabular(["speed"]), end="\n\n") +#dprint("get rotation scalar = ", dataset.get_scalars_to_tabular(["rotation"])) +#dprint("get speed scalar = ", dataset.get_scalars_to_tabular(["speed"]), end="\n\n") # Get specific scalars in tabular format -dprint("get specific scalars =", dataset.get_scalars_to_tabular(["speed", "rotation"])) -dprint("get all scalars =", dataset.get_scalars_to_tabular()) +#dprint("get specific scalars =", dataset.get_scalars_to_tabular(["speed", "rotation"])) +#dprint("get all scalars =", dataset.get_scalars_to_tabular()) # %% # Get specific scalars np.array -print("get all scalar arrays = ", dataset.get_scalars_to_tabular(as_nparray=True)) +#print("get all scalar arrays = ", dataset.get_scalars_to_tabular(as_nparray=True)) # %% [markdown] # ### Get Dataset fields # %% # Print fields in the Dataset -print("fields in the dataset = ", dataset.get_field_names()) +#print("fields in the dataset = ", dataset.get_field_names()) # %% [markdown] # ## Section 3: Various operations on the Dataset @@ -314,12 +319,12 @@ def dprint(name: str, dictio: dict, end: str = "\n"): samples = [] for _ in range(nb_samples): sample = Sample() - sample.add_scalar("rotation", np.random.rand() + 1.0) - sample.add_scalar("random_name", np.random.rand() - 1.0) + sample.add_global("rotation", np.random.rand() + 1.0) + sample.add_global("random_name", np.random.rand() - 1.0) samples.append(sample) # Add a list of Samples -other_dataset.add_samples(samples) +other_dataset.get_backend().add_sample(samples) print(f"{other_dataset = }") # %% [markdown] @@ -328,10 +333,10 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # %% # Merge the other dataset with the main dataset print(f"before merge: {dataset = }") -dataset.merge_dataset(other_dataset) +dataset._backend.merge_dataset(other_dataset) print(f"after merge: {dataset = }", end="\n\n") -dprint("dataset scalars = ", dataset.get_scalars_to_tabular()) +#dprint("dataset scalars = ", dataset.get_scalars_to_tabular()) # %% [markdown] # ### Add tabular scalars to a Dataset @@ -339,10 +344,10 @@ def dprint(name: str, dictio: dict, end: str = "\n"): # %% # Adding tabular scalars to the dataset new_scalars = np.random.rand(3, 2) -dataset.add_tabular_scalars(new_scalars, names=["Tu", "random_name"]) +#dataset.add_tabular_scalars(new_scalars, names=["Tu", "random_name"]) print(f"{dataset = }") -dprint("dataset scalars =", dataset.get_scalars_to_tabular()) +#dprint("dataset scalars =", dataset.get_scalars_to_tabular()) # %% [markdown] # ### Set additional information to a dataset @@ -353,7 +358,7 @@ def dprint(name: str, dictio: dict, end: str = "\n"): "data_production": {"type": "simulation", "simulator": "dummy"}, } dataset.set_infos(infos) -dataset.print_infos() +#dataset.print_infos() # %% [markdown] @@ -368,77 +373,77 @@ def dprint(name: str, dictio: dict, end: str = "\n"): tmpdir = f"/tmp/test_safe_to_delete_{np.random.randint(low=1, high=2_000_000_000)}" print(f"Save dataset in: {tmpdir}") -dataset.save_to_dir(tmpdir) +#dataset.save_to_dir(tmpdir) -# %% [markdown] -# ### Get the number of Samples that can be loaded from a directory +# # %% [markdown] +# # ### Get the number of Samples that can be loaded from a directory -# %% -nb_samples = plaid.get_number_of_samples(tmpdir) -print(f"{nb_samples = }") +# # %% +# nb_samples = plaid.get_number_of_samples(tmpdir) +# print(f"{nb_samples = }") -# %% [markdown] -# ### Load a Dataset from a directory via initialization +# # %% [markdown] +# # ### Load a Dataset from a directory via initialization -# %% -loaded_dataset_from_init = Dataset(tmpdir) -print(f"{loaded_dataset_from_init = }") +# # %% +# loaded_dataset_from_init = Dataset(tmpdir) +# print(f"{loaded_dataset_from_init = }") -if platform.system() == "Linux": - multi_process_loaded_dataset = Dataset(tmpdir, processes_number=3) - print(f"{multi_process_loaded_dataset = }") +# if platform.system() == "Linux": +# multi_process_loaded_dataset = Dataset(tmpdir, processes_number=3) +# print(f"{multi_process_loaded_dataset = }") -# %% [markdown] -# ### Load a Dataset from a directory via the Dataset class +# # %% [markdown] +# # ### Load a Dataset from a directory via the Dataset class -# %% -loaded_dataset_from_class = Dataset.load_from_dir(tmpdir) -print(f"{loaded_dataset_from_class = }") +# # %% +# loaded_dataset_from_class = Dataset.load_from_dir(tmpdir) +# print(f"{loaded_dataset_from_class = }") -if platform.system() == "Linux": - multi_process_loaded_dataset = Dataset.load_from_dir(tmpdir, processes_number=3) - print(f"{multi_process_loaded_dataset = }") +# if platform.system() == "Linux": +# multi_process_loaded_dataset = Dataset.load_from_dir(tmpdir, processes_number=3) +# print(f"{multi_process_loaded_dataset = }") -# %% [markdown] -# ### Load the dataset from a directory via a Dataset instance +# # %% [markdown] +# # ### Load the dataset from a directory via a Dataset instance -# %% -loaded_dataset_from_instance = Dataset() -loaded_dataset_from_instance.load(tmpdir) +# # %% +# loaded_dataset_from_instance = Dataset() +# loaded_dataset_from_instance.load(tmpdir) -print(f"{loaded_dataset_from_instance = }") +# print(f"{loaded_dataset_from_instance = }") -if platform.system() == "Linux": - multi_process_loaded_dataset = Dataset() - multi_process_loaded_dataset.load(tmpdir, processes_number=3) - print(f"{multi_process_loaded_dataset = }") +# if platform.system() == "Linux": +# multi_process_loaded_dataset = Dataset() +# multi_process_loaded_dataset.load(tmpdir, processes_number=3) +# print(f"{multi_process_loaded_dataset = }") -# %% [markdown] -# ### Save the dataset to a TAR (Tape Archive) file +# # %% [markdown] +# # ### Save the dataset to a TAR (Tape Archive) file -# %% -tmpdir = Path(f"/tmp/test_safe_to_delete_{np.random.randint(low=1, high=2_000_000_000)}") -tmpfile = tmpdir / "test_file.plaid" +# # %% +# tmpdir = Path(f"/tmp/test_safe_to_delete_{np.random.randint(low=1, high=2_000_000_000)}") +# tmpfile = tmpdir / "test_file.plaid" -print(f"Save dataset in: {tmpfile}") -dataset.save(tmpfile) +# print(f"Save dataset in: {tmpfile}") +# dataset.save(tmpfile) -# %% [markdown] -# ### Load the dataset from a TAR (Tape Archive) file via Dataset instance +# # %% [markdown] +# # ### Load the dataset from a TAR (Tape Archive) file via Dataset instance -# %% -new_dataset = Dataset() -new_dataset.load(tmpfile) +# # %% +# new_dataset = Dataset() +# new_dataset.load(tmpfile) -print(f"{dataset = }") -print(f"{new_dataset = }") +# print(f"{dataset = }") +# print(f"{new_dataset = }") -# %% [markdown] -# ### Load the dataset from a TAR (Tape Archive) file via initialization +# # %% [markdown] +# # ### Load the dataset from a TAR (Tape Archive) file via initialization -# %% -new_dataset = Dataset(tmpfile) +# # %% +# new_dataset = Dataset(tmpfile) -print(f"{dataset = }") -print(f"{new_dataset = }") +# print(f"{dataset = }") +# print(f"{new_dataset = }") diff --git a/examples/containers/sample_example.py b/examples/containers/sample_example.py index d5864261..3004da50 100644 --- a/examples/containers/sample_example.py +++ b/examples/containers/sample_example.py @@ -49,7 +49,7 @@ def show_sample(sample: Sample): print(f"sample = {sample}") sample.show_tree() - print(f"{sample.get_scalar_names() = }") + print(f"{sample.get_global_names() = }") print(f"{sample.get_field_names() = }") @@ -112,14 +112,14 @@ def show_sample(sample: Sample): # %% # Add a rotation scalar to this Sample -sample.add_scalar("rotation", np.random.randn()) +sample.add_global("rotation", np.random.randn()) show_sample(sample) # %% # Add a more scalars to this Sample -sample.add_scalar("speed", np.random.randn()) -sample.add_scalar("other", np.random.randn()) +sample.add_global("speed", np.random.randn()) +sample.add_global("other", np.random.randn()) show_sample(sample) @@ -192,8 +192,8 @@ def show_sample(sample: Sample): ) # Set the coordinates of nodes for a specified base and zone at a given time. -# set_points == set_nodes == set_vertices -sample.set_nodes(points, base_name="SurfaceMesh", zone_name="TestZoneName", time=0.0) +# set node coordinates +sample.set_nodes(points, base="SurfaceMesh", zone="TestZoneName", time=0.0) show_sample(sample) @@ -205,8 +205,8 @@ def show_sample(sample: Sample): sample.add_field( "Pressure", np.random.randn(len(points)), - base_name="SurfaceMesh", - zone_name="TestZoneName", + base="SurfaceMesh", + zone="TestZoneName", time=0.0, ) @@ -217,8 +217,8 @@ def show_sample(sample: Sample): sample.add_field( "Temperature", np.random.randn(len(points)), - base_name="SurfaceMesh", - zone_name="TestZoneName", + base="SurfaceMesh", + zone="TestZoneName", time=0.0, ) @@ -229,9 +229,9 @@ def show_sample(sample: Sample): # %% # It will look for a default base if no base and zone are given -print(f"{sample.get_scalar_names() = }") -print(f"{sample.get_scalar('omega') = }") -print(f"{sample.get_scalar('rotation') = }") +print(f"{sample.get_global_names() = }") +print(f"{sample.get_global('omega') = }") +print(f"{sample.get_global('rotation') = }") # %% [markdown] # ### Access fields data in Sample @@ -248,8 +248,7 @@ def show_sample(sample: Sample): # %% # It will look for a default base if no base and zone are given print(f"{sample.get_nodes() = }") -print(f"{sample.features.get_points() = }") # same as get_nodes -print(f"{sample.features.get_vertices() = }") # same as get_nodes +print(f"{sample.features.get_nodes() = }") # %% [markdown] # ### Retrieve element connectivity data @@ -533,7 +532,7 @@ def show_sample(sample: Sample): sample_save_fname = test_pth / "test" print(f"saving path: {sample_save_fname}") -sample.save(sample_save_fname) +sample.save_to_dir(sample_save_fname) # %% [markdown] # ### Load a Sample from a directory via initialization diff --git a/examples/convert_users_data_example.py b/examples/convert_users_data_example.py index 8cbc76f1..61ff908a 100644 --- a/examples/convert_users_data_example.py +++ b/examples/convert_users_data_example.py @@ -162,10 +162,10 @@ def in_notebook(): # Add random scalar values to the sample for sname in in_scalars_names: - sample.add_scalar(sname, np.random.randn()) + sample.add_global(sname, np.random.randn()) for sname in out_scalars_names: - sample.add_scalar(sname, np.random.randn()) + sample.add_global(sname, np.random.randn()) # Add random field values to the sample for j, sname in enumerate(out_fields_names): @@ -189,10 +189,10 @@ def in_notebook(): # Set information for the PLAID dataset dataset.set_infos(infos) -dataset.print_infos() +#dataset.print_infos() # %% # Add PLAID samples to the dataset -sample_ids = dataset.add_samples(samples) +sample_ids = dataset.get_backend().add_sample(samples) print(sample_ids) print(dataset) diff --git a/examples/examples/downloadable_example.py b/examples/examples/downloadable_example.py index c5ab3275..a804f2a8 100644 --- a/examples/examples/downloadable_example.py +++ b/examples/examples/downloadable_example.py @@ -63,7 +63,7 @@ end = time.perf_counter() print(f"First sample retrieval duration: {end - start:.6f} seconds") -assert(len(samples.vki_ls59.get_scalar_names())==8) +assert(len(samples.vki_ls59.get_global_names())==8) # %% from plaid.examples import samples @@ -73,4 +73,4 @@ end = time.perf_counter() print(f"The tensile2d dataset being already loaded: sample retrieval duration: {end - start:.6f} seconds") -assert(len(samples.tensile2d.get_scalar_names())==10) \ No newline at end of file +assert(len(samples.tensile2d.get_global_names())==10) \ No newline at end of file diff --git a/examples/pipelines/config_pipeline.yml b/examples/pipelines/config_pipeline.yml deleted file mode 100644 index 98a2bb07..00000000 --- a/examples/pipelines/config_pipeline.yml +++ /dev/null @@ -1,35 +0,0 @@ -input_scalar_scaler: - in_features_identifiers: - - type: scalar - name: angle_in - - type: scalar - name: mach_out - -pca_nodes: - in_features_identifiers: - - type: nodes - base_name: Base_2_2 - out_features_identifiers: - - type: scalar - name: reduced_nodes_* - -pca_mach: - in_features_identifiers: - - type: field - name: mach - base_name: Base_2_2 - out_features_identifiers: - - type: scalar - name: reduced_mach_* - -regressor_mach: - in_features_identifiers: - - type: scalar - name: angle_in - - type: scalar - name: mach_out - - type: scalar - name: reduced_nodes_* - out_features_identifiers: - - type: scalar - name: reduced_mach_* \ No newline at end of file diff --git a/examples/pipelines/pipeline_example.py b/examples/pipelines/pipeline_example.py deleted file mode 100644 index b6466e65..00000000 --- a/examples/pipelines/pipeline_example.py +++ /dev/null @@ -1,345 +0,0 @@ -# --- -# jupyter: -# jupytext: -# formats: ipynb,py:percent -# text_representation: -# extension: .py -# format_name: percent -# format_version: '1.3' -# jupytext_version: 1.17.3 -# kernelspec: -# display_name: plaid-dev -# language: python -# name: python3 -# --- - -# %% [markdown] -# # Pipeline Examples -# -# This notebook demonstrates the end-to-end process of building a machine learning pipeline using PLAID datasets and PLAID’s scikit-learn-compatible blocks. - -# %% [markdown] -# ## PCA-GP for `mach` field prediction of `VKI-LS59` dataset -# -# Key steps covered: -# -# - **Loading the PLAID dataset** using Hugging Face integration and PLAID’s dataset classes -# - **Standardizing features** with PLAID-wrapped scikit-learn transformers for scalars -# - **Dimensionality reduction** of flow fields via Principal Component Analysis (PCA) to reduce output complexity -# - **Regression modeling** of PCA coefficients from scalar inputs using Gaussian Process regression -# - **Pipeline assembly** combining transformations and regressors into a single scikit-learn-compatible workflow -# - **Hyperparameter tuning** using Optuna and scikit-learn’s `GridSearchCV` -# - **Best practices** for working with PLAID datasets and pipelines in a reproducible and modular manner - -# %% [markdown] -# ### 📦 Imports - -# %% -import warnings -warnings.filterwarnings('ignore', module='sklearn') -warnings.filterwarnings("ignore", message=".*IProgress not found.*") - -import os -from pathlib import Path - -import yaml -import numpy as np -import optuna - -from datasets.utils.logging import disable_progress_bar - -from sklearn.base import clone -from sklearn.pipeline import Pipeline - -from sklearn.decomposition import PCA -from sklearn.preprocessing import MinMaxScaler -from sklearn.gaussian_process import GaussianProcessRegressor -from sklearn.gaussian_process.kernels import Matern -from sklearn.multioutput import MultiOutputRegressor - -from sklearn.model_selection import KFold, GridSearchCV - -from plaid.bridges.huggingface_bridge import huggingface_dataset_to_plaid, load_dataset_from_hub -from plaid.pipelines.sklearn_block_wrappers import WrappedSklearnTransformer, WrappedSklearnRegressor -from plaid.pipelines.plaid_blocks import TransformedTargetRegressor, ColumnTransformer - - -disable_progress_bar() -n_processes = min(max(1, os.cpu_count()), 6) - -# %% [markdown] -# ### 📥 Load Dataset -# -# We load the `VKI-LS59` dataset from Hugging Face and restrict ourselves to the first 24 samples of the training set. - -# %% -hf_dataset = load_dataset_from_hub("PLAID-datasets/VKI-LS59", split="all_samples[:24]") -dataset_train, pb_def = huggingface_dataset_to_plaid(hf_dataset, processes_number = n_processes, verbose = False) - - -# %% [markdown] -# We print the summary of dataset_train, which contains 24 samples, with 8 scalars and 8 fields, which is consistent with the `VKI-LS59` dataset: - -# %% -print(dataset_train) - - -# %% [markdown] -# ### ⚙️ Pipeline Configuration -# -# For convenience, the `in_features_identifiers` and `out_features_identifiers` for each pipeline block are defined in a `.yml` file. Here's an example of how the configuration might look: - -# %% [markdown] -# ```yaml -# pca_nodes: -# in_features_identifiers: -# - type: nodes -# base_name: Base_2_2 -# out_features_identifiers: -# - type: scalar -# name: reduced_nodes_* -# ``` - -# %% -try: - filename = Path(__file__).parent.parent.parent / "examples" / "pipelines" / "config_pipeline.yml" -except NameError: - filename = "config_pipeline.yml" - -with open(filename, 'r') as f: - config = yaml.safe_load(f) - -all_feature_id = config['input_scalar_scaler']['in_features_identifiers'] +\ - config['pca_nodes']['in_features_identifiers'] + config['pca_mach']['in_features_identifiers'] - -# %% [markdown] -# In this example, we aim to predict the ``mach`` field based on two input scalars ``angle_in`` and ``mach_out``, and the mesh node coordinates. To contain memory consumption, we restrict the dataset to the features required for this example: - -# %% -dataset_train = dataset_train.extract_dataset_from_identifier(all_feature_id) -print("dataset_train =", dataset_train) -print("scalar names =", dataset_train.get_scalar_names()) -print("field names =", dataset_train.get_field_names()) - -# %% [markdown] -# We notive that only the 2 scalars and the field of interest are kept after restriction. - -# %% [markdown] -# #### 1. Preprocessor -# -# We now define a preprocessor: a `MinMaxScaler` of the 2 input scalars and a `PCA` on the nodes coordinates of the meshes: - -# %% -preprocessor = ColumnTransformer( - [ - ('input_scalar_scaler', WrappedSklearnTransformer(MinMaxScaler(), **config['input_scalar_scaler'])), - ('pca_nodes', WrappedSklearnTransformer(PCA(), **config['pca_nodes'])), - ] -) -preprocessor - -# %% [markdown] -# We use a `PlaidColumnTransformer` to apply independent transformations to different feature groups. -# -# To verify this behavior, we apply the `preprocessor` to `dataset_train`: - -# %% -preprocessed_dataset = preprocessor.fit_transform(dataset_train) -print("preprocessed_dataset:", preprocessed_dataset) -print("scalar names =", preprocessed_dataset.get_scalar_names()) -print("field names =", preprocessed_dataset.get_field_names()) - - -# %% [markdown] -# Using `MinMaxScaler`, we scaled the `angle_in` and `mach_out` features, replacing their original values. In contrast, `PCA` compressed the node coordinates and produced new scalar features named `reduced_nodes_*`, representing the PCA components. Alternatively, we could have specified `out_features_identifiers` in the `.yml` file configuring the `MinMaxScaler` block to generate new scalars without overwriting the original inputs. - -# %% [markdown] -# #### 2. Postprocessor -# -# Next, we define the postprocessor, which applies PCA to the `mach` field: - -# %% -postprocessor = WrappedSklearnTransformer(PCA(), **config['pca_mach']) -postprocessor - -# %% [markdown] -# #### 3. TransformedTargetRegressor -# -# The Gaussian Process regressor takes the transformed `angle_in` and `mach_out` scalars, along with the PCA coefficients of the mesh node coordinates as inputs, and predicts the PCA coefficients of the `mach` field as outputs. This is facilitated by using a `PlaidTransformedTargetRegressor`. - -# %% -kernel = Matern(length_scale_bounds=(1e-8, 1e8), nu = 2.5) - -gpr = GaussianProcessRegressor( - kernel=kernel, - optimizer='fmin_l_bfgs_b', - n_restarts_optimizer=1, - random_state=42) - -reg = MultiOutputRegressor(gpr) - -regressor = WrappedSklearnRegressor(reg, **config['regressor_mach']) - -target_regressor = TransformedTargetRegressor( - regressor=regressor, - transformer=postprocessor -) -target_regressor - -# %% [markdown] -# `PlaidTransformedTargetRegressor` functions like scikit-learn’s `TransformedTargetRegressor` but operates directly on PLAID datasets. - -# %% [markdown] -# #### 4. Pipeline assembling -# -# We then define the complete pipeline as follows: - -# %% -pipeline = Pipeline( - steps=[ - ("preprocessor", preprocessor), - ("regressor", target_regressor), - ] -) -pipeline - - -# %% [markdown] -# ### 🎯 Optuna hyperparameter tuning -# -# We now use Optuna to optimize hyperparameters, specifically tuning the number of components for the two `PCA` blocks using three-fold cross-validation. - - -# %% -def objective(trial): - # Suggest hyperparameters - nodes_n_components = trial.suggest_int("preprocessor__pca_nodes__sklearn_block__n_components", 3, 4) - mach_n_components = trial.suggest_int("regressor__transformer__sklearn_block__n_components", 4, 5) - - # Clone and configure pipeline - pipeline_run = clone(pipeline) - pipeline_run.set_params( - preprocessor__pca_nodes__sklearn_block__n_components=nodes_n_components, - regressor__transformer__sklearn_block__n_components=mach_n_components, - regressor__regressor__sklearn_block__estimator__kernel=Matern( - length_scale_bounds=(1e-8, 1e8), nu=2.5, length_scale=np.ones(nodes_n_components + len(config['input_scalar_scaler']['in_features_identifiers'])) - ) - ) - - cv = KFold(n_splits=3, shuffle=True, random_state=42) - - scores = [] - - indices = np.arange(len(dataset_train)) - - for train_idx, val_idx in cv.split(indices): - - dataset_cv_train_ = dataset_train[train_idx] - dataset_cv_val_ = dataset_train[val_idx] - - pipeline_run.fit(dataset_cv_train_) - - score = pipeline_run.score(dataset_cv_val_) - - scores.append(score) - - return np.mean(scores) - -# %% [markdown] -# We maximize the defined objective function over 4 trials selected by Optuna. - - -# %% -preprocessed_dataset = preprocessor.fit_transform(dataset_train) -print("preprocessed_dataset:", preprocessed_dataset) -print("scalar names =", preprocessed_dataset.get_scalar_names()) -print("field names =", preprocessed_dataset.get_field_names()) - -# %% -study = optuna.create_study(direction='maximize') -study.optimize(objective, n_trials=4) -print("best_params =", study.best_params) - -# %% [markdown] -# We retrieve the best hyperparameters found by Optuna and use them to define the `optimized_pipeline`. - -# %% -optimized_pipeline = clone(pipeline).set_params(**study.best_params) -optimized_pipeline.set_params(regressor__regressor__sklearn_block__estimator__kernel=Matern( - length_scale_bounds=(1e-8, 1e8), nu=2.5, length_scale=np.ones(study.best_params['preprocessor__pca_nodes__sklearn_block__n_components'] + len(config['input_scalar_scaler']['in_features_identifiers'])) - ) -) - -optimized_pipeline.fit(dataset_train) - - -# %% [markdown] -# Next, we fit the `optimized_pipeline` to the `dataset_train` dataset and evaluate its performance on the same data. - -# %% -dataset_pred = optimized_pipeline.predict(dataset_train) -score = optimized_pipeline.score(dataset_train) -print("score =", score, ", error =", 1. - score) - -# %% [markdown] -# We use an anisotropic kernel in the Gaussian Process. Its optimized `length_scale` is a vector with dimensions equal to 2 plus the number of PCA components from `preprocessor__pca_nodes__sklearn_block__n_components`, accounting for the two input scalars. - -# %% -print(optimized_pipeline.named_steps["regressor"].regressor_.sklearn_block_.estimators_[0].kernel_.get_params()['length_scale']) - -# %% -print("Dimension GP kernel length_scale =", len(optimized_pipeline.named_steps["regressor"].regressor_.sklearn_block_.estimators_[0].kernel_.get_params()['length_scale'])) -print("Expected dimension =", 2 + study.best_params['preprocessor__pca_nodes__sklearn_block__n_components']) - -# %% [markdown] -# The error remains non-zero due to the approximation introduced by PCA. Since the Gaussian Process regressor interpolates, the error is expected to vanish on the training set if all PCA modes are retained. - -# %% -exact_pipeline = clone(pipeline).set_params( - preprocessor__pca_nodes__sklearn_block__n_components = 24, - regressor__transformer__sklearn_block__n_components = 24 -) -exact_pipeline.fit(dataset_train) -dataset_pred = exact_pipeline.predict(dataset_train) -score = exact_pipeline.score(dataset_train) -print("score =", score, ", error =", 1. - score) - - -# %% [markdown] -# ### 🔍 GridSearchCV hyperparameter tuning -# -# Since our pipeline nodes conform to the scikit-learn API, the constructed pipeline can be used directly with `GridSearchCV`. - -# %% -pca_n_components = [3, 4] -regressor_n_components = [4, 5] - -param_grid = [] -for n, m in zip(pca_n_components, regressor_n_components): - param_grid.append( - { - "preprocessor__pca_nodes__sklearn_block__n_components": [n], - "regressor__transformer__sklearn_block__n_components": [m], - "regressor__regressor__sklearn_block__estimator__kernel": [ - Matern( - length_scale_bounds=(1e-8, 1e8), nu=2.5, length_scale=np.ones(n + 2) - ) - ], - } - ) - -cv = KFold(n_splits=3, shuffle=True, random_state=42) -search = GridSearchCV(pipeline, param_grid=param_grid, cv=cv, verbose=3, error_score='raise') - -search.fit(dataset_train) - -# %% [markdown] -# We evaluate the performance of the optimized pipeline by computing its score on the training set. - -# %% -print("best_params =", search.best_params_) -optimized_pipeline = clone(pipeline).set_params(**search.best_params_) -optimized_pipeline.fit(dataset_train) -dataset_pred = optimized_pipeline.predict(dataset_train) -score = optimized_pipeline.score(dataset_train) -print("score =", score, ", error =", 1. - score) diff --git a/examples/post/bisect_example.py b/examples/post/bisect_example.py deleted file mode 100644 index 8ed21b62..00000000 --- a/examples/post/bisect_example.py +++ /dev/null @@ -1,132 +0,0 @@ -# --- -# jupyter: -# jupytext: -# formats: ipynb,py:percent -# text_representation: -# extension: .py -# format_name: percent -# format_version: '1.3' -# jupytext_version: 1.17.3 -# kernelspec: -# display_name: Python 3 -# language: python -# name: python3 -# --- - -# %% [markdown] -# # Bisect Plot Examples -# -# ## Introduction -# This notebook explains the use case of the `prepare_datasets`, and `plot_bisect` functions from the Plaid library. The function is used to generate bisect plots for different scenarios using file paths and PLAID objects. -# - -# %% -# Importing Required Libraries -from pathlib import Path -import os - -from plaid import Dataset -from plaid.post.bisect import plot_bisect, prepare_datasets -from plaid import ProblemDefinition - - -# %% -# Setting up Directories -try: - dataset_directory = Path(__file__).parent.parent.parent / "tests" / "post" -except NameError: - dataset_directory = Path("..") / ".." / ".." / ".." / "tests" / "post" - -# %% [markdown] -# ## Prepare Datasets for comparision -# -# Assuming you have reference and predicted datasets, and a problem definition, The `prepare_datasets` function is used to obtain output scalars for subsequent analysis. -# - -# %% -# Load PLAID datasets and problem metadata objects -ref_ds = Dataset(dataset_directory / "dataset_ref") -pred_ds = Dataset(dataset_directory / "dataset_near_pred") -problem = ProblemDefinition(dataset_directory / "problem_definition") - -# Get output scalars from reference and prediction dataset -ref_out_scalars, pred_out_scalars, out_scalars_names = prepare_datasets( - ref_ds, pred_ds, problem, verbose=True -) - -print(f"{out_scalars_names = }\n") - -# %% -# Get output scalar -key = out_scalars_names[0] - -print(f"KEY '{key}':\n") -print(f"ID{' ' * 5}--REF_out_scalars--{' ' * 7}--PRED_out_scalars--") - -# Print output scalar values for both datasets -index = 0 -for item1, item2 in zip(ref_out_scalars[key], pred_out_scalars[key]): - print( - f"{str(index).ljust(2)} | {str(item1).ljust(20)} | {str(item2).ljust(20)}" - ) - index += 1 - -# %% [markdown] -# ## Plotting with File Paths -# -# Here, we load the datasets and problem metadata from file paths and use the `plot_bisect` function to generate a bisect plot for a specific scalar, in this case, "scalar_2." - -# %% -print("=== Plot with file paths ===") - -# Load PLAID datasets and problem metadata from files -ref_path = dataset_directory / "dataset_ref" -pred_path = dataset_directory / "dataset_pred" -problem_path = dataset_directory / "problem_definition" - -# Using file paths to generate bisect plot on feature_2 -plot_bisect(ref_path, pred_path, problem_path, "feature_2", "differ_bisect_plot") - -# %% [markdown] -# ## Plotting with PLAID -# -# In this section, we demonstrate how to use PLAID objects directly to generate a bisect plot. This can be advantageous when working with PLAID datasets in memory. - -# %% -print("=== Plot with PLAID objects ===") - -# Load PLAID datasets and problem metadata objects -ref_path = Dataset(dataset_directory / "dataset_ref") -pred_path = Dataset(dataset_directory / "dataset_pred") -problem_path = ProblemDefinition(dataset_directory / "problem_definition") - -# Using PLAID objects to generate bisect plot on feature_2 -plot_bisect(ref_path, pred_path, problem_path, "feature_2", "equal_bisect_plot") - -# %% [markdown] -# ## Mixing with Scalar Index and Verbose -# -# In this final section, we showcase a mix of file paths and PLAID objects, incorporating a scalar index and enabling the verbose option when generating a bisect plot. This can provide more detailed information during the plotting process. - -# %% -print("=== Mix with scalar index and verbose ===") - -# Mix -ref_path = dataset_directory / "dataset_ref" -pred_path = dataset_directory / "dataset_near_pred" -problem_path = ProblemDefinition(dataset_directory / "problem_definition") - -# Using scalar index and verbose option to generate bisect plot -scalar_index = 0 -plot_bisect( - ref_path, - pred_path, - problem_path, - scalar_index, - "converge_bisect_plot", - verbose=True, -) - -os.remove("converge_bisect_plot.png") -os.remove("differ_bisect_plot.png") -os.remove("equal_bisect_plot.png") \ No newline at end of file diff --git a/examples/post/metrics_example.py b/examples/post/metrics_example.py deleted file mode 100644 index 70ea4bd6..00000000 --- a/examples/post/metrics_example.py +++ /dev/null @@ -1,126 +0,0 @@ -# --- -# jupyter: -# jupytext: -# formats: ipynb,py:percent -# text_representation: -# extension: .py -# format_name: percent -# format_version: '1.3' -# jupytext_version: 1.17.3 -# kernelspec: -# display_name: Python 3 -# language: python -# name: python3 -# --- - -# %% [markdown] -# # Metrics Examples -# -# ## Introduction -# This notebook demonstrates the use case of the `prepare_datasets`, `compute_metrics`, and `pretty_metrics` functions from the PLAID library. The function is used to compute metrics for comparing reference and predicted datasets based on a given problem definition. -# - -# %% -# Importing Required Libraries -from pathlib import Path -import os - -from plaid import Dataset -from plaid.post.metrics import compute_metrics, prepare_datasets, pretty_metrics -from plaid import ProblemDefinition - - -# %% -# Setting up Directories -try: - dataset_directory = Path(__file__).parent.parent.parent / "tests" / "post" -except NameError: - dataset_directory = Path("..") / ".." / ".." / ".." / "tests" / "post" - -# %% [markdown] -# ## Prepare Datasets for comparision -# -# Assuming you have reference and predicted datasets, and a problem definition, The `prepare_datasets` function is used to obtain output scalars for subsequent analysis. - -# %% -# Load PLAID datasets and problem metadata objects -ref_ds = Dataset(dataset_directory / "dataset_ref") -pred_ds = Dataset(dataset_directory / "dataset_near_pred") -problem = ProblemDefinition(dataset_directory / "problem_definition") - -# Get output scalars from reference and prediction dataset -ref_out_scalars, pred_out_scalars, out_scalars_names = prepare_datasets( - ref_ds, pred_ds, problem, verbose=True -) - -print(f"{out_scalars_names = }\n") - -# %% -# Get output scalar -key = out_scalars_names[0] - -print(f"KEY '{key}':\n") -print(f"ID{' ' * 5}--REF_out_scalars--{' ' * 7}--PRED_out_scalars--") - -# Print output scalar values for both datasets -index = 0 -for item1, item2 in zip(ref_out_scalars[key], pred_out_scalars[key]): - print( - f"{str(index).ljust(2)} | {str(item1).ljust(20)} | {str(item2).ljust(20)}" - ) - index += 1 - -# %% [markdown] -# ## Metrics with File Paths -# -# Here, we load the datasets and problem metadata from file paths and use the `compute_metrics` function to generate metrics for comparison. The resulting metrics are then printed in a structured dictionary format. - -# %% -print("=== Metrics with file paths ===") - -# Load PLAID datasets and problem metadata file paths -ref_ds = dataset_directory / "dataset_ref" -pred_ds = dataset_directory / "dataset_near_pred" -problem = dataset_directory / "problem_definition" - -# Using file paths to generate metrics -metrics = compute_metrics(ref_ds, pred_ds, problem, "first_metrics") - -import json - -# Print the resulting metrics -print("output dictionary =", json.dumps(metrics, indent=4)) - -# %% [markdown] -# ## Metrics with PLAID Objects and Verbose -# -# In this section, we demonstrate how to use PLAID objects directly to generate metrics, and the verbose option is enabled to provide more detailed information during the computation. - -# %% -print("=== Metrics with PLAID objects and verbose ===") - -# Load PLAID datasets and problem metadata objects -ref_ds = Dataset(dataset_directory / "dataset_ref") -pred_ds = Dataset(dataset_directory / "dataset_pred") -problem = ProblemDefinition(dataset_directory / "problem_definition") - -# Pretty print activated with verbose mode -metrics = compute_metrics(ref_ds, pred_ds, problem, "second_metrics", verbose=True) - -# %% [markdown] -# ## Print metrics in a beautiful way -# -# Finally, in this last section, we showcase a way to print metrics in a more aesthetically pleasing format using the `pretty_metrics` function. The provided dictionary is an example structure for representing metrics, and the function enhances the readability of the metrics presentation. (it is used by `compute_metrics` when verbose mode is activated) - -# %% -dictionary: dict = { - "RMSE:": { - "train": {"scalar_1": 0.12345, "scalar_2": 0.54321}, - "test": {"scalar_1": 0.56789, "scalar_2": 0.98765}, - } -} - -pretty_metrics(dictionary) - -os.remove("first_metrics.yaml") -os.remove("second_metrics.yaml") \ No newline at end of file diff --git a/examples/problem_definition_example.py b/examples/problem_definition_example.py index dccd4245..ab87b8c3 100644 --- a/examples/problem_definition_example.py +++ b/examples/problem_definition_example.py @@ -34,15 +34,13 @@ # %% # Import necessary libraries and functions -from plaid import Dataset, Sample from plaid import ProblemDefinition -from plaid.utils.split import split_dataset -from plaid.types import FeatureIdentifier # %% [markdown] # ## Section 1: Initializing an Empty ProblemDefinition # -# This section demonstrates how to initialize a Problem Definition and add inputs / outputs. +# This section demonstrates how to initialize a ProblemDefinition and add +# input/output feature identifiers with the current API. # %% [markdown] # ### Initialize and print ProblemDefinition @@ -54,27 +52,27 @@ # %% # ### Initialize some feature identifiers -scalar_1_feat_id = FeatureIdentifier({"type":"scalar", "name":"scalar_1"}) -scalar_2_feat_id = FeatureIdentifier({"type":"scalar", "name":"scalar_2"}) -scalar_3_feat_id = FeatureIdentifier({"type":"scalar", "name":"scalar_3"}) -field_1_feat_id = FeatureIdentifier({"type":"field", "name":"field_1", "base_name":"Base_2_2"}) -field_2_feat_id = FeatureIdentifier({"type":"field", "name":"field_2", "base_name":"Base_2_2", "location":"Vertex"}) +scalar_1_feat_id = "Global/scalar_1" +scalar_2_feat_id = "Global/scalar_2" +scalar_3_feat_id = "Global/scalar_3" +field_1_feat_id = "Base_2_2/Zone/CellCenterFields/field_1" +field_2_feat_id = "Base_2_2/Zone/VertexFields/field_2" # %% [markdown] # ### Add inputs / outputs to a Problem Definition # %% # Add unique input and output feature identifiers -problem.add_in_feature_identifier(scalar_1_feat_id) -problem.add_out_feature_identifier(scalar_2_feat_id) +problem.add_in_features_identifiers(scalar_1_feat_id) +problem.add_out_features_identifiers(scalar_2_feat_id) # Add list of input and output feature identifiers problem.add_in_features_identifiers([scalar_3_feat_id, field_1_feat_id]) problem.add_out_features_identifiers([field_2_feat_id]) -print(f"{problem.get_in_features_identifiers() = }") +print(f"{problem.input_features = }") print( - f"{problem.get_out_features_identifiers() = }", + f"{problem.output_features = }", ) # %% [markdown] @@ -87,52 +85,35 @@ # %% # Set the task type (e.g., regression) -problem.set_task("regression") -print(f"{problem.get_task() = }") +problem.task = "regression" +print(f"{problem.task = }") # %% [markdown] # ### Set Problem Definition split # %% -# Init an empty Dataset -dataset = Dataset() -print(f"{dataset = }") - -# Add Samples -dataset.add_samples([Sample(), Sample(), Sample(), Sample()]) -print(f"{dataset = }") +# Current API uses `train_split` and `test_split` fields. +# Note: each split field currently expects a dictionary with a single entry. +problem.train_split = {"train": [0, 1]} +problem.test_split = {"test": [2, 3]} +print(f"{problem.train_split = }") +print(f"{problem.test_split = }") # %% -# Set startegy options for the split -options = { - "shuffle": False, - "split_sizes": { - "train": 2, - "val": 1, - }, +split_names = [problem.get_train_split_name(), problem.get_test_split_name()] +split_indices = { + problem.get_train_split_name(): problem.get_train_split_indices(), + problem.get_test_split_name(): problem.get_test_split_indices(), } - -split = split_dataset(dataset, options) -print(f"{split = }") - -# %% -problem.set_split(split) -print(f"{problem.get_split() = }") - -# %% [markdown] -# ### Retrieves Problem Definition split indices - -# %% -# Get all split indices -print(f"{problem.get_all_indices() = }") +print(f"{split_names = }") +print(f"{split_indices = }") # %% [markdown] -# ### Filter Problem Definition inputs / outputs by feature identifiers +# ### Show inputs / outputs # %% -all_feature_ids = [scalar_1_feat_id, scalar_2_feat_id, scalar_3_feat_id, field_1_feat_id, field_2_feat_id] -print(f"{problem.filter_in_features_identifiers(all_feature_ids) = }") -print(f"{problem.filter_out_features_identifiers(all_feature_ids) = }") +print(f"{problem.input_features = }") +print(f"{problem.output_features = }") # %% [markdown] # ## Section 3: Saving and Loading Problem Definitions @@ -140,34 +121,22 @@ # This section demonstrates how to save and load a Problem Definition from a directory. # %% [markdown] -# ### Save a Problem Definition to a directory +# ### Save a Problem Definition to a YAML file # %% -test_pth = Path(f"/tmp/test_safe_to_delete_{np.random.randint(low=1, high=2_000_000_000)}") -pb_def_save_fname = test_pth / "test" +test_pth = Path( + f"/tmp/test_safe_to_delete_{np.random.randint(low=1, high=2_000_000_000)}" +) +pb_def_save_fname = test_pth / "test_problem_definition.yaml" test_pth.mkdir(parents=True, exist_ok=True) print(f"saving path: {pb_def_save_fname}") -problem.save_to_dir(pb_def_save_fname) - -# %% [markdown] -# ### Load a ProblemDefinition from a directory via initialization - -# %% -problem = ProblemDefinition(pb_def_save_fname) -print(problem) - -# %% [markdown] -# ### Load from a directory via the ProblemDefinition class - -# %% -problem = ProblemDefinition.load(pb_def_save_fname) -print(problem) +problem.save_to_file(pb_def_save_fname) # %% [markdown] -# ### Load from a directory via a Dataset instance +# ### Load a ProblemDefinition from a YAML file # %% problem = ProblemDefinition() -problem.load(pb_def_save_fname) +problem._load_from_file_(pb_def_save_fname) print(problem) diff --git a/examples/run_examples.sh b/examples/run_examples.sh index 8853bbd1..48898667 100755 --- a/examples/run_examples.sh +++ b/examples/run_examples.sh @@ -1,9 +1,9 @@ #!/bin/bash if [[ "$(uname)" == "Linux" ]]; then - FILES="*.py examples/*.py bridges/*.py utils/*.py containers/*.py post/*.py pipelines/*.py" + FILES="*.py examples/*.py utils/*.py containers/*.py bridges/*.py" else - FILES="*.py examples/*.py utils/*.py containers/*.py post/*.py" + FILES="*.py examples/*.py utils/*.py containers/*.py" fi for file in $FILES diff --git a/examples/utils/__init__.py b/examples/utils/__init__.py index a9efb940..e69de29b 100644 --- a/examples/utils/__init__.py +++ b/examples/utils/__init__.py @@ -1,6 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/examples/utils/init_with_tabular_example.py b/examples/utils/init_with_tabular_example.py deleted file mode 100644 index a2511dd6..00000000 --- a/examples/utils/init_with_tabular_example.py +++ /dev/null @@ -1,87 +0,0 @@ -# --- -# jupyter: -# jupytext: -# formats: ipynb,py:percent -# text_representation: -# extension: .py -# format_name: percent -# format_version: '1.3' -# jupytext_version: 1.17.3 -# kernelspec: -# display_name: Python 3 -# language: python -# name: python3 -# --- - -# %% [markdown] -# # Initializing a Dataset with Tabular Data -# -# 1. Initializing a Dataset with Tabular Data: -# - Generate random tabular data for multiple scalars. -# - Initialize a dataset with the tabular data. -# -# 2. Accessing and Manipulating Data in the Dataset: -# - Retrieve and print the dataset and specific samples. -# - Access and display the value of a particular scalar within a sample. -# - Retrieve tabular data from the dataset based on scalar names. -# -# This example demonstrates how to initialize a dataset with tabular data, access specific samples, retrieve scalar values, and extract tabular data based on scalar names. - -# %% -# Import required libraries -import numpy as np - -# %% -# Import necessary libraries and functions -from plaid.utils.init_with_tabular import initialize_dataset_with_tabular_data - - -# %% -# Print dict util -def dprint(name: str, dictio: dict): - print(name, "{") - for key, value in dictio.items(): - print(" ", key, ":", value) - - print("}") - - -# %% [markdown] -# ## Section 1: Initializing a Dataset with Tabular Data - -# %% -# Generate random tabular data for multiple scalars -nb_scalars = 7 -nb_samples = 10 -names = [f"scalar_{j}" for j in range(nb_scalars)] - -tabular_data = {} -for name in names: - tabular_data[name] = np.random.randn(nb_samples) - -dprint("tabular_data", tabular_data) - -# %% -# Initialize a dataset with the tabular data -dataset = initialize_dataset_with_tabular_data(tabular_data) -print("Initialized Dataset: ", dataset) - -# %% [markdown] -# ## Section 2: Accessing and Manipulating Data in the Dataset - -# %% -# Retrieve and print the dataset and specific samples -sample_1 = dataset[1] -print(f"{sample_1 = }") - -# %% -# Access and display the value of a particular scalar within a sample -scalar_value = sample_1.get_scalar("scalar_0") -print("Scalar 'scalar_0' in Sample 1:", scalar_value) - -# %% -# Retrieve tabular data from the dataset based on scalar names -scalar_names = ["scalar_1", "scalar_3", "scalar_5"] -tabular_data_subset = dataset.get_scalars_to_tabular(scalar_names) -print("Tabular Data Subset for Scalars 1, 3, and 5:") -dprint("tabular_data_subset", tabular_data_subset) diff --git a/examples/utils/interpolation_example.py b/examples/utils/interpolation_example.py deleted file mode 100644 index 5cf9c07a..00000000 --- a/examples/utils/interpolation_example.py +++ /dev/null @@ -1,245 +0,0 @@ -# --- -# jupyter: -# jupytext: -# formats: ipynb,py:percent -# text_representation: -# extension: .py -# format_name: percent -# format_version: '1.3' -# jupytext_version: 1.17.3 -# kernelspec: -# display_name: Python 3 -# language: python -# name: python3 -# --- - -# %% [markdown] -# # Interpolation Examples -# -# This Jupyter Notebook demonstrates the usage and functionality of interpolation functions in the PLAID library. It includes examples of: -# -# 1. Piece-wise linear interpolation -# 2. Piece-wise linear interpolation with mapping -# 3. Vectorized interpolation with mapping -# 4. Vectorized interpolation -# 5. Binary Search -# -# This function provides comprehensive examples and tests for interpolation functions, including piece-wise linear interpolation, interpolation with mapping, vectorized interpolation, and binary search. - -# %% -# Import required libraries -import numpy as np - -# %% -# Import necessary libraries and functions -from plaid.utils.interpolation import ( - binary_search, - binary_search_vectorized, - piece_wise_linear_interpolation, - piece_wise_linear_interpolation_vectorized, - piece_wise_linear_interpolation_vectorized_with_map, - piece_wise_linear_interpolation_with_map, -) - -# %% [markdown] -# ## Section 1: Piece-wise Linear Interpolation - -# %% -# Init example variables -time_indices = np.array([0.0, 1.0, 2.5]) -vectors = np.array([np.ones(5), 2.0 * np.ones(5), 3.0 * np.ones(5)]) - -print(f"{time_indices = }") -print(f"{vectors = }") - -# %% -# Test piece-wise linear interpolation for various inputs -result = piece_wise_linear_interpolation(-1.0, time_indices, vectors) -print(f"{result = }") - -np.testing.assert_almost_equal(result, [1.0, 1.0, 1.0, 1.0, 1.0]) - -# %% -result = piece_wise_linear_interpolation(1.0, time_indices, vectors) -print(f"{result = }") - -np.testing.assert_almost_equal(result, [2.0, 2.0, 2.0, 2.0, 2.0]) - -# %% -result = piece_wise_linear_interpolation(0.4, time_indices, vectors) -print(f"{result = }") - -np.testing.assert_almost_equal(result, [1.4, 1.4, 1.4, 1.4, 1.4]) - -# %% [markdown] -# ## Section 2: Piece-wise Linear Interpolation with Mapping - -# %% -# Init vectors variables -vectors_map = ["vec1", "vec2", "vec1"] -vectors_dict = {"vec1": np.ones(5), "vec2": 2.0 * np.ones(5)} - -print(f"{vectors_map = }") -print(f"{vectors_dict = }") - -# %% -# Test interpolation with mapping to named vectors -result = piece_wise_linear_interpolation_with_map( - 3.0, time_indices, vectors_dict, vectors_map -) -print(f"{result = }") - -np.testing.assert_almost_equal(result, [1.0, 1.0, 1.0, 1.0, 1.0]) - -# %% -result = piece_wise_linear_interpolation_with_map( - 1.0, time_indices, vectors_dict, vectors_map -) -print(f"{result = }") - -np.testing.assert_almost_equal(result, [2.0, 2.0, 2.0, 2.0, 2.0]) - -# %% -result = piece_wise_linear_interpolation_with_map( - 0.6, time_indices, vectors_dict, vectors_map -) -print(f"{result = }") - -np.testing.assert_almost_equal(result, [1.6, 1.6, 1.6, 1.6, 1.6]) - -# %% [markdown] -# ## Section 3: Vectorized Interpolation with Mapping - -# %% -# Init input values -input_values = np.array([-0.1, 2.0, 3.0]) - -# %% -result = piece_wise_linear_interpolation_vectorized_with_map( - input_values, time_indices, vectors_dict, vectors_map -) -print(f"{result = }") - -expected_result = [ - np.array([1.0, 1.0, 1.0, 1.0, 1.0]), - np.array([1.33333333, 1.33333333, 1.33333333, 1.33333333, 1.33333333]), - np.array([1.0, 1.0, 1.0, 1.0, 1.0]), -] - -np.testing.assert_almost_equal(result, expected_result) - -# %% -""" -Checks the accuracy of a piecewise linear interpolation function -by comparing its output for a set of input values to a set of precomputed -expected values. -""" - -time_indices = np.array( - [0.0, 100.0, 200.0, 300.0, 400.0, 500.0, 600.0, 700.0, 800.0, 900.0, 1000.0, 2000.0] -) - -coefficients = np.array( - [ - 2000000.0, - 2200000.0, - 2400000.0, - 2000000.0, - 2400000.0, - 3000000.0, - 2500000.0, - 2400000.0, - 2100000.0, - 2800000.0, - 4000000.0, - 3000000.0, - ] -) - -vals = np.array( - [ - -10.0, - 0.0, - 100.0, - 150.0, - 200.0, - 300.0, - 400.0, - 500.0, - 600.0, - 700.0, - 800.0, - 900.0, - 1000.0, - 3000.0, - 701.4752695491923, - ] -) - -res = np.array( - [ - 2000000.0, - 2000000.0, - 2200000.0, - 2300000.0, - 2400000.0, - 2000000.0, - 2400000.0, - 3000000.0, - 2500000.0, - 2400000.0, - 2100000.0, - 2800000.0, - 4000000.0, - 3000000.0, - 2395574.19135242, - ] -) - -for i in range(vals.shape[0]): - assert ( - piece_wise_linear_interpolation(vals[i], time_indices, coefficients) - res[i] - ) / res[i] < 1.0e-10 - -# %% [markdown] -# ## Section 4: Vectorized Interpolation - -# %% -result = piece_wise_linear_interpolation_vectorized( - np.array(vals), time_indices, coefficients -) - -expected_result = [ - 2000000.0, - 2000000.0, - 2200000.0, - 2300000.0, - 2400000.0, - 2000000.0, - 2400000.0, - 3000000.0, - 2500000.0, - 2400000.0, - 2100000.0, - 2800000.0, - 4000000.0, - 3000000.0, - 2395574.1913524233, -] - -np.testing.assert_almost_equal(result, expected_result) - -# %% [markdown] -# ## Section 5: Binary Search - -# %% -test_list = np.array([0.0, 1.0, 2.5, 10.0]) -val_list = np.array([-1.0, 11.0, 0.6, 2.0, 2.6, 9.9, 1.0]) - -# Apply binary search to find indices for given values within a reference list -ref = np.array([0, 3, 0, 1, 2, 2, 1], dtype=int) -result = binary_search_vectorized(test_list, val_list) - -for i, val in enumerate(val_list): - assert binary_search(test_list, val) == ref[i] - assert result[i] == ref[i] diff --git a/examples/utils/split_example.py b/examples/utils/split_example.py deleted file mode 100644 index 3230b271..00000000 --- a/examples/utils/split_example.py +++ /dev/null @@ -1,137 +0,0 @@ -# --- -# jupyter: -# jupytext: -# formats: ipynb,py:percent -# text_representation: -# extension: .py -# format_name: percent -# format_version: '1.3' -# jupytext_version: 1.17.3 -# kernelspec: -# display_name: Python 3 -# language: python -# name: python3 -# --- - -# %% [markdown] -# # Dataset Splitting Examples -# -# This Jupyter Notebook demonstrates the usage of the split module using the PLAID library. It includes examples of: -# -# 1. Initializing a Dataset -# 2. Splitting a Dataset with ratios -# 3. Splitting a Dataset with fixed sizes -# 4. Splitting a Dataset with ratio and fixed Sizes -# 5. Splitting a Dataset with custom split IDs -# -# This example demonstrates the usage of dataset splitting functions to divide a dataset into training, validation, and test sets. It provides examples of splitting the dataset using different methods and configurations. -# -# **Each section is documented and explained.** - -# %% -# Import required libraries -import numpy as np - -# %% -# Import necessary libraries and functions -from plaid.utils.init_with_tabular import initialize_dataset_with_tabular_data -from plaid.utils.split import split_dataset - - -# %% -# Print dict util -def dprint(name: str, dictio: dict): - print(name, "{") - for key, value in dictio.items(): - print(" ", key, ":", value) - - print("}") - - -# %% [markdown] -# ## Section 1: Initialize Dataset -# -# In this section, we create a dataset with random tabular data for testing purposes. The dataset will be used for subsequent splitting. - -# %% -# Create a dataset with random tabular data for testing purposes -nb_scalars = 7 -nb_samples = 70 -tabular_data = {f"scalar_{j}": np.random.randn(nb_samples) for j in range(nb_scalars)} -dataset = initialize_dataset_with_tabular_data(tabular_data) - -print(f"{dataset = }") - -# %% [markdown] -# ## Section 2: Splitting a Dataset with Ratios -# -# In this section, we split the dataset into training, validation, and test sets using specified ratios. We also have the option to shuffle the dataset during the split process. - -# %% -print("# First split") -options = { - "shuffle": True, - "split_ratios": { - "train": 0.8, - "val": 0.1, - }, -} - -split = split_dataset(dataset, options) -dprint("split =", split) - -# %% [markdown] -# ## Section 3: Splitting a Dataset with Fixed Sizes -# -# In this section, we split the dataset into training, validation, and test sets with fixed sample counts for each set. We can also choose to shuffle the dataset during the split. - -# %% -print("# Second split") -options = { - "shuffle": True, - "split_sizes": { - "train": 14, - "val": 8, - "test": 5, - }, -} - -split = split_dataset(dataset, options) -dprint("split =", split) - -# %% [markdown] -# ## Section 4: Splitting a Dataset with Ratios and Fixed Sizes -# -# In this section, we split the dataset into training, validation, and test sets with fixed sample counts and sample ratios for each set. We can also choose to shuffle the dataset during the split. - -# %% -print("# Third split") -options = { - "shuffle": True, - "split_ratios": { - "train": 0.7, - "test": 0.1, - }, - "split_sizes": {"val": 7}, -} - -split = split_dataset(dataset, options) -dprint("split =", split) - -# %% [markdown] -# ## Section 5: Splitting a Dataset with Custom Split IDs -# -# In this section, we split the dataset based on custom sample IDs for each set. We can specify the sample IDs for training, validation, and prediction sets. - -# %% -print("# Fourth split") -options = { - "split_ids": { - "train": np.arange(20), - "val": np.arange(30, 60), - "predict": np.arange(25, 35), - }, -} - -split = split_dataset(dataset, options) -dprint("split =", split) diff --git a/examples/utils/stats_example.py b/examples/utils/stats_example.py deleted file mode 100644 index f0a65fcf..00000000 --- a/examples/utils/stats_example.py +++ /dev/null @@ -1,174 +0,0 @@ -# --- -# jupyter: -# jupytext: -# formats: ipynb,py:percent -# text_representation: -# extension: .py -# format_name: percent -# format_version: '1.3' -# jupytext_version: 1.17.3 -# kernelspec: -# display_name: Python 3 -# language: python -# name: python3 -# --- - -# %% [markdown] -# # Statistics Calculation Examples -# -# 1. OnlineStatistics Class: -# - Initialize an OnlineStatistics object. -# - Calculate statistics for an empty dataset. -# - Add the first batch of samples and update statistics. -# - Add the second batch of samples and update statistics. -# - Combine and recompute statistics for all samples. -# -# 2. Stats Class: -# - Initialize a Stats object to collect statistics. -# - Create and add samples with scalar and field data. -# - Retrieve and display the calculated statistics. -# - Add more samples with varying field sizes and update statistics. -# - Retrieve and display the updated statistics. -# -# This notebook provides examples of using the OnlineStatistics and Stats classes to compute statistics from sample data, including scalars and fields. It demonstrates the functionality and usage of these classes. -# -# **Each section is documented and explained.** - -# %% -# Import required libraries -import numpy as np -import rich - -# %% -# Import necessary libraries and functions -from plaid import Sample -from plaid.utils.stats import OnlineStatistics, Stats - - -# %% -def sprint(stats: dict): - print("Stats:") - for k in stats: - print(" - {} -> {}".format(k, stats[k])) - - -# %% [markdown] -# ## Section 1: OnlineStatistics Class -# -# In this section, we demonstrate the usage of the OnlineStatistics class. We initialize an OnlineStatistics object and calculate statistics for an empty dataset. Then, we add the first and second batches of samples and update the statistics. Finally, we combine and recompute statistics for all samples. - -# %% [markdown] -# ### Initialize and empty OnlineStatistics - -# %% -print("#---# Initialize OnlineStatistics") -stats_computer = OnlineStatistics() -stats = stats_computer.get_stats() - -sprint(stats) - -# %% [markdown] -# ### Add sample batches - -# %% -# First batch of samples -first_batch_samples = 3.0 * np.random.randn(100, 3) + 10.0 -print(f"{first_batch_samples.shape = }") - -stats_computer.add_samples(first_batch_samples) -stats = stats_computer.get_stats() - -sprint(stats) - -# %% -second_batch_samples = 10.0 * np.random.randn(1000, 3) - 1.0 -print(f"{second_batch_samples.shape = }") - -stats_computer.add_samples(second_batch_samples) -stats = stats_computer.get_stats() - -sprint(stats) - -# %% [markdown] -# ### Combine and recompute statistics - -# %% -total_samples = np.concatenate((first_batch_samples, second_batch_samples), axis=0) -print(f"{total_samples.shape = }") - -new_stats_computer = OnlineStatistics() -new_stats_computer.add_samples(total_samples) -stats = new_stats_computer.get_stats() - -sprint(stats) - -# %% [markdown] -# ## Section 2: Stats Class -# -# In this section, we explore the Stats class. We initialize a Stats object to collect statistics, create and add samples with scalar and field data. We retrieve and display the calculated statistics. We also add more samples with varying field sizes and update the statistics, followed by retrieving and displaying the updated statistics. - -# %% [markdown] -# ### Initalize an empty Stats object - -# %% -print("#---# Initialize Stats") -stats = Stats() -print(f"{stats.get_stats() = }") - -# %% [markdown] -# ### Feed Stats with Samples - -# %% -print("#---# Feed Stats with samples") - -# Init 11 samples -nb_samples = 11 -samples = [Sample() for _ in range(nb_samples)] - -spatial_shape_max = 20 -# -for sample in samples: - sample.add_scalar("test_scalar", np.random.randn()) - sample.init_base(2, 3, "test_base") - zone_shape = np.array([[spatial_shape_max, 0, 0]]) - sample.init_zone(zone_shape, zone_name="test_zone") - sample.set_nodes(np.zeros((spatial_shape_max, 3))) - sample.add_field("test_field", np.random.randn(spatial_shape_max)) - -stats.add_samples(samples) - -# %% [markdown] -# ### Get and print stats - -# %% -rich.print("stats.get_stats():") -rich.print(stats.get_stats()) - -# %% [markdown] -# ### Feed Stats with more Samples - -# %% -nb_samples = 11 -spatial_shape_max = 20 -samples = [Sample() for _ in range(nb_samples)] - -for sample in samples: - sample.add_scalar("test_scalar", np.random.randn()) - sample.init_base(2, 3, "test_base") - zone_shape = np.array([[spatial_shape_max, 0, 0]]) - sample.init_zone(zone_shape, zone_name="test_zone") - sample.set_nodes(np.zeros((spatial_shape_max, 3))) - sample.add_field("test_field_same_size", np.random.randn(spatial_shape_max)) - sample.add_field( - "test_field", - np.random.randn(spatial_shape_max), - ) - -stats.add_samples(samples) - -# %% [markdown] -# ### Get and print stats - -# %% -rich.print("stats.get_stats():") -rich.print(stats.get_stats()) diff --git a/src/plaid/__init__.py b/src/plaid/__init__.py index c415ad42..dbe927c1 100644 --- a/src/plaid/__init__.py +++ b/src/plaid/__init__.py @@ -1,21 +1,9 @@ """PLAID package public API.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -try: - from ._version import __version__ -except ImportError: # pragma: no cover - __version__ = "None" - from .containers.dataset import Dataset from .containers.sample import Sample from .containers.utils import get_number_of_samples, get_sample_ids from .problem_definition import ProblemDefinition +from .version import __version__ __all__ = [ "__version__", diff --git a/src/plaid/bridges/__init__.py b/src/plaid/bridges/__init__.py deleted file mode 100644 index 262b1c42..00000000 --- a/src/plaid/bridges/__init__.py +++ /dev/null @@ -1,7 +0,0 @@ -"""Package that implements the PLAID bridges.""" -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/src/plaid/bridges/huggingface_bridge.py b/src/plaid/bridges/huggingface_bridge.py deleted file mode 100644 index 06c83dcc..00000000 --- a/src/plaid/bridges/huggingface_bridge.py +++ /dev/null @@ -1,2178 +0,0 @@ -"""Hugging Face bridge for PLAID datasets.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# -import os -import pickle -import shutil -import sys -from multiprocessing import Pool -from pathlib import Path -from typing import Any, Optional, Union - -import numpy as np -import pyarrow as pa -import yaml -from tqdm import tqdm - -if sys.version_info >= (3, 11): - from typing import Self -else: # pragma: no cover - from typing import TypeVar - - Self = TypeVar("Self") - -# ------------------------------------------------------------------------------ -# imports for the saved functions at the end of the file -import hashlib -import io -import logging -import multiprocessing as mp -import traceback -from functools import partial -from queue import Empty -from typing import Callable - -import datasets -from datasets import Features, Sequence, Value, load_dataset, load_from_disk -from huggingface_hub import HfApi, hf_hub_download, snapshot_download -from pydantic import ValidationError - -from plaid import Dataset, ProblemDefinition, Sample -from plaid.containers.features import SampleFeatures -from plaid.types import IndexType -from plaid.utils.cgns_helper import ( - flatten_cgns_tree, - unflatten_cgns_tree, -) - -# ------------------------------------------------------------------------------ - -logger = logging.getLogger(__name__) -pa.set_memory_pool(pa.system_memory_pool()) - - -# ------------------------------------------------------------------------------ -# HUGGING FACE BRIDGE (with tree flattening and pyarrow tables) -# ------------------------------------------------------------------------------ - - -def to_plaid_dataset( - hf_dataset: datasets.Dataset, - flat_cst: dict[str, Any], - cgns_types: dict[str, str], - enforce_shapes: bool = True, -) -> Dataset: - """Convert a Hugging Face dataset into a PLAID dataset. - - Iterates over all samples in a Hugging Face `Dataset` and converts each one - into a PLAID-compatible sample using `to_plaid_sample`. The resulting - samples are then collected into a single PLAID `Dataset`. - - Args: - hf_dataset (datasets.Dataset): The Hugging Face dataset split to convert. - flat_cst (dict[str, Any]): Flattened representation of the CGNS tree structure constants. - cgns_types (dict[str, str]): Mapping of CGNS paths to their expected types. - enforce_shapes (bool, optional): If True, ensures all arrays strictly follow the reference shapes. Defaults to True. - - Returns: - Dataset: A PLAID `Dataset` object containing the converted samples. - """ - sample_list = [] - for i in range(len(hf_dataset)): - sample_list.append( - to_plaid_sample(hf_dataset, i, flat_cst, cgns_types, enforce_shapes) - ) - - return Dataset(samples=sample_list) - - -def to_plaid_sample( - ds: datasets.Dataset, - i: int, - flat_cst: dict[str, Any], - cgns_types: dict[str, str], - enforce_shapes: bool = True, -) -> Sample: - """Convert a Hugging Face dataset row to a PLAID Sample object. - - Extracts a single row from a Hugging Face dataset and converts it - into a PLAID Sample by unflattening the CGNS tree structure. Constant features - from flat_cst are merged with the variable features from the row. - - Args: - ds (datasets.Dataset): The Hugging Face dataset containing the sample data. - i (int): The index of the row to convert. - flat_cst (dict[str, Any]): Dictionary of constant features to add to each sample. - cgns_types (dict[str, str]): Dictionary mapping paths to CGNS types for reconstruction. - enforce_shapes (bool, optional): If True, ensures consistent array shapes during conversion. Defaults to True. - - Returns: - Sample: A validated PLAID Sample object reconstructed from the Hugging Face dataset row. - - Note: - - Uses the dataset's pyarrow table data for efficient access. - - Handles array shapes and types according to enforce_shapes. - - Constant features from flat_cst are merged with the variable features from the row. - """ - assert not isinstance(flat_cst[next(iter(flat_cst))], dict), ( - "did you provide the complete `flat_cst` instead of the one for the considered split?" - ) - - table = ds.data - row = {} - if not enforce_shapes: - for name in table.column_names: - value = table[name][i].values - if value is None: - row[name] = None # pragma: no cover - else: - row[name] = value.to_numpy(zero_copy_only=False) - else: - for name in table.column_names: - if isinstance(table[name][i], pa.NullScalar): - row[name] = None # pragma: no cover - else: - value = table[name][i].values - if value is None: - row[name] = None # pragma: no cover - else: - if isinstance(value, pa.ListArray): - row[name] = np.stack(value.to_numpy(zero_copy_only=False)) - elif isinstance(value, pa.StringArray): # pragma: no cover - row[name] = value.to_numpy(zero_copy_only=False) - else: - row[name] = value.to_numpy(zero_copy_only=True) - - flat_cst_val = {k: v for k, v in flat_cst.items() if not k.endswith("_times")} - flat_cst_times = {k[:-6]: v for k, v in flat_cst.items() if k.endswith("_times")} - - row_val = {k: v for k, v in row.items() if not k.endswith("_times")} - row_tim = {k[:-6]: v for k, v in row.items() if k.endswith("_times")} - - row_val.update(flat_cst_val) - row_tim.update(flat_cst_times) - - row_val = {p: row_val[p] for p in sorted(row_val)} - row_tim = {p: row_tim[p] for p in sorted(row_tim)} - - sample_flat_trees = {} - paths_none = {} - for (path_t, times_struc), (path_v, val) in zip(row_tim.items(), row_val.items()): - assert path_t == path_v - if val is None: - assert times_struc is None - if path_v not in paths_none and cgns_types[path_v] not in [ - "DataArray_t", - "IndexArray_t", - ]: - paths_none[path_v] = None - else: - times_struc = times_struc.reshape((-1, 3)) - for i, time in enumerate(times_struc[:, 0]): - start = int(times_struc[i, 1]) - end = int(times_struc[i, 2]) - if end == -1: - end = None - if val.ndim > 1: - values = val[:, start:end] - else: - values = val[start:end] - if isinstance(values[0], str): - values = np.frombuffer( - values[0].encode("ascii", "strict"), dtype="|S1" - ) - if time in sample_flat_trees: - sample_flat_trees[time][path_v] = values - else: - sample_flat_trees[time] = {path_v: values} - - for time, tree in sample_flat_trees.items(): - bases = list(set([k.split("/")[0] for k in tree.keys()])) - for base in bases: - tree[f"{base}/Time"] = np.array([1], dtype=np.int32) - tree[f"{base}/Time/IterationValues"] = np.array([1], dtype=np.int32) - tree[f"{base}/Time/TimeValues"] = np.array([time], dtype=np.float64) - tree["CGNSLibraryVersion"] = np.array([4.0], dtype=np.float32) - - sample_data = {} - for time, flat_tree in sample_flat_trees.items(): - flat_tree.update(paths_none) - sample_data[time] = unflatten_cgns_tree(flat_tree, cgns_types) - - return Sample(path=None, features=SampleFeatures(sample_data)) - - -# ------------------------------------------------------------------------------ -# HUGGING FACE HUB INTERACTIONS -# ------------------------------------------------------------------------------ - - -def instantiate_plaid_datasetdict_from_hub( - repo_id: str, - enforce_shapes: bool = True, -) -> dict[str, Dataset]: # pragma: no cover (not tested in unit tests) - """Load a Hugging Face dataset from the Hub and instantiate it as a dictionary of PLAID datasets. - - This function retrieves a dataset dictionary from the Hugging Face Hub, - along with its associated CGNS tree structure and type information. Each - split of the Hugging Face dataset is then converted into a PLAID dataset. - - Args: - repo_id (str): - The Hugging Face repository identifier (e.g. `"user/dataset"`). - enforce_shapes (bool, optional): - If True, enforce strict array shapes when converting to PLAID - datasets. Defaults to True. - - Returns: - dict[str, Dataset]: - A dictionary mapping split names (e.g. `"train"`, `"test"`) to - PLAID `Dataset` objects. - - """ - hf_dataset_dict = load_dataset_from_hub(repo_id) - - flat_cst, key_mappings = load_tree_struct_from_hub(repo_id) - cgns_types = key_mappings["cgns_types"] - - datasetdict = {} - for split_name, hf_dataset in hf_dataset_dict.items(): - datasetdict[split_name] = to_plaid_dataset( - hf_dataset, flat_cst, cgns_types, enforce_shapes - ) - - return datasetdict - - -def load_dataset_from_hub( - repo_id: str, streaming: bool = False, *args, **kwargs -) -> Union[ - datasets.Dataset, - datasets.DatasetDict, - datasets.IterableDataset, - datasets.IterableDatasetDict, -]: # pragma: no cover (not tested in unit tests) - """Loads a Hugging Face dataset from the public hub, a private mirror, or local cache, with automatic handling of streaming and download modes. - - Behavior: - - - If the environment variable `HF_ENDPOINT` is set, uses a private Hugging Face mirror. - - - Streaming is disabled. - - The dataset is downloaded locally via `snapshot_download` and loaded from disk. - - - If `HF_ENDPOINT` is not set, attempts to load from the public Hugging Face hub. - - - If the dataset is already cached locally, loads from disk. - - Otherwise, loads from the hub, optionally using streaming mode. - - Args: - repo_id (str): The Hugging Face dataset repository ID (e.g., 'username/dataset'). - streaming (bool, optional): If True, attempts to stream the dataset (only supported on the public hub). - *args: - Positional arguments forwarded to - [`datasets.load_dataset`](https://huggingface.co/docs/datasets/main/en/package_reference/loading_methods#datasets.load_dataset). - **kwargs: - Keyword arguments forwarded to - [`datasets.load_dataset`](https://huggingface.co/docs/datasets/main/en/package_reference/loading_methods#datasets.load_dataset). - - Returns: - Union[datasets.Dataset, datasets.DatasetDict]: The loaded Hugging Face dataset object. - - Raises: - Exception: Propagates any exceptions raised by `datasets.load_dataset`, `datasets.load_from_disk`, or `huggingface_hub.snapshot_download` if loading fails. - - Note: - - Streaming mode is not supported when using a private mirror. - - If the dataset is found in the local cache, loads from disk instead of streaming. - - To use behind a proxy or with a private mirror, you may need to set: - - HF_ENDPOINT to your private mirror address - - CURL_CA_BUNDLE to your trusted CA certificates - - HF_HOME to a shared cache directory if needed - """ - hf_endpoint = os.getenv("HF_ENDPOINT", "").strip() - - # Helper to check if dataset repo is already cached - def _get_cached_path(repo_id_): - try: - return snapshot_download( - repo_id=repo_id_, repo_type="dataset", local_files_only=True - ) - except FileNotFoundError: - return None - - # Private mirror case - if hf_endpoint: - if streaming: - logger.warning( - "Streaming mode not compatible with private mirror. Falling back to download mode." - ) - local_path = snapshot_download(repo_id=repo_id, repo_type="dataset") - return load_dataset(local_path, *args, **kwargs) - - # Public case - local_path = _get_cached_path(repo_id) - if local_path is not None and streaming is True: - # Even though streaming mode: rely on local files if already downloaded - logger.info("Dataset found in cache. Loading from disk instead of streaming.") - return load_dataset(local_path, *args, **kwargs) - - return load_dataset(repo_id, streaming=streaming, *args, **kwargs) - - -def load_infos_from_hub( - repo_id: str, -) -> dict[str, dict[str, str]]: # pragma: no cover (not tested in unit tests) - """Load dataset infos from the Hugging Face Hub. - - Downloads the infos.yaml file from the specified repository and parses it as a dictionary. - - Args: - repo_id (str): The repository ID on the Hugging Face Hub. - - Returns: - dict[str, dict[str, str]]: Dictionary containing dataset infos. - """ - # Download infos.yaml - yaml_path = hf_hub_download( - repo_id=repo_id, filename="infos.yaml", repo_type="dataset" - ) - with open(yaml_path, "r", encoding="utf-8") as f: - infos = yaml.safe_load(f) - - return infos - - -def load_problem_definition_from_hub( - repo_id: str, name: str -) -> ProblemDefinition: # pragma: no cover (not tested in unit tests) - """Load a ProblemDefinition from the Hugging Face Hub. - - Downloads the problem infos YAML and split JSON files from the specified repository and location, - then initializes a ProblemDefinition object with this information. - - Args: - repo_id (str): The repository ID on the Hugging Face Hub. - name (str): The name of the problem_definition stored in the repo. - - Returns: - ProblemDefinition: The loaded problem definition. - """ - if not name.endswith(".yaml"): - name = f"{name}.yaml" - - # Download problem_infos.yaml - yaml_path = hf_hub_download( - repo_id=repo_id, - filename=f"problem_definitions/{name}", - repo_type="dataset", - ) - with open(yaml_path, "r", encoding="utf-8") as f: - yaml_data = yaml.safe_load(f) - - prob_def = ProblemDefinition() - prob_def._initialize_from_problem_infos_dict(yaml_data) - - return prob_def - - -def load_tree_struct_from_hub( - repo_id: str, -) -> tuple[dict, dict]: # pragma: no cover (not tested in unit tests) - """Load the tree structure metadata of a PLAID dataset from the Hugging Face Hub. - - This function retrieves two artifacts previously uploaded alongside a dataset: - - **tree_constant_part.pkl**: a pickled dictionary of constant feature values - (features that are identical across all samples). - - **key_mappings.yaml**: a YAML file containing metadata about the dataset - feature structure, including variable features, constant features, and CGNS types. - - Args: - repo_id (str): - The repository ID on the Hugging Face Hub - (e.g., `"username/dataset_name"`). - - Returns: - tuple[dict, dict]: - - **flat_cst (dict)**: constant features dictionary (path → value). - - **key_mappings (dict)**: metadata dictionary containing keys such as: - - `"variable_features"`: list of paths for non-constant features. - - `"constant_features"`: list of paths for constant features. - - `"cgns_types"`: mapping from paths to CGNS types. - """ - # constant part of the tree - flat_cst_path = hf_hub_download( - repo_id=repo_id, - filename="tree_constant_part.pkl", - repo_type="dataset", - ) - - with open(flat_cst_path, "rb") as f: - flat_cst = pickle.load(f) - - # key mappings - yaml_path = hf_hub_download( - repo_id=repo_id, - filename="key_mappings.yaml", - repo_type="dataset", - ) - with open(yaml_path, "r", encoding="utf-8") as f: - key_mappings = yaml.safe_load(f) - - return flat_cst, key_mappings - - -# ------------------------------------------------------------------------------ -# HUGGING FACE INTERACTIONS ON DISK -# ------------------------------------------------------------------------------ - - -def load_dataset_from_disk( - path: Union[str, Path], *args, **kwargs -) -> Union[datasets.Dataset, datasets.DatasetDict]: - """Load a Hugging Face dataset or dataset dictionary from disk. - - This function wraps `datasets.load_from_disk` to accept either a string path or a - `Path` object and returns the loaded dataset object. - - Args: - path (Union[str, Path]): Path to the directory containing the saved dataset. - *args: - Positional arguments forwarded to - [`datasets.load_from_disk`](https://huggingface.co/docs/datasets/main/en/package_reference/loading_methods#datasets.load_from_disk). - **kwargs: - Keyword arguments forwarded to - [`datasets.load_from_disk`](https://huggingface.co/docs/datasets/main/en/package_reference/loading_methods#datasets.load_from_disk). - - Returns: - Union[datasets.Dataset, datasets.DatasetDict]: The loaded Hugging Face dataset - object, which may be a single `Dataset` or a `DatasetDict` depending on what - was saved on disk. - """ - return load_from_disk(str(path), *args, **kwargs) - - -def load_infos_from_disk(path: Union[str, Path]) -> dict[str, dict[str, str]]: - """Load dataset information from a YAML file stored on disk. - - Args: - path (Union[str, Path]): Directory path containing the `infos.yaml` file. - - Returns: - dict[str, dict[str, str]]: Dictionary containing dataset infos. - """ - infos_fname = Path(path) / "infos.yaml" - with infos_fname.open("r") as file: - infos = yaml.safe_load(file) - return infos - - -def load_problem_definition_from_disk( - path: Union[str, Path], name: Union[str, Path] -) -> ProblemDefinition: - """Load a ProblemDefinition and its split information from disk. - - Args: - path (Union[str, Path]): The root directory path for loading. - name (str): The name of the problem_definition stored in the disk directory. - - Returns: - ProblemDefinition: The loaded problem definition. - """ - pb_def = ProblemDefinition() - pb_def._load_from_file_(Path(path) / Path("problem_definitions") / Path(name)) - return pb_def - - -def load_tree_struct_from_disk( - path: Union[str, Path], -) -> tuple[dict[str, Any], dict[str, Any]]: - """Load a tree structure for a dataset from disk. - - This function loads two components from the specified directory: - 1. `tree_constant_part.pkl`: a pickled dictionary containing the constant parts of the tree. - 2. `key_mappings.yaml`: a YAML file containing key mappings and metadata. - - Args: - path (Union[str, Path]): Directory path containing the `tree_constant_part.pkl` - and `key_mappings.yaml` files. - - Returns: - tuple[dict, dict]: A tuple containing: - - `flat_cst` (dict): Dictionary of constant tree values. - - `key_mappings` (dict): Dictionary of key mappings and metadata. - """ - with open(Path(path) / Path("key_mappings.yaml"), "r", encoding="utf-8") as f: - key_mappings = yaml.safe_load(f) - - with open(Path(path) / "tree_constant_part.pkl", "rb") as f: - flat_cst = pickle.load(f) - - return flat_cst, key_mappings - - -# ------------------------------------------------------------------------------ -# HUGGING FACE BINARY BRIDGE -# ------------------------------------------------------------------------------ - - -def binary_to_plaid_sample(hf_sample: dict[str, bytes]) -> Sample: - """Convert a Hugging Face dataset sample in binary format to a Plaid `Sample`. - - The input `hf_sample` is expected to contain a pickled representation of a sample - under the key `"sample"`. This function attempts to validate the unpickled sample - as a Plaid `Sample`. If validation fails, it reconstructs the sample from its - components (`meshes`, `path`, and optional `scalars`) before validating it. - - Args: - hf_sample (dict[str, bytes]): A dictionary representing a Hugging Face sample, - with the pickled sample stored under the key `"sample"`. - - Returns: - Sample: A validated Plaid `Sample` object. - - Raises: - KeyError: If required keys (`"sample"`, `"meshes"`, `"path"`) are missing - and the sample cannot be reconstructed. - ValidationError: If the reconstructed sample still fails Plaid validation. - """ - pickled_hf_sample = pickle.loads(hf_sample["sample"]) - - try: - # Try to validate the sample - return Sample.model_validate(pickled_hf_sample) - - except ValidationError: - features = SampleFeatures( - data=pickled_hf_sample.get("meshes"), - ) - - sample = Sample( - path=pickled_hf_sample.get("path"), - features=features, - ) - - scalars = pickled_hf_sample.get("scalars") - if scalars: - for sn, val in scalars.items(): - sample.add_scalar(sn, val) - - return Sample.model_validate(sample) - - -def huggingface_dataset_to_plaid( - ds: datasets.Dataset, - ids: Optional[list[int]] = None, - processes_number: int = 1, - large_dataset: bool = False, - verbose: bool = True, -) -> tuple[Union[Dataset, ProblemDefinition], ProblemDefinition]: - """Use this function for converting a plaid dataset from a Hugging Face dataset. - - A Hugging Face dataset can be read from disk or the hub. From the hub, the - split = "all_samples" options is important to get a dataset and not a datasetdict. - Many options from loading are available (caching, streaming, etc...) - - Args: - ds (datasets.Dataset): the dataset in Hugging Face format to be converted - ids (list, optional): The specific sample IDs to load from the dataset. Defaults to None. - processes_number (int, optional): The number of processes used to generate the plaid dataset - large_dataset (bool): if True, uses a variant where parallel worker do not each load the complete dataset. Default: False. - verbose (bool, optional): if True, prints progress using tdqm - - Returns: - dataset (Dataset): the converted dataset. - problem_definition (ProblemDefinition): the problem definition generated from the Hugging Face dataset - - Example: - .. code-block:: python - - from datasets import load_dataset, load_from_disk - - dataset = load_dataset("path/to/dir", split = "all_samples") - dataset = load_from_disk("chanel/dataset") - plaid_dataset, plaid_problem = huggingface_dataset_to_plaid(dataset) - """ - from plaid.bridges.huggingface_helpers import ( - _HFShardToPlaidSampleConverter, - _HFToPlaidSampleConverter, - ) - - assert processes_number <= len(ds), ( - "Trying to parallelize with more processes than samples in dataset" - ) - if ids: - assert processes_number <= len(ids), ( - "Trying to parallelize with more processes than selected samples in dataset" - ) - - description = "Converting Hugging Face binary dataset to plaid" - - dataset = Dataset() - - if large_dataset: - if ids: - raise NotImplementedError( - "ids selection not implemented with large_dataset option" - ) - for i in range(processes_number): - shard = ds.shard(num_shards=processes_number, index=i) - shard.save_to_disk(f"./shards/dataset_shard_{i}") - - def parallel_convert(shard_path, n_workers): - converter = _HFShardToPlaidSampleConverter(shard_path) - with Pool(processes=n_workers) as pool: - return list( - tqdm( - pool.imap(converter, range(len(converter.hf_ds))), - total=len(converter.hf_ds), - disable=not verbose, - desc=description, - ) - ) - - samples = [] - - for i in range(processes_number): - shard_path = Path(".") / "shards" / f"dataset_shard_{i}" - shard_samples = parallel_convert(shard_path, n_workers=processes_number) - samples.extend(shard_samples) - - dataset.add_samples(samples, ids) - - shards_dir = Path(".") / "shards" - if shards_dir.exists() and shards_dir.is_dir(): - shutil.rmtree(shards_dir) - - else: - if ids: - indices = ids - else: - indices = range(len(ds)) - - if processes_number == 1: - for idx in tqdm( - indices, total=len(indices), disable=not verbose, desc=description - ): - sample = _HFToPlaidSampleConverter(ds)(idx) - dataset.add_sample(sample, id=idx) - - else: - with Pool(processes=processes_number) as pool: - for idx, sample in enumerate( - tqdm( - pool.imap(_HFToPlaidSampleConverter(ds), indices), - total=len(indices), - disable=not verbose, - desc=description, - ) - ): - dataset.add_sample(sample, id=indices[idx]) - - infos = huggingface_description_to_infos(ds.description) - - dataset.set_infos(infos, warn=False) - - problem_definition = huggingface_description_to_problem_definition(ds.description) - - return dataset, problem_definition - - -def huggingface_description_to_problem_definition( - description: dict, -) -> ProblemDefinition: - """Converts a Hugging Face dataset description to a plaid problem definition. - - Args: - description (dict): the description field of a Hugging Face dataset, containing the problem definition - - Returns: - problem_definition (ProblemDefinition): the plaid problem definition initialized from the Hugging Face dataset description - """ - description = {} if description == "" else description - problem_definition = ProblemDefinition() - for func, key in [ - (problem_definition.set_task, "task"), - (problem_definition.set_split, "split"), - (problem_definition.add_input_scalars_names, "in_scalars_names"), - (problem_definition.add_output_scalars_names, "out_scalars_names"), - (problem_definition.add_input_fields_names, "in_fields_names"), - (problem_definition.add_output_fields_names, "out_fields_names"), - (problem_definition.add_input_meshes_names, "in_meshes_names"), - (problem_definition.add_output_meshes_names, "out_meshes_names"), - ]: - try: - func(description[key]) - except KeyError: - logger.error(f"Could not retrieve key:'{key}' from description") - pass - - return problem_definition - - -def huggingface_description_to_infos( - description: dict, -) -> dict[str, dict[str, str]]: - """Convert a Hugging Face dataset description dictionary to a PLAID infos dictionary. - - Extracts the "legal" and "data_production" sections from the Hugging Face description - and returns them in a format compatible with PLAID dataset infos. - - Args: - description (dict): The Hugging Face dataset description dictionary. - - Returns: - dict[str, dict[str, str]]: Dictionary containing "legal" and "data_production" infos if present. - """ - infos = {} - if "legal" in description: - infos["legal"] = description["legal"] - if "data_production" in description: - infos["data_production"] = description["data_production"] - return infos - - -######################################################################################### -#################################### SAVED FUNCTIONS #################################### -######################################################################################### -################### kept temporarily in case of lost functionalities #################### - - -def infer_hf_features_from_value(value: Any) -> Union[Value, Sequence]: - """Infer Hugging Face dataset feature type from a given value. - - This function analyzes the input value and determines the appropriate Hugging Face - feature type representation. It handles None values, scalars, and arrays/lists - of various dimensions, mapping them to corresponding Hugging Face Value or Sequence types. - - Args: - value (Any): The value to infer the feature type from. Can be None, scalar, - list, tuple, or numpy array. - - Returns: - datasets.Feature: A Hugging Face feature type (Value or Sequence) that corresponds - to the input value's structure and data type. - - Raises: - TypeError: If the value type is not supported. - TypeError: If the array dimensionality exceeds 3D for arrays/lists. - - Note: - - For scalar values, maps numpy dtypes to appropriate Hugging Face Value types: - float types to "float32", int32 to "int32", int64 to "int64", others to "string" - - For arrays/lists, creates nested Sequence structures based on dimensionality: - 1D → Sequence(base_type), 2D → Sequence(Sequence(base_type)), - 3D → Sequence(Sequence(Sequence(base_type))) - - All float values are enforced to "float32" to limit data size - - All int64 values are preserved as "int64" to satisfy CGNS standards - """ - if value is None: - return Value("null") # pragma: no cover - - # Scalars - if np.isscalar(value): - dtype = np.array(value).dtype - if np.issubdtype( - dtype, np.floating - ): # enforcing float32 for all floats, to be updated in case we want to keep float64 - return Value("float32") - elif np.issubdtype(dtype, np.int32): - return Value("int32") - elif np.issubdtype( - dtype, np.int64 - ): # very important to satisfy the CGNS standard - return Value("int64") - elif np.issubdtype(dtype, np.dtype("|S1")) or np.issubdtype( - dtype, np.dtype(" 0 else None) - if arr.ndim == 1: - return Sequence(base_type) - elif arr.ndim == 2: - return Sequence(Sequence(base_type)) - elif arr.ndim == 3: - return Sequence(Sequence(Sequence(base_type))) - else: - raise TypeError(f"Unsupported ndim: {arr.ndim}") # pragma: no cover - raise TypeError(f"Unsupported type: {type(value)}") # pragma: no cover - - -def build_hf_sample(sample: Sample) -> tuple[dict[str, Any], list[str], dict[str, str]]: - """Flatten a PLAID Sample's CGNS trees into Hugging Face–compatible arrays and metadata. - - The function traverses every CGNS tree stored in sample.features.data (keyed by time), - produces a flattened mapping path -> primitive value for each time, and then builds - compact numpy arrays suitable for storage in a Hugging Face Dataset. Repeated value - blocks that are identical across times are deduplicated and referenced by start/end - indices; companion "_times" arrays describe, per time, the slice indices into - the concatenated arrays. - - Args: - sample (Sample): A PLAID Sample whose features contain one or more CGNS trees - (sample.features.data maps time -> CGNSTree). - - Returns: - tuple: - - hf_sample (dict[str, Any]): Mapping of flattened CGNS paths to either a - numpy array (concatenation of per-time blocks) or None. For each path - there is also an entry "_times" containing a flattened numpy array - of triplets [time, start, end] (end == -1 indicates the block extends to - the end of the array). - - all_paths (list[str]): Sorted list of all considered variable feature paths - (excluding Time-related nodes and CGNSLibraryVersion). - - sample_cgns_types (dict[str, str]): Mapping from path to CGNS node type - (metadata produced by flatten_cgns_tree). - - Note: - - Byte-array encoded strings (dtype ``"|S1"``) are handled by reassembling and - storing the string as a single-element numpy array; a sha256 hash is used - for deduplication. - - Deduplication reduces storage when identical blocks recur across times. - - Paths containing "/Time" or "CGNSLibraryVersion" are ignored for variable features. - """ - sample_flat_trees = {} - sample_cgns_types = {} - all_paths = set() - - # --- Flatten CGNS trees --- - for time, tree in sample.features.data.items(): - flat, cgns_types = flatten_cgns_tree(tree) - sample_flat_trees[time] = flat - - all_paths.update( - k for k in flat.keys() if "/Time" not in k and "CGNSLibraryVersion" not in k - ) - - sample_cgns_types.update(cgns_types) - - hf_sample = {} - - for path in all_paths: - hf_sample[path] = None - hf_sample[path + "_times"] = None - - known_values = {} - values_acc, times_acc = [], [] - current_length = 0 - - for time, flat in sample_flat_trees.items(): - if path not in flat: - continue # pragma: no cover - value = flat[path] - - # Handle byte-array encoded strings - if ( - isinstance(value, np.ndarray) - and value.dtype == np.dtype("|S1") - and value.ndim == 1 - ): - value_str = b"".join(value).decode("ascii") - value_np = np.array([value_str]) - key = hashlib.sha256(value_str.encode("ascii")).hexdigest() - size = 1 - elif value is not None: - value_np = value - key = hashlib.sha256(value.tobytes()).hexdigest() - size = ( - value.shape[-1] - if isinstance(value, np.ndarray) and value.ndim >= 1 - else 1 - ) - else: - continue - - # Deduplicate identical arrays - if key in known_values: - start, end = known_values[key] # pragma: no cover - else: - start, end = current_length, current_length + size - known_values[key] = (start, end) - values_acc.append(value_np) - current_length = end - - times_acc.append([time, start, end]) - - # Build arrays - if values_acc: - try: - hf_sample[path] = np.hstack(values_acc) - except Exception: # pragma: no cover - hf_sample[path] = np.concatenate([np.atleast_1d(x) for x in values_acc]) - - if len(known_values) == 1: - for t in times_acc: - t[-1] = -1 - hf_sample[path + "_times"] = np.array(times_acc).flatten() - else: - hf_sample[path] = None - hf_sample[path + "_times"] = None - - # Convert lists to numpy arrays - for k, v in hf_sample.items(): - if isinstance(v, list): - hf_sample[k] = np.array(v) # pragma: no cover - - return hf_sample, all_paths, sample_cgns_types - - -def _hash_value(value): - """Compute a hash for a value (np.ndarray or basic types).""" - if isinstance(value, np.ndarray): - return hashlib.md5(value.view(np.uint8)).hexdigest() - return hashlib.md5(str(value).encode("utf-8")).hexdigest() - - -def process_shard( - generator_fn: Callable[..., Any], - progress: Any, - n_proc: int, - shard_ids: Optional[list[IndexType]] = None, -) -> tuple[ - set[str], - dict[str, str], - dict[str, Union[Value, Sequence]], - dict[str, dict[str, Union[str, bool, int]]], - int, -]: - """Process a single shard of sample ids and collect per-shard metadata. - - This function drives a shard-level pass over samples produced by `generator_fn`. - For each sample it: - - flattens the sample into Hugging Face friendly arrays (build_hf_sample), - - collects observed flattened paths, - - aggregates CGNS type metadata, - - infers Hugging Face feature types for each path, - - detects per-path constants using a content hash, - - updates progress (either a multiprocessing.Queue or a tqdm progress bar). - - Args: - shard_ids (list[IndexType]): Sequence of sample ids (a single shard) to process. - generator_fn (Callable): Generator function accepting a list of shard id sequences - and yielding Sample objects for those ids. - progress (Any): Progress reporter; either a multiprocessing.Queue (for parallel - execution) or a tqdm progress bar object (for sequential execution). - n_proc (int): Number of worker processes used by the caller (used to decide - how to report progress). - - Returns: - tuple: - - split_all_paths (set[str]): Set of all flattened feature paths observed in the shard. - - shard_global_cgns_types (dict[str, str]): Mapping path -> CGNS node type observed in the shard. - - shard_global_feature_types (dict[str, Union[Value, Sequence]]): Inferred HF feature types per path. - - split_constant_leaves (dict[str, dict]): Per-path metadata for constant detection. Each entry - is a dict with keys "hash" (str), "constant" (bool) and "count" (int). - - n_samples_processed (int): Number of samples processed in this shard. - - Raises: - ValueError: If inconsistent feature types are detected for the same path within the shard. - """ - split_constant_leaves = {} - split_all_paths = set() - shard_global_cgns_types = {} - shard_global_feature_types = {} - - if shard_ids is not None: - generator = generator_fn([shard_ids]) - else: - generator = generator_fn() - - n_samples = 0 - for sample in generator: - hf_sample, all_paths, sample_cgns_types = build_hf_sample(sample) - - split_all_paths.update(hf_sample.keys()) - shard_global_cgns_types.update(sample_cgns_types) - - # Feature type inference - for path in all_paths: - value = hf_sample[path] - if value is None: - continue - inferred = infer_hf_features_from_value(value) - if path not in shard_global_feature_types: - shard_global_feature_types[path] = inferred - elif repr(shard_global_feature_types[path]) != repr(inferred): - raise ValueError( - f"Feature type mismatch for {path} in shard" - ) # pragma: no cover - - # Constant detection using **hash only** - for path, value in hf_sample.items(): - h = _hash_value(value) - if path not in split_constant_leaves: - split_constant_leaves[path] = {"hashes": {h}, "count": 1} - else: - entry = split_constant_leaves[path] - entry["hashes"].add(h) - entry["count"] += 1 - - # Progress - if n_proc > 1: - progress.put(1) # pragma: no cover - else: - progress.update(1) - - n_samples += 1 - - return ( - split_all_paths, - shard_global_cgns_types, - shard_global_feature_types, - split_constant_leaves, - n_samples, - ) - - -def _process_shard_debug( - generator_fn, progress_queue, n_proc, shard_ids -): # pragma: no cover - try: - return process_shard(generator_fn, progress_queue, n_proc, shard_ids) - except Exception as e: - print(f"Exception in worker for shards {shard_ids}: {e}", file=sys.stderr) - traceback.print_exc() - raise # re-raise to propagate to main process - - -def preprocess_splits( - generators: dict[str, Callable], - gen_kwargs: Optional[dict[str, dict[str, list[IndexType]]]] = None, - processes_number: int = 1, - verbose: bool = True, -) -> tuple[ - dict[str, set[str]], - dict[str, dict[str, Any]], - dict[str, set[str]], - dict[str, str], - dict[str, Union[Value, Sequence]], -]: - """Pre-process dataset splits: inspect samples to infer features, constants and CGNS metadata. - - This function iterates over the provided split generators (optionally in parallel), - flattens each PLAID sample into Hugging Face friendly arrays, detects constant - CGNS leaves (features identical across all samples in a split), infers global - Hugging Face feature types, and aggregates CGNS type metadata. - - The work is sharded per-split and each shard is processed by `process_shard`. - In parallel mode, progress is updated via a multiprocessing.Queue; otherwise a - tqdm progress bar is used. - - Args: - generators (dict[str, Callable]): - Mapping from split name to a generator function. Each generator must - accept a single argument (a sequence of shard ids) and yield PLAID samples. - gen_kwargs (dict[str, dict[str, list[IndexType]]]): - Per-split kwargs used to drive generator invocation (e.g. {"train": {"shards_ids": [...]}}). - processes_number (int, optional): - Number of worker processes to use for shard-level parallelism. Defaults to 1. - verbose (bool, optional): - If True, displays progress bars. Defaults to True. - - Returns: - tuple: - - split_all_paths (dict[str, set[str]]): - For each split, the set of all observed flattened feature paths (including "_times" keys). - - split_flat_cst (dict[str, dict[str, Any]]): - For each split, a mapping of constant feature path -> value (constant parts of the tree). - - split_var_path (dict[str, set[str]]): - For each split, the set of variable feature paths (non-constant). - - global_cgns_types (dict[str, str]): - Aggregated mapping from flattened path -> CGNS node type. - - global_feature_types (dict[str, Union[Value, Sequence]]): - Aggregated inferred Hugging Face feature types for each variable path. - - Raises: - ValueError: If inconsistent feature types or CGNS types are detected across shards/splits. - """ - global_cgns_types = {} - global_feature_types = {} - split_flat_cst = {} - split_var_path = {} - split_all_paths = {} - - gen_kwargs_ = gen_kwargs or {split_name: {} for split_name in generators.keys()} - - for split_name, generator_fn in generators.items(): - shards_ids_list = gen_kwargs_[split_name].get("shards_ids", [None]) - n_proc = max(1, processes_number or len(shards_ids_list)) - - shards_data = [] - - if n_proc == 1: - with tqdm( - disable=not verbose, - desc=f"Pre-process split {split_name}", - ) as pbar: - for shard_ids in shards_ids_list: - shards_data.append( - process_shard(generator_fn, pbar, n_proc=1, shard_ids=shard_ids) - ) - - else: # pragma: no cover - # Parallel execution - manager = mp.Manager() - progress_queue = manager.Queue() - shards_data = [] - - try: - with mp.Pool(n_proc) as pool: - results = [ - pool.apply_async( - _process_shard_debug, - args=(generator_fn, progress_queue, n_proc, shard_ids), - ) - for shard_ids in shards_ids_list - ] - - total_samples = sum(len(shard) for shard in shards_ids_list) - completed = 0 - - with tqdm( - total=total_samples, - disable=not verbose, - desc=f"Pre-process split {split_name}", - ) as pbar: - while completed < total_samples: - try: - increment = progress_queue.get(timeout=0.5) - pbar.update(increment) - completed += increment - except Empty: - # Check for any crashed workers - for r in results: - if r.ready(): - try: - r.get( - timeout=0.1 - ) # will raise worker exception if any - except Exception as e: - raise RuntimeError(f"Worker crashed: {e}") - - # Collect all results - for r in results: - shards_data.append(r.get()) - - finally: - manager.shutdown() - - # Merge shard results - split_all_paths[split_name] = set() - split_constant_hashes = {} - n_samples_total = 0 - - for ( - all_paths, - shard_cgns, - shard_features, - shard_constants, - n_samples, - ) in shards_data: - split_all_paths[split_name].update(all_paths) - global_cgns_types.update(shard_cgns) - - for path, inferred in shard_features.items(): - if path not in global_feature_types: - global_feature_types[path] = inferred - elif repr(global_feature_types[path]) != repr(inferred): - raise ValueError( # pragma: no cover - f"Feature type mismatch for {path} in split {split_name}" - ) - - for path, entry in shard_constants.items(): - if path not in split_constant_hashes: - split_constant_hashes[path] = entry - else: - existing = split_constant_hashes[path] - existing["hashes"].update(entry["hashes"]) - existing["count"] += entry["count"] - - n_samples_total += n_samples - - # Determine truly constant paths (same hash across all samples) - constant_paths = [ - p - for p, entry in split_constant_hashes.items() - if len(entry["hashes"]) == 1 and entry["count"] == n_samples_total - ] - - # Retrieve **values** only for constant paths from first sample - if gen_kwargs: - first_sample = next(generator_fn([shards_ids_list[0]])) - else: - first_sample = next(generator_fn()) - hf_sample, _, _ = build_hf_sample(first_sample) - - split_flat_cst[split_name] = {p: hf_sample[p] for p in sorted(constant_paths)} - split_var_path[split_name] = { - p - for p in split_all_paths[split_name] - if p not in split_flat_cst[split_name] - } - - global_feature_types = { - p: global_feature_types[p] for p in sorted(global_feature_types) - } - - return ( - split_all_paths, - split_flat_cst, - split_var_path, - global_cgns_types, - global_feature_types, - ) - - -def _generator_prepare_for_huggingface( - generators: dict[str, Callable], - gen_kwargs: Optional[dict[str, dict[str, list[IndexType]]]] = None, - processes_number: int = 1, - verbose: bool = True, -): - ( - split_all_paths, - split_flat_cst, - split_var_path, - global_cgns_types, - global_feature_types, - ) = preprocess_splits(generators, gen_kwargs, processes_number, verbose) - - # --- build HF features --- - var_features = sorted(list(set().union(*split_var_path.values()))) - if len(var_features) == 0: # pragma: no cover - raise ValueError( - "no variable feature found, is your dataset variable through samples?" - ) - - for split_name in split_flat_cst.keys(): - for path in var_features: - if not path.endswith("_times") and path not in split_all_paths[split_name]: - split_flat_cst[split_name][path + "_times"] = None # pragma: no cover - if path in split_flat_cst[split_name]: - split_flat_cst[split_name].pop(path) # pragma: no cover - - cst_features = { - split_name: sorted(list(cst.keys())) - for split_name, cst in split_flat_cst.items() - } - first_split, first_value = next(iter(cst_features.items()), (None, None)) - for split, value in cst_features.items(): - assert value == first_value, ( - f"cst_features differ for split '{split}' (vs '{first_split}')" - ) - cst_features = first_value - - hf_features_map = {} - for k in var_features: - if k.endswith("_times"): - hf_features_map[k] = Sequence(Value("float64")) # pragma: no cover - else: - hf_features_map[k] = global_feature_types[k] - hf_features = Features(hf_features_map) - - var_features = [path for path in var_features if not path.endswith("_times")] - cst_features = [path for path in cst_features if not path.endswith("_times")] - - key_mappings = { - "variable_features": var_features, - "constant_features": cst_features, - "cgns_types": global_cgns_types, - } - - return split_flat_cst, key_mappings, hf_features - - -# # ------------------------------------------- -# # --------- Sequential version -# def _generator_prepare_for_huggingface( -# generators: dict[str, Callable], -# gen_kwargs: dict, -# processes_number: int = 1, -# verbose: bool = True, -# ) -> tuple[dict[str, dict[str, Any]], dict[str, Any], Features]: -# """Inspect PLAID dataset generators and infer Hugging Face feature schema. - -# Iterates over all samples in all provided split generators to: -# 1. Flatten each CGNS tree into a dictionary of paths → values. -# 2. Infer Hugging Face `Features` types for all variable leaves. -# 3. Detect constant leaves (values that never change across all samples). -# 4. Collect global CGNS type metadata. - -# Args: -# generators (dict[str, Callable]): -# Mapping from split names to callables returning sample generators. -# Each sample must have `sample.features.data[0.0]` compatible with `flatten_cgns_tree`. -# gen_kwargs (dict, optional, default=None): -# Optional mapping from split names to dictionaries of keyword arguments -# to be passed to each generator function, used for parallelization. -# processes_number (int, optional): Number of parallel processes to use. -# verbose (bool, optional): If True, displays progress bars while processing splits. - -# Returns: -# tuple: -# - flat_cst (dict[str, Any]): Mapping from feature path to constant values detected across all splits. -# - key_mappings (dict[str, Any]): Metadata dictionary with: -# - "variable_features" (list[str]): paths of non-constant features. -# - "constant_features" (list[str]): paths of constant features. -# - "cgns_types" (dict[str, Any]): CGNS type information for all paths. -# - hf_features (datasets.Features): Hugging Face feature specification for variable features. - -# Raises: -# ValueError: If inconsistent CGNS types or feature types are found for the same path. -# """ -# processes_number - -# def values_equal(v1, v2): -# if isinstance(v1, np.ndarray) and isinstance(v2, np.ndarray): -# return np.array_equal(v1, v2) -# return v1 == v2 - -# global_cgns_types = {} -# global_feature_types = {} - -# split_flat_cst = {} -# split_var_path = {} -# split_all_paths = {} - -# # ---- Single pass over all splits and samples ---- -# for split_name, generator in generators.items(): -# split_constant_leaves = {} - -# split_all_paths[split_name] = set() - -# n_samples = 0 -# for sample in tqdm( -# generator(**gen_kwargs[split_name]), -# disable=not verbose, -# desc=f"Pre-process split {split_name}", -# ): -# # --- Build Hugging Face–compatible sample --- -# hf_sample, all_paths, sample_cgns_types = build_hf_sample(sample) - -# split_all_paths[split_name].update(hf_sample.keys()) -# # split_all_paths[split_name].update(all_paths) -# global_cgns_types.update(sample_cgns_types) - -# # --- Infer global HF feature types --- -# for path in all_paths: -# value = hf_sample[path] -# if value is None: -# continue - -# # if isinstance(value, np.ndarray) and value.dtype.type is np.str_: -# # inferred = Value("string") -# # else: -# # inferred = infer_hf_features_from_value(value) - -# inferred = infer_hf_features_from_value(value) - -# if path not in global_feature_types: -# global_feature_types[path] = inferred -# elif repr(global_feature_types[path]) != repr(inferred): -# raise ValueError( # pragma: no cover -# f"Feature type mismatch for {path} in split {split_name}" -# ) - -# # --- Update per-split constant detection --- -# for path, value in hf_sample.items(): -# if path not in split_constant_leaves: -# split_constant_leaves[path] = { -# "value": value, -# "constant": True, -# "count": 1, -# } -# else: -# entry = split_constant_leaves[path] -# entry["count"] += 1 -# if entry["constant"] and not values_equal(entry["value"], value): -# entry["constant"] = False - -# n_samples += 1 - -# # --- Record per-split constants --- -# for p, e in split_constant_leaves.items(): -# if e["count"] < n_samples: -# split_constant_leaves[p]["constant"] = False - -# split_flat_cst[split_name] = dict( -# sorted( -# ( -# (p, e["value"]) -# for p, e in split_constant_leaves.items() -# if e["constant"] -# ), -# key=lambda x: x[0], -# ) -# ) - -# split_var_path[split_name] = { -# p -# for p in split_all_paths[split_name] -# if p not in split_flat_cst[split_name] -# } - -# global_feature_types = { -# p: global_feature_types[p] for p in sorted(global_feature_types) -# } -# var_features = sorted(list(set().union(*split_var_path.values()))) - -# if len(var_features) == 0: -# raise ValueError( # pragma: no cover -# "no variable feature found, is your dataset variable through samples?" -# ) - -# # --------------------------------------------------- -# # for test-like splits, some var_features are all None (e.g.: outputs): need to add '_times' counterparts to corresponding constant trees -# for split_name in split_flat_cst.keys(): -# for path in var_features: -# if not path.endswith("_times") and path not in split_all_paths[split_name]: -# split_flat_cst[split_name][path + "_times"] = None # pragma: no cover -# if ( -# path in split_flat_cst[split_name] -# ): # remove for flat_cst the path that will be forcely included in the arrow tables -# split_flat_cst[split_name].pop(path) # pragma: no cover - -# # ---- Constant features sanity check -# cst_features = { -# split_name: sorted(list(cst.keys())) -# for split_name, cst in split_flat_cst.items() -# } - -# first_split, first_value = next(iter(cst_features.items()), (None, None)) -# for split, value in cst_features.items(): -# assert value == first_value, ( -# f"cst_features differ for split '{split}' (vs '{first_split}'): something went wrong in _generator_prepare_for_huggingface." -# ) - -# cst_features = first_value - -# # ---- Build global HF Features (only variable) ---- -# hf_features_map = {} -# for k in var_features: -# if k.endswith("_times"): -# hf_features_map[k] = Sequence(Value("float64")) # pragma: no cover -# else: -# hf_features_map[k] = global_feature_types[k] - -# hf_features = Features(hf_features_map) - -# var_features = [path for path in var_features if not path.endswith("_times")] -# cst_features = [path for path in cst_features if not path.endswith("_times")] - -# key_mappings = { -# "variable_features": var_features, -# "constant_features": cst_features, -# "cgns_types": global_cgns_types, -# } - -# return split_flat_cst, key_mappings, hf_features - - -def plaid_dataset_to_huggingface_datasetdict( - dataset: Dataset, - main_splits: dict[str, IndexType], - processes_number: int = 1, - writer_batch_size: int = 1, - verbose: bool = False, -) -> tuple[datasets.DatasetDict, dict[str, Any], dict[str, Any]]: - """Convert a PLAID dataset into a Hugging Face `datasets.DatasetDict`. - - This is a thin wrapper that creates per-split generators from a PLAID dataset - and delegates the actual dataset construction to - `plaid_generator_to_huggingface_datasetdict`. - - Args: - dataset (plaid.Dataset): - The PLAID dataset to be converted. Must support indexing with - a list of IDs (from `main_splits`). - main_splits (dict[str, IndexType]): - Mapping from split names (e.g. "train", "test") to the subset of - sample indices belonging to that split. - processes_number (int, optional, default=1): - Number of parallel processes to use when writing the Hugging Face dataset. - writer_batch_size (int, optional, default=1): - Batch size used when writing samples to disk in Hugging Face format. - verbose (bool, optional, default=False): - If True, print progress and debug information. - - Returns: - datasets.DatasetDict: - A Hugging Face `DatasetDict` containing one dataset per split. - - Example: - >>> ds_dict = plaid_dataset_to_huggingface_datasetdict( - ... dataset=my_plaid_dataset, - ... main_splits={"train": [0, 1, 2], "test": [3]}, - ... processes_number=4, - ... writer_batch_size=3 - ... ) - >>> print(ds_dict) - DatasetDict({ - train: Dataset({ - features: ... - }), - test: Dataset({ - features: ... - }) - }) - """ - - def generator(dataset): - for sample in dataset: - yield sample - - generators = { - split_name: partial(generator, dataset[ids]) - for split_name, ids in main_splits.items() - } - - # gen_kwargs = { - # split_name: {"shards_ids": [ids]} for split_name, ids in main_splits.items() - # } - - return plaid_generator_to_huggingface_datasetdict( - generators, - processes_number=processes_number, - writer_batch_size=writer_batch_size, - verbose=verbose, - ) - - -def plaid_generator_to_huggingface_datasetdict( - generators: dict[str, Callable], - gen_kwargs: Optional[dict[str, dict[str, list[IndexType]]]] = None, - processes_number: int = 1, - writer_batch_size: int = 1, - verbose: bool = False, -) -> tuple[datasets.DatasetDict, dict[str, Any], dict[str, Any]]: - """Convert PLAID dataset generators into a Hugging Face `datasets.DatasetDict`. - - This function inspects samples produced by the given generators, flattens their - CGNS tree structure, infers Hugging Face feature types, and builds one - `datasets.Dataset` per split. Constant features (identical across all samples) - are separated out from variable features. - - Args: - generators (dict[str, Callable]): - Mapping from split names (e.g., "train", "test") to generator functions. - Each generator function must return an iterable of PLAID samples, where - each sample provides `sample.features.data[0.0]` for flattening. - processes_number (int, optional, default=1): - Number of processes used internally by Hugging Face when materializing - the dataset from the generators. - writer_batch_size (int, optional, default=1): - Batch size used when writing samples to disk in Hugging Face format. - gen_kwargs (dict, optional, default=None): - Optional mapping from split names to dictionaries of keyword arguments - to be passed to each generator function, used for parallelization. - verbose (bool, optional, default=False): - If True, displays progress bars and diagnostic messages. - - Returns: - tuple: - - **DatasetDict** (`datasets.DatasetDict`): - A Hugging Face dataset dictionary with one dataset per split. - - **flat_cst** (`dict[str, Any]`): - Dictionary of constant features detected across all splits. - - **key_mappings** (`dict[str, Any]`): - Metadata dictionary containing: - - `"variable_features"`: list of paths for non-constant features. - - `"constant_features"`: list of paths for constant features. - - `"cgns_types"`: inferred CGNS types for all features. - - Example: - >>> ds_dict, flat_cst, key_mappings = plaid_generator_to_huggingface_datasetdict( - ... {"train": lambda: iter(train_samples), - ... "test": lambda: iter(test_samples)}, - ... processes_number=4, - ... writer_batch_size=2, - ... verbose=True - ... ) - >>> print(ds_dict) - DatasetDict({ - train: Dataset({ - features: ... - }), - test: Dataset({ - features: ... - }) - }) - >>> print(flat_cst) - {'Zone1/GridCoordinates': array([0., 0.1, 0.2])} - >>> print(key_mappings["variable_features"][:3]) - ['Zone1/FlowSolution/VelocityX', 'Zone1/FlowSolution/VelocityY', ...] - """ - flat_cst, key_mappings, hf_features = _generator_prepare_for_huggingface( - generators, gen_kwargs, processes_number, verbose - ) - - all_features_keys = list(hf_features.keys()) - - def generator_fn(gen_func, all_features_keys, **kwargs): - for sample in gen_func(**kwargs): - hf_sample, _, _ = build_hf_sample(sample) - yield {path: hf_sample.get(path, None) for path in all_features_keys} - - _dict = {} - for split_name, gen_func in generators.items(): - gen = partial(generator_fn, all_features_keys=all_features_keys) - gen_kwargs_ = gen_kwargs or {split_name: {} for split_name in generators.keys()} - _dict[split_name] = datasets.Dataset.from_generator( - generator=gen, - gen_kwargs={"gen_func": gen_func, **gen_kwargs_[split_name]}, - features=hf_features, - num_proc=processes_number, - writer_batch_size=writer_batch_size, - split=datasets.splits.NamedSplit(split_name), - ) - - return datasets.DatasetDict(_dict), flat_cst, key_mappings - - -def _compute_num_shards(hf_dataset_dict: datasets.DatasetDict) -> dict[str, int]: - target_shard_size_mb = 500 - - num_shards = {} - for split_name, ds in hf_dataset_dict.items(): - n_samples = len(ds) - assert n_samples > 0, f"split {split_name} has no sample" - - dataset_size_bytes = ds.data.nbytes - target_shard_size_bytes = target_shard_size_mb * 1024 * 1024 - - n_shards = max( - 1, - (dataset_size_bytes + target_shard_size_bytes - 1) - // target_shard_size_bytes, - ) - num_shards[split_name] = min(n_samples, int(n_shards)) - return num_shards - - -def push_dataset_dict_to_hub( - repo_id: str, hf_dataset_dict: datasets.DatasetDict, **kwargs -) -> None: # pragma: no cover (not tested in unit tests) - """Push a Hugging Face `DatasetDict` to the Hugging Face Hub. - - This is a thin wrapper around `datasets.DatasetDict.push_to_hub`, allowing - you to upload a dataset dictionary (with one or more splits such as - `"train"`, `"validation"`, `"test"`) to the Hugging Face Hub. - - Note: - The function automatically handles sharding of the dataset by setting `num_shards` - for each split. For each split, the number of shards is set to the minimum between - the number of samples in that split and such that shards are targetted to approx. 500 MB. - This ensures efficient chunking while preventing excessive fragmentation. Empty splits - will raise an assertion error. - - Args: - repo_id (str): - The repository ID on the Hugging Face Hub - (e.g. `"username/dataset_name"`). - hf_dataset_dict (datasets.DatasetDict): - The Hugging Face dataset dictionary to push. - **kwargs: - Keyword arguments forwarded to - [`DatasetDict.push_to_hub`](https://huggingface.co/docs/datasets/main/en/package_reference/main_classes#datasets.DatasetDict.push_to_hub). - - Returns: - None - """ - num_shards = _compute_num_shards(hf_dataset_dict) - num_proc = kwargs.get("num_proc", None) - if num_proc is not None: # pragma: no cover - min_num_shards = min(num_shards.values()) - if min_num_shards < num_proc: - logger.warning( - f"num_proc chaged from {num_proc} to 1 to safely adapt for num_shards={num_shards}" - ) - num_proc = 1 - del kwargs["num_proc"] - - hf_dataset_dict.push_to_hub( - repo_id, num_shards=num_shards, num_proc=num_proc, **kwargs - ) - - -def push_infos_to_hub( - repo_id: str, infos: dict[str, dict[str, str]] -) -> None: # pragma: no cover (not tested in unit tests) - """Upload dataset infos to the Hugging Face Hub. - - Serializes the infos dictionary to YAML and uploads it to the specified repository as infos.yaml. - - Args: - repo_id (str): The repository ID on the Hugging Face Hub. - infos (dict[str, dict[str, str]]): Dictionary containing dataset infos to upload. - - Raises: - ValueError: If the infos dictionary is empty. - """ - if len(infos) > 0: - api = HfApi() - yaml_str = yaml.dump(infos) - yaml_buffer = io.BytesIO(yaml_str.encode("utf-8")) - api.upload_file( - path_or_fileobj=yaml_buffer, - path_in_repo="infos.yaml", - repo_id=repo_id, - repo_type="dataset", - commit_message="Upload infos.yaml", - ) - else: - raise ValueError("'infos' must not be empty") - - -def push_problem_definition_to_hub( - repo_id: str, name: str, pb_def: ProblemDefinition -) -> None: # pragma: no cover (not tested in unit tests) - """Upload a ProblemDefinition and its split information to the Hugging Face Hub. - - Args: - repo_id (str): The repository ID on the Hugging Face Hub. - name (str): The name of the problem_definition to store in the repo. - pb_def (ProblemDefinition): The problem definition to upload. - """ - api = HfApi() - data = pb_def._generate_problem_infos_dict() - for k, v in list(data.items()): - if not v: - data.pop(k) - if data is not None: - yaml_str = yaml.dump(data) - yaml_buffer = io.BytesIO(yaml_str.encode("utf-8")) - - if not name.endswith(".yaml"): - name = f"{name}.yaml" - - api.upload_file( - path_or_fileobj=yaml_buffer, - path_in_repo=f"problem_definitions/{name}", - repo_id=repo_id, - repo_type="dataset", - commit_message=f"Upload problem_definitions/{name}", - ) - - -def push_tree_struct_to_hub( - repo_id: str, - flat_cst: dict[str, Any], - key_mappings: dict[str, Any], -) -> None: # pragma: no cover (not tested in unit tests) - """Upload a dataset's tree structure to a Hugging Face dataset repository. - - This function pushes two components of a dataset tree structure to the specified - Hugging Face Hub repository: - - 1. `flat_cst`: the constant parts of the dataset tree, serialized as a pickle file - (`tree_constant_part.pkl`). - 2. `key_mappings`: the dictionary of key mappings and metadata for the dataset tree, - serialized as a YAML file (`key_mappings.yaml`). - - Both files are uploaded using the Hugging Face `HfApi().upload_file` method. - - Args: - repo_id (str): The Hugging Face dataset repository ID where files will be uploaded. - flat_cst (dict[str, Any]): Dictionary containing constant values in the dataset tree. - key_mappings (dict[str, Any]): Dictionary containing key mappings and additional metadata. - - Returns: - None - - Note: - - Each upload includes a commit message indicating the filename. - - This function is not covered by unit tests (`pragma: no cover`). - """ - api = HfApi() - - # constant part of the tree - api.upload_file( - path_or_fileobj=io.BytesIO(pickle.dumps(flat_cst)), - path_in_repo="tree_constant_part.pkl", - repo_id=repo_id, - repo_type="dataset", - commit_message="Upload tree_constant_part.pkl", - ) - - # key mappings - yaml_str = yaml.dump(key_mappings, sort_keys=False) - yaml_buffer = io.BytesIO(yaml_str.encode("utf-8")) - - api.upload_file( - path_or_fileobj=yaml_buffer, - path_in_repo="key_mappings.yaml", - repo_id=repo_id, - repo_type="dataset", - commit_message="Upload key_mappings.yaml", - ) - - -def save_dataset_dict_to_disk( - path: Union[str, Path], hf_dataset_dict: datasets.DatasetDict, **kwargs -) -> None: - """Save a Hugging Face DatasetDict to disk. - - This function serializes the provided DatasetDict and writes it to the specified - directory, preserving its features, splits, and data for later loading. - - Args: - path (Union[str, Path]): Directory path where the DatasetDict will be saved. - hf_dataset_dict (datasets.DatasetDict): The Hugging Face DatasetDict to save. - **kwargs: - Keyword arguments forwarded to - [`DatasetDict.save_to_disk`](https://huggingface.co/docs/datasets/main/en/package_reference/main_classes#datasets.DatasetDict.save_to_disk). - - Returns: - None - """ - num_shards = _compute_num_shards(hf_dataset_dict) - num_proc = kwargs.get("num_proc", None) - if num_proc is not None: # pragma: no cover - min_num_shards = min(num_shards.values()) - if min_num_shards < num_proc: - logger.warning( - f"num_proc chaged from {num_proc} to 1 to safely adapt for num_shards={num_shards}" - ) - num_proc = 1 - del kwargs["num_proc"] - - hf_dataset_dict.save_to_disk( - str(path), num_shards=num_shards, num_proc=num_proc, **kwargs - ) - - -def save_infos_to_disk( - path: Union[str, Path], infos: dict[str, dict[str, str]] -) -> None: - """Save dataset infos as a YAML file to disk. - - Args: - path (Union[str, Path]): The directory path where the infos file will be saved. - infos (dict[str, dict[str, str]]): Dictionary containing dataset infos. - """ - infos_fname = Path(path) / "infos.yaml" - infos_fname.parent.mkdir(parents=True, exist_ok=True) - with open(infos_fname, "w") as file: - yaml.dump(infos, file, default_flow_style=False, sort_keys=False) - - -def save_problem_definition_to_disk( - path: Union[str, Path], name: Union[str, Path], pb_def: ProblemDefinition -) -> None: - """Save a ProblemDefinition and its split information to disk. - - Args: - path (Union[str, Path]): The root directory path for saving. - name (str): The name of the problem_definition to store in the disk directory. - pb_def (ProblemDefinition): The problem definition to save. - """ - pb_def.save_to_file(Path(path) / Path("problem_definitions") / Path(name)) - - -def save_tree_struct_to_disk( - path: Union[str, Path], - flat_cst: dict[str, Any], - key_mappings: dict[str, Any], -) -> None: - """Save the structure of a dataset tree to disk. - - This function writes the constant part of the tree and its key mappings to files - in the specified directory. The constant part is serialized as a pickle file, - while the key mappings are saved in YAML format. - - Args: - path (Union[str, Path]): Directory path where the tree structure files will be saved. - flat_cst (dict): Dictionary containing the constant part of the tree. - key_mappings (dict): Dictionary containing key mappings for the tree structure. - - Returns: - None - """ - Path(path).mkdir(parents=True, exist_ok=True) - - with open(Path(path) / "tree_constant_part.pkl", "wb") as f: - pickle.dump(flat_cst, f) - - with open(Path(path) / "key_mappings.yaml", "w", encoding="utf-8") as f: - yaml.dump(key_mappings, f, sort_keys=False) - - -def plaid_dataset_to_huggingface_binary( - dataset: Dataset, - ids: Optional[list[IndexType]] = None, - split_name: str = "all_samples", - processes_number: int = 1, -) -> datasets.Dataset: - """Use this function for converting a Hugging Face dataset from a plaid dataset. - - The dataset can then be saved to disk, or pushed to the Hugging Face hub. - - Args: - dataset (Dataset): the plaid dataset to be converted in Hugging Face format - ids (list, optional): The specific sample IDs to convert the dataset. Defaults to None. - split_name (str): The name of the split. Default: "all_samples". - processes_number (int): The number of processes used to generate the Hugging Face dataset. Default: 1. - - Returns: - datasets.Dataset: dataset in Hugging Face format - - Example: - .. code-block:: python - - dataset = plaid_dataset_to_huggingface_binary(dataset, problem_definition, split) - dataset.save_to_disk("path/to/dir) - dataset.push_to_hub("chanel/dataset") - """ - if ids is None: - ids = dataset.get_sample_ids() - - def generator(): - for sample in dataset[ids]: - yield { - "sample": pickle.dumps(sample.model_dump()), - } - - return plaid_generator_to_huggingface_binary( - generator=generator, - split_name=split_name, - processes_number=processes_number, - ) - - -def plaid_generator_to_huggingface_binary( - generator: Callable, - split_name: str = "all_samples", - processes_number: int = 1, -) -> datasets.Dataset: - """Use this function for creating a Hugging Face dataset from a sample generator function. - - This function can be used when the plaid dataset cannot be loaded in RAM all at once due to its size. - The generator enables loading samples one by one. - - Args: - generator (Callable): a function yielding a dict {"sample" : sample}, where sample is of type 'bytes' - split_name (str): The name of the split. Default: "all_samples". - processes_number (int): The number of processes used to generate the Hugging Face dataset. Default: 1. - - Returns: - datasets.Dataset: dataset in Hugging Face format - - Example: - .. code-block:: python - - dataset = plaid_generator_to_huggingface_binary(generator, infos, split) - """ - ds: datasets.Dataset = datasets.Dataset.from_generator( # pyright: ignore[reportAssignmentType] - generator=generator, - features=datasets.Features({"sample": datasets.Value("binary")}), - num_proc=processes_number, - writer_batch_size=1, - split=datasets.splits.NamedSplit(split_name), - ) - - return ds - - -def plaid_dataset_to_huggingface_datasetdict_binary( - dataset: Dataset, - main_splits: dict[str, IndexType], - processes_number: int = 1, -) -> datasets.DatasetDict: - """Use this function for converting a Hugging Face dataset dict from a plaid dataset. - - The dataset can then be saved to disk, or pushed to the Hugging Face hub. - - Args: - dataset (Dataset): the plaid dataset to be converted in Hugging Face format. - main_splits (list[str]): The name of the main splits: defining a partitioning of the sample ids. - processes_number (int): The number of processes used to generate the Hugging Face dataset. Default: 1. - - Returns: - datasets.Dataset: dataset in Hugging Face format - - Example: - .. code-block:: python - - dataset = plaid_dataset_to_huggingface_datasetdict_binary(dataset, problem_definition, split) - dataset.save_to_disk("path/to/dir) - dataset.push_to_hub("chanel/dataset") - """ - _dict = {} - for split_name, ids in main_splits.items(): - ds = plaid_dataset_to_huggingface_binary( - dataset=dataset, - ids=ids, - processes_number=processes_number, - ) - _dict[split_name] = ds - - return datasets.DatasetDict(_dict) - - -def plaid_generator_to_huggingface_datasetdict_binary( - generators: dict[str, Callable], - processes_number: int = 1, -) -> datasets.DatasetDict: - """Use this function for creating a Hugging Face dataset dict (containing multiple splits) from a sample generator function. - - This function can be used when the plaid dataset cannot be loaded in RAM all at once due to its size. - The generator enables loading samples one by one. - The dataset dict can then be saved to disk, or pushed to the Hugging Face hub. - - Note: - Only the first split will contain the decription. - - Args: - generators (dict[str, Callable]): a dict of functions yielding a dict {"sample" : sample}, where sample is of type 'bytes' - processes_number (int): The number of processes used to generate the Hugging Face dataset. Default: 1. - - Returns: - datasets.DatasetDict: dataset dict in Hugging Face format - - Example: - .. code-block:: python - - hf_dataset_dict = plaid_generator_to_huggingface_datasetdict(generator, infos, problem_definition, main_splits) - push_dataset_dict_to_hub("chanel/dataset", hf_dataset_dict) - hf_dataset_dict.save_to_disk("path/to/dir") - """ - _dict = {} - for split_name, generator in generators.items(): - ds = plaid_generator_to_huggingface_binary( - generator=generator, - processes_number=processes_number, - split_name=split_name, - ) - _dict[split_name] = ds - - return datasets.DatasetDict(_dict) - - -def update_dataset_card( - dataset_card: str, - infos: Optional[dict[str, dict[str, str]]] = None, - pretty_name: Optional[str] = None, - dataset_long_description: Optional[str] = None, - illustration_urls: Optional[list[str]] = None, - arxiv_paper_urls: Optional[list[str]] = None, -) -> str: - r"""Update a dataset card with PLAID-specific metadata and documentation. - - Args: - dataset_card (str): The original dataset card content to update. - infos (dict[str, dict[str, str]]): Dictionary containing dataset information - with "legal" and "data_production" sections. Defaults to None. - pretty_name (str, optional): A human-readable name for the dataset. Defaults to None. - dataset_long_description (str, optional): Detailed description of the dataset's content, - purpose, and characteristics. Defaults to None. - illustration_urls (list[str], optional): List of URLs to images illustrating the dataset. - Defaults to None. - arxiv_paper_urls (list[str], optional): List of URLs to related arXiv papers. - Defaults to None. - - Returns: - str: The updated dataset card content as a string. - """ - lines = dataset_card.splitlines() - lines = [s for s in lines if not s.startswith("license")] - - indices = [i for i, line in enumerate(lines) if line.strip() == "---"] - - assert len(indices) >= 2, ( - "Cannot find two instances of '---', you should try to update a correct dataset_card." - ) - lines = lines[: indices[1] + 1] - - count = 1 - lines.insert(count, f"license: {infos['legal']['license']}") - count += 1 - lines.insert(count, "task_categories:") - count += 1 - lines.insert(count, "- graph-ml") - count += 1 - if pretty_name: - lines.insert(count, f"pretty_name: {pretty_name}") - count += 1 - lines.insert(count, "tags:") - count += 1 - lines.insert(count, "- physics learning") - count += 1 - lines.insert(count, "- geometry learning") - count += 1 - - str__ = "\n".join(lines) + "\n" - - if illustration_urls: - str__ += "

\n" - for url in illustration_urls: - str__ += f"{url}\n" - str__ += "

\n\n" - - if infos: - str__ += ( - f"```yaml\n{yaml.dump(infos, sort_keys=False, allow_unicode=True)}\n```" - ) - - str__ += """ -Example of commands: -```python -from datasets import load_dataset -from plaid.bridges import huggingface_bridge - -repo_id = "chanel/dataset" -pb_def_name = "pb_def_name" #`pb_def_name` is to choose from the repo `problem_definitions` folder - -# Load the dataset -hf_datasetdict = load_dataset(repo_id) - -# Load addition required data -flat_cst, key_mappings = huggingface_bridge.load_tree_struct_from_hub(repo_id) -pb_def = huggingface_bridge.load_problem_definition_from_hub(repo_id, pb_def_name) - -# Efficient reconstruction of plaid samples -for split_name, hf_dataset in hf_datasetdict.items(): - for i in range(len(hf_dataset)): - sample = huggingface_bridge.to_plaid_sample( - hf_dataset, - i, - flat_cst[split_name], - key_mappings["cgns_types"], - ) - -# Extract input and output features from samples: -for t in sample.get_all_mesh_times(): - for path in pb_def.get_in_features_identifiers(): - sample.get_feature_by_path(path=path, time=t) - for path in pb_def.get_out_features_identifiers(): - sample.get_feature_by_path(path=path, time=t) -``` -""" - str__ += "This dataset was generated in [PLAID](https://plaid-lib.readthedocs.io/), we refer to this documentation for additional details on how to extract data from `sample` objects.\n" - - if dataset_long_description: - str__ += f""" -### Dataset Description -{dataset_long_description} -""" - - if arxiv_paper_urls: - str__ += """ -### Dataset Sources - -- **Papers:** -""" - for url in arxiv_paper_urls: - str__ += f" - [arxiv]({url})\n" - - return str__ diff --git a/src/plaid/bridges/huggingface_helpers.py b/src/plaid/bridges/huggingface_helpers.py deleted file mode 100644 index cabcd728..00000000 --- a/src/plaid/bridges/huggingface_helpers.py +++ /dev/null @@ -1,47 +0,0 @@ -"""Huggingface private helpers.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -from pathlib import Path -from typing import Union - -import datasets -from datasets import load_from_disk - -from plaid import Sample -from plaid.bridges.huggingface_bridge import binary_to_plaid_sample - - -class _HFToPlaidSampleConverter: - """Class to convert a Hugging Face dataset sample to a plaid :class:`Sample `.""" - - def __init__(self, hf_ds: Union[datasets.Dataset, datasets.DatasetDict]): - self.hf_ds = hf_ds - - def __call__(self, sample_id: int) -> Sample: # pragma: no cover - return binary_to_plaid_sample(self.hf_ds[sample_id]) - - -class _HFShardToPlaidSampleConverter: - """Class to convert a huggingface dataset sample shard to a plaid :class:`Sample `.""" - - def __init__(self, shard_path: Path): - """Initialization. - - Args: - shard_path (Path): path of the shard. - """ - self.hf_ds: Union[datasets.Dataset, datasets.DatasetDict] = load_from_disk( - shard_path.as_posix() - ) - - def __call__( - self, sample_id: int - ) -> Sample: # pragma: no cover (not reported with multiprocessing) - """Convert a sample shard from the huggingface dataset to a plaid :class:`Sample `.""" - return binary_to_plaid_sample(self.hf_ds[sample_id]) diff --git a/src/plaid/constants.py b/src/plaid/constants.py index ded3e3bb..8b8753a3 100644 --- a/src/plaid/constants.py +++ b/src/plaid/constants.py @@ -1,9 +1,3 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# """This module defines common constants used throughout the PLAID library. It includes: @@ -17,16 +11,22 @@ These constants help standardize metadata, task types, and mesh element references across the PLAID codebase. """ -AUTHORIZED_TASKS = ["regression", "classification"] +from typing import Literal, get_args -AUTHORIZED_SCORE_FUNCTIONS = ["RRMSE"] +AUTHORIZED_TASKS_T = Literal["regression", "classification"] +AUTHORIZED_TASKS = get_args(AUTHORIZED_TASKS_T) -AUTHORIZED_FEATURE_TYPES = ["scalar", "field", "nodes"] + +AUTHORIZED_SCORE_FUNCTIONS_T = Literal["RRMSE"] +AUTHORIZED_SCORE_FUNCTIONS = get_args(AUTHORIZED_SCORE_FUNCTIONS_T) + +AUTHORIZED_FEATURE_TYPES_T = Literal["scalar", "field", "nodes"] +AUTHORIZED_FEATURE_TYPES = get_args(AUTHORIZED_FEATURE_TYPES_T) AUTHORIZED_FEATURE_INFOS = { "scalar": ["name"], - "field": ["name", "location", "zone_name", "base_name", "time"], - "nodes": ["zone_name", "base_name", "time"], + "field": ["name", "location", "zone", "base", "time"], + "nodes": ["zone", "base", "time"], } # Information keys for dataset metadata diff --git a/src/plaid/containers/__init__.py b/src/plaid/containers/__init__.py index ad726aa9..c6135f45 100644 --- a/src/plaid/containers/__init__.py +++ b/src/plaid/containers/__init__.py @@ -1,18 +1,8 @@ """Package for PLAID containers such as `Dataset` and `Sample`.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - from .dataset import Dataset -from .feature_identifier import FeatureIdentifier from .sample import Sample __all__ = [ "Dataset", - "FeatureIdentifier", "Sample", ] diff --git a/src/plaid/containers/dataset.py b/src/plaid/containers/dataset.py index d4833b2e..ed7b1eb9 100644 --- a/src/plaid/containers/dataset.py +++ b/src/plaid/containers/dataset.py @@ -1,940 +1,111 @@ -"""Implementation of the `Dataset` container.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports -import copy -import sys - -if sys.version_info >= (3, 11): - from typing import Self -else: # pragma: no cover - from typing import TypeVar - - Self = TypeVar("Self") +"""Dataset container for PLAID.""" import logging -import os -import shutil -import subprocess -from multiprocessing import Pool from pathlib import Path -from typing import Iterator, Literal, Optional, Union +from typing import Any, Literal, Optional, Union import numpy as np -import yaml from packaging.version import Version -from tqdm import tqdm - -import plaid -from plaid.constants import AUTHORIZED_INFO_KEYS, CGNS_FIELD_LOCATIONS -from plaid.containers.feature_identifier import FeatureIdentifier -from plaid.containers.sample import Sample -from plaid.containers.utils import check_features_size_homogeneity -from plaid.types import Array, Feature -from plaid.utils.base import DeprecatedError, ShapeError, generate_random_ASCII -from plaid.utils.deprecation import deprecated, deprecated_argument +from pydantic import ( + BaseModel, + ConfigDict, + Field, + PrivateAttr, +) + +from ..problem_definition import ProblemDefinition +from ..storage.registry import BackendModule, get_backend +from ..types.common import NDArrayInt +from ..utils.info import normalize_infos +from ..version import __version__ +from .sample import Sample logger = logging.getLogger(__name__) -# %% Functions - - -def _process_sample(path: Union[str, Path]) -> tuple: # pragma: no cover - """Load Sample from path. - - Args: - path (Union[str, Path]): The path to the Sample. - - Returns: - tuple: The loaded Sample and its ID. - """ - path = Path(path) - id = int(path.stem.split("_")[-1]) - return id, Sample(path=path) - - -# %% Classes - - -class Dataset(object): - """A set of samples, and optionnaly some other informations about the Dataset.""" - - @deprecated_argument("directory_path", "path", version="0.1.8", removal="0.2.0") - def __init__( - self, - path: Optional[Union[str, Path]] = None, - verbose: bool = False, - processes_number: int = 0, - samples: Optional[list[Sample]] = None, - sample_ids: Optional[list[int]] = None, - ) -> None: - """Initialize a :class:`Dataset `. - - If `path` is not specified it initializes an empty :class:`Dataset ` that should be fed with :class:`Samples `. - - Use :meth:`add_sample ` or :meth:`add_samples ` to feed the :class:`Dataset` - - Args: - path (Union[str, Path], optional): The path from which to load PLAID dataset files. - verbose (bool, optional): Explicitly displays the operations performed. Defaults to False. - processes_number (int, optional): Number of processes used to load files (-1 to use all available ressources, 0 to disable multiprocessing). Defaults to 0. - samples (list[Sample], optional): A list of :class:`Samples ` to initialize the :class:`Dataset `. Defaults to None. - sample_ids (list[int], optional): An optional list of IDs for the new samples. If not provided, the IDs will be automatically generated based on the current number of samples in the dataset. - - Example: - .. code-block:: python - - from plaid import Dataset - from plaid import Sample - - # 1. Create empty instance of Dataset - dataset = Dataset() - print(dataset) - >>> Dataset(0 samples, 0 scalars, 0 fields) - print(len(dataset)) - >>> 0 - - # 2. Load dataset and create Dataset instance - dataset = Dataset("path_to_plaid_dataset") # .plaid or directory - print(dataset) - >>> Dataset(3 samples, 2 scalars, 5 fields) - print(len(dataset)) - >>> 3 - for sample in dataset: - print(sample) - >>> Sample(1 scalar, 1 timestamp, 2 fields) - Sample(1 scalar, 0 timestamps, 0 fields) - Sample(2 scalars, 1 timestamp, 2 fields) - - # 3. Create Dataset instance from a list of Samples - dataset = Dataset(samples=[sample1, sample2, sample3]) - print(dataset) - >>> Dataset(3 samples, 0 scalars, 2 fields) - - # 4. Create Dataset instance from a list of Samples with specific ids - dataset = Dataset(samples=[sample1, sample2, sample3], sample_ids=[3, 5, 7]) - print(dataset) - >>> Dataset(3 samples, 0 scalars, 2 fields) - - - Caution: - It is assumed that you provided a compatible PLAID dataset. - """ - self._samples: dict[int, Sample] = {} # sample_id -> sample - # info_name -> description - self._infos: dict[str, dict[str, Union[str, Version]]] = { - "plaid": {"version": Version(plaid.__version__)} - } - - if samples is not None and (path is not None): - raise ValueError("'samples' and 'path' are mutually exclusive") - - if path is not None: - path = Path(path) - - if path.suffix == ".plaid": - self.load(path, verbose=verbose, processes_number=processes_number) - else: - self._load_from_dir_( - path, verbose=verbose, processes_number=processes_number - ) - elif samples is not None: - if sample_ids is None: - self.add_samples(samples) - else: - self.add_samples(samples, sample_ids) - - def copy(self) -> Self: - """Create a deep copy of the dataset. - - Returns: - A new `Dataset` instance with all internal data (samples, infos) - deeply copied to ensure full isolation from the original. - - Note: - This operation may be memory-intensive for large datasets. - """ - return copy.deepcopy(self) - - # -------------------------------------------------------------------------# - def get_samples( - self, ids: Optional[list[int]] = None, as_list: bool = False - ) -> Union[list[Sample], dict[int, Sample]]: - """Return dictionnary of samples with ids corresponding to :code:`ids` if specified, else all samples. - - Args: - ids (list[int], optional): If None, take all samples. Defaults to None. - as_list (bool, optional): If False, return a dict ``id -> sample``, else return a list on ``Sample`` in the same order as ``ids``. Defaults to False. - - Returns: - dict[int,Sample]: Samples with corresponding ids. - """ - if ids is None: - ids = sorted(list(self._samples.keys())) - if as_list: - return [self._samples[id] for id in ids] - else: - return {id: self._samples[id] for id in ids} - - def add_sample(self, sample: Sample, id: Optional[int] = None) -> int: - """Add a new :class:`Sample ` to the :class:`Dataset .`. - - Args: - sample (Sample): The sample to add. - id (int, optional): An optional ID for the new sample. If not provided, the ID will be automatically generated based on the current number of samples in the dataset. - - Raises: - TypeError: If ``sample`` is not a :class:`Sample `. - - Returns: - int: Id of the new added :class:`Sample `. - - Example: - .. code-block:: python - - from plaid import Dataset - dataset = Dataset() - dataset.add_sample(sample) - print(dataset) - >>> Dataset(3 samples, 0 scalars, 2 fields) - """ - if not (isinstance(sample, Sample)): - raise TypeError(f"sample should be of type Sample but {type(sample)=}") - - if id is None: - id = len(self) - self.set_sample(id=id, sample=sample) - return id - - def del_sample(self, sample_id: int) -> None: - """Delete a :class:`Sample ` from the :class:`Dataset ` and reorganize the remaining sample IDs to eliminate gaps. - - Args: - sample_id (int): The ID of the sample to delete. - - Raises: - ValueError: If the provided sample ID is not present in the dataset. - - Returns: - list[int]: The new list of sample ids. - - Example: - .. code-block:: python - - from plaid import Dataset - dataset = Dataset() - dataset.add_samples(samples) - print(dataset) - >>> Dataset(1 samples, y scalars, x fields) - dataset.del_sample(0) - print(dataset) - >>> Dataset(0 samples, 0 scalars, 0 fields) - """ - if sample_id < 0 or sample_id >= len(self._samples): - raise ValueError( - f"Invalid ID {sample_id}, it must be within [0, len(dataset)]" - ) - - if sample_id == len(self) - 1: - return self._samples.pop(sample_id) - - deleted_sample = self._samples[sample_id] - keys_to_move = np.arange(sample_id + 1, len(self._samples)) - - # Move each key one position back - for key in keys_to_move: - self._samples[key - 1] = self._samples.pop(key) - - return deleted_sample - - def add_samples( - self, samples: list[Sample], ids: Optional[list[int]] = None - ) -> list[int]: - """Add new :class:`Samples ` to the :class:`Dataset `. - - Args: - samples (list[Sample]): The list of samples to add. - ids (list[int], optional): An optional list of IDs for the new samples. If not provided, the IDs will be automatically generated based on the current number of samples in the dataset. - - Raises: - TypeError: If ``samples`` is not a list or if one of the ``samples`` is not a :class:`Sample `. - ValueError: If samples list is empty. - ValueError: If the length of ids list (if provided) is not equal to the length of samples list. - ValueError: If provided ids are not unique. - - Returns: - list[int]: Ids of added :class:`Samples `. - - Example: - .. code-block:: python - - from plaid import Dataset - dataset = Dataset() - dataset.add_samples(samples) - print(len(samples)) - >>> n - print(dataset) - >>> Dataset(n samples, 0 scalars, x fields) - """ - if not (isinstance(samples, list)): - raise TypeError(f"samples should be of type list but {type(samples)=}") - if samples == []: - raise ValueError("The list of samples to add is empty") - - for i_sample, sample in enumerate(samples): - if not (isinstance(sample, Sample)): - raise TypeError( - f"element {i_sample} of samples should be of type Sample but {type(sample)=}" - ) - - if ids is None: - ids = list(range(len(self), len(self) + len(samples))) - else: - if len(samples) != len(ids): - raise ValueError( - "The length of the list of samples to add and the list of IDs are different" - ) - if len(set(ids)) != len(ids): - raise ValueError("IDS must be unique") - - self._samples.update(dict(zip(ids, samples))) - return ids - - def del_samples(self, sample_ids: list[int]) -> None: - """Delete :class:`Sample ` from the :class:`Dataset ` and reorganize the remaining sample IDs to eliminate gaps. - - Args: - sample_ids (list[int]): The list of IDs of samples to delete. - - Raises: - TypeError: If ``sample_ids`` is not a list. - ValueError: If sample_ids list is empty. - ValueError: If any of the sample_ids does not exist in the dataset. - ValueError: If the provided IDs are not unique. - - Returns: - list[int]: The new list of sample ids. - - Example: - .. code-block:: python - - from plaid import Dataset - dataset = Dataset() - # Assume samples are already added to the dataset - print(dataset) - >>> Dataset(6 samples, y scalars, x fields) - dataset.del_samples([1, 3, 5]) - print(dataset) - >>> Dataset(3 samples, y scalars, x fields) - """ - if not isinstance(sample_ids, list): - raise TypeError( - f"sample_ids should be of type list but {type(sample_ids)=}" - ) - - if sample_ids == []: - raise ValueError("The list of sample IDs to delete is empty") - - for id in sample_ids: - if id < 0 or id >= len(self._samples): - raise ValueError( - f"Invalid ID {id}, it must be within [0, len(dataset)]" - ) - - if len(set(sample_ids)) != len(sample_ids): - raise ValueError("Sample with IDs must be unique") - - # Delete samples - deleted_samples = [] - for id in sample_ids: - deleted_samples.append(self._samples[id]) - del self._samples[id] - - # Reorganize remaining sample IDs to eliminate gaps - # from the min index of sample_ids to delete - del_idx_min = min(sample_ids) - remaining_ids = list(self._samples.keys()) - for new_id, old_id in enumerate(remaining_ids[del_idx_min:], start=del_idx_min): - if new_id != old_id: - self._samples[new_id] = self._samples.pop(old_id) - - return deleted_samples - - # -------------------------------------------------------------------------# - def get_sample_ids(self) -> list[int]: - """Return list of sample ids. - - Returns: - list[int]: List of sample ids. - """ - return list(self._samples.keys()) +class Dataset(BaseModel): + """A lazy-loading dataset that reads samples from disk on demand.""" - # -------------------------------------------------------------------------# - def get_scalar_names(self, ids: Optional[list[int]] = None) -> list[str]: - """Return union of scalars names in all samples with id in ids. - - Args: - ids (list[int], optional): Select scalars depending on sample id. If None, take all samples. Defaults to None. - - Returns: - list[str]: List of all scalars names - """ - if ids is not None and len(set(ids)) != len(ids): - logger.warning("Provided ids are not unique") - - scalars_names = [] - for sample in self.get_samples(ids, as_list=True): - s_names = sample.get_scalar_names() - for s_name in s_names: - if s_name not in scalars_names: - scalars_names.append(s_name) - scalars_names.sort() - return scalars_names - - # -------------------------------------------------------------------------# - def get_field_names( - self, - ids: Optional[list[int]] = None, - location: Optional[str] = None, - zone_name: Optional[str] = None, - base_name: Optional[str] = None, - time: Optional[float] = None, - ) -> list[str]: - """Return union of fields names in all samples with id in ids. - - Args: - ids (list[int], optional): Select fields depending on sample id. If None, take all samples. Defaults to None. - location (str, optional): If provided, only field names from this location will be included. Defaults to None. - zone_name (str, optional): If provided, only field names from this zone will be included. Defaults to None. - base_name (str, optional): If provided, only field names containing this base name will be included. Defaults to None. - time (float, optional): If provided, only field names from this time will be included. Defaults to None. - - Returns: - list[str]: List of all fields names. - """ - if ids is not None and len(set(ids)) != len(ids): # pragma: no cover - logger.warning("Provided ids are not unique") - - fields_names = [] - for sample in self.get_samples(ids, as_list=True): - times = [time] if time else sample.features.get_all_time_values() - for time in times: - base_names = ( - [base_name] - if base_name - else sample.features.get_base_names(time=time) - ) - for base_name in base_names: - zone_names = ( - [zone_name] - if zone_name - else sample.features.get_zone_names( - time=time, base_name=base_name - ) - ) - for zone_name in zone_names: - locations = [location] if location else CGNS_FIELD_LOCATIONS - for location in locations: - f_names = sample.get_field_names( - zone_name=zone_name, - base_name=base_name, - location=location, - time=time, - ) - for f_name in f_names: - if f_name not in fields_names: - fields_names.append(f_name) - fields_names.sort() - return fields_names - - # -------------------------------------------------------------------------# - - def get_all_features_identifiers( - self, ids: Optional[list[int]] = None - ) -> list[FeatureIdentifier]: - """Get all features identifiers from the dataset. - - Args: - ids (list[int], optional): Sample id from which returning feature identifiers. If None, take all samples. Defaults to None. - - Returns: - list[FeatureIdentifier]: A list of dictionaries containing the identifiers of all features in the dataset. - """ - if ids is not None and len(set(ids)) != len(ids): - logger.warning("Provided ids are not unique") - - all_features_identifiers = [] - for sample in self.get_samples(ids, as_list=True): - features_identifiers = sample.get_all_features_identifiers() - for feat_id in features_identifiers: - if feat_id not in all_features_identifiers: - all_features_identifiers.append(feat_id) - all_features_identifiers - return all_features_identifiers - - def get_all_features_identifiers_by_type( - self, - feature_type: Literal["scalar", "nodes", "field"], - ids: list[int] = None, - ) -> list[FeatureIdentifier]: - """Get all features identifiers from the dataset. - - Args: - feature_type (str): Type of features to return - ids (list[int], optional): Sample id from which returning feature identifiers. If None, take all samples. Defaults to None. - - Returns: - list[FeatureIdentifier]: A list of dictionaries containing the identifiers of all features of a given type in the dataset. - """ - all_features_identifiers = self.get_all_features_identifiers(ids) - return [ - feat_id - for feat_id in all_features_identifiers - if feat_id["type"] == feature_type - ] - - # -------------------------------------------------------------------------# - - def add_tabular_scalars( - self, tabular: np.ndarray, names: Optional[list[str]] = None - ) -> None: - """Add tabular scalar data to the summary. - - Args: - tabular (np.ndarray): A 2D NumPy array containing tabular scalar data. - names (list[str], optional): A list of column names for the tabular data. Defaults to None. - - Raises: - ShapeError: Raised if the input tabular array does not have the correct shape (2D). - ShapeError: Raised if the number of columns in the tabular data does not match the number of names provided. - - Note: - If no names are provided, it will automatically create names based on the pattern 'X{number}' - """ - nb_samples = len(tabular) - - if tabular.ndim != 2: - raise ShapeError(f"{tabular.ndim=}!=2, should be == 2") - if names is None: - names = [f"X{i}" for i in range(tabular.shape[1])] - if tabular.shape[1] != len(names): - raise ShapeError( - f"tabular should have as many columns as there are names, but {tabular.shape[1]=} and {len(names)=}" - ) - - # ---# For efficiency, first add values to storage - name_to_ids = {} - for col, name in zip(tabular.T, names): - name_to_ids[name] = col - - # ---# Then add data in sample - for i_samp in range(nb_samples): - sample = Sample() - for name in names: - sample.add_scalar(name, name_to_ids[name][i_samp]) - self.add_sample(sample) - - def get_scalars_to_tabular( - self, - scalar_names: Optional[list[str]] = None, - sample_ids: Optional[list[int]] = None, - as_nparray=False, - ) -> Union[dict[str, np.ndarray], np.ndarray]: - """Return a dict containing scalar values as tabulars/arrays. - - Args: - scalar_names (str, optional): Scalars to work on. If None, all scalars will be returned. Defaults to None. - sample_ids (list[int], optional): Filter by sample id. If None, take all samples. Defaults to None. - as_nparray (bool, optional): If True, return the data as a single numpy ndarray. If False, return a dictionary mapping scalar names to their respective tabular values. Defaults to False. - - Returns: - np.ndarray: if as_nparray is True. - dict[str,np.ndarray]: if as_nparray is False, scalar name -> tabular values. - """ - if scalar_names is None: - scalar_names = self.get_scalar_names(sample_ids) - elif len(set(scalar_names)) != len(scalar_names): - logger.warning( - "Provided scalar names are not unique, this will lead to duplicate columns in output array" - ) - - if sample_ids is None: - sample_ids = self.get_sample_ids() - elif len(set(sample_ids)) != len(sample_ids): - logger.warning( - "Provided sample ids are not unique, this will lead to duplicate rows in output array" - ) - nb_samples = len(sample_ids) - - named_tabular = {} - for s_name in scalar_names: - tmp = self[sample_ids[0]].get_scalar(s_name) - res = np.empty(nb_samples) - if isinstance(tmp, np.ndarray) and tmp.size > 1: - assert tmp.ndim < 3 - res = np.empty((nb_samples, tmp.size)) - res.fill(None) - for i_, id in enumerate(sample_ids): - val = self[id].get_scalar(s_name) - if val is not None: - res[i_] = val.reshape((-1,)) if isinstance(val, np.ndarray) else val - named_tabular[s_name] = res - - if as_nparray: - named_tabular = np.concatenate( - [v.reshape((nb_samples, -1)) for v in named_tabular.values()], axis=1 - ) - return named_tabular - - # -------------------------------------------------------------------------# - def get_feature_from_string_identifier( - self, feature_string_identifier: str - ) -> dict[int, Feature]: - """Get a list of features from the dataset based on the provided feature string identifier. - - Args: - feature_string_identifier (str): A string identifier for the feature. - - Returns: - dict[int, Feature]: A list of features matching the provided string identifier. - """ - return { - id: self[id].get_feature_from_string_identifier(feature_string_identifier) - for id in self.get_sample_ids() - } - - def get_feature_from_identifier( - self, feature_identifier: FeatureIdentifier - ) -> dict[int, Feature]: - """Get a list of features from the dataset based on the provided feature identifier. - - Args: - feature_identifier (FeatureIdentifier): A dictionary containing the feature identifier. - - Returns: - dict[int, Feature]: A list of features matching the provided identifier. - """ - return { - id: self[id].get_feature_from_identifier(feature_identifier) - for id in self.get_sample_ids() - } - - def get_features_from_identifiers( - self, feature_identifiers: list[FeatureIdentifier] - ) -> dict[int, list[Feature]]: - """Get a list of features from the dataset based on the provided feature identifiers. - - Args: - feature_identifiers (FeatureIdentifier): A dictionary containing the feature identifier. - - Returns: - dict[int, list[Feature]]: A list of features matching the provided identifier. - """ - return { - id: self[id].get_features_from_identifiers(feature_identifiers) - for id in self.get_sample_ids() - } - - def update_features_from_identifier( - self, - feature_identifiers: Union[FeatureIdentifier, list[FeatureIdentifier]], - features: dict[int, Union[Feature, list[Feature]]], - in_place: bool = False, - ) -> Self: - """Update one or several features of the dataset by their identifier(s). - - This method applies updates to scalars, fields, or nodes - using feature identifiers, and corresponding feature data. When `in_place=False`, a deep copy of the dataset is created - before applying updates, ensuring full isolation from the original. - - Args: - feature_identifiers (dict or list of dict): one or more feature identifiers. - features (dict): dict with sample index as keys and one or more features as values. - in_place (bool, optional): If True, modifies the current dataset in place. - If False, returns a deep copy with updated features. - - Returns: - Self: The updated dataset (either the current instance or a new copy). - - Raises: - AssertionError: If types are inconsistent or identifiers contain unexpected keys. - """ - assert set(features.keys()) == set(self.get_sample_ids()), ( - "Must provide the same sample indices in features as in the dataset" - ) - - dataset = self if in_place else self.copy() - - for id in self.get_sample_ids(): - dataset[id].update_features_from_identifier( - feature_identifiers, features[id], in_place=True - ) - return dataset - - def extract_dataset_from_identifier( - self, - feature_identifiers: Union[FeatureIdentifier, list[FeatureIdentifier]], - ) -> Self: - """Extract features of the dataset by their identifier(s) and return a new dataset containing these features. - - This method applies updates to scalars, fields, or nodes - using feature identifiers - - Args: - feature_identifiers (dict or list of dict): One or more feature identifiers. - - Returns: - Self: New dataset containing the provided feature identifiers - - Raises: - AssertionError: If types are inconsistent or identifiers contain unexpected keys. - """ - dataset = Dataset() - dataset.set_infos(copy.deepcopy(self.get_infos())) - - for id in self.get_sample_ids(): - extracted_sample = self[id].extract_sample_from_identifier( - feature_identifiers - ) - dataset.add_sample(sample=extracted_sample, id=id) - return dataset - - @deprecated( - "`Dataset.from_features_identifier(...)` is deprecated, use instead `Dataset.extract_dataset_from_identifier(...)`", - version="0.1.8", - removal="0.2", + model_config = ConfigDict( + revalidate_instances="always", validate_assignment=True, extra="forbid" ) - def from_features_identifier( - self, - feature_identifiers: Union[FeatureIdentifier, list[FeatureIdentifier]], - ) -> Self: - """DEPRECATED: Use :meth:`Dataset.extract_dataset_from_identifier` instead.""" - return self.extract_dataset_from_identifier( - feature_identifiers - ) # pragma: no cover - def get_tabular_from_homogeneous_identifiers( - self, - feature_identifiers: list[FeatureIdentifier], - ) -> Array: - """Extract features of the dataset by their identifier(s) and return an array containing these features. - - Features must have identic sizes to be casted in an array. The first dimension of the array is the number of samples in the dataset. - This method applies updates to scalars, fields, or nodes using feature identifiers. + path: Optional[Union[str, Path]] = Field( + default=None, description="Path to the PLAID dataset directory on disk." + ) + stage: Optional[Literal["training", "evaluating"]] = Field( + default=None, description="Dataset stage ('training' or 'evaluating')" + ) + split: Optional[str] = Field( + default=None, + description="Actual data source split (immutable after construction).", + ) + problem_definition: ProblemDefinition = Field( + default_factory=ProblemDefinition, + description="Problem definition for this dataset.", + ) + indices: NDArrayInt | Literal["all"] = Field( + default="all", + description="""Optional array of sample indices to restrict the dataset view. + Can be "all" to include all samples, a sequence of indices, or None + to use all samples from the split.""", + ) + infos: dict[str, dict[str, str]] = Field(default_factory=dict) + _conv: Any = PrivateAttr(default=None) + _ids: Any = PrivateAttr(default=None) + _backend: BackendModule = PrivateAttr( + default_factory=lambda: get_backend("in_memory")() + ) + label: str = Field(default="") + + # to set the name, task only once + def __setattr__(self, name: str, value: Any) -> None: + if name in ["split", "path"]: + current_value = getattr(self, name, None) + if ( + current_value is not None + and value is not None + and current_value != value + ): + raise AttributeError(f"'{name}' is already set and cannot be changed.") + if current_value == value: + return + super().__setattr__(name, value) + + def get_backend(self) -> BackendModule: + """Return the backend instance used to store and retrieve samples.""" + return self._backend - Args: - feature_identifiers (list of dict): Feature identifiers. + @staticmethod + def from_path( + path: str | Path, + *, + split: Any = None, + stage: Literal["training", "evaluating"] | None = None, + problem_definition: ProblemDefinition | None = None, + indices: NDArrayInt | Literal["all"] = "all", + ) -> "Dataset": + """Create and load a dataset from a local path. + + Args: + path: Dataset path on disk. + split: Optional split name to load. + stage: Optional stage label. + problem_definition: Optional problem definition to attach. + indices: Optional subset indices or ``"all"``. Returns: - Array: An containing the provided feature identifiers, size (nb_sample, nb_features, dim_features) - - Raises: - AssertionError: If feature sizes are inconsistent. - """ - features = self.get_features_from_identifiers(feature_identifiers) - dim_features = check_features_size_homogeneity(feature_identifiers, features) - - tabular = np.stack(list(features.values())) - if dim_features == 0: - tabular = np.expand_dims(tabular, axis=-1) - assert tabular.ndim == 3, ( - f"tabular must be constructed to have 3 dimensions: (nb_sample, nb_features, dim_features), but {tabular.ndim=} | {tabular.shape=}" + A loaded :class:`Dataset` instance. + """ + dataset = Dataset( + path=path, + split=split, + stage=stage, + problem_definition=problem_definition or ProblemDefinition(), + indices=indices, ) - - return tabular - - def get_tabular_from_stacked_identifiers( - self, - feature_identifiers: list[FeatureIdentifier], - ) -> tuple[Array, Array]: - """Extract features of the dataset by their identifier(s), stack them and return an array containing these features. - - After stacking, each sample has one feature of dimension dim_stacked_features - - Args: - feature_identifiers (list of dict): Feature identifiers. - - Returns: - Array: An array containing the provided feature identifiers, size (nb_sample, dim_stacked_features) - Array: An array containing the cumulated feature dimensions, starts with 0, size (len(feature_identifiers)+1, ) - """ - features = self.get_features_from_identifiers(feature_identifiers) - - tabular = [] - for local_feats in features.values(): - tabular.append(np.hstack([np.atleast_1d(e) for e in local_feats])) - tabular = np.stack(tabular) - - feat_dims = [0] - feat_dims.extend([len(np.atleast_1d(e)) for e in local_feats]) - cumulated_feat_dims = np.cumsum(feat_dims) - - return tabular, cumulated_feat_dims - - def add_features_from_tabular( - self, - tabular: Array, - feature_identifiers: list[FeatureIdentifier], - restrict_to_features: bool = True, - ) -> Self: - """Add or update features in the dataset from tabular data using feature identifiers. - - This method takes tabular data and applies it to the dataset, either by updating existing features - or adding new ones based on the provided feature identifiers. The method can either: - 1. Extract only the specified features and return a new dataset with just those features (if restrict_to_features=True) - 2. Update the specified features in the current dataset while keeping all other existing features (if restrict_to_features=False) - - Parameters: - tabular (Array): of size (nb_sample, nb_features) or (nb_sample, nb_features, dim_feature) if dim_feature>1 - feature_identifiers (list of dict): One or more feature identifiers specifying which features to update/add. - restrict_to_features (bool, optional): If True, only returns the features from feature identifiers, otherwise keep the other features as well. Defaults to True. - - Returns: - Self: A new dataset with features updated/added from the tabular data. If restrict_to_features=True, - contains only the specified features. If restrict_to_features=False, contains all original - features plus the updated/added ones. - - Raises: - AssertionError - If the number of rows in `tabular` does not match the number of samples in the dataset, - or if the number of feature identifiers does not match the number of columns in `tabular`. - """ - for i_id, feat_id in enumerate(feature_identifiers): - feature_identifiers[i_id] = FeatureIdentifier(feat_id) - - assert tabular.shape[0] == len(self) - # assert tabular.shape[1] == len(feature_identifiers) - - features = {id: tabular[i] for i, id in enumerate(self.get_sample_ids())} - - if restrict_to_features: - dataset = self.extract_dataset_from_identifier(feature_identifiers) - dataset.update_features_from_identifier( - feature_identifiers=feature_identifiers, - features=features, - in_place=True, - ) - else: - dataset = self.update_features_from_identifier( - feature_identifiers=feature_identifiers, - features=features, - in_place=False, - ) - + dataset.load(path=path, split=split) return dataset - @deprecated( - "`Dataset.from_tabular(...)` is deprecated, use instead `Dataset.add_features_from_tabular(...)`", - version="0.1.8", - removal="0.2", - ) - def from_tabular( - self, - tabular: Array, - feature_identifiers: Union[FeatureIdentifier, list[FeatureIdentifier]], - restrict_to_features: bool = True, - ) -> Self: - """DEPRECATED: Use :meth:`Dataset.add_features_from_tabular` instead.""" - return self.add_features_from_tabular( - tabular, feature_identifiers, restrict_to_features - ) # pragma: no cover - - # -------------------------------------------------------------------------# - def add_info(self, cat_key: str, info_key: str, info: str) -> None: - """Add information to the :class:`Dataset `, overwriting existing information if there's a conflict. - - Args: - cat_key (str): Category key, choose among "legal," "data_production," and "data_description". - info_key (str): Information key, depending on the chosen category key, choose among "owner", "license", "type", "physics", "simulator", "hardware", "computation_duration", "script", "contact", "location", "number_of_samples", "number_of_splits", "DOE", "inputs" and "outputs". - info (str): Information content. - - Raises: - KeyError: Invalid category key. - KeyError: Invalid info key. - - Example: - .. code-block:: python - - from plaid import Dataset - dataset = Dataset() - infos = {"legal":{"owner":"CompX", "license":"li_X"}} - dataset.set_infos(infos) - print(dataset.get_infos()) - >>> {'legal': {'owner': 'CompX', 'license': 'li_X'}} - dataset.add_info("data_production", "type", "simulation") - print(dataset.get_infos()) - >>> {'legal': {'owner': 'CompX', 'license': 'li_X'}, 'data_production': {'type': 'simulation'}} - - """ - if cat_key not in AUTHORIZED_INFO_KEYS: - raise KeyError( - f"{cat_key=} not among authorized keys. Maybe you want to try among these keys {list(AUTHORIZED_INFO_KEYS.keys())}" - ) - if info_key not in AUTHORIZED_INFO_KEYS[cat_key]: - raise KeyError( - f"{info_key=} not among authorized keys. Maybe you want to try among these keys {AUTHORIZED_INFO_KEYS[cat_key]}" - ) - - if cat_key not in self._infos: - self._infos[cat_key] = {} - elif info_key in self._infos[cat_key]: - logger.warning( - f"{cat_key=} and {info_key=} already set, replacing it anyway" - ) - self._infos[cat_key][info_key] = info - - def add_infos(self, cat_key: str, infos: dict[str, str]) -> None: - """Add information to the :class:`Dataset `, overwriting existing information if there's a conflict. - - Args: - cat_key (str): Category key, choose among "legal," "data_production," and "data_description". - infos (str): Information key with its related content. - - Raises: - KeyError: Invalid category key. - KeyError: Invalid info key. - - Example: - .. code-block:: python - - from plaid import Dataset - dataset = Dataset() - infos = {"legal":{"owner":"CompX", "license":"li_X"}} - dataset.set_infos(infos) - print(dataset.get_infos()) - >>> {'legal': {'owner': 'CompX', 'license': 'li_X'}} - new_info = {"type":"simulation", "simulator":"Z-set"} - dataset.add_infos("data_production", new_info) - print(dataset.get_infos()) - >>> {'legal': {'owner': 'CompX', 'license': 'li_X'}, 'data_production': {'type': 'simulation', 'simulator': 'Z-set'}} - - """ - if cat_key not in AUTHORIZED_INFO_KEYS: # Format checking on "infos" - raise KeyError( - f"{cat_key=} not among authorized keys. Maybe you want to try among these keys {list(AUTHORIZED_INFO_KEYS.keys())}" - ) - for info_key in infos.keys(): - if info_key not in AUTHORIZED_INFO_KEYS[cat_key]: - raise KeyError( - f"{info_key=} not among authorized keys. Maybe you want to try among these keys {AUTHORIZED_INFO_KEYS[cat_key]}" - ) - - if cat_key not in self._infos: - self._infos[cat_key] = {} - elif info_key in self._infos[cat_key]: - logger.warning( - f"{cat_key=} and {info_key=} already set, replacing it anyway" - ) - - for key, value in infos.items(): - self._infos[cat_key][key] = value - def set_infos(self, infos: dict[str, dict[str, str]], warn: bool = True) -> None: """Set information to the :class:`Dataset `, overwriting the existing one. @@ -956,888 +127,196 @@ def set_infos(self, infos: dict[str, dict[str, str]], warn: bool = True) -> None print(dataset.get_infos()) >>> {'legal': {'owner': 'CompX', 'license': 'li_X'}} """ - for cat_key in infos.keys(): # Format checking on "infos" - if cat_key != "plaid": - if cat_key not in AUTHORIZED_INFO_KEYS: - raise KeyError( - f"{cat_key=} not among authorized keys. Maybe you want to try among these keys {list(AUTHORIZED_INFO_KEYS.keys())}" - ) - for info_key in infos[cat_key].keys(): - if info_key not in AUTHORIZED_INFO_KEYS[cat_key]: - raise KeyError( - f"{info_key=} not among authorized keys. Maybe you want to try among these keys {AUTHORIZED_INFO_KEYS[cat_key]}" - ) - # Check if there are any non-plaid infos being replaced - has_user_infos = any(key != "plaid" for key in self._infos.keys()) + has_user_infos = any(key != "plaid" for key in self.infos.keys()) if has_user_infos and warn: logger.warning("infos not empty, replacing it anyway") - self._infos = copy.deepcopy(infos) - - if "plaid" not in self._infos: - self._infos["plaid"] = {} - if "version" not in self._infos["plaid"]: - self._infos["plaid"]["version"] = Version(plaid.__version__) - def get_infos(self) -> dict[str, dict[str, str]]: - """Get information from an instance of :class:`Dataset `. + self.infos = normalize_infos(infos) + if "version" not in self.infos["plaid"]: + self.infos["plaid"]["version"] = Version(__version__) - Returns: - dict[str,dict[str,str]]: Information associated with this data set (Dataset). + # load data from disk if path and split are given + def model_post_init(self, __context): + """Automatically load data when both ``path`` and ``split`` are set.""" + if self.path is not None and self.split is not None: + self.load() - Example: - .. code-block:: python - - from plaid import Dataset - dataset = Dataset() - infos = {"legal":{"owner":"CompX", "license":"li_X"}} - dataset.set_infos(infos) - print(dataset.get_infos()) - >>> {'legal': {'owner': 'CompX', 'license': 'li_X'}} - """ - return { - k: {kk: str(vv) for kk, vv in v.items()} for k, v in self._infos.items() - } - - def print_infos(self) -> None: - """Prints information in a readable format (pretty print).""" - infos_cats = list(self._infos.keys()) - tf = "*********************** \x1b[34;1mdataset infos\x1b[0m **********************\n" - for cat in infos_cats: - tf += "\x1b[33;1m" + str(cat) + "\x1b[0m\n" - infos = list(self._infos[cat].keys()) - for info in infos: - tf += ( - " \x1b[32;1m" - + str(cat) - + "\x1b[0m:" - + str(self._infos[cat][info]) - + "\n" - ) - tf += "************************************************************\n" - print(tf) - - # -------------------------------------------------------------------------# - def merge_dataset(self, dataset: Self) -> list[int]: - """Merges samples of another dataset into this one. - - Args: - dataset (Dataset): The data set to be merged into this one (self). - in_place (bool, option): If True, modifies the current dataset in place. - - Returns: - list[int]: ids of added :class:`Samples ` from input :class:`Dataset `. - - Raises: - ValueError: If the provided dataset value is not an instance of Dataset - """ - if dataset is None: - return - if not isinstance(dataset, Dataset): - raise ValueError("dataset must be an instance of Dataset") - return self.add_samples(dataset.get_samples(as_list=True)) - - def merge_features(self, dataset: Self, in_place: bool = False) -> Self: - """Merge features of another dataset into this one. - - Args: - dataset (Dataset): The dataset to be merged into this one (self). - in_place (bool, option): If True, modifies the current dataset in place. - If False, returns a deep copy with the merged features. - - Returns: - Dataset: A dataset containing all samples from the input datasets. - """ - if dataset is None: - return - if not isinstance(dataset, Dataset): - raise ValueError("dataset must be an instance of Dataset") - - merged_dataset = self if in_place else self.copy() - assert dataset.get_sample_ids() == merged_dataset.get_sample_ids(), ( - "Cannot merge features of datasets with different sample ids. " - ) - for id in dataset.get_sample_ids(): - merged_sample = merged_dataset[id].merge_features( - dataset[id], in_place=in_place - ) - merged_dataset.set_sample( - id=id, sample=merged_sample, warning_overwrite=False - ) - - return merged_dataset - - @classmethod - def merge_dataset_by_features(cls, datasets_list: list[Self]) -> Self: - """Merge features a list of datasets. - - Args: - datasets_list (list[Dataset]): The list of datasets to be merged. - - Returns: - Dataset: A new dataset containing all samples from the input datasets. - """ - if len(datasets_list) == 1: - return datasets_list[0] - - merged_dataset = datasets_list[0] - for dataset in datasets_list[1:]: - merged_dataset = merged_dataset.merge_features(dataset, in_place=False) - return merged_dataset - - @deprecated_argument("directory_path", "path", version="0.1.8", removal="0.2.0") - @deprecated( - "`Dataset.save(...)` is deprecated, use instead `Dataset.save_to_file(...)`", - version="0.1.10", - removal="0.2.0", - ) - def save(self, path: Union[str, Path]) -> None: - """DEPRECATED: use :meth:`Dataset.save_to_file` instead.""" - self.save_to_file(path) - - def save_to_file(self, path: Union[str, Path]) -> None: - """Saves the data set to a TAR (Tape Archive) file. - - It creates a temporary intermediate directory to store temporary files during the loading process. + def load( + self, path: Optional[Union[str, Path]] = None, split: Optional[Any] = None + ): + """Load dataset content from a directory or archive. Args: - path (Union[str, Path]): The path to which the data set will be saved. + path: Directory or archive path. If ``None``, uses ``self.path``. + split: Split name to load. If ``None``, uses ``self.split`` or ``"train"``. Raises: - ValueError: If the randomly generated temporary dir name is already used (extremely unlikely!). - """ - path = Path(path) - - # First : creates a directory to save everything in an - # arborescence on disk - tmp_dir = path.parent / f"tmpsavedir_{generate_random_ASCII()}" - if tmp_dir.is_dir(): # pragma: no cover - raise ValueError( - f"temporary intermediate directory <{tmp_dir}> already exits" - ) - tmp_dir.mkdir(parents=True) - - self.save_to_dir(tmp_dir) - - # Then : tar dir in file - # TODO: avoid using subprocess by using lib tarfile - ARGUMENTS = ["tar", "-cf", path, "-C", tmp_dir, "."] - subprocess.call(ARGUMENTS) - - # Finally : removes directory - shutil.rmtree(tmp_dir) - - def save_to_dir(self, path: Union[str, Path], verbose: bool = False) -> None: - """Saves the dataset into a sub-directory `samples` and creates an 'infos.yaml' file to store additional information about the dataset. - - Args: - path (Union[str, Path]): The path in which to save the files. - verbose (bool, optional): Explicitly displays the operations performed. Defaults to False. + RuntimeError: If no path is provided. + FileNotFoundError: If the path does not exist. """ - path = Path(path) - if not (path.is_dir()): - path.mkdir(parents=True) - - self.path = path - - if verbose: # pragma: no cover - print(f"Saving database to: {path}") - - # Save infos - assert "plaid" in self._infos, f"{self._infos.keys()=} should contain 'plaid'" - assert "version" in self._infos["plaid"], ( - f"{self._infos['plaid'].keys()=} should contain 'version'" - ) - plaid_version = Version(plaid.__version__) - if self._infos["plaid"]["version"] != plaid_version: # pragma: no cover - logger.warning( - f"Version mismatch: Dataset was loaded from version {self._infos['plaid']['version'] if self._infos['plaid']['version'] is not None else 'anterior to 0.1.10'}, and will be saved with version: {plaid_version}" - ) - self._infos["plaid"]["version"] = str(plaid_version) - infos_fname = path / "infos.yaml" - with open(infos_fname, "w") as file: - yaml.dump(self._infos, file, default_flow_style=False, sort_keys=False) - # Save samples - samples_dir = path / "samples" - if not (samples_dir.is_dir()): - samples_dir.mkdir(parents=True) + if path is None: + path = self.path + if path is None: + raise RuntimeError("must supply a path ") - for i_sample, sample in tqdm(self._samples.items(), disable=not (verbose)): - sample_dname = samples_dir / f"sample_{i_sample:09d}" - sample.save_to_dir(sample_dname) - - def summarize_features(self) -> str: - """Show the name of each feature and the number of samples containing it. - - Returns: - str: A summary of features across the dataset. - - Example: - .. code-block:: bash - - Dataset Feature Summary: - ================================================== - Scalars (8 unique): - - Pr: 30/32 samples (93.8%) - - Q: 30/32 samples (93.8%) - - Tr: 30/32 samples (93.8%) - - angle_in: 32/32 samples (100.0%) - - angle_out: 30/32 samples (93.8%) - - eth_is: 30/32 samples (93.8%) - - mach_out: 32/32 samples (100.0%) - - power: 30/32 samples (93.8%) - - Fields (8 unique): - - M_iso: 30/32 samples (93.8%) - - mach: 30/32 samples (93.8%) - - nut: 30/32 samples (93.8%) - - ro: 30/32 samples (93.8%) - - roe: 30/32 samples (93.8%) - - rou: 30/32 samples (93.8%) - - rov: 30/32 samples (93.8%) - - sdf: 32/32 samples (100.0%) - """ - summary = "Dataset Feature Summary:\n" - summary += "=" * 50 + "\n" + if split is None: + split = self.split - if len(self._samples) == 0: - return summary + "No samples in dataset.\n" - - # Collect all feature names across all samples - all_scalar_names = set() - all_field_names = set() - - # Count occurrences of each feature - scalar_counts = {} - field_counts = {} - - for _, sample in self._samples.items(): - # Scalars - scalar_names = sample.get_scalar_names() - all_scalar_names.update(scalar_names) - for name in scalar_names: - scalar_counts[name] = scalar_counts.get(name, 0) + 1 - - # Fields - times = sample.features.get_all_time_values() - for time in times: - base_names = sample.features.get_base_names(time=time) - for base_name in base_names: - zone_names = sample.features.get_zone_names( - base_name=base_name, time=time - ) - for zone_name in zone_names: - field_names = sample.get_field_names( - zone_name=zone_name, base_name=base_name, time=time - ) - all_field_names.update(field_names) - for name in field_names: - field_counts[name] = field_counts.get(name, 0) + 1 - - total_samples = len(self._samples) - - # Scalars summary - summary += f"Scalars ({len(all_scalar_names)} unique):\n" - if all_scalar_names: - for name in sorted(all_scalar_names): - count = scalar_counts.get(name, 0) - summary += f" - {name}: {count}/{total_samples} samples ({count / total_samples * 100:.1f}%)\n" - else: - summary += " None\n" - summary += "\n" - - # Fields summary - summary += f"Fields ({len(all_field_names)} unique):\n" - if all_field_names: - for name in sorted(all_field_names): - count = field_counts.get(name, 0) - summary += f" - {name}: {count}/{total_samples} samples ({count / total_samples * 100:.1f}%)\n" + if self.split is not None: + split = self.split else: - summary += " None\n" - - return summary - - def check_feature_completeness(self) -> str: - """Detect and notify if some Samples don't contain all features. - - Returns: - str: A report on feature completeness across the dataset. - - Example: - .. code-block:: bash - - Dataset Feature Completeness Check: - ======================================== - Complete samples: 30/32 (93.8%) - Incomplete samples: 2/32 (6.2%) + split = "train" - Samples with missing features: - Sample 671: missing 13 features - - scalar:Tr - - scalar:angle_out - - scalar:power - - scalar:Pr - - scalar:Q - ... and 8 more - Sample 672: missing 13 features - - scalar:Tr - - scalar:angle_out - - scalar:power - - scalar:Pr - - scalar:Q - ... and 8 more - """ - report = "Dataset Feature Completeness Check:\n" - report += "=" * 40 + "\n" - - if len(self._samples) == 0: - return report + "No samples in dataset.\n" - - # Collect all possible features across all samples - all_scalar_names = set() - all_field_names = set() - - for sample in self._samples.values(): - all_scalar_names.update(sample.get_scalar_names()) - - times = sample.features.get_all_time_values() - for time in times: - base_names = sample.features.get_base_names(time=time) - for base_name in base_names: - zone_names = sample.features.get_zone_names( - base_name=base_name, time=time - ) - for zone_name in zone_names: - all_field_names.update( - sample.get_field_names( - zone_name=zone_name, base_name=base_name, time=time - ) - ) + path = Path(path) + if path.is_file(): + # inputdir = path.parent / f"tmploaddir_{generate_random_ASCII()}" + import tempfile - total_samples = len(self._samples) - incomplete_samples = [] + inputdir = Path(tempfile.mkdtemp(prefix="temp_plaid_load")) - # Check each sample for missing features - for sample_id, sample in self._samples.items(): - missing_features = [] + try: + # First : untar file to a directory + # TODO: avoid using subprocess by using a lib tarfile + arguments = ["tar", "-xf", path, "-C", inputdir] + import subprocess - # Check scalars - sample_scalars = set(sample.get_scalar_names()) - missing_scalars = all_scalar_names - sample_scalars - if missing_scalars: - missing_features.extend([f"scalar:{name}" for name in missing_scalars]) + subprocess.call(arguments) - # Check fields - sample_fields = set() - times = sample.features.get_all_time_values() - for time in times: - base_names = sample.features.get_base_names(time=time) - for base_name in base_names: - zone_names = sample.features.get_zone_names( - base_name=base_name, time=time - ) - for zone_name in zone_names: - sample_fields.update( - sample.get_field_names( - zone_name=zone_name, base_name=base_name, time=time - ) - ) + # Then : load data from directory + from plaid.storage import init_from_disk - missing_fields = all_field_names - sample_fields - if missing_fields: - missing_features.extend([f"field:{name}" for name in missing_fields]) + datasetdict, converterdict = init_from_disk(inputdir) + self._ds = datasetdict[split] + self._conv = converterdict[split] + self.indices = np.arange(len(self._ds)) + finally: + # shutil.rmtree(inputdir) + # register deletion at exit + import atexit + import shutil - if missing_features: - incomplete_samples.append((sample_id, missing_features)) + atexit.register(shutil.rmtree, inputdir) - # Generate report - complete_samples = total_samples - len(incomplete_samples) - report += f"Complete samples: {complete_samples}/{total_samples} ({complete_samples / total_samples * 100:.1f}%)\n" - report += f"Incomplete samples: {len(incomplete_samples)}/{total_samples} ({len(incomplete_samples) / total_samples * 100:.1f}%)\n\n" + elif path.is_dir(): + from plaid.storage import init_from_disk - if incomplete_samples: - report += "Samples with missing features:\n" - for sample_id, missing_features in incomplete_samples: - report += ( - f" Sample {sample_id}: missing {len(missing_features)} features\n" - ) - for feature in missing_features[:5]: # Show first 5 missing features - report += f" - {feature}\n" - if len(missing_features) > 5: - report += f" ... and {len(missing_features) - 5} more\n" + datasetdict, converterdict = init_from_disk(path) + self._ds = datasetdict[split] + self._conv = converterdict[split] + self.indices = np.arange(len(self._ds)) else: - report += "All samples contain all features! ✓\n" - - return report + raise FileNotFoundError(f"Error! path '{path}' does not exist") @classmethod - @deprecated( - "`Dataset.from_list_of_samples(samples)` is deprecated, use instead `Dataset(samples=samples)`", - version="0.1.8", - removal="0.2.0", - ) - def from_list_of_samples( - cls, list_of_samples: list[Sample], ids: Optional[list[int]] = None - ) -> Self: - """DEPRECATED: use `Dataset(samples=..., sample_ids=...)` instead.""" - return cls(samples=list_of_samples, sample_ids=ids) - - @classmethod - @deprecated_argument("fname", "path", version="0.1.8", removal="0.2.0") - def load_from_file( - cls, path: Union[str, Path], verbose: bool = False, processes_number: int = 0 - ) -> Self: - """Load data from a specified TAR (Tape Archive) file. - - Args: - path (Union[str, Path]): The path to the data file to be loaded. - verbose (bool, optional): Explicitly displays the operations performed. Defaults to False. - processes_number (int, optional): Number of processes used to load files (-1 to use all available ressources, 0 to disable multiprocessing). Defaults to 0. - - Returns: - Self: The loaded dataset (Dataset). - """ - path = Path(path) - instance = cls() - instance.load(path, verbose, processes_number) - return instance - - @classmethod - @deprecated_argument("fname", "path", version="0.1.8", removal="0.2.0") - def load_from_dir( + def from_train_split( cls, - path: Union[str, Path], - ids: Optional[list[int]] = None, - verbose: bool = False, - processes_number: int = 0, - ) -> Self: - """Load data from a specified directory. + path: Path, + pb_def_name: str = "PLAID_benchmark", + ) -> "Dataset": + """Create a Dataset from the training split defined in the problem definition. Args: - path (Union[str, Path]): The path from which to load files. - ids (list, optional): The specific sample IDs to load from the dataset. Defaults to None. - verbose (bool, optional): Explicitly displays the operations performed. Defaults to False. - processes_number (int, optional): Number of processes used to load files (-1 to use all available ressources, 0 to disable multiprocessing). Defaults to 0. + path: Path to the PLAID dataset directory on disk. + pb_def_name: Name of the problem definition to load. Returns: - Self: The loaded dataset (Dataset). - """ - path = Path(path) - instance = cls() - instance._load_from_dir_( - path, ids=ids, verbose=verbose, processes_number=processes_number - ) - return instance - - @deprecated_argument("fname", "path", version="0.1.8", removal="0.2.0") - def load( - self, path: Union[str, Path], verbose: bool = False, processes_number: int = 0 - ) -> None: - """Load data from a specified file or directory. - - Note: - If path is a file, it creates a temporary intermediate directory to extract the files from the archive during the loading process. - - Note: - This method overwrites the content of the calling instance. - - Args: - path (Union[str, Path]): The path to the data file to be loaded. - verbose (bool, optional): Explicitly displays the operations performed. Defaults to False. - processes_number (int, optional): Number of processes used to load files (-1 to use all available ressources, 0 to disable multiprocessing). Defaults to 0. - - Raises: - ValueError: If a randomly generated temporary directory already exists, - indicating a potential conflict during the loading process (extremely unlikely). - """ - path = Path(path) - - if path.is_file(): - inputdir = path.parent / f"tmploaddir_{generate_random_ASCII()}" - if inputdir.is_dir(): # pragma: no cover - raise ValueError( - f"temporary intermediate directory <{inputdir}> already exits" - ) - inputdir.mkdir(parents=True) - - # First : untar file to a directory - # TODO: avoid using subprocess by using a lib tarfile - arguments = ["tar", "-xf", path, "-C", inputdir] - subprocess.call(arguments) - - # Then : load data from directory - self._load_from_dir_( - inputdir, verbose=verbose, processes_number=processes_number - ) - - # Finally : removes directory - shutil.rmtree(inputdir) - else: - self._load_from_dir_( - path, verbose=verbose, processes_number=processes_number - ) - - # -------------------------------------------------------------------------# - @deprecated_argument("save_dir", "path", version="0.1.8", removal="0.2.0") - def add_to_dir( - self, - sample: Sample, - path: Optional[Union[str, Path]] = None, - verbose: bool = False, - ) -> None: - """Add a sample to the dataset and save it to the specified directory. - - Note: - If `path` is None, will look for `self.path` which will be retrieved from last previous call to load or save. - `path` given in argument will take precedence over `self.path` and overwrite it. - - Args: - sample (Sample): The sample to add. - path (Union[str, Path], optional): The directory in which to save the sample. Defaults to None. - verbose (bool, optional): If True, will print additional information. Defaults to False. - - Raises: - ValueError: If both self.path and path are None. + Dataset instance loaded from the training split. """ - if path is not None: - path = Path(path) - self.path = path - else: - if not hasattr(self, "path") or self.path is None: - raise ValueError( - "self.path and path are None, we don't know where to save, specify one of them before" - ) - - # --- sample is not only saved to dir, but also added to the dataset - # self.add_sample(sample) - # --- if dataset already contains other Samples, they will all be saved to path - # self._save_to_dir_(self.path) - - if not self.path.is_dir(): - self.path.mkdir(parents=True) - - if verbose: - print(f"Saving database to: {self.path}") - - samples_dir = self.path / "samples" - if not samples_dir.is_dir(): - samples_dir.mkdir(parents=True) - - # find i_sample - # if there are already samples in the instance, we should not take an already existing id - # if there are already samples in the path, we should not take an already existing id - sample_ids_in_path = [ - int(d.name.split("_")[-1]) - for d in samples_dir.glob("sample_*") - if d.is_dir() - ] - i_sample = max(sample_ids_in_path) + 1 if len(sample_ids_in_path) > 0 else 0 - i_sample = max(len(self), i_sample) - - sample_dname = samples_dir / f"sample_{i_sample:09d}" - sample.save_to_dir(sample_dname) - - @deprecated( - "`Dataset._save_to_dir_(path)` is deprecated, use instead `Dataset.save_to_dir(path)`", - version="0.1.10", - removal="0.2.0", - ) - def _save_to_dir_(self, path: Union[str, Path], verbose: bool = False) -> None: - """DEPRECATED: use :meth:`Dataset.save_to_dir` instead.""" - self.save_to_dir(path, verbose=verbose) - - def _load_from_dir_( - self, - path: Union[str, Path], - ids: Optional[list[int]] = None, - verbose: bool = False, - processes_number: int = 0, - ) -> None: - """DEPRECATED: use :meth:`Dataset.load` instead.""" - path = Path(path) - if not path.is_dir(): - raise FileNotFoundError( - f'"{path}" is not a directory or does not exist. Abort' - ) - - if processes_number < -1: - raise ValueError("Number of processes cannot be < -1") - - self.path = path - - if verbose: # pragma: no cover - print(f"Reading database located at: {path}") - - # Load infos - infos_fname = path / "infos.yaml" - if infos_fname.is_file(): - with open(infos_fname, "r") as file: - self._infos = yaml.safe_load(file) - if ( - "plaid" not in self._infos or "version" not in self._infos["plaid"] - ): # pragma: no cover - self._infos.setdefault("plaid", {}).setdefault("version", None) - else: - if not isinstance(self._infos["plaid"]["version"], Version): - self._infos["plaid"]["version"] = Version( - self._infos["plaid"]["version"] - ) - - # Load samples - sample_paths = sorted( - [path for path in (path / "samples").glob("sample_*") if path.is_dir()] - ) + problem_definition = ProblemDefinition.from_path(path=path, name=pb_def_name) + split, indices = next(iter(problem_definition.train_split.items())) - if ids is not None: - filtered_sample_paths = [] - for sample_path in sample_paths: - id = int(sample_path.stem.split("_")[-1]) - if id in ids: - filtered_sample_paths.append(sample_path) - sample_paths = filtered_sample_paths - - if len(sample_paths) != len(set(ids)): # pragma: no cover - raise ValueError( - "The length of the list of samples to add and the list of IDs are different" - ) - - if processes_number == -1: - logger.info( - f"Number of processes set to maximum available: {os.cpu_count()}" - ) - processes_number = os.cpu_count() - - if processes_number == 0 or processes_number == 1: - for sample_path in tqdm(sample_paths, disable=not (verbose)): - id = int(sample_path.stem.split("_")[-1]) - sample = Sample(path=sample_path) - self.add_sample(sample, id) + # Convert indices to numpy array if needed + indices_array: np.ndarray | Literal["all"] | None + if indices == "all": + indices_array = "all" + elif indices is not None: + indices_array = np.array(indices) else: - with Pool(processes_number) as p: - for id, sample in list( - tqdm( - p.imap(_process_sample, sample_paths), - total=len(sample_paths), - disable=not (verbose), - ) - ): - self.set_sample(id, sample) - - """ - samples_pool = Pool(processes_number) - pbar = tqdm(total=len(sample_paths), disable=not (verbose)) - - def update(self, *a): - pbar.update() - - samples = [ - samples_pool.apply_async( - _process_sample, - args=( - sample_paths[i], - i), - callback=update) for i in range( - len(sample_paths))] - - samples_pool.close() - samples_pool.join() - - for s in samples: - id, sample = s.get() - self.set_sample(id, sample) - """ - - if len(self) == 0: # pragma: no cover - print("Warning: dataset contains no sample") - - @staticmethod - def _load_number_of_samples_(_path: Union[str, Path]) -> int: - """DEPRECATED: use :meth:`plaid.get_number_of_samples ` instead.""" - raise DeprecatedError( - 'use instead: plaid.get_number_of_samples("path-to-my-dataset")' + indices_array = None + + return cls( + path=path, + stage="training", + split=split, + # problem_definition=problem_definition, + indices=indices_array, ) - # -------------------------------------------------------------------------# - def set_samples(self, samples: dict[int, Sample]) -> None: - """Set the samples of the data set, overwriting the existing ones. + def __getitem__(self, idx: int): + """Return a single converted sample from the current dataset view. Args: - samples (dict[int,Sample]): A dictionary of samples to set inside the dataset. - - Raises: - TypeError: If the 'samples' parameter is not of type dict[int, Sample]. - TypeError: If the 'id' inside a sample is not of type int. - ValueError: If the 'id' inside a sample is negative (id >= 0 is required). - TypeError: If the values inside the 'samples' dictionary are not of type Sample. - """ - if not (isinstance(samples, dict)): - raise TypeError( - f"samples should be of type dict[int,Sample] but is {type(samples)=}" - ) - - ids = list(samples.keys()) - for id in ids: - if not (isinstance(id, int)): - raise TypeError(f"id should be of type {int.__class__} but {type(id)=}") - if not (id >= 0): - raise ValueError(f"id should be positive (id>=0) but {id=}") - if not (isinstance(samples[id], Sample)): - raise TypeError( - f"samples[{id=}] should be of type {Sample.__class__} but {type(samples[id])=}" - ) - - if len(self._samples) > 0: - logger.warning( - f"{len(self._samples)} samples are already present in dataset, replacing them anyway" - ) - self._samples = samples + idx: Position inside the current ``_ids`` view. - # TODO: on veut vraiment faire ça ? - # laisser l’utilisateur faire joujou avec les id des samples ? - # le laisser placer des samples n’importe où ? - # - avec des ids potentiellement négatifs, - # - potentiellement loin après le dernier id déjà présent... - def set_sample( - self, id: int, sample: Sample, warning_overwrite: bool = True - ) -> None: - """Set a :class:`sample` with :code:`id` in the Dataset, overwriting existing samples if there's a conflict. - - Args: - id (int): The choosen id of the sample. - sample (Sample): The sample to set inside the dataset. - warning_overwrite (bool, optional): Show warning if an preexisting field is being overwritten - - Raises: - TypeError: If the 'id' inside the sample is not of type int. - ValueError: If the 'id' inside a sample is negative (id >= 0 is required). - TypeError: If 'sample' parameter is not of type Sample. - - Caution: - In case of conflict, the existing samples will be overwritten. + Returns: + Sample converted to PLAID format by the split converter. """ - if not (isinstance(id, int)): - raise TypeError(f"id should be of type {int.__class__} but {type(id)=}") - if not (id >= 0): - raise ValueError(f"id should be positive (id>=0) but {id=}") - if not (isinstance(sample, Sample)): - raise TypeError( - f"sample should be of type {Sample.__class__} but {type(sample)=}" - ) + return self._backend[idx] - if warning_overwrite: - if id in self._samples: - logger.warning( - f"sample with {id=} already present in dataset, replacing it anyway" - ) - self._samples[id] = sample - - # -------------------------------------------------------------------------# - def __len__(self) -> int: - """Return the number of samples in the dataset. + def __len__(self): + """Return the number of samples currently exposed by this dataset. Returns: - int: The number of samples in the dataset. - - Example: - .. code-block:: python - - from plaid import Dataset - dataset = Dataset() - len(dataset) - >>> 10 # Assuming there are 10 samples in the dataset + Number of indices currently stored in ``_ids``. """ - return len(self._samples) - - def __iter__(self) -> Iterator[Sample]: - """Iterate over the samples in the dataset. - - Yields: - Iterator[Sample]: An iterator over the Sample objects stored in the dataset. - - Example: - >>> for sample in dataset: - ... process(sample) + if isinstance(self.indices, str): + if self.indices == "all": + return len(self._backend) + else: + raise RuntimeError(f"'{self.indices}' not a valid value") - Note: - The samples are yielded in ascending order of their IDs. - Only samples that have been explicitly added to the dataset are returned. - """ - return (self._samples[k] for k in sorted(list(self._samples.keys()))) + return len(self.indices) - def __getitem__( - self, id: Union[int, slice, range, list[int], np.ndarray] - ) -> Union[Sample, Self]: - """Retrieve a specific sample by its ID int this dataset. + def get_samples(self, ids: Optional[list[int]] = None) -> list[Sample]: + """Return a list of samples corresponding to the given IDs. Args: - id (Union[int, slice, list[int], np.ndarray]): The ID(s) of the sample to retrieve. - - Raises: - IndexError: If the provided ID is out of bounds or does not exist in the dataset. + ids: Optional list of sample IDs to retrieve. If None, retrieves all samples in the dataset. Returns: - Union[Sample, Dataset]: The sample with the specified ID or a dataset in the specified IDs. - - Example: - .. code-block:: python - - from plaid import Dataset - dataset = Dataset() - sample = dataset[3] # Retrieve the sample with ID 3 - - Seealso: - This function can also be called using `__call__()`. + List of Sample objects corresponding to the specified IDs. """ - if isinstance(id, (slice, range, list, np.ndarray)): - if isinstance(id, slice): - id = list(range(*id.indices(len(self)))) - dataset = Dataset() - for i in id: - dataset.add_sample(self[int(i)], int(i)) - dataset.set_infos(self.get_infos()) - return dataset - else: - if id in self._samples: - return self._samples[id] + if ids is None: + if self.indices == "all": + return [self[i] for i in range(len(self))] else: - raise IndexError( - f"sample with {id=} not set -> use 'Dataset.add_sample' or 'Dataset.add_samples'" - ) - - __call__ = __getitem__ - - def __str__(self) -> str: - """Return a string representation of the dataset. + return [self[i] for i in self.indices] + else: + return [self[i] for i in ids] - Returns: - str: A string representation of the overview of dataset content. + def get_sample_ids(self): + """Return the effective sample identifiers for the current dataset view.""" + if self.indices == "all": + return range(len(self)) + else: + return self.indices - Example: - .. code-block:: python + def save_to_dir(self, output_folder: Union[str, Path], verbose: bool = False): + """Save the current dataset view to disk. - from plaid import Dataset - dataset = Dataset() - print(dataset) - >>> Dataset(0 samples, 0 scalars, 0 fields) + Args: + output_folder: Destination folder. + verbose: If ``True``, enables verbose storage output. """ - str_repr = "Dataset(" - - # samples - nb_samples = len(self._samples) - str_repr += f"{nb_samples} sample{'' if nb_samples == 1 else 's'}, " - - # scalars - nb_scalars = len(self.get_scalar_names()) - str_repr += f"{nb_scalars} scalar{'' if nb_scalars == 1 else 's'}, " - - # fields - nb_fields = len(self.get_field_names()) - str_repr += f"{nb_fields} field{'' if nb_fields == 1 else 's'}, " + from plaid.storage import save_to_disk - if str_repr[-2:] == ", ": - str_repr = str_repr[:-2] - str_repr = str_repr + ")" - return str_repr - - __repr__ = __str__ + if self.split is not None: + split = self.split + else: + split = "train" + + save_to_disk( + output_folder=output_folder, + sample_constructor=lambda x: self[x], + ids={split: self.get_sample_ids()}, + backend="cgns" + if self._backend.name not in ["zarr"] + else "zarr", # or "cgns" / "zarr" + # infos=infos, + # pb_defs=pb_def, + num_proc=1, + overwrite=True, + verbose=verbose, + ) diff --git a/src/plaid/containers/feature_identifier.py b/src/plaid/containers/feature_identifier.py deleted file mode 100644 index f76820f9..00000000 --- a/src/plaid/containers/feature_identifier.py +++ /dev/null @@ -1,37 +0,0 @@ -"""Feature identifier class for PLAID containers.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -from typing import Union - - -class FeatureIdentifier(dict[str, Union[str, float]]): - """Feature identifier for a specific feature.""" - - def __init__(self, *args, **kwargs) -> None: - return super().__init__(*args, **kwargs) - - def __hash__(self) -> int: # pyright: ignore[reportIncompatibleVariableOverride] - """Compute a hash for the feature identifier. - - Returns: - int: The hash value. - """ - return hash(frozenset(sorted(self.items()))) - # return hash(tuple(sorted(self.items()))) - - def __lt__(self, other: "FeatureIdentifier") -> bool: - """Compare two feature identifiers for ordering. - - Args: - other (FeatureIdentifier): The other feature identifier to compare against. - - Returns: - bool: True if this feature identifier is less than the other, False otherwise. - """ - return sorted(self.items()) < sorted(other.items()) diff --git a/src/plaid/containers/features.py b/src/plaid/containers/features.py index 86fd54ac..ff2a173c 100644 --- a/src/plaid/containers/features.py +++ b/src/plaid/containers/features.py @@ -10,19 +10,18 @@ import numpy as np from CGNS.PAT.cgnsutils import __CHILDREN__, __NAME__ -from plaid.constants import ( +from ..constants import ( CGNS_ELEMENT_NAMES, CGNS_FIELD_LOCATIONS, ) -from plaid.containers.managers.default_manager import DefaultManager -from plaid.containers.utils import ( +from ..containers.managers.default_manager import DefaultManager +from ..containers.utils import ( _check_names, _read_index, get_feature_details_from_path, ) -from plaid.types import Array, CGNSNode, CGNSTree, Field -from plaid.utils import cgns_helper as CGH -from plaid.utils.deprecation import deprecated +from ..types import Array, CGNSNode, CGNSTree +from ..utils import cgns_helper as CGH logger = logging.getLogger(__name__) @@ -129,15 +128,6 @@ def get_all_time_values(self) -> list[float]: """ return list(self.data.keys()) - @deprecated( - "`get_all_mesh_times()` is deprecated, use instead `get_all_time_values()`", - version="0.1.11", - removal="0.2.0", - ) - def get_all_mesh_times(self) -> list[float]: - """DEPRECATED: Use :meth:`get_all_time_values` instead.""" - return self.get_all_time_values() # pragma: no cover - def init_tree(self, time: Optional[float] = None) -> CGNSTree: """Initialize a CGNS tree structure at a specified time step or create a new one if it doesn't exist. @@ -323,12 +313,12 @@ def get_topological_dim( return base_node[1][0] def get_physical_dim( - self, base_name: Optional[str] = None, time: Optional[float] = None + self, base: Optional[str] = None, time: Optional[float] = None ) -> int: """Get the physical dimension of a base node at a specific time. Args: - base_name (str, optional): The name of the base node for which to retrieve the topological dimension. Defaults to None. + base (str, optional): The name of the base node for which to retrieve the topological dimension. Defaults to None. time (float, optional): The time at which to retrieve the topological dimension. Defaults to None. Raises: @@ -337,10 +327,10 @@ def get_physical_dim( Returns: int: The topological dimension of the specified base node at the given time. """ - base_node = self.get_base(base_name, time) + base_node = self.get_base(base, time) if base_node is None: # pragma: no cover raise ValueError( - f"there is no base called {base_name} at the time {time} in this sample" + f"there is no base called {base} at the time {time} in this sample" ) return base_node[1][1] @@ -554,7 +544,7 @@ def del_zone(self, zone_name: str, base_name: str, time: float) -> CGNSTree: if time not in self.data: raise KeyError(f"There is no CGNS tree for time {time}.") - zone_node = self.get_zone(zone_name=zone_name, base_name=base_name, time=time) + zone_node = self.get_zone(zone=zone_name, base=base_name, time=time) mesh_tree = self.data[time] if zone_node is None: @@ -623,8 +613,8 @@ def has_zone( def get_zone( self, - zone_name: Optional[str] = None, - base_name: Optional[str] = None, + zone: Optional[str] = None, + base: Optional[str] = None, time: Optional[float] = None, ) -> CGNSNode: """Retrieve a CGNS Zone node by its name within a specific Base and time. @@ -638,23 +628,23 @@ def get_zone( CGNSNode: Returns a CGNS Zone node if found; otherwise, returns None. """ # get_base will look for default base_name and time - base_node = self.get_base(base_name, time) + base_node = self.get_base(base, time) if base_node is None: # logger.warning(f"No base with name {base_name} in this tree") return None # _zone_attribution will look for default base_name - zone_name = self.resolve_zone(zone_name, base_name, time) - if zone_name is None: + zone = self.resolve_zone(zone, base, time) + if zone is None: # logger.warning(f"No zone with name {zone_name} in this base ({base_name})") return None - return CGU.getNodeByPath(base_node, zone_name) + return CGU.getNodeByPath(base_node, zone) def get_zone_type( self, - zone_name: Optional[str] = None, - base_name: Optional[str] = None, + zone: Optional[str] = None, + base: Optional[str] = None, time: Optional[float] = None, ) -> str: """Get the type of a specific zone at a specified time. @@ -671,19 +661,19 @@ def get_zone_type( str: The type of the specified zone as a string. """ # get_zone will look for default base_name, zone_name and time - zone_node = self.get_zone(zone_name=zone_name, base_name=base_name, time=time) + zone_node = self.get_zone(zone=zone, base=base, time=time) if zone_node is None: raise KeyError( - f"there is no base/zone <{base_name}/{zone_name}>, you should first create one with `Sample.init_zone({zone_name=},{base_name=})`" + f"there is no base/zone <{base}/{zone}>, you should first create one with `Sample.init_zone({zone=},{base=})`" ) return CGU.getValueByPath(zone_node, "ZoneType").tobytes().decode() # -------------------------------------------------------------------------# def get_nodal_tags( self, - zone_name: Optional[str] = None, - base_name: Optional[str] = None, + zone: Optional[str] = None, + base: Optional[str] = None, time: Optional[float] = None, ) -> dict[str, Array]: """Get the nodal tags for a specified base and zone at a given time. @@ -698,7 +688,7 @@ def get_nodal_tags( The NumPy arrays have shape (num_nodal_tags). """ # get_zone will look for default base_name, zone_name and time - zone_node = self.get_zone(zone_name=zone_name, base_name=base_name, time=time) + zone_node = self.get_zone(zone=zone, base=base, time=time) if zone_node is None: return {} @@ -867,9 +857,10 @@ def get_global_names(self, time: Optional[float] = None) -> list[str]: # -------------------------------------------------------------------------# def get_nodes( self, - zone_name: Optional[str] = None, - base_name: Optional[str] = None, + zone: Optional[str] = None, + base: Optional[str] = None, time: Optional[float] = None, + name: Optional[str] = None, ) -> Optional[Array]: """Get grid node coordinates from a specified base, zone, and time. @@ -884,12 +875,9 @@ def get_nodes( Returns: Optional[Array]: A NumPy array containing the grid node coordinates. If no matching zone or grid coordinates are found, None is returned. - - Seealso: - This function can also be called using `get_points()` or `get_vertices()`. """ # get_zone will look for default base_name, zone_name and time - search_node = self.get_zone(zone_name=zone_name, base_name=base_name, time=time) + search_node = self.get_zone(zone=zone, base=base, time=time) if search_node is None: return None @@ -897,6 +885,16 @@ def get_nodes( grid_paths = CGU.getAllNodesByTypeSet(search_node, ["GridCoordinates_t"]) if len(grid_paths) == 1: grid_node = CGU.getNodeByPath(search_node, grid_paths[0]) + + if name == "CoordinateX": + return CGU.getValueByPath(grid_node, "GridCoordinates/CoordinateX") + elif name == "CoordinateY": + return CGU.getValueByPath(grid_node, "GridCoordinates/CoordinateY") + elif name == "CoordinateZ": + return CGU.getValueByPath(grid_node, "GridCoordinates/CoordinateZ") + elif name is not None: + raise ValueError(f"Unknown coordinate name: {name}") + array_x = CGU.getValueByPath(grid_node, "GridCoordinates/CoordinateX") array_y = CGU.getValueByPath(grid_node, "GridCoordinates/CoordinateY") array_z = CGU.getValueByPath(grid_node, "GridCoordinates/CoordinateZ") @@ -919,14 +917,12 @@ def get_nodes( f"Found {len(grid_paths)} nodes, should find only one" ) - get_points = get_nodes - get_vertices = get_nodes def set_nodes( self, nodes: Array, - zone_name: Optional[str] = None, - base_name: Optional[str] = None, + zone: Optional[str] = None, + base: Optional[str] = None, time: Optional[float] = None, ) -> None: """Set the coordinates of nodes for a specified base and zone at a given time. @@ -940,16 +936,13 @@ def set_nodes( Raises: KeyError: Raised if the specified base or zone do not exist. You should first create the base and zone using the `Sample.init_zone(zone_name,base_name)` method. - - Seealso: - This function can also be called using `set_points()` or `set_vertices()` """ # get_zone will look for default base_name, zone_name and time - zone_node = self.get_zone(zone_name=zone_name, base_name=base_name, time=time) + zone_node = self.get_zone(zone=zone, base=base, time=time) if zone_node is None: raise KeyError( - f"there is no base/zone <{base_name}/{zone_name}>, you should first create one with `Sample.init_zone({zone_name=},{base_name=})`" + f"there is no base/zone <{base}/{zone}>, you should first create one with `Sample.init_zone({zone=},{base=})`" ) # Check if GridCoordinates_t node exists @@ -972,29 +965,30 @@ def set_nodes( # Create new coordinate CGL.newCoordinates(zone_node, name, np.asfortranarray(nodes[..., i_dim])) - set_points = set_nodes - set_vertices = set_nodes - # -------------------------------------------------------------------------# def get_elements( self, - zone_name: Optional[str] = None, - base_name: Optional[str] = None, + zone: Optional[str] = None, + base: Optional[str] = None, time: Optional[float] = None, ) -> dict[str, Array]: """Retrieve element connectivity data for a specified zone, base, and time. Args: - zone_name (str, optional): The name of the zone for which element connectivity data is requested. Defaults to None, indicating the default zone. - base_name (str, optional): The name of the base for which element connectivity data is requested. Defaults to None, indicating the default base. + zone (str, optional): The name of the zone for which element + connectivity data is requested. Defaults to None, indicating the + default zone. + base (str, optional): The name of the base for which element + connectivity data is requested. Defaults to None, indicating the + default base. time (float, optional): The time at which element connectivity data is requested. If a specific time is not provided, the method will display the tree structure for the default time step. Returns: dict[str, Array]: A dictionary where keys are element type names and values are NumPy arrays representing the element connectivity data. The NumPy arrays have shape (num_elements, num_nodes_per_element), and element indices are 0-based. """ - # get_zone will look for default base_name, zone_name and time - zone_node = self.get_zone(zone_name=zone_name, base_name=base_name, time=time) + # get_zone will look for default base, zone and time + zone_node = self.get_zone(zone=zone, base=base, time=time) if zone_node is None: return {} @@ -1024,8 +1018,8 @@ def get_elements( def get_field_names( self, location: str = None, - zone_name: Optional[str] = None, - base_name: Optional[str] = None, + zone: Optional[str] = None, + base: Optional[str] = None, time: Optional[float] = None, ) -> list[str]: """Get a set of field names associated with a specified zone, base, location, and/or time. @@ -1035,8 +1029,10 @@ def get_field_names( Args: location (str, optional): The desired grid location where to search for. Defaults to None. Possible values : :py:const:`plaid.constants.CGNS_FIELD_LOCATIONS` - zone_name (str, optional): The name of the zone to search for. Defaults to None. - base_name (str, optional): The name of the base to search for. Defaults to None. + zone (str, optional): The name of the zone to search for. + Defaults to None. + base (str, optional): The name of the base to search for. + Defaults to None. time (float, optional): The specific time at which to search for. Defaults to None. Returns: @@ -1044,12 +1040,10 @@ def get_field_names( """ def get_field_names_one_time_base_zone_location( - location: str, zone_name: str, base_name: str, time: float + location: str, zone: str, base: str, time: float ) -> list[str]: - # get_zone will look for default zone_name, base_name, time - search_node = self.get_zone( - zone_name=zone_name, base_name=base_name, time=time - ) + # get_zone will look for default zone, base, time + search_node = self.get_zone(zone=zone, base=base, time=time) if search_node is None: # pragma: no cover return [] @@ -1073,26 +1067,22 @@ def get_field_names_one_time_base_zone_location( field_names = [] times = [time] if time is not None else self.get_all_time_values() for _time in times: - base_names = ( - [base_name] - if base_name is not None - else self.get_base_names(time=_time) - ) - for _base_name in base_names: - zone_names = ( - [zone_name] - if zone_name is not None - else self.get_zone_names(base_name=_base_name, time=_time) + bases = [base] if base is not None else self.get_base_names(time=_time) + for _base in bases: + zones = ( + [zone] + if zone is not None + else self.get_zone_names(base_name=_base, time=_time) ) - for _zone_name in zone_names: + for _zone in zones: locations = ( [location] if location is not None else CGNS_FIELD_LOCATIONS ) for _location in locations: field_names += get_field_names_one_time_base_zone_location( location=_location, - zone_name=_zone_name, - base_name=_base_name, + zone=_zone, + base=_base, time=_time, ) @@ -1104,10 +1094,10 @@ def get_field( self, name: str, location: str = "Vertex", - zone_name: Optional[str] = None, - base_name: Optional[str] = None, + zone: Optional[str] = None, + base: Optional[str] = None, time: Optional[float] = None, - ) -> Field: + ) -> np.ndarray: """Retrieve a field with a specified name from a given zone, base, location, and time. Args: @@ -1122,7 +1112,7 @@ def get_field( Field: A set containing the names of the fields that match the specified criteria. """ # get_zone will look for default time - search_node = self.get_zone(zone_name=zone_name, base_name=base_name, time=time) + search_node = self.get_zone(zone=zone, base=base, time=time) if search_node is None: return None @@ -1149,10 +1139,10 @@ def get_field( def add_field( self, name: str, - field: Field, + field: np.ndarray, location: str = "Vertex", - zone_name: Optional[str] = None, - base_name: Optional[str] = None, + zone: Optional[str] = None, + base: Optional[str] = None, time: Optional[float] = None, warning_overwrite: bool = True, ) -> None: @@ -1182,11 +1172,11 @@ def add_field( # init_tree will look for default time self.init_tree(time) # get_zone will look for default zone_name, base_name and time - zone_node = self.get_zone(zone_name=zone_name, base_name=base_name, time=time) + zone_node = self.get_zone(zone=zone, base=base, time=time) if zone_node is None: raise KeyError( - f"there is no Zone with name {zone_name} in base {base_name}. Did you check topological and physical dimensions ?" + f"there is no Zone with name {zone} in base {base}. Did you check topological and physical dimensions ?" ) # Check field size consistency with its geometrical support @@ -1255,8 +1245,8 @@ def del_field( self, name: str, location: str = "Vertex", - zone_name: Optional[str] = None, - base_name: Optional[str] = None, + zone: Optional[str] = None, + base: Optional[str] = None, time: Optional[float] = None, ) -> CGNSTree: """Delete a field with specified name in the mesh. @@ -1265,8 +1255,8 @@ def del_field( name (str): The name of the field to be deleted. location (str, optional): The grid location where the field is stored. Defaults to 'Vertex'. Possible values : :py:const:`plaid.constants.CGNS_FIELD_LOCATIONS` - zone_name (str, optional): The name of the zone from which the field will be deleted. Defaults to None. - base_name (str, optional): The name of the base where the zone is located. Defaults to None. + zone (str, optional): The name of the zone from which the field will be deleted. Defaults to None. + base (str, optional): The name of the base where the zone is located. Defaults to None. time (float, optional): The time associated with the field. Defaults to None. Raises: @@ -1276,14 +1266,12 @@ def del_field( CGNSTree: The tree at the provided time (without the deleted node) """ # get_zone will look for default zone_name, base_name, and time - zone_node = self.get_zone(zone_name=zone_name, base_name=base_name, time=time) + zone_node = self.get_zone(zone=zone, base=base, time=time) time = self.resolve_time(time) mesh_tree = self.data[time] if zone_node is None: - raise KeyError( - f"There is no Zone with name {zone_name} in base {base_name}." - ) + raise KeyError(f"There is no Zone with name {zone} in base {base}.") solution_paths = CGU.getPathsByTypeSet(zone_node, [CGK.FlowSolution_t]) diff --git a/src/plaid/containers/managers/default_manager.py b/src/plaid/containers/managers/default_manager.py index e2d6bd39..427e6280 100644 --- a/src/plaid/containers/managers/default_manager.py +++ b/src/plaid/containers/managers/default_manager.py @@ -76,7 +76,7 @@ def set_default_time(self, time: float) -> None: Note: - Setting the default time is important for synchronizing operations with a specific time point in the system's data. - - The available mesh times can be obtained using the `get_all_mesh_times` method. + - The available mesh times can be obtained using the `get_all_time_values` method. Example: .. code-block:: python diff --git a/src/plaid/containers/sample.py b/src/plaid/containers/sample.py index c3913ce3..21799ce9 100644 --- a/src/plaid/containers/sample.py +++ b/src/plaid/containers/sample.py @@ -1,14 +1,9 @@ """Implementation of the `Sample` container.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports import sys +from typing import TYPE_CHECKING + +from ..types.common import ArrayDType if sys.version_info >= (3, 11): from typing import Self @@ -17,7 +12,6 @@ Self = TypeVar("Self") -import copy import logging import pickle import shutil @@ -31,21 +25,14 @@ from pydantic import BaseModel, ConfigDict, PrivateAttr from pydantic import Field as PydanticField -from plaid.constants import ( - AUTHORIZED_FEATURE_INFOS, +from ..constants import ( AUTHORIZED_FEATURE_TYPES, + AUTHORIZED_FEATURE_TYPES_T, CGNS_FIELD_LOCATIONS, ) -from plaid.containers.feature_identifier import FeatureIdentifier -from plaid.containers.features import SampleFeatures -from plaid.containers.utils import get_feature_type_and_details_from -from plaid.types import ( - Feature, - Scalar, -) -from plaid.utils import cgns_helper as CGH -from plaid.utils.base import delegate_methods, safe_len -from plaid.utils.deprecation import deprecated +from ..utils import cgns_helper as CGH +from ..utils.base import delegate_methods, safe_len +from .features import SampleFeatures logger = logging.getLogger(__name__) @@ -58,7 +45,6 @@ "resolve_zone", "set_default_time", "get_all_time_values", - "get_all_mesh_times", "get_tree", "get_base_names", "get_zone_names", @@ -81,6 +67,7 @@ "init_tree", "add_tree", "del_tree", + "set_trees", ] @@ -96,6 +83,7 @@ class Sample(BaseModel): """ # Pydantic configuration + # TODO(FB) check why arbitrary_types_allowed is needed, and if it can be removed model_config = ConfigDict( arbitrary_types_allowed=True, revalidate_instances="always", extra="forbid" ) @@ -106,8 +94,8 @@ class Sample(BaseModel): description="Path to the folder containing the sample data. If provided, the sample will be loaded from this path during initialization. Defaults to None.", ) - features: Optional[SampleFeatures] = PydanticField( - default_factory=lambda _: SampleFeatures(data=None), + features: SampleFeatures = PydanticField( + default_factory=lambda: SampleFeatures(data=None), description="An instance of SampleFeatures containing mesh data. Defaults to an empty `SampleFeatures` object.", ) @@ -135,117 +123,9 @@ def copy(self) -> Self: # pyright: ignore[reportIncompatibleMethodOverride] """ return self.model_copy(deep=True) - def get_scalar(self, name: str) -> Optional[Scalar]: - """Retrieve a scalar value associated with the given name. - - Args: - name (str): The name of the scalar value to retrieve. - - Returns: - Scalar or None: The scalar value associated with the given name, or None if the name is not found. - """ - return self.features.get_global(name) - - def add_scalar(self, name: str, value: Scalar) -> None: - """Add a scalar value to a dictionary. - - Args: - name (str): The name of the scalar value. - value (Scalar): The scalar value to add or update in the dictionary. - """ - self.features.add_global(name, value) - - def del_scalar(self, name: str) -> Scalar: - """Delete a scalar value from the dictionary. - - Args: - name (str): The name of the scalar value to be deleted. - - Raises: - KeyError: Raised when there is no scalar / there is no scalar with the provided name. - - Returns: - Scalar: The value of the deleted scalar. - """ - return self.features.del_global(name) - - def get_scalar_names(self) -> list[str]: - """Get a set of scalar names available in the object. - - Returns: - list[str]: A set containing the names of the available scalars. - """ - return self.features.get_global_names() - - # -------------------------------------------------------------------------# - - def del_all_fields( - self, - ) -> Self: - """Delete alls field from sample, while keeping geometrical info. - - Returns: - Sample: The sample with deleted fields - """ - all_features_identifiers = self.get_all_features_identifiers() - # Delete all fields in the sample - for feat_id in all_features_identifiers: - if feat_id["type"] == "field": - self.del_field( - name=feat_id["name"], - location=feat_id["location"], - zone_name=feat_id["zone_name"], - base_name=feat_id["base_name"], - time=feat_id["time"], - ) - return self - - # -------------------------------------------------------------------------# - def get_all_features_identifiers( - self, - ) -> list[FeatureIdentifier]: - """Get all features identifiers from the sample. - - Returns: - list[FeatureIdentifier]: A list of dictionaries containing the identifiers of all features in the sample. - """ - all_features_identifiers = [] - for sn in self.get_scalar_names(): - all_features_identifiers.append({"type": "scalar", "name": sn}) - for t in self.features.get_all_time_values(): - for bn in self.features.get_base_names(time=t): - for zn in self.features.get_zone_names(base_name=bn, time=t): - if ( - self.features.get_nodes(base_name=bn, zone_name=zn, time=t) - is not None - ): - all_features_identifiers.append( - { - "type": "nodes", - "base_name": bn, - "zone_name": zn, - "time": t, - } - ) - for loc in CGNS_FIELD_LOCATIONS: - for fn in self.features.get_field_names( - location=loc, zone_name=zn, base_name=bn, time=t - ): - all_features_identifiers.append( - { - "type": "field", - "name": fn, - "base_name": bn, - "zone_name": zn, - "location": loc, - "time": t, - } - ) - return all_features_identifiers - def get_all_features_identifiers_by_type( - self, feature_type: str - ) -> list[FeatureIdentifier]: + self, feature_type: AUTHORIZED_FEATURE_TYPES_T + ) -> list[str]: """Get all features identifiers of a given type from the sample. Args: @@ -255,15 +135,22 @@ def get_all_features_identifiers_by_type( list[FeatureIdentifier]: A list of dictionaries containing the identifiers of a given type of all features in the sample. """ assert feature_type in AUTHORIZED_FEATURE_TYPES, "feature_type not known" - all_features_identifiers = self.get_all_features_identifiers() - return [ - feat_id - for feat_id in all_features_identifiers - if feat_id["type"] == feature_type - ] + if feature_type == "scalar": + return self.get_global_names() + elif feature_type == "field": + return self.get_field_names() + elif feature_type == "nodes": + return [ + "Coordinate" + n + for _, n in zip( + range(self.features.get_physical_dim()), ["X", "Y", "Z"] + ) + ] - def get_feature_by_path(self, path: str, time: Optional[int] = None) -> Feature: - """Retrieve a feature value from the sample's CGNS mesh using a CGNS-style path. + def get_feature_by_path( + self, path: str, time: Optional[int] = None + ) -> np.number | np.ndarray | None: + """Retrieve a feature value from the sample's CGNS mesh using a CGNS-style url. Args: path (str): CGNS node path relative to the mesh root (for example @@ -284,149 +171,28 @@ def get_feature_by_path(self, path: str, time: Optional[int] = None) -> Feature: time = self.features.resolve_time(time) return CGU.getValueByPath(self.get_tree(time), path) - def get_feature_from_string_identifier( - self, feature_string_identifier: str - ) -> Feature: - """Retrieve a specific feature from its encoded string identifier. - - The `feature_string_identifier` must follow the format: - ":://.../" - - Supported feature types: - - "scalar": expects 1 detail → `scalars.get(name)` - - "field": up to 5 details → `get_field(name, base_name, zone_name, location, time)` - - "nodes": up to 3 details → `get_nodes(base_name, zone_name, time)` - - Args: - feature_string_identifier (str): Structured identifier of the feature. - - Returns: - Feature: The retrieved feature object. - - Raises: - AssertionError: If `feature_type` is unknown. - - Warnings: - - If "time" is present in a field/nodes identifier, it is cast to float. - - `name` is required for scalar and field features. - - The order of the details must be respected. One cannot specify a detail in the feature_string_identifier string without specified the previous ones. - """ - splitted_identifier = feature_string_identifier.split("::") - - feature_type = splitted_identifier[0] - feature_details = [detail for detail in splitted_identifier[1].split("/")] - - assert feature_type in AUTHORIZED_FEATURE_TYPES, "feature_type not known" - - arg_names = AUTHORIZED_FEATURE_INFOS[feature_type] - assert len(arg_names) >= len(feature_details), "Too much details provided" - - if feature_type == "scalar": - val = self.get_scalar(feature_details[0]) - if val is None: - raise KeyError( - f"Unknown scalar {feature_details[0]}" - ) # pragma: no cover - return val - elif feature_type == "field": - kwargs = {arg_names[i]: detail for i, detail in enumerate(feature_details)} - for k in kwargs: - if kwargs[k] == "": - kwargs[k] = None - if "time" in kwargs: - kwargs["time"] = float(kwargs["time"]) - return self.get_field(**kwargs) - elif feature_type == "nodes": - kwargs = {arg_names[i]: detail for i, detail in enumerate(feature_details)} - for k in kwargs: - if kwargs[k] == "": - kwargs[k] = None - if "time" in kwargs: - kwargs["time"] = float(kwargs["time"]) - return self.get_nodes(**kwargs).flatten() - - def get_feature_from_identifier( - self, feature_identifier: FeatureIdentifier - ) -> Feature: - """Retrieve a feature object based on a structured identifier dictionary. - - The `feature_identifier` must include a `"type"` key specifying the feature kind: - - `"scalar"` → calls `scalars.get(name)` - - `"field"` → calls `get_field(name, base_name, zone_name, location, time)` - - `"nodes"` → calls `get_nodes(base_name, zone_name, time)` + # def get_feature_from_details(self, feature_details): + # #TODO (FB) dont know if this is useful - Required keys: - - `"type"`: one of `"scalar"`, `"field"`, or `"nodes"` - - `"name"`: required for all types except `"nodes"` + # my_type = feature_details.pop("type", None) + # my_sub_type = feature_details.pop("sub_type", None) + # if my_type == 'coordinate': + # print(feature_details) + # return self.get_nodes(**feature_details).flatten() + # elif my_type == 'field': + # print(feature_details) + # return self.get_field(**feature_details) + # elif my_type == 'global': + # return self.get_global(**feature_details) - Optional keys depending on type: - - `"base_name"`, `"zone_name"`, `"location"`, `"time"`: used in `"field"` and `"nodes"` - - Any omitted optional keys will rely on the default values mechanics of the class. - - Args: - feature_identifier ( dict[str:Union[str, float]]): - A dictionary encoding the feature type and its relevant parameters. - - Returns: - Feature: The corresponding feature instance retrieved via the appropriate accessor. - """ - feature_type, feature_details = get_feature_type_and_details_from( - feature_identifier - ) - - if feature_type == "scalar": - return self.get_scalar(**feature_details) - elif feature_type == "field": - return self.get_field(**feature_details) - elif feature_type == "nodes": - return self.get_nodes(**feature_details).flatten() - - def get_features_from_identifiers( - self, feature_identifiers: list[FeatureIdentifier] - ) -> list[Feature]: - """Retrieve features based on a list of structured identifier dictionaries. - - Elements of `feature_identifiers` must include a `"type"` key specifying the feature kind: - - `"scalar"` → calls `scalars.get(name)` - - `"field"` → calls `get_field(name, base_name, zone_name, location, time)` - - `"nodes"` → calls `get_nodes(base_name, zone_name, time)` - - Required keys: - - `"type"`: one of `"scalar"`, `"field"`, or `"nodes"` - - `"name"`: required for all types except `"nodes"` - - Optional keys depending on type: - - `"base_name"`, `"zone_name"`, `"location"`, `"time"`: used in `"field"` and `"nodes"` - - Any omitted optional keys will rely on the default values mechanics of the class. - - Args: - feature_identifiers (list[FeatureIdentifier]): - A dictionary encoding the feature type and its relevant parameters. - - Returns: - list[Feature]: List of corresponding feature instance retrieved via the appropriate accessor. - """ - all_features_info = [ - get_feature_type_and_details_from(feat_id) - for feat_id in feature_identifiers - ] - - features = [] - for feature_type, feature_details in all_features_info: - if feature_type == "scalar": - features.append(self.get_scalar(**feature_details)) - elif feature_type == "field": - features.append(self.get_field(**feature_details)) - elif feature_type == "nodes": - features.append(self.get_nodes(**feature_details).flatten()) - return features + # print(feature_details) + # print(my_type) + # raise RuntimeError(f"Cant find a featurn for the given details") def add_feature( self, - feature_identifier: FeatureIdentifier, - feature: Feature, + feature_path: str, + feature: ArrayDType, ) -> Self: """Add a feature to current sample. @@ -443,61 +209,77 @@ def add_feature( Raises: AssertionError: If types are inconsistent or identifiers contain unexpected keys. """ - feature_type, feature_details = get_feature_type_and_details_from( - feature_identifier - ) + # feature_type, feature_details = get_feature_type_and_details_from( + # feature_identifier + # ) - if feature_type == "scalar": + from .utils import get_feature_details_from_path + + feature_details = get_feature_details_from_path(feature_path) + + feature_type = feature_details.pop("type") + _ = feature_details.pop("sub_type", None) + + if feature_type == "global": if safe_len(feature) == 1: feature = feature[0] - self.add_scalar(**feature_details, value=feature) + self.add_global(**feature_details, global_array=feature) elif feature_type == "field": self.add_field(**feature_details, field=feature, warning_overwrite=False) - elif feature_type == "nodes": + elif feature_type == "coordinate": + if feature_details.get("name", None) is not None: # pragma: no cover + raise ValueError("Must set the 3 coordinate at the same time") physical_dim_arg = { - k: v for k, v in feature_details.items() if k in ["base_name", "time"] + k: v for k, v in feature_details.items() if k in ["base", "time"] } phys_dim = self.features.get_physical_dim(**physical_dim_arg) self.set_nodes(**feature_details, nodes=feature.reshape((-1, phys_dim))) + else: # pragma: no cover + print(feature_details) + raise RuntimeError(f"feature_type not allowed : {feature_type}") return self - def del_feature( - self, - feature_identifier: FeatureIdentifier, - ) -> Self: - """Remove a feature from current sample. - - This method applies updates to scalars, time series, fields, or nodes using feature identifiers. + def del_feature_by_path( + self, path: str, time: Optional[float] = None + ) -> np.number | np.ndarray | None: + """Retrieve a feature value from the sample's CGNS mesh using a CGNS-style url. Args: - feature_identifier (dict): A feature identifier. + path (str): CGNS node path relative to the mesh root (for example + "BaseName/ZoneName/GridCoordinates/CoordinateX" or + "BaseName/ZoneName/Solution/FieldName"). + time (Optional[int], optional): Time selection for the mesh. If an integer, + it is interpreted via the SampleFeatures time-assignment logic + (see SampleFeatures.resolve_time). If None, the default time + assignment is used. Defaults to None. Returns: - Self: The updated sample + Feature: The value stored at the given CGNS path. This may be a numpy array, a scalar, or None if the node has no value. - Raises: - AssertionError: If types are inconsistent or identifiers contain unexpected keys. + Note: + - This is a thin wrapper around CGNS.PAT.cgnsutils.getValueByPath and Sample.get_tree(time). Callers should handle a returned None when the path or value does not exist. + - For field-like features, prefer using Sample.get_field which applies additional validation and selection logic. """ - feature_type, feature_details = get_feature_type_and_details_from( - feature_identifier - ) + time = self.features.resolve_time(time) +# return CGU.getValueByPath(self.get_tree(time), path) - if feature_type == "scalar": - self.del_scalar(**feature_details) - elif feature_type == "field": - self.del_field(**feature_details) - elif feature_type == "nodes": - raise NotImplementedError("Deleting node features is not implemented.") + updated_tree = None + node = CGU.getNodeByPath(self.get_tree(time), path) + if node is not None: + updated_tree = CGU.nodeDelete(self.get_tree(time), node) + + # If the function reaches here, the field was not found + if updated_tree is None: + raise KeyError(f"There is no field with name {name} in the specified zone.") + + return updated_tree - return self - def update_features_from_identifier( + def update_features_by_path( self, - feature_identifiers: dict[ - int, Union[FeatureIdentifier, list[FeatureIdentifier]] - ], - features: Union[Feature, list[Feature]], + feature_identifiers: dict[int, Union[str, list[str]]], + features: Union[ArrayDType, list[ArrayDType]], in_place: bool = False, ) -> Self: """Update one or several features of the sample by their identifier(s). @@ -524,7 +306,7 @@ def update_features_from_identifier( features = [features] assert len(feature_identifiers) == len(features) for i_id, feat_id in enumerate(feature_identifiers): - feature_identifiers[i_id] = FeatureIdentifier(feat_id) + feature_identifiers[i_id] = str(feat_id) sample = self if in_place else self.copy() @@ -533,123 +315,97 @@ def update_features_from_identifier( return sample - def extract_sample_from_identifier( - self, - feature_identifiers: Union[FeatureIdentifier, list[FeatureIdentifier]], - ) -> Self: - """Extract features of the sample by their identifier(s) and return a new sample containing these features. + # def extract_sample_by_path( + # self, + # feature_identifiers: Union[str, list[str]], + # ) -> Self: + # """Extract features of the sample by their identifier(s) and return a new sample containing these features. - This method applies updates to scalars, fields, or nodes - using feature identifiers + # This method applies updates to scalars, fields, or nodes + # using feature identifiers - Args: - feature_identifiers (dict or list of dict): One or more feature identifiers. + # Args: + # feature_identifiers (dict or list of dict): One or more feature identifiers. - Returns: - Self: New sample containing the provided feature identifiers + # Returns: + # Self: New sample containing the provided feature identifiers - Raises: - AssertionError: If types are inconsistent or identifiers contain unexpected keys. - """ - assert isinstance(feature_identifiers, dict) or isinstance( - feature_identifiers, list - ), "Check types of feature_identifiers argument" - if isinstance(feature_identifiers, dict): - feature_identifiers = [feature_identifiers] + # Raises: + # AssertionError: If types are inconsistent or identifiers contain unexpected keys. + # """ + # assert isinstance(feature_identifiers, dict) or isinstance( + # feature_identifiers, list + # ), "Check types of feature_identifiers argument" + # if isinstance(feature_identifiers, dict): + # feature_identifiers = [feature_identifiers] - source_sample = self.copy() - source_sample.del_all_fields() + # source_sample = self.copy() + # source_sample.del_all_fields() - sample = Sample() + # sample = Sample() - for feat_id in feature_identifiers: - feature = self.get_feature_from_identifier(feat_id) + # for feat_id in feature_identifiers: + # feature = self.get_feature_from_identifier(feat_id) - if feature is not None: - # get time of current feature - time = self.features.resolve_time(time=feat_id.get("time")) + # if feature is not None: + # # get time of current feature + # time = self.features.resolve_time(time=feat_id.get("time")) - # if the constructed sample does not have a tree, add the one from the source sample, with no field - if len(sample.features.get_base_names(time=time)) == 0: - sample.features.add_tree(source_sample.features.get_tree(time)) - for name in sample.features.get_global_names(time=time): - sample.features.del_global(name, time) + # # if the constructed sample does not have a tree, add the one from the source sample, with no field + # if len(sample.features.get_base_names(time=time)) == 0: + # sample.features.add_tree(source_sample.features.get_tree(time)) + # for name in sample.features.get_global_names(time=time): + # sample.features.del_global(name, time) - sample.add_feature(feat_id, feature) + # sample.add_feature(feat_id, feature) - sample._extra_data = copy.deepcopy(self._extra_data) + # sample._extra_data = copy.deepcopy(self._extra_data) - return sample + # return sample - @deprecated( - "`Dataset.from_features_identifier(...)` is deprecated, use instead `Dataset.extract_sample_from_identifier(...)`", - version="0.1.8", - removal="0.2", - ) - def from_features_identifier( - self, - feature_identifiers: Union[FeatureIdentifier, list[FeatureIdentifier]], - ) -> Self: - """DEPRECATED: Use :meth:`Dataset.extract_sample_from_identifier` instead.""" - return self.extract_sample_from_identifier( - feature_identifiers - ) # pragma: no cover - - def merge_features(self, sample: Self, in_place: bool = False) -> Self: - """Merge features from another sample into the current sample. - - This method applies updates to scalars, fields, or nodes - using features from another sample. When `in_place=False`, a deep copy of the sample is created - before applying updates, ensuring full isolation from the original. + # def merge_features(self, sample: Self, in_place: bool = False) -> Self: + # """Merge features from another sample into the current sample. - Args: - sample (Sample): The sample from which features will be merged. - in_place (bool, optional): If True, modifies the current sample in place. - If False, returns a deep copy with updated features. + # This method applies updates to scalars, fields, or nodes + # using features from another sample. When `in_place=False`, a deep copy of the sample is created + # before applying updates, ensuring full isolation from the original. - Returns: - Self: The updated sample (either the current instance or a new copy). - """ - merged_dataset = self if in_place else self.copy() + # Args: + # sample (Sample): The sample from which features will be merged. + # in_place (bool, optional): If True, modifies the current sample in place. + # If False, returns a deep copy with updated features. - all_features_identifiers = sample.get_all_features_identifiers() - all_features = sample.get_features_from_identifiers(all_features_identifiers) + # Returns: + # Self: The updated sample (either the current instance or a new copy). + # """ + # merged_dataset = self if in_place else self.copy() - feature_types = set([feat_id["type"] for feat_id in all_features_identifiers]) + # all_features_identifiers = sample.get_all_features_identifiers() + # all_features = sample.get_features_from_identifiers(all_features_identifiers) - # if field or node features are to extract, copy the source sample and delete all fields - if "field" in feature_types or "nodes" in feature_types: - source_sample = sample.copy() - source_sample.del_all_fields() + # feature_types = set([feat_id["type"] for feat_id in all_features_identifiers]) - # DELETE LATER IF CONFIRMED THIS IS NOT NEEDED (WITH GLOBAL, THERE IS ALWAYS A TREE) - # for feat_id in all_features_identifiers: - # # if trying to add a field or nodes, must check if the corresponding tree exists, and add it if not - # if feat_id["type"] in ["field", "nodes"]: - # # get time of current feature - # time = sample.features.resolve_time(time=feat_id.get("time")) + # # if field or node features are to extract, copy the source sample and delete all fields + # if "field" in feature_types or "nodes" in feature_types: + # source_sample = sample.copy() + # source_sample.del_all_fields() - # # if the constructed sample does not have a tree, add the one from the source sample, with no field - # if not merged_dataset.features.get_tree(time): - # merged_dataset.features.add_tree(source_sample.get_tree(time)) + # # DELETE LATER IF CONFIRMED THIS IS NOT NEEDED (WITH GLOBAL, THERE IS ALWAYS A TREE) + # # for feat_id in all_features_identifiers: + # # # if trying to add a field or nodes, must check if the corresponding tree exists, and add it if not + # # if feat_id["type"] in ["field", "nodes"]: + # # # get time of current feature + # # time = sample.features.resolve_time(time=feat_id.get("time")) - return merged_dataset.update_features_from_identifier( - feature_identifiers=all_features_identifiers, - features=all_features, - in_place=in_place, - ) + # # # if the constructed sample does not have a tree, add the one from the source sample, with no field + # # if not merged_dataset.features.get_tree(time): + # # merged_dataset.features.add_tree(source_sample.get_tree(time)) - # -------------------------------------------------------------------------# - @deprecated( - "`Sample.save(...)` is deprecated, use instead `Sample.save_to_dir(...)`", - version="0.1.8", - removal="0.2", - ) - def save( - self, path: Union[str, Path], overwrite: bool = False, memory_safe: bool = False - ) -> None: - """DEPRECATED: use :meth:`Sample.save_to_dir` instead.""" - self.save_to_dir(path, overwrite=overwrite, memory_safe=memory_safe) + # return merged_dataset.update_features_by_path( + # feature_identifiers=all_features_identifiers, + # features=all_features, + # in_place=in_place, + # ) # -------------------------------------------------------------------------# def save_to_dir( @@ -760,48 +516,6 @@ def load(self, path: Union[str, Path]) -> None: (self.features.data[time],) = (tree,) - old_scalars_file = path / "scalars.csv" - if old_scalars_file.is_file(): - self._load_old_scalars(old_scalars_file) - - old_time_series_files = list(path.glob("time_series_*.csv")) - if len(old_time_series_files) > 0: - self._load_old_time_series(old_time_series_files) - - @deprecated( - reason="This Sample was written with plaid<=0.1.9, save it with plaid>=0.1.10 to have all features embedded in the CGNS tree", - version="0.1.10", - removal="0.2.0", - ) - def _load_old_scalars(self, scalars_file: Path): - names = np.loadtxt(scalars_file, dtype=str, max_rows=1, delimiter=",").reshape( - (-1,) - ) - scalars = np.loadtxt( - scalars_file, dtype=float, skiprows=1, delimiter="," - ).reshape((-1,)) - for name, value in zip(names, scalars): - self.add_scalar(name, value) - - @deprecated( - reason="This Sample was written with plaid<=0.1.9, save it with plaid>=0.1.10 to have all features embedded in the CGNS tree", - version="0.1.10", - removal="0.2.0", - ) - def _load_old_time_series(self, time_series_files: list[Path]): - for ts_fname in time_series_files: - names = np.loadtxt(ts_fname, dtype=str, max_rows=1, delimiter=",").reshape( - (-1,) - ) - assert names[0] == "t" - times_and_val = np.loadtxt(ts_fname, dtype=float, skiprows=1, delimiter=",") - for i in range(times_and_val.shape[0]): - self.add_global( - name=names[1], - global_array=times_and_val[i, 1], - time=times_and_val[i, 0], - ) - # # -------------------------------------------------------------------------# def __str__(self) -> str: """Return a string representation of the sample. @@ -831,7 +545,7 @@ def __str__(self) -> str: for location in CGNS_FIELD_LOCATIONS: field_names = field_names.union( self.features.get_field_names( - location=location, zone_name=zn, base_name=bn, time=time + location=location, zone=zn, base=bn, time=time ) ) nb_fields = len(field_names) @@ -887,11 +601,11 @@ def summarize(self) -> str: summary += "=" * 50 + "\n" # Scalars with names - scalar_names = self.get_scalar_names() + scalar_names = self.get_global_names() if scalar_names: summary += f"Scalars ({len(scalar_names)}):\n" for name in scalar_names: - value = self.get_scalar(name) + value = self.get_global(name) summary += f" - {name}: {value}\n" summary += "\n" @@ -911,12 +625,12 @@ def summarize(self) -> str: summary += f" Zone: {zone_name}\n" # Nodes, nodal tags and fields at verticies nodes = self.get_nodes( - zone_name=zone_name, base_name=base_name, time=time + zone=zone_name, base=base_name, time=time ) if nodes is not None: nb_nodes = nodes.shape[0] nodal_tags = self.features.get_nodal_tags( - zone_name=zone_name, base_name=base_name, time=time + zone=zone_name, base=base_name, time=time ) summary += f" Nodes ({nb_nodes})\n" if len(nodal_tags) > 0: @@ -925,8 +639,8 @@ def summarize(self) -> str: for location in CGNS_FIELD_LOCATIONS: field_names = self.get_field_names( location=location, - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, time=time, ) if field_names: @@ -934,7 +648,7 @@ def summarize(self) -> str: # Elements and fields at elements elements = self.features.get_elements( - zone_name=zone_name, base_name=base_name, time=time + zone=zone_name, base=base_name, time=time ) summary += f" Elements ({sum([v.shape[0] for v in elements.values()])})\n" if len(elements) > 0: @@ -962,7 +676,7 @@ def check_completeness(self) -> str: report += "=" * 30 + "\n" # Check if sample has basic features - has_scalars = len(self.get_scalar_names()) > 0 + has_scalars = len(self.get_global_names()) > 0 has_meshes = len(self.features.get_all_time_values()) > 0 report += f"Has scalars: {has_scalars}\n" @@ -981,8 +695,8 @@ def check_completeness(self) -> str: for location in CGNS_FIELD_LOCATIONS: field_names = self.get_field_names( location=location, - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, time=time, ) total_fields.update(field_names) @@ -992,3 +706,9 @@ def check_completeness(self) -> str: report += f"Field names: {', '.join(sorted(total_fields))}\n" return report + + +if TYPE_CHECKING: + # Inheriting from the Protocol (SampleFeatures) inside TYPE_CHECKING + # automatically adds all its methods to Sample's autocomplete. + class Sample(Sample, SampleFeatures): ... diff --git a/src/plaid/containers/utils.py b/src/plaid/containers/utils.py index 3e77e685..c8b1f5d4 100644 --- a/src/plaid/containers/utils.py +++ b/src/plaid/containers/utils.py @@ -1,29 +1,15 @@ """Utility functions for PLAID containers.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports from pathlib import Path -from typing import Any, Optional, Union +from typing import Optional, Union import CGNS.PAT.cgnsutils as CGU import numpy as np -from plaid.constants import ( - AUTHORIZED_FEATURE_INFOS, - AUTHORIZED_FEATURE_TYPES, +from ..constants import ( CGNS_FIELD_LOCATIONS, - REQUIRED_INFOS_KEYS, ) -from plaid.containers.feature_identifier import FeatureIdentifier -from plaid.types import Feature -from plaid.utils.base import safe_len path_to_location = {f"{loc}Fields": loc for loc in CGNS_FIELD_LOCATIONS} retrocompatibility = { @@ -50,7 +36,7 @@ def _check_names(names: Union[str, list[Optional[str]], None]): for name in names: if (name is not None) and ("/" in name): raise ValueError( - f"feature_names containing `/` are not allowed, but {name=}, you should first replace any occurence of `/` with something else, for example: `name.replace('/','__')`" + f"feature_names containing `/` are not allowed, but {name=}, you should first replace any occurrence of `/` with something else, for example: `name.replace('/','__')`" ) if (name is not None) and (len(name) > 32): raise ValueError( @@ -143,11 +129,7 @@ def get_sample_ids(savedir: Union[str, Path]) -> list[int]: """ savedir = Path(savedir) return sorted( - [ - int(d.stem.split("_")[-1]) - for d in (savedir / "samples").glob("sample_*") - if d.is_dir() - ] + [int(d.stem.split("_")[-1]) for d in (savedir).glob("sample_*") if d.is_dir()] ) @@ -163,144 +145,6 @@ def get_number_of_samples(savedir: Union[str, Path]) -> int: return len(get_sample_ids(savedir)) -def get_feature_type_and_details_from( - feature_identifier: FeatureIdentifier, -) -> tuple[str, FeatureIdentifier]: - """Extract and validate the feature type and its associated metadata from a feature identifier. - - This utility function ensures that the `feature_identifier` dictionary contains a valid - "type" key (e.g., "scalar", "field", "node") and returns the type along - with the remaining identifier keys, which are specific to the feature type. - - Args: - feature_identifier (dict): A dictionary with a "type" key, and - other keys (some optional) depending on the feature type. For example: - - {"type": "scalar", "name": "Mach"} - - {"type": "field", "name": "pressure"} - - {"type": "field", "name": "pressure", "time":0.} - - {"type": "nodes", "base_name": "Base_2_2"} - - Returns: - tuple[str, dict]: A tuple `(feature_type, feature_details)` where: - - `feature_type` is the value of the "type" key (e.g., "scalar"). - - `feature_details` is a dictionary of the remaining keys. - - Raises: - AssertionError: - - If "type" is missing. - - If the type is not in `AUTHORIZED_FEATURE_TYPES`. - - If any unexpected keys are present for the given type. - """ - assert "type" in feature_identifier, ( - "feature type not specified in feature_identifier" - ) - feature_type = feature_identifier["type"] - feature_details = feature_identifier.copy() - feature_type = feature_details.pop("type") - - assert feature_type in AUTHORIZED_FEATURE_TYPES, ( - f"feature type {feature_type} not known" - ) - - assert all( - key in AUTHORIZED_FEATURE_INFOS[feature_type] for key in feature_details - ), ( - f"Unexpected key(s) in feature_identifier {feature_details=} | {feature_type=} -> {AUTHORIZED_FEATURE_INFOS[feature_type]}" - ) - - return feature_type, feature_details - - -def check_features_type_homogeneity( - feature_identifiers: list[FeatureIdentifier], -) -> None: - """Check type homogeneity of features, for tabular conversion. - - Args: - feature_identifiers (list[dict]): dict with a "type" key, and - other keys (some optional) depending on the feature type. For example: - - {"type": "scalar", "name": "Mach"} - - {"type": "field", "name": "pressure"} - - Raises: - AssertionError: if types are not consistent - """ - assert feature_identifiers and isinstance(feature_identifiers, list), ( - "feature_identifiers must be a non-empty list" - ) - feat_type = feature_identifiers[0]["type"] - for i, feat_id in enumerate(feature_identifiers): - assert feat_id["type"] in AUTHORIZED_FEATURE_TYPES, "feature type not known" - assert feat_id["type"] == feat_type, ( - f"Inconsistent feature types: {i}-th feature type is {feat_id['type']}, while the first one is {feat_type}" - ) - - -def check_features_size_homogeneity( - feature_identifiers: list[FeatureIdentifier], - features: dict[int, list[Feature]], -) -> int: - """Check size homogeneity of features, for tabular conversion. - - Size homogeneity is check through samples for each feature, and through features for each sample. - To be converted to tabular data, each sample must have the same number of features and each feature - must have the same dimension - - Args: - feature_identifiers (list[dict]): dict with a "type" key, and - other keys (some optional) depending on the feature type. For example: - - {"type": "scalar", "name": "Mach"} - - {"type": "field", "name": "pressure"} - features (dict): dict with sample index as keys and one or more features as values. - - Returns: - int: the common feature dimension - - Raises: - AssertionError: if sizes are not consistent - """ - features_values = list(features.values()) - nb_samples = len(features_values) - nb_features = len(feature_identifiers) - for i in range(nb_features): - name_feature = feature_identifiers[i].get("name", "nodes") - size = safe_len(features_values[0][i]) - for j in range(nb_samples): - size_j = safe_len(features_values[j][i]) - assert size_j == size, ( - f"Inconsistent feature sizes for feature {i} (name {name_feature}): has size {size_j} in sample {j}, while having size {size} in sample 0" - ) - - for j in range(nb_samples): - size = safe_len(features_values[j][0]) - for i in range(nb_features): - name_feature = feature_identifiers[i].get("name", "nodes") - size_i = safe_len(features_values[j][i]) - assert size_i == size, ( - f"Inconsistent feature sizes in sample {j}: feature {i} (name {name_feature}) size {size_i}, while feature 0 (name {feature_identifiers[0]['name']}) is of size {size}" - ) - return size - - -def has_duplicates_feature_ids(feature_identifiers: list[FeatureIdentifier]): - """Check whether a list of feature identifier contains duplicates. - - Args: - feature_identifiers (list[FeatureIdentifier]): - A list of dictionaries representing feature identifiers. - - Returns: - bool: True if a duplicate is found in the list, False otherwise. - """ - seen = set() - for d in feature_identifiers: - frozen = frozenset(d.items()) - if frozen in seen: - return True - seen.add(frozen) - return False - - def get_feature_details_from_path(path: str) -> dict[str, str]: """Retrieve semantic details from a CGNS-style path.""" split_path = path.split("/") @@ -356,10 +200,11 @@ def get_feature_details_from_path(path: str) -> dict[str, str]: # ---------------------- # Grid coordinates # ---------------------- - if node == "GridCoordinates" and len(split_path) == 4: + if node == "GridCoordinates" and len(split_path) >= 3 and len(split_path) <= 4: feat["type"] = "coordinate" feat["sub_type"] = "node" - feat["name"] = split_path[3] + if len(split_path) == 4: + feat["name"] = split_path[3] return feat # ---------------------- @@ -411,30 +256,3 @@ def get_feature_details_from_path(path: str) -> dict[str, str]: return feat -def validate_required_infos(infos: dict[str, Any]) -> None: - """Validate that required infos categories and keys are present. - - Args: - infos: Dataset infos dictionary loaded from disk. - - Raises: - ValueError: If a required infos category or key is missing. - """ - assert isinstance(infos, dict) - - missing_entries = [] - - for category, required_keys in REQUIRED_INFOS_KEYS.items(): - category_infos = infos.get(category) - assert isinstance(category_infos, dict) - - for key in required_keys: - if key not in category_infos: - missing_entries.append(f"{category}.{key}") - - if missing_entries: - raise ValueError( - "Missing required infos entries: " - + ", ".join(sorted(missing_entries)) - + f". Required entries are defined by {REQUIRED_INFOS_KEYS!r}." - ) diff --git a/src/plaid/examples/__init__.py b/src/plaid/examples/__init__.py index 1f8291fc..706f7b3f 100644 --- a/src/plaid/examples/__init__.py +++ b/src/plaid/examples/__init__.py @@ -1,12 +1,4 @@ """Examples for PLAID objects.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - from plaid.examples.config import _HF_REPOS AVAILABLE_EXAMPLES = list(_HF_REPOS.keys()) diff --git a/src/plaid/examples/config.py b/src/plaid/examples/config.py index 80f86563..99038dde 100644 --- a/src/plaid/examples/config.py +++ b/src/plaid/examples/config.py @@ -1,19 +1,11 @@ """Config for PLAID examples.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - _HF_REPOS = { -'vki_ls59':'PLAID-datasets/VKI-LS59', -'elastoplastodynamics':'PLAID-datasets/2D_ElastoPlastoDynamics', -'multiscale_hyperelasticity':'PLAID-datasets/2D_Multiscale_Hyperelasticity', -'tensile2d':'PLAID-datasets/Tensile2d', -'rotor37':'PLAID-datasets/Rotor37', -'profile2d':'PLAID-datasets/2D_profile', +'vki_ls59':'PhysArena/VKI-LS59', +'elastoplastodynamics':'PhysArena/2D_ElastoPlastoDynamics', +'multiscale_hyperelasticity':'PhysArena/2D_Multiscale_Hyperelasticity', +'tensile2d':'PhysArena/Tensile2d', +'rotor37':'PhysArena/Rotor37', +'profile2d':'2D_profile/2D_profile', 'airfrans_clipped':'PLAID-datasets/AirfRANS_clipped', 'airfrans_original':'PLAID-datasets/AirfRANS_original', 'airfrans_remeshed':'PLAID-datasets/AirfRANS_remeshed', diff --git a/src/plaid/examples/dataset.py b/src/plaid/examples/dataset.py index 55f9f7fa..38ec83de 100644 --- a/src/plaid/examples/dataset.py +++ b/src/plaid/examples/dataset.py @@ -1,13 +1,5 @@ """Examples for PLAID `Dataset` objects.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# from plaid import Dataset -from plaid.bridges.huggingface_bridge import load_dataset_from_hub, binary_to_plaid_sample from plaid.examples.config import _HF_REPOS @@ -40,12 +32,15 @@ def _load_dataset( return self._cache[ex_name] try: - ds_stream = load_dataset_from_hub(hf_repo, split="all_samples", streaming=True) + from plaid.storage import init_streaming_from_hub + datasetdict, converterdict = init_streaming_from_hub(hf_repo) samples = [] - for _ in range(2): - hf_sample = next(iter(ds_stream)) - samples.append(binary_to_plaid_sample(hf_sample)) - dataset = Dataset(samples=samples) + for k in converterdict.keys(): + hf_sample = next(iter(datasetdict[k])) + plaid_sample = converterdict[k].sample_to_plaid(hf_sample) + samples.append(plaid_sample) + dataset = Dataset() + dataset.get_backend().set_sample(samples) self._cache[ex_name] = dataset return dataset except Exception as e: # pragma: no cover diff --git a/src/plaid/examples/sample.py b/src/plaid/examples/sample.py index 0d4b775e..6311e2bd 100644 --- a/src/plaid/examples/sample.py +++ b/src/plaid/examples/sample.py @@ -1,11 +1,4 @@ """Examples for PLAID `Sample` objects.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# from plaid import Sample from plaid.examples.dataset import datasets from plaid.examples.config import _HF_REPOS diff --git a/src/plaid/pipelines/__init__.py b/src/plaid/pipelines/__init__.py deleted file mode 100644 index c16a04bf..00000000 --- a/src/plaid/pipelines/__init__.py +++ /dev/null @@ -1,8 +0,0 @@ -"""Package for PLAID containers such as `Dataset` and `Sample`.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/src/plaid/pipelines/plaid_blocks.py b/src/plaid/pipelines/plaid_blocks.py deleted file mode 100644 index 29e3602b..00000000 --- a/src/plaid/pipelines/plaid_blocks.py +++ /dev/null @@ -1,325 +0,0 @@ -"""Custom meta-estimators for applying feature-wise and target-wise transformations. - -Includes: - -- PlaidTransformedTargetRegressor: transforms the target before fitting. - -- PlaidColumnTransformer: applies transformers to feature subsets like ColumnTransformer. -""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -import copy -import sys -from typing import Union - -if sys.version_info >= (3, 11): - from typing import Self -else: # pragma: no cover - from typing import TypeVar - - Self = TypeVar("Self") - - -import numpy as np -from sklearn.base import BaseEstimator, RegressorMixin, TransformerMixin, clone -from sklearn.compose import ColumnTransformer as SklearnColumnTransformer -from sklearn.pipeline import Pipeline -from sklearn.utils.validation import check_is_fitted - -from plaid import Dataset -from plaid.containers.utils import has_duplicates_feature_ids - - -class ColumnTransformer(SklearnColumnTransformer): - """Custom column-wise transformer for PLAID-style datasets. - - Similar to scikit-learn's `ColumnTransformer`, this class applies a list - of transformer blocks to subsets of features, defined by their feature - identifiers. Additionally, it preserves a set of remainder features that - bypass transformation. - - Args: - plaid_transformers: A list of tuples - (name, transformer), where each `transformer` is a TransformerMixin. - - Note: - At fit, it is checked that `plaid_transformers` share no in_features_identifiers and no out_features_identifiers. - """ - - def __init__( - self, - plaid_transformers: list[tuple[str, Union[TransformerMixin, Pipeline]]], - ): - self.plaid_transformers = plaid_transformers - - super().__init__( - [(name, transformer, "_") for name, transformer in plaid_transformers] - ) - - def fit(self, dataset: Dataset, _y=None) -> Self: - """Fits all transformers on their corresponding feature subsets. - - Args: - dataset: A `Dataset` object or a list of samples. - y: Ignored. Present for API compatibility. - - Returns: - self: The fitted PlaidColumnTransformer. - """ - if isinstance(dataset, list): - dataset = Dataset(samples=dataset) - - self.in_features_identifiers_ = [] - for _, transformer in self.plaid_transformers: - in_feat_id = ( - transformer[0].in_features_identifiers - if isinstance(transformer, Pipeline) - else transformer.in_features_identifiers - ) - self.in_features_identifiers_ += copy.deepcopy(in_feat_id) - - assert not has_duplicates_feature_ids(self.in_features_identifiers_), ( - "Identical in_features_identifiers found among provided transformer: not compatible with PlaidColumnTransformer." - ) - - self.plaid_transformers_ = [ - (copy.deepcopy(name), clone(transformer)) - for name, transformer in self.plaid_transformers - ] - - self.transformers_ = [] - for name, transformer in self.plaid_transformers_: - in_feat_id = ( - transformer[0].in_features_identifiers - if isinstance(transformer, Pipeline) - else transformer.in_features_identifiers - ) - sub_dataset = dataset.extract_dataset_from_identifier(in_feat_id) - transformer_ = clone(transformer).fit(sub_dataset) - self.transformers_.append((name, transformer_, "_")) - - self.out_features_identifiers_ = [] - for _, transformer, _ in self.transformers_: - out_feat_id = ( - transformer[-1].out_features_identifiers_ - if isinstance(transformer, Pipeline) - else transformer.out_features_identifiers_ - ) - self.out_features_identifiers_ += copy.deepcopy(out_feat_id) - - assert not has_duplicates_feature_ids(self.out_features_identifiers_), ( - "Identical out_features_identifiers found among provided transformer: not compatible with PlaidColumnTransformer." - ) - - return self - - def transform(self, dataset: Dataset) -> Dataset: - """Applies fitted transformers to feature subsets and merges results. - - Args: - dataset: A `Dataset` object or a list of samples. - - Returns: - Dataset: A new `Dataset` with transformed feature blocks, including - untransformed remainder features. - """ - check_is_fitted(self, "transformers_") - if isinstance(dataset, list): - dataset = Dataset(samples=dataset) - - transformed_datasets = [dataset.copy()] - for _, transformer_, _ in self.transformers_: - in_feat_id = ( - transformer_[0].in_features_identifiers_ - if isinstance(transformer_, Pipeline) - else transformer_.in_features_identifiers_ - ) - sub_dataset = dataset.extract_dataset_from_identifier(in_feat_id) - transformed = transformer_.transform(sub_dataset) - transformed_datasets.append(transformed) - return Dataset.merge_dataset_by_features(transformed_datasets) - - def fit_transform(self, dataset: Dataset, y=None) -> Dataset: - """Fits all transformers and returns the combined transformed dataset. - - Args: - dataset: A `Dataset` object or a list of samples. - y: Ignored. Present for API compatibility. - - Returns: - Dataset: A new `Dataset` with transformed features. - """ - return self.fit(dataset, y).transform(dataset) - - def inverse_transform(self, dataset: Dataset) -> Dataset: - """Applies fitted inverse transformers to feature subsets and merges results. - - Args: - dataset: A `Dataset` object or a list of samples. - - Returns: - Dataset: A new `Dataset` with inverse transformed feature blocks, including - untransformed remainder features. - """ - check_is_fitted(self, "transformers_") - if isinstance(dataset, list): - dataset = Dataset(samples=dataset) - - transformed_datasets = [dataset.copy()] - for _, transformer_, _ in self.transformers_: - in_feat_id = ( - transformer_[-1].out_features_identifiers_ - if isinstance(transformer_, Pipeline) - else transformer_.out_features_identifiers_ - ) - sub_dataset = dataset.extract_dataset_from_identifier(in_feat_id) - transformed = transformer_.inverse_transform(sub_dataset) - transformed_datasets.append(transformed) - return Dataset.merge_dataset_by_features(transformed_datasets) - - -class TransformedTargetRegressor(RegressorMixin, BaseEstimator): - """Meta-estimator that transforms the target before fit and inverses it at predict. - - This regressor is compatible with custom `Dataset` objects and supports - complex targets, including scalars and fields. It wraps a base regressor - and a transformer that is responsible for preprocessing the target space. - - Args: - regressor: A regressor implementing `fit` and `predict`, following the scikit-learn API. - transformer: A transformer implementing `fit`, `transform`, and `inverse_transform`. - Applied to the dataset before fitting the regressor. - """ - - def __init__( - self, - regressor: Union[RegressorMixin, Pipeline], - transformer: Union[TransformerMixin, Pipeline], - ): - self.regressor = regressor - self.transformer = transformer - - def fit(self, dataset: Dataset, _y=None) -> Self: - """Fits the transformer and the regressor on the transformed dataset. - - Args: - dataset: A `Dataset` object or a list of sample dictionaries. - Input training data. - y: Ignored. Present for API compatibility. - - Returns: - self: The fitted estimator. - """ - if isinstance(dataset, list): - dataset = Dataset(samples=dataset) - - self.transformer_ = clone(self.transformer).fit(dataset) - - transformed_dataset = self.transformer_.transform(dataset) - self.regressor_ = clone(self.regressor).fit(transformed_dataset) - - in_feat_id = ( - self.regressor_[0].in_features_identifiers_ - if isinstance(self.regressor_, Pipeline) - else self.regressor_.in_features_identifiers_ - ) - self.in_features_identifiers_ = copy.deepcopy(in_feat_id) - - out_feat_id = ( - self.transformer_[0].in_features_identifiers_ - if isinstance(self.transformer_, Pipeline) - else self.transformer_.in_features_identifiers_ - ) - self.out_features_identifiers_ = copy.deepcopy(out_feat_id) - - return self - - def predict(self, dataset: Dataset) -> Dataset: - """Predicts target values using the fitted regressor, then applies the inverse transformation. - - Args: - dataset: A `Dataset` object or a list of sample dictionaries. - Input data to predict on. - - Returns: - Dataset: A `Dataset` containing the inverse-transformed predictions. - """ - check_is_fitted(self, "regressor_") - if isinstance(dataset, list): - dataset = Dataset(samples=dataset) - dataset_pred_transformed = self.regressor_.predict(dataset) - return self.transformer_.inverse_transform(dataset_pred_transformed) - - def score(self, dataset_X: Dataset, dataset_y: Dataset = None) -> float: - """Computes a normalized root mean squared error (RMSE) score on the transformed targets. - - The score is defined as `1 - avg(relative RMSE)` over all target features in the - `transformer` input features identifiers. The error computation depends on the feature type: - - For "scalar" features: RMSE normalized by squared reference value. - - For "field" features: RMSE normalized by field size and max-norm of the reference. - - Args: - dataset_X: A `Dataset` object or a list of samples. - Input features used for prediction. - dataset_y: A `Dataset` object or list, optional. - Ground-truth targets. If `None`, `dataset_X` is used for both input and reference. - - Returns: - float: A score between `-inf` and `1`. A perfect prediction yields a score of `1.0`. - - Raises: - ValueError: If an unknown feature type is encountered. - """ - check_is_fitted(self, "regressor_") - if dataset_y is None: - dataset_y = dataset_X - if isinstance(dataset_X, list): - dataset_X = Dataset(samples=dataset_X) - if isinstance(dataset_y, list): - dataset_y = Dataset(samples=dataset_y) - - dataset_y_pred = self.predict(dataset_X) - - sample_ids = dataset_X.get_sample_ids() - - assert dataset_y.get_sample_ids() == sample_ids - - all_errors = [] - - for feat_id in self.out_features_identifiers_: - feature_type = feat_id["type"] - - reference = dataset_y.get_feature_from_identifier(feat_id) - prediction = dataset_y_pred.get_feature_from_identifier(feat_id) - - if feature_type == "scalar": - errors = 0.0 - for id in sample_ids: - if reference[id] != 0: - error = ((prediction[id] - reference[id]) ** 2) / ( - reference[id] ** 2 - ) - else: - error = (prediction[id] - reference[id]) ** 2 - errors += error - elif feature_type == "field": # pragma: no cover - errors = 0.0 - for id in sample_ids: - errors += (np.linalg.norm(prediction[id] - reference[id]) ** 2) / ( - reference[id].shape[0] - * np.linalg.norm(reference[id], ord=np.inf) ** 2 - ) - else: # pragma: no cover - raise ( - f"No score function implemented for feature type {feat_id['type']}" - ) - - all_errors.append(np.sqrt(errors / len(sample_ids))) - - return 1.0 - sum(all_errors) / len(self.out_features_identifiers_) diff --git a/src/plaid/pipelines/sklearn_block_wrappers.py b/src/plaid/pipelines/sklearn_block_wrappers.py deleted file mode 100644 index 3b5920a9..00000000 --- a/src/plaid/pipelines/sklearn_block_wrappers.py +++ /dev/null @@ -1,270 +0,0 @@ -"""Wrapped scikit-learn transformers and regressors for PLAID Dataset compatibility. - -Provides adapters to use scikit-learn estimators within the PLAID feature/block system: - -- WrappedPlaidSklearnTransformer: wraps a TransformerMixin - -- WrappedPlaidSklearnRegressor: wraps a RegressorMixin -""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -import copy -import sys -from typing import Optional - -if sys.version_info >= (3, 11): - from typing import Self -else: # pragma: no cover - from typing import TypeVar - - Self = TypeVar("Self") - - -from sklearn.base import ( - BaseEstimator, - RegressorMixin, - TransformerMixin, - clone, -) -from sklearn.utils.validation import check_is_fitted - -from plaid import Dataset -from plaid.containers import FeatureIdentifier -from plaid.containers.utils import check_features_type_homogeneity -from plaid.types import Array, SklearnBlock - - -def get_2Darray_from_homogeneous_identifiers( - dataset: Dataset, features_identifiers: list[FeatureIdentifier] -) -> Array: - """Returns a 2D array from a Dataset and a feature id. - - The function calls `dataset.get_tabular_from_homogeneous_identifiers(...)`, then removes - either the second or third dimension if it has size 1, so that the output is 2D. - - Args: - dataset (Dataset): A Dataset object exposing `get_tabular_from_homogeneous_identifiers`. - features_identifiers (list[FeatureIdentifier]): a list of input feature identifiers. - - Returns: - A NumPy array of shape (n_samples, n_features). - - Raises: - AssertionError: If the number of features in the output does not match the identifiers. - ValueError: If both the second and third dimensions have size greater than 1. - """ - X = dataset.get_tabular_from_homogeneous_identifiers(features_identifiers) - # X is of size (nb_sample, nb_features, dim_features), either nb_features or dim_features should be 1 to be compatible with scikit-learn blocks - if X.shape[1] == 1: - X = X[:, 0, :] - elif X.shape[2] == 1: - X = X[:, :, 0] - else: - raise ValueError( - "X (generate by dataset.get_tabular_from_homogeneous_identifiers) is expected to have its second or third dimension equal to 1" - ) - - return X - - -class WrappedSklearnTransformer(TransformerMixin, BaseEstimator): - """Adapter for using a scikit-learn transformer on PLAID Datasets. - - Transforms tabular data extracted from homogeneous feature identifiers, - and returns results as a `Dataset`. Supports forward and inverse transforms. - - Args: - sklearn_block (SklearnBlock): A scikit-learn Transformer implementing fit/transform APIs. - in_features_identifiers (list[FeatureIdentifier]): List of feature identifiers to extract input data from. - out_features_identifiers (list[FeatureIdentifier], optional): List of feature identifiers used for outputs. If None, - defaults to `in_features_identifiers`. - """ - - # TODO: check if restrict_to_features=True can be used to reduce further memory consumption - def __init__( - self, - sklearn_block: SklearnBlock, - in_features_identifiers: list[FeatureIdentifier], - out_features_identifiers: Optional[list[FeatureIdentifier]] = None, - ): - self.sklearn_block = sklearn_block - self.in_features_identifiers = in_features_identifiers - self.out_features_identifiers = out_features_identifiers - - def fit(self, dataset: Dataset, _y=None) -> Self: - """Fits the underlying scikit-learn transformer on selected input features. - - Args: - dataset: A `Dataset` object containing the features to transform. - _y: Ignored. - - Returns: - self: The fitted transformer. - """ - self.in_features_identifiers_ = copy.deepcopy(self.in_features_identifiers) - check_features_type_homogeneity(self.in_features_identifiers_) - - if self.out_features_identifiers: - self.out_features_identifiers_ = copy.deepcopy( - self.out_features_identifiers - ) - check_features_type_homogeneity(self.out_features_identifiers_) - else: - self.out_features_identifiers_ = copy.deepcopy(self.in_features_identifiers) - - X = get_2Darray_from_homogeneous_identifiers( - dataset, self.in_features_identifiers_ - ) - - self.sklearn_block_ = clone(self.sklearn_block).fit(X, _y) - - return self - - def transform(self, dataset: Dataset) -> Dataset: - """Applies the fitted transformer to the selected input features. - - Args: - dataset: A `Dataset` object to transform. - - Returns: - Dataset: Transformed features wrapped as a new `Dataset`. - """ - check_is_fitted(self, "sklearn_block_") - - X = get_2Darray_from_homogeneous_identifiers( - dataset, self.in_features_identifiers_ - ) - - X_transformed = self.sklearn_block_.transform(X) - X_transformed = X_transformed.reshape( - (len(dataset), len(self.out_features_identifiers_), -1) - ) - - dataset_transformed = dataset.add_features_from_tabular( - X_transformed, self.out_features_identifiers_, restrict_to_features=False - ) - - return dataset_transformed - - def inverse_transform(self, dataset: Dataset) -> Dataset: - """Applies inverse transformation to the output features. - - Args: - dataset: A `Dataset` object with transformed output features. - - Returns: - Dataset: Dataset with inverse-transformed features. - """ - check_is_fitted(self, "sklearn_block_") - - X = get_2Darray_from_homogeneous_identifiers( - dataset, self.out_features_identifiers_ - ) - - X_inv_transformed = self.sklearn_block_.inverse_transform(X) - - X_inv_transformed = X_inv_transformed.reshape( - (len(dataset), len(self.in_features_identifiers_), -1) - ) - - dataset_inv_transformed = dataset.add_features_from_tabular( - X_inv_transformed, self.in_features_identifiers_, restrict_to_features=False - ) - - return dataset_inv_transformed - - -class WrappedSklearnRegressor(RegressorMixin, BaseEstimator): - """Adapter for using a scikit-learn regressor with PLAID Dataset. - - Fits and predicts on tabular arrays extracted from stacked features, - while preserving the feature/block structure expected by PLAID. - - Args: - sklearn_block: A scikit-learn regressor with fit/predict API. - in_features_identifiers: List of feature identifiers for inputs. - out_features_identifiers: List of feature identifiers for outputs. - """ - - # TODO: remove transform and inv tranf - - def __init__( - self, - sklearn_block: SklearnBlock, - in_features_identifiers: list[FeatureIdentifier], - out_features_identifiers: list[FeatureIdentifier], - ): - self.sklearn_block = sklearn_block - self.in_features_identifiers = in_features_identifiers - self.out_features_identifiers = out_features_identifiers - - def fit(self, dataset: Dataset, _y=None) -> Self: - """Fits the wrapped scikit-learn regressor on the stacked input/output data. - - Args: - dataset: A `Dataset` containing both input and output features. - _y: Ignored. - - Returns: - self: The fitted regressor. - """ - self.sklearn_block_ = clone(self.sklearn_block) - self.in_features_identifiers_ = self.in_features_identifiers.copy() - self.out_features_identifiers_ = self.out_features_identifiers.copy() - - X, _ = dataset.get_tabular_from_stacked_identifiers( - self.in_features_identifiers_ - ) - y, self.cumulated_feat_dims = dataset.get_tabular_from_stacked_identifiers( - self.out_features_identifiers_ - ) - - self.sklearn_block_.fit(X, y) - - return self - - def predict(self, dataset: Dataset) -> Dataset: - """Predicts target values using the fitted regressor. - - Args: - dataset: A `Dataset` with input features. - - Returns: - Dataset: A new `Dataset` containing predicted target features. - """ - check_is_fitted(self, "sklearn_block_") - - X, _ = dataset.get_tabular_from_stacked_identifiers( - self.in_features_identifiers_ - ) - - y = self.sklearn_block_.predict(X) - y = y.reshape((len(dataset), -1)) - - dataset_predicted = Dataset.merge_dataset_by_features( - [ - dataset.from_tabular( - y[ - :, - None, - self.cumulated_feat_dims[i_feat] : self.cumulated_feat_dims[ - i_feat + 1 - ], - ], - feature_identifiers=[feat_ids], - ) - for i_feat, feat_ids in enumerate(self.out_features_identifiers_) - ] - ) - # dataset_predicted = dataset.add_features_from_tabular( - # y, self.out_features_identifiers_, restrict_to_features=False - # ) - dataset_predicted = dataset.merge_features(dataset_predicted) - - return dataset_predicted diff --git a/src/plaid/post/__init__.py b/src/plaid/post/__init__.py deleted file mode 100644 index 4f44cacd..00000000 --- a/src/plaid/post/__init__.py +++ /dev/null @@ -1,8 +0,0 @@ -"""Package that implement post-processing utilities for PLAID.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/src/plaid/post/bisect.py b/src/plaid/post/bisect.py deleted file mode 100644 index 0832514d..00000000 --- a/src/plaid/post/bisect.py +++ /dev/null @@ -1,182 +0,0 @@ -"""Utiliy function to plot bisect graphs comparing predictions vs. targets dataset.""" - -import subprocess -from pathlib import Path -from typing import Union - -import matplotlib as mpl -import matplotlib.pyplot as plt -import numpy as np -from tqdm import tqdm - -from plaid import Dataset, ProblemDefinition - - -def prepare_datasets( - ref_dataset: Dataset, - pred_dataset: Dataset, - problem_definition: ProblemDefinition, - verbose: bool = False, -) -> tuple[dict, dict, list[str]]: - """Prepare datasets for comparison. - - Args: - ref_dataset (Dataset): The reference dataset. - pred_dataset (Dataset): The predicted dataset. - problem_definition (ProblemDefinition): The common problem for the reference and predicted dataset - verbose (bool, optional): Verbose mode. Defaults to False. - - Returns: - tuple[dict[str, list[float]], dict[str, list[float]], list[str]]: A tuple containing dictionaries of reference and predicted scalar values, and a list of scalar names. - """ - assert len(ref_dataset) == len(pred_dataset), ( - "Reference and predicted dataset lengths differ" - ) - ref_problem = ref_dataset.get_scalar_names() - pred_problem = pred_dataset.get_scalar_names() - assert ref_problem == pred_problem, "Reference and predicted dataset scalars differ" - - n_samples = len(ref_dataset) - out_scalars_names = problem_definition.get_output_scalars_names() - - ref_out_scalars = {} - pred_out_scalars = {} - - ref_out_scalars = {sname: [] for sname in out_scalars_names} - pred_out_scalars = {sname: [] for sname in out_scalars_names} - - for i_sample in tqdm(range(n_samples), disable=not (verbose)): - for sname in out_scalars_names: - ref = ref_dataset[i_sample].get_scalar(sname) - ref_out_scalars[sname].append(ref) - - pred = pred_dataset[i_sample].get_scalar(sname) - pred_out_scalars[sname].append(pred) - - return ref_out_scalars, pred_out_scalars, out_scalars_names - - -def is_dvipng_available(verbose: bool) -> bool: - """Check if dvipng is available on the system for matplotlib figures. - - Returns: - bool: True if dvipng is available, False otherwise. - """ - try: - subprocess.run( - ["dvipng", "--version"], - check=True, - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, - ) - return True # pragma: no cover - except FileNotFoundError: # pragma: no cover - print( - "dvipng module not installed. Using the default matplotlib options instead" - ) if verbose else None - return False - - -def plot_bisect( - ref_dataset: Union[Dataset, str, Path], - pred_dataset: Union[Dataset, str, Path], - problem_def: Union[ProblemDefinition, str, Path], - scalar: Union[str, int], - save_file_name: str = "bissec_plots", - verbose: bool = False, -) -> None: - """Plot a bisect graph comparing predictions vs. targets dataset. - - Args: - ref_dataset (Dataset | str | Path): The reference dataset or its file path. - pred_dataset (Dataset | str | Path): The predicted dataset or its file path. - problem_def (ProblemDefinition | str | Path): The common problem for the reference and predicted dataset - scalar (str | int): The name of the scalar to study or its index. - save_file_name (str, optional): Figure name when saving to PNG format. Defaults to "bissec_plots". - verbose (bool, optional): Verbose mode. Defaults to False. - - Raises: - KeyError: If the provided scalar name is not part of the dataset. - """ - ### Transform path to Dataset object ### - if isinstance(ref_dataset, (str, Path)): - ref_dataset: Dataset = Dataset(ref_dataset) - if isinstance(pred_dataset, (str, Path)): - pred_dataset: Dataset = Dataset(pred_dataset) - if isinstance(problem_def, (str, Path)): - problem_def: ProblemDefinition = ProblemDefinition(problem_def) - - # Load the testing_set - # testing_set = problem_def.get_split("test") - - print("Data preprocessing...") if verbose else None - ref_out_scalars, pred_out_scalars, out_scalars_names = prepare_datasets( - ref_dataset, pred_dataset, problem_def, verbose - ) - - ### Transform string to index ### - if isinstance(scalar, str): - if scalar in out_scalars_names: - scalar: int = out_scalars_names.index(scalar) - else: - raise KeyError( - f"The scalar name provided ({scalar}) is not part of '{out_scalars_names = }'" - ) - - # Matplotlib plotting options - if is_dvipng_available(verbose): # pragma: no cover - plt.rcParams.update( - { - "text.usetex": True, - "font.family": "sans-serif", - "font.sans-serif": ["Helvetica"], - } - ) - mpl.style.use("seaborn-v0_8") - else: # pragma: no cover - mpl.rcParams.update(mpl.rcParamsDefault) - - fontsize = 32 - labelsize = 32 - markersize = 24 - markeredgewidth = 1 - - #### Bisect graph plot #### - print("Bisect graph construction...") if verbose else None - label = r"$\mathrm{Predictions~vs~Targets~for~" + out_scalars_names[scalar] + "}$" - fig, ax = plt.subplots(figsize=(2 * 6, 2 * 5.5)) - - ### Matplotlib instructions ### - y_true_dataset = np.array( - ref_out_scalars[out_scalars_names[scalar]] - ) # [testing_set] - y_pred_dataset = np.array( - pred_out_scalars[out_scalars_names[scalar]] - ) # [testing_set] - - m, M = np.min(y_true_dataset), np.max(y_true_dataset) - ax.plot(np.array([m, M]), np.array([m, M]), color="k") - - ax.plot( - y_true_dataset, - y_pred_dataset, - linestyle="", - color="b", - markerfacecolor="r", - markeredgecolor="b", - markeredgewidth=markeredgewidth, - marker=".", - markersize=markersize, - ) - - ax.tick_params(labelsize=labelsize) - ax.set_title(label, fontsize=fontsize) - - ax.set_ylabel(r"$\mathrm{Predictions}$", fontsize=fontsize) - ax.set_xlabel(r"$\mathrm{Targets}$", fontsize=fontsize) - - plt.tight_layout() - - print("Bisect graph saving...") if verbose else None - fig.savefig(f"{save_file_name}.png", dpi=300, format="png", bbox_inches="tight") - print("...Bisect plot done") if verbose else None diff --git a/src/plaid/post/metrics.py b/src/plaid/post/metrics.py deleted file mode 100644 index 6e3fee7e..00000000 --- a/src/plaid/post/metrics.py +++ /dev/null @@ -1,200 +0,0 @@ -"""Utility functions for computing and printing metrics for regression problems in PLAID.""" - -from pathlib import Path -from typing import Union - -import numpy as np -import yaml -from sklearn.metrics import r2_score -from tqdm import tqdm - -from plaid import Dataset, ProblemDefinition -from plaid.post.bisect import prepare_datasets - - -def compute_rRMSE_RMSE( - metrics: dict, - rel_SE_out_scalars: dict, - abs_SE_out_scalars: dict, - problem_split: dict, - out_scalars_names: list[str], -) -> None: - """Compute and print the relative Root Mean Square Error (rRMSE) for scalar outputs. - - Args: - metrics (dict): Dictionary to store the computed metrics. - rel_SE_out_scalars (dict): Dictionary containing relative squared errors for scalar outputs. - abs_SE_out_scalars (dict): Dictionary containing absolute squared errors for scalar outputs. - problem_split (dict): Dictionary specifying how the problem is split. - out_scalars_names (list[str]): List of names of scalar outputs. - """ - metrics["rRMSE for scalars"] = {} - metrics["RMSE for scalars"] = {} - - for split_name, _ in problem_split.items(): - metrics["rRMSE for scalars"][split_name] = {} - metrics["RMSE for scalars"][split_name] = {} - - for sname in out_scalars_names: - rRMSE_value = np.sqrt( - np.mean(rel_SE_out_scalars[split_name][sname], axis=0) - ) - out_string_rRMSE = "{:#.6g}".format(rRMSE_value) - - RMSE_value = np.sqrt(np.mean(abs_SE_out_scalars[split_name][sname], axis=0)) - out_string_RMSE = "{:#.6g}".format(RMSE_value) - - metrics["rRMSE for scalars"][split_name][sname] = float(out_string_rRMSE) - metrics["RMSE for scalars"][split_name][sname] = float(out_string_RMSE) - - -def compute_R2( - metrics: dict, - r2_out_scalars: dict, - problem_split: dict, - out_scalars_names: list[str], -) -> None: - """Compute and print the R-squared (R2) score for scalar outputs. - - Args: - metrics (dict): Dictionary to store the computed metrics. - r2_out_scalars (dict): Dictionary containing R2 scores for scalar outputs. - problem_split (dict): Dictionary specifying how the problem is split. - out_scalars_names (list[str]): List of names of scalar outputs. - """ - metrics["R2 for scalars"] = {} - - for split_name, _ in problem_split.items(): - metrics["R2 for scalars"][split_name] = {} - - for sname in out_scalars_names: - out_string = "{:#.6g}".format(r2_out_scalars[split_name][sname]) - metrics["R2 for scalars"][split_name][sname] = float(out_string) - - -def prepare_metrics_for_split( - ref_out_specific_scalars: np.ndarray, - pred_out_specific_scalars: np.ndarray, - split_indices: list[int], - rel_SE_out_specific_scalars: np.ndarray, - abs_SE_out_specific_scalars: np.ndarray, -) -> float: - """Prepare metrics for a specific split and compute the R-squared (R2) score. - - Args: - ref_out_specific_scalars (np.ndarray): Array of reference scalar outputs. - pred_out_specific_scalars (np.ndarray): Array of predicted scalar outputs. - split_indices (list[int]): List of indices specifying the split. - rel_SE_out_specific_scalars (np.ndarray): Array to store relative squared errors for scalar outputs. - abs_SE_out_specific_scalars (np.ndarray): Array to store absolute squared errors for scalar outputs. - - Returns: - float: R-squared (R2) score for the specific split. - """ - ref_scal = np.array([ref_out_specific_scalars[i] for i in split_indices]) - predict_scal = np.array([pred_out_specific_scalars[i] for i in split_indices]) - - diff = predict_scal - ref_scal - rel_SE_out_specific_scalars[:] = (diff / ref_scal) ** 2 - abs_SE_out_specific_scalars[:] = diff**2 - return r2_score(ref_scal, predict_scal) - - -def pretty_metrics(metrics: dict) -> None: - """Prints metrics information in a readable format (pretty print). - - Args: - metrics (dict): The metrics dictionary to print. - """ - metrics_keys = list(metrics.keys()) - tf = "******************** \x1b[34;1mcomparision metrics\x1b[0m *******************\n" - for metric_key in metrics_keys: - tf += "\x1b[33;1m" + str(metric_key) + "\x1b[0m\n" - splits = list(metrics[metric_key].keys()) - for split in splits: - tf += " \x1b[32;1m" + str(split) + "\x1b[0m\n" - scalars = list(metrics[metric_key][split].keys()) - for scalar in scalars: - tf += ( - " \x1b[34;1m" - + str(scalar) - + "\x1b[0m: " - + str(metrics[metric_key][split][scalar]) - + "\n" - ) - tf += "************************************************************\n" - print(tf) - - -def compute_metrics( - ref_dataset: Union[Dataset, str, Path], - pred_dataset: Union[Dataset, str, Path], - problem: Union[ProblemDefinition, str, Path], - save_file_name: str = "test_metrics", - verbose: bool = False, -) -> None: - """Compute and save evaluation metrics for a given regression problem. - - Args: - ref_dataset (Dataset | str | Path): Reference dataset or path to a reference dataset. - pred_dataset (Dataset | str | Path): Predicted dataset or path to a predicted dataset. - problem (ProblemDefinition | str | Path): Problem definition or path to a problem definition. - save_file_name (str, optional): Name of the file to save the metrics. Defaults to "test_metrics". - verbose (bool, optional): If True, print detailed information during computation. - """ - ### Transform path to Dataset object ### - if isinstance(ref_dataset, (str, Path)): - ref_dataset: Dataset = Dataset(ref_dataset) - if isinstance(pred_dataset, (str, Path)): - pred_dataset: Dataset = Dataset(pred_dataset) - if isinstance(problem, (str, Path)): - problem: ProblemDefinition = ProblemDefinition(problem) - - ### Get important formated values ### - problem_split = problem.get_split() - ref_out_scalars, pred_out_scalars, out_scalars_names = prepare_datasets( - ref_dataset, pred_dataset, problem, verbose - ) - - rel_SE_out_scalars = {} - abs_SE_out_scalars = {} - r2_out_scalars = {} - - for split_name, split_indices in problem_split.items(): - rel_SE_out_scalars[split_name] = {} - abs_SE_out_scalars[split_name] = {} - r2_out_scalars[split_name] = {} - for sname in out_scalars_names: - rel_SE_out_scalars[split_name][sname] = np.empty(len(split_indices)) - abs_SE_out_scalars[split_name][sname] = np.empty(len(split_indices)) - r2_out_scalars[split_name][sname] = np.empty(1) - - print("Compute metrics for each regressor:") if verbose else None - for split_name, split_indices in tqdm(problem_split.items(), disable=not (verbose)): - for sname in out_scalars_names: - r2_out_scalars[split_name][sname] = prepare_metrics_for_split( - ref_out_scalars[sname], - pred_out_scalars[sname], - split_indices, - rel_SE_out_scalars[split_name][sname], - abs_SE_out_scalars[split_name][sname], - ) - - metrics = {} - compute_rRMSE_RMSE( - metrics, - rel_SE_out_scalars, - abs_SE_out_scalars, - problem_split, - out_scalars_names, - ) - - compute_R2(metrics, r2_out_scalars, problem_split, out_scalars_names) - - with open(f"{save_file_name}.yaml", "w") as file: - yaml.dump(metrics, file, default_flow_style=False, sort_keys=False) - - if verbose: - pretty_metrics(metrics) - - return metrics diff --git a/src/plaid/problem_definition.py b/src/plaid/problem_definition.py index 78d3c713..d15c78fc 100644 --- a/src/plaid/problem_definition.py +++ b/src/plaid/problem_definition.py @@ -1,1546 +1,277 @@ """Implementation of the `ProblemDefinition` class.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports -import sys - -if sys.version_info >= (3, 11): - from typing import Self -else: # pragma: no cover - from typing import TypeVar - - Self = TypeVar("Self") - -import csv -import json -import logging -from pathlib import Path -from typing import Optional, Sequence, Union - -import yaml -from packaging.version import Version - -import plaid -from plaid.constants import AUTHORIZED_SCORE_FUNCTIONS, AUTHORIZED_TASKS -from plaid.containers import FeatureIdentifier -from plaid.types import IndexType -from plaid.utils.deprecation import deprecated - -# %% Globals - -logger = logging.getLogger(__name__) - -# %% Functions - -# %% Classes - - -class ProblemDefinition(object): - """Gathers all necessary informations to define a learning problem.""" - - def __init__( - self, - path: Optional[Union[str, Path]] = None, - directory_path: Optional[Union[str, Path]] = None, - ) -> None: - """Initialize an empty :class:`ProblemDefinition `. - - Use :meth:`add_inputs ` or :meth:`add_output_scalars_names ` to feed the :class:`ProblemDefinition` - - Args: - path (Union[str,Path], optional): The path from which to load PLAID problem definition files. - directory_path (Union[str,Path], optional): Deprecated, use `path` instead. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - - # 1. Create empty instance of ProblemDefinition - problem_definition = ProblemDefinition() - print(problem_definition) - >>> ProblemDefinition() - - # 2. Load problem definition and create ProblemDefinition instance - problem_definition = ProblemDefinition("path_to_plaid_prob_def") - print(problem_definition) - >>> ProblemDefinition(input_scalars_names=['s_1'], output_scalars_names=['s_2'], input_meshes_names=['mesh'], task='regression') - """ - self._name: str = None - self._version: Union[Version] = Version(plaid.__version__) - self._task: str = None - self._score_function: str = None - self.in_features_identifiers: Sequence[Union[str, FeatureIdentifier]] = [] - self.out_features_identifiers: Sequence[Union[str, FeatureIdentifier]] = [] - self.constant_features_identifiers: list[str] = [] - self.in_scalars_names: list[str] = [] - self.out_scalars_names: list[str] = [] - self.in_timeseries_names: list[str] = [] - self.out_timeseries_names: list[str] = [] - self.in_fields_names: list[str] = [] - self.out_fields_names: list[str] = [] - self.in_meshes_names: list[str] = [] - self.out_meshes_names: list[str] = [] - self._split: Optional[dict[str, IndexType]] = None - self._train_split: Optional[dict[str, dict[str, IndexType]]] = None - self._test_split: Optional[dict[str, dict[str, IndexType]]] = None - - if directory_path is not None: - if path is not None: - raise ValueError( - "Arguments `path` and `directory_path` cannot be both set. Use only `path` as `directory_path` is deprecated." - ) - else: - path = directory_path - logger.warning( - "DeprecationWarning: 'directory_path' is deprecated, use 'path' instead." - ) - - if path is not None: - path = Path(path) - self._load_from_dir_(path) - - # -------------------------------------------------------------------------# - def get_name(self) -> str: - """Get the name. None if not defined. - - Returns: - str: The name, such as "regression_1". - """ - return self._name - - def set_name(self, name: str) -> None: - """Set the name. - - Args: - name (str): The name, such as "regression_1". - """ - if self._name is not None: - raise ValueError(f"A name is already in self._name: (`{self._name}`)") - else: - self._name = name - - # -------------------------------------------------------------------------# - def get_version(self) -> Version: - """Get the version. None if not defined. - - Returns: - Version: The version, such as "0.1.0". - """ - return self._version - - # -------------------------------------------------------------------------# - def get_task(self) -> str: - """Get the authorized task. None if not defined. - - Returns: - str: The authorized task, such as "regression" or "classification". - """ - return self._task - - def set_task(self, task: str) -> None: - """Set the authorized task. - - Args: - task (str): The authorized task to be set, such as "regression" or "classification". - """ - if self._task is not None: - raise ValueError(f"A task is already in self._task: (`{self._task}`)") - elif task in AUTHORIZED_TASKS: - self._task = task - else: - raise TypeError( - f"{task} not among authorized tasks. Maybe you want to try among: {AUTHORIZED_TASKS}" - ) - - # -------------------------------------------------------------------------# - def get_score_function(self) -> str: - """Get the authorized score function. None if not defined. - - Returns: - str: The authorized score function, such as "RRMSE". - """ - return self._score_function - - def set_score_function(self, score_function: str) -> None: - """Set the authorized score function. - - Args: - score_function (str): The authorized score function, such as "RRMSE". - """ - if self._score_function is not None: - raise ValueError( - f"A score function is already in self._task: (`{self._score_function}`)" - ) - elif score_function in AUTHORIZED_SCORE_FUNCTIONS: - self._score_function = score_function - else: - raise TypeError( - f"{score_function} not among authorized tasks. Maybe you want to try among: {AUTHORIZED_SCORE_FUNCTIONS}" - ) - - # -------------------------------------------------------------------------# - - def get_split( - self, indices_name: Optional[str] = None - ) -> Union[IndexType, dict[str, IndexType]]: - """Get the split indices. This function returns the split indices, either for a specific split with the provided `indices_name` or all split indices if `indices_name` is not specified. - - Args: - indices_name (str, optional): The name of the split for which indices are requested. Defaults to None. - - Raises: - KeyError: If `indices_name` is specified but not found among split names. - - Returns: - Union[IndexType,dict[str,IndexType]]: If `indices_name` is provided, it returns - the indices for that split (IndexType). If `indices_name` is not provided, it - returns a dictionary mapping split names (str) to their respective indices - (IndexType). - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - split_indices = problem.get_split() - print(split_indices) - >>> {'train': [0, 1, 2, ...], 'test': [100, 101, ...]} - - test_indices = problem.get_split('test') - print(test_indices) - >>> [100, 101, ...] - """ - if indices_name is None: - return self._split - else: - assert indices_name in self._split, ( - indices_name + " not among split indices names" - ) - return self._split[indices_name] - - def set_split(self, split: dict[str, IndexType]) -> None: - """Set the split indices. This function allows you to set the split indices by providing a dictionary mapping split names (str) to their respective indices (IndexType). - - Args: - split (dict[str,IndexType]): A dictionary containing split names and their indices. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - new_split = {'train': [0, 1, 2], 'test': [3, 4]} - problem.set_split(new_split) - """ - if self._split is not None: # pragma: no cover - logger.warning("split already exists -> data will be replaced") - self._split = split - - def get_train_split( - self, indices_name: Optional[str] = None - ) -> Union[dict[str, IndexType], dict[str, dict[str, IndexType]]]: - """Get the train split indices for different subsets of the dataset. - - Args: - indices_name (str, optional): The name of the specific train split subset - for which indices are requested. Defaults to None. - - Returns: - Union[dict[str, IndexType], dict[str, dict[str, IndexType]]]: - If indices_name is provided: - - Returns a dictionary mapping split names to their indices for the specified subset. - If indices_name is None: - - Returns the complete train split dictionary containing all subsets and their indices. - - Raises: - AssertionError: If indices_name is provided but not found in the train split. - """ - if indices_name is None: - return self._train_split - else: - assert indices_name in self._train_split, ( - indices_name + " not among split indices names" - ) - return self._train_split[indices_name] - - def set_train_split(self, split: dict[str, dict[str, Optional[IndexType]]]) -> None: - """Set the train split dictionary containing subsets and their indices. - - Args: - split (dict[str, dict[str, IndexType]]): Dictionary mapping train subset names - to their split dictionaries. Each split dictionary maps split names (e.g., 'train', 'val') - to their indices. - - Note: - If a train split already exists, it will be replaced and a warning will be logged. - """ - if self._train_split is not None: # pragma: no cover - logger.warning("split already exists -> data will be replaced") - self._train_split = split - - def get_test_split( - self, indices_name: Optional[str] = None - ) -> Union[dict[str, IndexType], dict[str, dict[str, IndexType]]]: - """Get the test split indices for different subsets of the dataset. - - Args: - indices_name (str, optional): The name of the specific test split subset - for which indices are requested. Defaults to None. - - Returns: - Union[dict[str, IndexType], dict[str, dict[str, IndexType]]]: - If indices_name is provided: - - Returns a dictionary mapping split names to their indices for the specified subset. - If indices_name is None: - - Returns the complete test split dictionary containing all subsets and their indices. - - Raises: - AssertionError: If indices_name is provided but not found in the test split. - """ - if indices_name is None: - return self._test_split - else: - assert indices_name in self._test_split, ( - indices_name + " not among split indices names" - ) - return self._test_split[indices_name] - - def set_test_split(self, split: dict[str, dict[str, Optional[IndexType]]]) -> None: - """Set the test split dictionary containing subsets and their indices. - - Args: - split (dict[str, dict[str, IndexType]]): Dictionary mapping test subset names - to their split dictionaries. Each split dictionary maps split names (e.g., 'test', 'test_ood') - to their indices. - - Note: - If a test split already exists, it will be replaced and a warning will be logged. - """ - if self._test_split is not None: # pragma: no cover - logger.warning("split already exists -> data will be replaced") - self._test_split = split - - # -------------------------------------------------------------------------# - @staticmethod - def _feature_sort_key(feat: Union[str, FeatureIdentifier]) -> tuple[str, str]: - if isinstance(feat, str): - # Strings first, sorted lexicographically - return ("a_string", feat) - else: - assert isinstance(feat, FeatureIdentifier) - # Then FeatureIdentifiers, sorted by their "type" field - return ("b_feature", feat["type"]) - - def get_in_features_identifiers(self) -> Sequence[Union[str, FeatureIdentifier]]: - """Get the input features identifiers of the problem. - - Returns: - Sequence[Union[str, FeatureIdentifier]]: A list of input feature identifiers. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - # [...] - in_features_identifiers = problem.get_in_features_identifiers() - print(in_features_identifiers) - >>> ['omega', 'pressure'] - """ - return self.in_features_identifiers - - def set_in_features_identifiers( - self, features_identifiers: Sequence[Union[str, FeatureIdentifier]] - ) -> None: - """Set the input features identifiers of the problem. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - # [...] - problem.set_in_features_identifiers(in_features_identifiers) - """ - self.in_features_identifiers = features_identifiers - - def add_in_features_identifiers( - self, inputs: Sequence[Union[str, FeatureIdentifier]] - ) -> None: - """Add input features identifiers to the problem. - - Args: - inputs (Sequence[Union[str, FeatureIdentifier]]): A list of input feature identifiers to add. - - Raises: - ValueError: If some :code:`inputs` are redondant. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - in_features_identifiers = ['omega', 'pressure'] - problem.add_in_features_identifiers(in_features_identifiers) - """ - if not (len(set(inputs)) == len(inputs)): - raise ValueError("Some inputs have same identifiers") - for input in inputs: - self.add_in_feature_identifier(input) - - def add_in_feature_identifier(self, input: Union[str, FeatureIdentifier]) -> None: - """Add an input feature identifier or identifier to the problem. - - Args: - input (FeatureIdentifier): The identifier or identifier of the input feature to add. - - Raises: - ValueError: If the specified input feature is already in the list of inputs. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - input_identifier = 'pressure' - problem.add_in_feature_identifier(input_identifier) - """ - if input in self.in_features_identifiers: - raise ValueError(f"{input} is already in self.in_features_identifiers") - self.in_features_identifiers.append(input) - self.in_features_identifiers.sort(key=self._feature_sort_key) - - def filter_in_features_identifiers( - self, identifiers: Sequence[Union[str, FeatureIdentifier]] - ) -> Sequence[Union[str, FeatureIdentifier]]: - """Filter and get input features features corresponding to a sorted list of identifiers. - - Args: - identifiers (Sequence[Union[str, FeatureIdentifier]]): A list of identifiers for which to retrieve corresponding input features. - - Returns: - Sequence[Union[str, FeatureIdentifier]]: A sorted list of input feature identifiers or categories corresponding to the provided identifiers. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - # [...] - features_identifiers = ['omega', 'pressure', 'temperature'] - input_features = problem.filter_in_features_identifiers(features_identifiers) - print(input_features) - >>> ['omega', 'pressure'] - """ - return sorted(set(identifiers).intersection(self.get_in_features_identifiers())) - - # -------------------------------------------------------------------------# - def get_out_features_identifiers(self) -> Sequence[Union[str, FeatureIdentifier]]: - """Get the output features identifiers of the problem. - - Returns: - Sequence[Union[str, FeatureIdentifier]]: A list of output feature identifiers. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - # [...] - outputs_identifiers = problem.get_out_features_identifiers() - print(outputs_identifiers) - >>> ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - """ - return self.out_features_identifiers - - def set_out_features_identifiers( - self, features_identifiers: Sequence[Union[str, FeatureIdentifier]] - ) -> None: - """Set the output features identifiers of the problem. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - # [...] - problem.set_out_features_identifiers(out_features_identifiers) - """ - self.out_features_identifiers = features_identifiers - - def add_out_features_identifiers( - self, outputs: Sequence[Union[str, FeatureIdentifier]] - ) -> None: - """Add output features identifiers to the problem. - - Args: - outputs (Sequence[Union[str, FeatureIdentifier]]): A list of output feature identifiers to add. - - Raises: - ValueError: if some :code:`outputs` are redondant. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - out_features_identifiers = ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - problem.add_out_features_identifiers(out_features_identifiers) - """ - if not (len(set(outputs)) == len(outputs)): - raise ValueError("Some outputs have same identifiers") - for output in outputs: - self.add_out_feature_identifier(output) - - def add_out_feature_identifier(self, output: Union[str, FeatureIdentifier]) -> None: - """Add an output feature identifier or identifier to the problem. - - Args: - output (FeatureIdentifier): The identifier or identifier of the output feature to add. - - Raises: - ValueError: If the specified output feature is already in the list of outputs. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - out_features_identifiers = 'pressure' - problem.add_out_feature_identifier(out_features_identifiers) - """ - if output in self.out_features_identifiers: - raise ValueError(f"{output} is already in self.out_features_identifiers") - self.out_features_identifiers.append(output) - self.out_features_identifiers.sort(key=self._feature_sort_key) - - def filter_out_features_identifiers( - self, identifiers: Sequence[Union[str, FeatureIdentifier]] - ) -> Sequence[Union[str, FeatureIdentifier]]: - """Filter and get output features corresponding to a sorted list of identifiers. - - Args: - identifiers (Sequence[Union[str, FeatureIdentifier]]): A list of identifiers for which to retrieve corresponding output features. - - Returns: - Sequence[Union[str, FeatureIdentifier]]: A sorted list of output feature identifiers or categories corresponding to the provided identifiers. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - # [...] - features_identifiers = ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - output_features = problem.filter_out_features_identifiers(features_identifiers) - print(output_features) - >>> ['in_massflow'] - """ - return sorted( - set(identifiers).intersection(self.get_out_features_identifiers()) - ) - - # -------------------------------------------------------------------------# - def get_constant_features_identifiers(self) -> list[str]: - """Get the constant features identifiers of the problem. - - Returns: - list[str]: A list of constant feature identifiers. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - # [...] - constant_features_identifiers = problem.get_constant_features_identifiers() - print(constant_features_identifiers) - >>> ['Global/P', 'Base_2_2/Zone/GridCoordinates'] - """ - return self.constant_features_identifiers - - def set_constant_features_identifiers( - self, features_identifiers: list[str] - ) -> None: - """Set the constant features identifiers of the problem. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - # [...] - problem.set_constant_features_identifiers(constant_features_identifiers) - """ - self.constant_features_identifiers = features_identifiers - - def add_constant_features_identifiers(self, inputs: list[str]) -> None: - """Add input features identifiers to the problem. - - Args: - inputs (list[str]): A list of constant feature identifiers to add. - - Raises: - ValueError: If some :code:`inputs` are redondant. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - constant_features_identifiers = ['Global/P', 'Base_2_2/Zone/GridCoordinates'] - problem.add_constant_features_identifiers(constant_features_identifiers) - """ - if not (len(set(inputs)) == len(inputs)): - raise ValueError("Some inputs have same identifiers") - for input in inputs: - self.add_constant_feature_identifier(input) - - def add_constant_feature_identifier(self, input: str) -> None: - """Add an constant feature identifier to the problem. - - Args: - input (str): The identifier of the constant feature to add. - - Raises: - ValueError: If the specified input feature is already in the list of constant features. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - constant_identifier = 'Global/P' - problem.add_constant_feature_identifier(constant_identifier) - """ - if input in self.constant_features_identifiers: - raise ValueError(f"{input} is already in self.in_features_identifiers") - self.constant_features_identifiers.append(input) - self.constant_features_identifiers.sort(key=self._feature_sort_key) - - def filter_constant_features_identifiers(self, identifiers: list[str]) -> list[str]: - """Filter and get input features features corresponding to a sorted list of identifiers. - - Args: - identifiers (list[str]): A list of identifiers for which to retrieve corresponding constant features. - - Returns: - list[str]: A sorted list of constant feature identifiers corresponding to the provided identifiers. - - Example: - .. code-block:: python - - from plaid.problem_definition import ProblemDefinition - problem = ProblemDefinition() - # [...] - features_identifiers = ['Global/P', 'Base_2_2/Zone/GridCoordinates'] - constant_features = problem.filter_constant_features_identifiers(features_identifiers) - print(constant_features) - >>> ['Global/P'] - """ - return sorted( - set(identifiers).intersection(self.get_constant_features_identifiers()) - ) - - # -------------------------------------------------------------------------# - @deprecated( - "use `get_in_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def get_input_scalars_names(self) -> list[str]: - """DEPRECATED: use :meth:`ProblemDefinition.get_in_features_identifiers` instead. - - Get the input scalars names of the problem. - - Returns: - list[str]: A list of input feature names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - input_scalars_names = problem.get_input_scalars_names() - print(input_scalars_names) - >>> ['omega', 'pressure'] - """ - return self.in_scalars_names - - @deprecated( - "use `add_in_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def add_input_scalars_names(self, inputs: list[str]) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_in_features_identifiers` instead. - - Add input scalars names to the problem. - - Args: - inputs (list[str]): A list of input feature names to add. - - Raises: - ValueError: If some :code:`inputs` are redondant. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - input_scalars_names = ['omega', 'pressure'] - problem.add_input_scalars_names(input_scalars_names) - """ - if not (len(set(inputs)) == len(inputs)): - raise ValueError("Some inputs have same names") - for input in inputs: - self.add_input_scalar_name(input) - - @deprecated( - "use `add_in_feature_identifier` instead", version="0.1.8", removal="0.2.0" - ) - def add_input_scalar_name(self, input: str) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_in_feature_identifier` instead. - - Add an input scalar name to the problem. - - Args: - input (str): The name of the input feature to add. - - Raises: - ValueError: If the specified input feature is already in the list of inputs. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - input_name = 'pressure' - problem.add_input_scalar_name(input_name) - """ - if input in self.in_scalars_names: - raise ValueError(f"{input} is already in self.in_scalars_names") - self.in_scalars_names.append(input) - self.in_scalars_names.sort() - - @deprecated( - "use `filter_in_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def filter_input_scalars_names(self, names: list[str]) -> list[str]: - """DEPRECATED: use :meth:`ProblemDefinition.filter_in_features_identifiers` instead. - - Filter and get input scalars features corresponding to a list of names. - - Args: - names (list[str]): A list of names for which to retrieve corresponding input features. - - Returns: - list[str]: A sorted list of input feature names or categories corresponding to the provided names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - scalars_names = ['omega', 'pressure', 'temperature'] - input_features = problem.filter_input_scalars_names(scalars_names) - print(input_features) - >>> ['omega', 'pressure'] - """ - return sorted(set(names).intersection(self.get_input_scalars_names())) - - # -------------------------------------------------------------------------# - @deprecated( - "use `get_out_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def get_output_scalars_names(self) -> list[str]: - """DEPRECATED: use :meth:`ProblemDefinition.get_out_features_identifiers` instead. - - Get the output scalars names of the problem. - - Returns: - list[str]: A list of output feature names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - outputs_names = problem.get_output_scalars_names() - print(outputs_names) - >>> ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - """ - return self.out_scalars_names - - @deprecated( - "use `add_out_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def add_output_scalars_names(self, outputs: list[str]) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_out_features_identifiers` instead. - - Add output scalars names to the problem. - - Args: - outputs (list[str]): A list of output feature names to add. - - Raises: - ValueError: if some :code:`outputs` are redondant. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - output_scalars_names = ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - problem.add_output_scalars_names(output_scalars_names) - """ - if not (len(set(outputs)) == len(outputs)): - raise ValueError("Some outputs have same names") - for output in outputs: - self.add_output_scalar_name(output) - - @deprecated( - "use `add_out_feature_identifier` instead", version="0.1.8", removal="0.2.0" - ) - def add_output_scalar_name(self, output: str) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_out_feature_identifier` instead. - - Add an output scalar name to the problem. - - Args: - output (str): The name of the output feature to add. - - Raises: - ValueError: If the specified output feature is already in the list of outputs. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - output_scalars_names = 'pressure' - problem.add_output_scalar_name(output_scalars_names) - """ - if output in self.out_scalars_names: - raise ValueError(f"{output} is already in self.out_scalars_names") - self.out_scalars_names.append(output) - self.in_scalars_names.sort() - - def filter_output_scalars_names(self, names: list[str]) -> list[str]: - """Filter and get output features corresponding to a list of names. - - Args: - names (list[str]): A list of names for which to retrieve corresponding output features. - - Returns: - list[str]: A sorted list of output feature names or categories corresponding to the provided names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - scalars_names = ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - output_features = problem.filter_output_scalars_names(scalars_names) - print(output_features) - >>> ['in_massflow'] - """ - return sorted(set(names).intersection(self.get_output_scalars_names())) - - # -------------------------------------------------------------------------# - @deprecated( - "use `get_in_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def get_input_fields_names(self) -> list[str]: - """DEPRECATED: use :meth:`ProblemDefinition.get_in_features_identifiers` instead. - - Get the input fields names of the problem. - - Returns: - list[str]: A list of input feature names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - input_fields_names = problem.get_input_fields_names() - print(input_fields_names) - >>> ['omega', 'pressure'] - """ - return self.in_fields_names - - @deprecated( - "use `add_in_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def add_input_fields_names(self, inputs: list[str]) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_in_features_identifiers` instead. - - Add input fields names to the problem. - - Args: - inputs (list[str]): A list of input feature names to add. - - Raises: - ValueError: If some :code:`inputs` are redondant. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - input_fields_names = ['omega', 'pressure'] - problem.add_input_fields_names(input_fields_names) - """ - if not (len(set(inputs)) == len(inputs)): - raise ValueError("Some inputs have same names") - for input in inputs: - self.add_input_field_name(input) - - @deprecated( - "use `add_in_feature_identifier` instead", version="0.1.8", removal="0.2.0" - ) - def add_input_field_name(self, input: str) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_in_feature_identifier` instead. - - Add an input field name to the problem. - - Args: - input (str): The name of the input feature to add. - - Raises: - ValueError: If the specified input feature is already in the list of inputs. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - input_name = 'pressure' - problem.add_input_field_name(input_name) - """ - if input in self.in_fields_names: - raise ValueError(f"{input} is already in self.in_fields_names") - self.in_fields_names.append(input) - self.in_fields_names.sort() - - def filter_input_fields_names(self, names: list[str]) -> list[str]: - """Filter and get input fields features corresponding to a list of names. - - Args: - names (list[str]): A list of names for which to retrieve corresponding input features. - - Returns: - list[str]: A sorted list of input feature names or categories corresponding to the provided names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - input_fields_names = ['omega', 'pressure', 'temperature'] - input_features = problem.filter_input_fields_names(input_fields_names) - print(input_features) - >>> ['omega', 'pressure'] - """ - return sorted(set(names).intersection(self.get_input_fields_names())) - - # -------------------------------------------------------------------------# - @deprecated( - "use `get_out_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def get_output_fields_names(self) -> list[str]: - """DEPRECATED: use :meth:`ProblemDefinition.get_out_features_identifiers` instead. - - Get the output fields names of the problem. - - Returns: - list[str]: A list of output feature names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - outputs_names = problem.get_output_fields_names() - print(outputs_names) - >>> ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - """ - return self.out_fields_names - - @deprecated( - "use `add_out_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def add_output_fields_names(self, outputs: list[str]) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_out_features_identifiers` instead. - - Add output fields names to the problem. - - Args: - outputs (list[str]): A list of output feature names to add. - - Raises: - ValueError: if some :code:`outputs` are redondant. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - output_fields_names = ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - problem.add_output_fields_names(output_fields_names) - """ - if not (len(set(outputs)) == len(outputs)): - raise ValueError("Some outputs have same names") - for output in outputs: - self.add_output_field_name(output) - - @deprecated( - "use `add_out_feature_identifier` instead", version="0.1.8", removal="0.2.0" - ) - def add_output_field_name(self, output: str) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_out_feature_identifier` instead. - - Add an output field name to the problem. - - Args: - output (str): The name of the output feature to add. - - Raises: - ValueError: If the specified output feature is already in the list of outputs. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - output_fields_names = 'pressure' - problem.add_output_field_name(output_fields_names) - """ - if output in self.out_fields_names: - raise ValueError(f"{output} is already in self.out_fields_names") - self.out_fields_names.append(output) - self.out_fields_names.sort() - - def filter_output_fields_names(self, names: list[str]) -> list[str]: - """Filter and get output features corresponding to a list of names. - - Args: - names (list[str]): A list of names for which to retrieve corresponding output features. - - Returns: - list[str]: A sorted list of output feature names or categories corresponding to the provided names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - output_fields_names = ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - output_features = problem.filter_output_fields_names(output_fields_names) - print(output_features) - >>> ['in_massflow'] - """ - return sorted(set(names).intersection(self.get_output_fields_names())) - - # -------------------------------------------------------------------------# - @deprecated( - "use `get_in_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def get_input_timeseries_names(self) -> list[str]: - """DEPRECATED: use :meth:`ProblemDefinition.get_in_features_identifiers` instead. - - Get the input timeseries names of the problem. - - Returns: - list[str]: A list of input feature names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - input_timeseries_names = problem.get_input_timeseries_names() - print(input_timeseries_names) - >>> ['omega', 'pressure'] - """ - return self.in_timeseries_names - - @deprecated( - "use `add_in_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def add_input_timeseries_names(self, inputs: list[str]) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_in_features_identifiers` instead. - - Add input timeseries names to the problem. - - Args: - inputs (list[str]): A list of input feature names to add. - - Raises: - ValueError: If some :code:`inputs` are redondant. +import sys - Example: - .. code-block:: python +if sys.version_info >= (3, 11): + from typing import Self +else: # pragma: no cover + from typing import TypeVar - from plaid import ProblemDefinition - problem = ProblemDefinition() - input_timeseries_names = ['omega', 'pressure'] - problem.add_input_timeseries_names(input_timeseries_names) - """ - if not (len(set(inputs)) == len(inputs)): - raise ValueError("Some inputs have same names") - for input in inputs: - self.add_input_timeseries_name(input) + Self = TypeVar("Self") - @deprecated( - "use `add_in_feature_identifier` instead", version="0.1.8", removal="0.2.0" - ) - def add_input_timeseries_name(self, input: str) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_in_feature_identifier` instead. +import logging +from pathlib import Path +from typing import Any, Literal, Optional, Sequence, Union, cast - Add an input timeseries name to the problem. +import yaml +from pydantic import BaseModel, ConfigDict, Field, field_validator - Args: - input (str): The name of the input feature to add. +from .constants import ( + AUTHORIZED_SCORE_FUNCTIONS_T, + AUTHORIZED_TASKS_T, +) +from .types import IndexType - Raises: - ValueError: If the specified input feature is already in the list of inputs. +# %% Globals - Example: - .. code-block:: python +logger = logging.getLogger(__name__) - from plaid import ProblemDefinition - problem = ProblemDefinition() - input_name = 'pressure' - problem.add_input_timeseries_name(input_name) - """ - if input in self.in_timeseries_names: - raise ValueError(f"{input} is already in self.in_timeseries_names") - self.in_timeseries_names.append(input) - self.in_timeseries_names.sort() +# %% Functions - def filter_input_timeseries_names(self, names: list[str]) -> list[str]: - """Filter and get input timeseries features corresponding to a list of names. +# %% Classes - Args: - names (list[str]): A list of names for which to retrieve corresponding input features. - Returns: - list[str]: A sorted list of input feature names or categories corresponding to the provided names. +def _normalize_list(v): + return sorted(map(str, v)) - Example: - .. code-block:: python - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - input_timeseries_names = ['omega', 'pressure', 'temperature'] - input_features = problem.filter_input_timeseries_names(input_timeseries_names) - print(input_features) - >>> ['omega', 'pressure'] - """ - return sorted(set(names).intersection(self.get_input_timeseries_names())) +class ProblemDefinition(BaseModel): + """Defines the input and output features for a machine learning problem.""" - # -------------------------------------------------------------------------# - @deprecated( - "use `get_out_features_identifiers` instead", version="0.1.8", removal="0.2.0" + model_config = ConfigDict( + revalidate_instances="always", validate_assignment=True, extra="forbid" ) - def get_output_timeseries_names(self) -> list[str]: - """DEPRECATED: use :meth:`ProblemDefinition.get_out_features_identifiers` instead. - - Get the output timeseries names of the problem. - - Returns: - list[str]: A list of output feature names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - outputs_names = problem.get_output_timeseries_names() - print(outputs_names) - >>> ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - """ - return self.out_timeseries_names - @deprecated( - "use `add_out_features_identifiers` instead", version="0.1.8", removal="0.2.0" + name: Optional[str] = Field(default=None) + task: Optional[AUTHORIZED_TASKS_T] = Field(default=None) + input_features: list[str] = Field(default_factory=list) + output_features: list[str] = Field(default_factory=list) + score_function: Optional[AUTHORIZED_SCORE_FUNCTIONS_T] = Field(default=None) + train_split: Optional[dict[str, Sequence[int] | Literal["all"]]] = Field( + default=None ) - def add_output_timeseries_names(self, outputs: list[str]) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_out_features_identifiers` instead. - - Add output timeseries names to the problem. - - Args: - outputs (list[str]): A list of output feature names to add. - - Raises: - ValueError: if some :code:`outputs` are redondant. - - Example: - .. code-block:: python + test_split: Optional[dict[str, Sequence[int] | Literal["all"]]] = Field( + default=None + ) + constant_features: list[str] = Field(default_factory=list) + version: str = "" - from plaid import ProblemDefinition - problem = ProblemDefinition() - output_timeseries_names = ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - problem.add_output_timeseries_names(output_timeseries_names) - """ - if not (len(set(outputs)) == len(outputs)): - raise ValueError("Some outputs have same names") - for output in outputs: - self.add_output_timeseries_name(output) + # verifier que tab autotocopleate marche bien dans vscode - @deprecated( - "use `add_out_feature_identifier` instead", version="0.1.8", removal="0.2.0" - ) - def add_output_timeseries_name(self, output: str) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_out_feature_identifier` instead. + @staticmethod + def from_path( + path: str | Path, name: str | None = None, **overrides + ) -> "ProblemDefinition": + """Load a problem definition from a YAML file located at the specified path. - Add an output timeseries name to the problem. + The YAML file should contain one or more problem definitions, and the desired definition can be selected by its name. Args: - output (str): The name of the output feature to add. + path (str | Path): The file path to the YAML file containing problem definitions. + name (str | None, optional): The name of the problem definition to load. If None, it will attempt to load the + only problem definition available in the file. Defaults to None. + **overrides: Additional keyword arguments to override specific fields in the loaded problem definition. Raises: - ValueError: If the specified output feature is already in the list of outputs. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - output_timeseries_names = 'pressure' - problem.add_output_timeseries_name(output_timeseries_names) - """ - if output in self.out_timeseries_names: - raise ValueError(f"{output} is already in self.out_timeseries_names") - self.out_timeseries_names.append(output) - self.in_timeseries_names.sort() - - def filter_output_timeseries_names(self, names: list[str]) -> list[str]: - """Filter and get output features corresponding to a list of names. - - Args: - names (list[str]): A list of names for which to retrieve corresponding output features. + ValueError: If the specified name is not found in the YAML file or if multiple problem definitions are present + without a specified name. Returns: - list[str]: A sorted list of output feature names or categories corresponding to the provided names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - output_timeseries_names = ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - output_features = problem.filter_output_timeseries_names(output_timeseries_names) - print(output_features) - >>> ['in_massflow'] + ProblemDefinition: The loaded problem definition. """ - return sorted(set(names).intersection(self.get_output_timeseries_names())) - - # -------------------------------------------------------------------------# - @deprecated( - "use `get_in_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def get_input_meshes_names(self) -> list[str]: - """DEPRECATED: use :meth:`ProblemDefinition.get_in_features_identifiers` instead. - - Get the input meshes names of the problem. - - Returns: - list[str]: A list of input feature names. + from plaid.storage import load_problem_definitions_from_disk - Example: - .. code-block:: python + all_pb_def = load_problem_definitions_from_disk(path=Path(path)) + available = ", ".join(sorted(all_pb_def)) + if name is not None: + if name not in all_pb_def: + raise ValueError( + f"Problem definition '{name}' not found in {path}. " + f"Available definitions: {available}" + ) + data2 = all_pb_def[name].model_dump() + data2.update(overrides) + data = data2 + else: + if len(all_pb_def) > 1: + raise RuntimeError( + f"Non name specified, but more than one Problem definition. Available definitions: {available}" + ) + else: + data2 = next(iter(all_pb_def.values())).model_dump() + data2.update(overrides) + data = data2 - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - input_meshes_names = problem.get_input_meshes_names() - print(input_meshes_names) - >>> ['omega', 'pressure'] - """ - return self.in_meshes_names + # return data + return ProblemDefinition(**data) - @deprecated( - "use `add_in_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def add_input_meshes_names(self, inputs: list[str]) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_in_features_identifiers` instead. + @field_validator("input_features", mode="before") + @classmethod + def normalize_in_features_identifiers(cls, v): + """Normalize input features identifiers by ensuring they are unique and sorted.""" + if len(set(v)) != len(v): + raise ValueError("duplicated values in input_features") + return _normalize_list(v) - Add input meshes names to the problem. + @field_validator("train_split", "test_split", mode="after") + @classmethod + def check_split_has_only_one_obj(cls, v): + """Ensure that the split dictionaries contain only one key-value pair.""" + if len(v) > 1: + raise ValueError( + "Splits only support one element (dict with only one object)" + ) + return v - Args: - inputs (list[str]): A list of input feature names to add. + @field_validator("output_features", mode="before") + @classmethod + def normalize_out_features_identifiers(cls, v): + """Normalize output features identifiers by ensuring they are unique and sorted.""" + if len(set(v)) != len(v): + raise ValueError("duplicated values in output_features") + return _normalize_list(v) + + def __setattr__(self, name: str, value: Any) -> None: + """Override the default attribute setting behavior to enforce immutability for certain fields and log warnings for others.""" + # to set the name, task, score_function only once and oly once + if name in ["name", "task", "score_function"]: + current_value = getattr(self, name, None) + if ( + current_value is not None + and value is not None + and current_value != value + ): + raise AttributeError(f"'{name}' is already set and cannot be changed.") + # warning if set + if name in ["train_split", "test_split"]: + current_value = getattr(self, name, None) + if ( + current_value is not None + and value is not None + and current_value != value + ): + logger.warning("'%s' already exists -> data will be replaced", name) + + super().__setattr__(name, value) + + # def get_name(self) -> str | None: + # return self.name + + # # -------------------------------------------------------------------------# + def get_train_split_name(self) -> str: + """Return the name of the train split.""" + if self.train_split is None: + raise ValueError("train_split is not defined.") + return list(self.train_split.keys())[0] + + def get_train_split_indices(self) -> IndexType | Literal["all"]: + """Return the indices associated with the train split. Raises: - ValueError: If some :code:`inputs` are redondant. - - Example: - .. code-block:: python + ValueError: If `train_split` is not defined. - from plaid import ProblemDefinition - problem = ProblemDefinition() - input_meshes_names = ['omega', 'pressure'] - problem.add_input_meshes_names(input_meshes_names) + Returns: + IndexType | Literal["all"]: The indices associated with the train split. """ - if not (len(set(inputs)) == len(inputs)): - raise ValueError("Some inputs have same names") - for input in inputs: - self.add_input_mesh_name(input) - - @deprecated( - "use `add_in_feature_identifier` instead", version="0.1.8", removal="0.2.0" - ) - def add_input_mesh_name(self, input: str) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_in_feature_identifier` instead. + if self.train_split is None: + raise ValueError("train_split is not defined.") + return cast(IndexType | Literal["all"], next(iter(self.train_split.values()))) - Add an input mesh name to the problem. + def get_test_split_name(self) -> str: + """Return the name of the test split.""" + if self.test_split is None: + raise ValueError("test_split is not defined.") + return list(self.test_split.keys())[0] - Args: - input (str): The name of the input feature to add. + def get_test_split_indices(self) -> IndexType | Literal["all"]: + """Return the indices associated with the test split. Raises: - ValueError: If the specified input feature is already in the list of inputs. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - input_name = 'pressure' - problem.add_input_mesh_name(input_name) - """ - if input in self.in_meshes_names: - raise ValueError(f"{input} is already in self.in_meshes_names") - self.in_meshes_names.append(input) - self.in_meshes_names.sort() - - def filter_input_meshes_names(self, names: list[str]) -> list[str]: - """Filter and get input meshes features corresponding to a list of names. - - Args: - names (list[str]): A list of names for which to retrieve corresponding input features. - - Returns: - list[str]: A sorted list of input feature names or categories corresponding to the provided names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - input_meshes_names = ['omega', 'pressure', 'temperature'] - input_features = problem.filter_input_meshes_names(input_meshes_names) - print(input_features) - >>> ['omega', 'pressure'] - """ - return sorted(set(names).intersection(self.get_input_meshes_names())) - - # -------------------------------------------------------------------------# - @deprecated( - "use `get_out_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def get_output_meshes_names(self) -> list[str]: - """DEPRECATED: use :meth:`ProblemDefinition.get_out_features_identifiers` instead. - - Get the output meshes names of the problem. + ValueError: If `test_split` is not defined. Returns: - list[str]: A list of output feature names. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - outputs_names = problem.get_output_meshes_names() - print(outputs_names) - >>> ['compression_rate', 'in_massflow', 'isentropic_efficiency'] + IndexType | Literal["all"]: The indices associated with the test split. """ - return self.out_meshes_names + if self.test_split is None: + raise ValueError("test_split is not defined.") + return cast(IndexType | Literal["all"], next(iter(self.test_split.values()))) - @deprecated( - "use `add_out_features_identifiers` instead", version="0.1.8", removal="0.2.0" - ) - def add_output_meshes_names(self, outputs: list[str]) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_out_features_identifiers` instead. - - Add output meshes names to the problem. + def add_in_features_identifiers(self, inputs: Union[str, Sequence[str]]) -> None: + """Add input features identifiers to the problem. Args: - outputs (list[str]): A list of output feature names to add. + inputs (Sequence[str] or str ): A list of or a single input feature identifier to add. Raises: - ValueError: if some :code:`outputs` are redondant. + ValueError: If some :code:`inputs` are duplicated. Example: .. code-block:: python - from plaid import ProblemDefinition + from plaid.problem_definition import ProblemDefinition problem = ProblemDefinition() - output_meshes_names = ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - problem.add_output_meshes_names(output_meshes_names) - """ - if not (len(set(outputs)) == len(outputs)): - raise ValueError("Some outputs have same names") - for output in outputs: - self.add_output_mesh_name(output) - - @deprecated( - "use `add_out_feature_identifier` instead", version="0.1.8", removal="0.2.0" - ) - def add_output_mesh_name(self, output: str) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.add_out_feature_identifier` instead. + in_features_identifiers = ['omega', 'pressure'] + problem.add_in_features_identifiers(in_features_identifiers) - Add an output mesh name to the problem. + # or for a single feature - Args: - output (str): The name of the output feature to add. + problem.add_in_features_identifiers("angle") + """ + if isinstance(inputs, str): + input_feature = inputs + if input_feature in self.input_features: + raise ValueError( + f"{input_feature} is already in self.input_features" + ) - Raises: - ValueError: If the specified output feature is already in the list of outputs. + self.input_features.append(input_feature) + self.input_features.sort() + return - Example: - .. code-block:: python + if not (len(set(inputs)) == len(inputs)): + raise ValueError("Some input features share the same identifier") - from plaid import ProblemDefinition - problem = ProblemDefinition() - output_meshes_names = 'pressure' - problem.add_output_mesh_name(output_meshes_names) - """ - if output in self.out_meshes_names: - raise ValueError(f"{output} is already in self.out_meshes_names") - self.out_meshes_names.append(output) - self.in_meshes_names.sort() + for input_feature in inputs: + self.add_in_features_identifiers(input_feature) - def filter_output_meshes_names(self, names: list[str]) -> list[str]: - """Filter and get output features corresponding to a list of names. + def add_out_features_identifiers(self, outputs: Union[str, Sequence[str]]) -> None: + """Add output features identifiers to the problem. Args: - names (list[str]): A list of names for which to retrieve corresponding output features. + outputs (Sequence[str] or str ): A list of or a single input feature identifier to add. - Returns: - list[str]: A sorted list of output feature names or categories corresponding to the provided names. + Raises: + ValueError: If some :code:`outputs` are duplicated. Example: .. code-block:: python - from plaid import ProblemDefinition + from plaid.problem_definition import ProblemDefinition problem = ProblemDefinition() - # [...] - output_meshes_names = ['compression_rate', 'in_massflow', 'isentropic_efficiency'] - output_features = problem.filter_output_meshes_names(output_meshes_names) - print(output_features) - >>> ['in_massflow'] - """ - return sorted(set(names).intersection(self.get_output_meshes_names())) - - # -------------------------------------------------------------------------# - def get_all_indices(self) -> list[int]: - """Get all indices from splits. - - Returns: - list[int]: list containing all unique indices. - """ - all_indices = [] - for indices in self.get_split().values(): - all_indices += list(indices) - return list(set(all_indices)) + out_features_identifiers = ['omega', 'pressure'] + problem.add_out_features_identifiers(out_features_identifiers) - # -------------------------------------------------------------------------# - def _generate_problem_infos_dict(self) -> dict[str, Union[str, list]]: - """Generate a dictionary containing all relevant problem definition data. + # or for a single feature - Returns: - dict[str, Union[str, list]]: A dictionary with keys for task, input/output features, scalars, fields, timeseries, and meshes. + problem.add_out_features_identifiers("angle") """ - data = { - "task": self._task, - "score_function": self._score_function, - "constant_features": [], - "input_features": [], - "output_features": [], - } - for tup in self.in_features_identifiers: - if isinstance(tup, FeatureIdentifier): - data["input_features"].append(dict(**tup)) - else: - data["input_features"].append(tup) - for tup in self.out_features_identifiers: - if isinstance(tup, FeatureIdentifier): - data["output_features"].append(dict(**tup)) - else: - data["output_features"].append(tup) - for tup in self.constant_features_identifiers: - data["constant_features"].append(tup) - if self._train_split is not None: - data["train_split"] = self._train_split - if self._test_split is not None: - data["test_split"] = self._test_split - if self._name is not None: - data["name"] = self._name - if Version(plaid.__version__) < Version("0.2.0"): - data.update( - { - k: v - for k, v in { - "input_scalars": self.in_scalars_names, - "output_scalars": self.out_scalars_names, - "input_fields": self.in_fields_names, - "output_fields": self.out_fields_names, - "input_timeseries": self.in_timeseries_names, - "output_timeseries": self.out_timeseries_names, - "input_meshes": self.in_meshes_names, - "output_meshes": self.out_meshes_names, - }.items() - if v # keeps only truthy (non-empty, non-None) lists - } - ) - - # Handle version - plaid_version = Version(plaid.__version__) - if self._version != plaid_version: # pragma: no cover - logger.warning( - f"Version mismatch: ProblemDefinition was loaded from version {self._version if self._version is not None else 'anterior to 0.1.10'}, and will be saved with version: {plaid_version}" - ) - data["version"] = str(plaid_version) - else: - data["version"] = str(self._version) + if isinstance(outputs, str): + output_feature = outputs + if output_feature in self.output_features: + raise ValueError( + f"{output_feature} is already in self.output_features" + ) - return data + self.output_features.append(output_feature) + self.output_features.sort() + return - # Handle version - plaid_version = Version(plaid.__version__) - if self._version != plaid_version: # pragma: no cover - logger.warning( - f"Version mismatch: ProblemDefinition was loaded from version {self._version if self._version is not None else 'anterior to 0.1.10'}, and will be saved with version: {plaid_version}" - ) - data["version"] = str(plaid_version) - else: - data["version"] = str(self._version) + if not (len(set(outputs)) == len(outputs)): + raise ValueError("Some output features share the same identifier") - # Save infos + for output_feature in outputs: + self.add_out_features_identifiers(output_feature) def save_to_file(self, path: Union[str, Path]) -> None: """Save problem information, inputs, outputs, and split to the specified file in YAML format. @@ -1555,8 +286,6 @@ def save_to_file(self, path: Union[str, Path]) -> None: problem = ProblemDefinition() problem.save_to_file("/path/to/save_file") """ - problem_infos_dict = self._generate_problem_infos_dict() - path = Path(path) path.parent.mkdir(parents=True, exist_ok=True) @@ -1566,136 +295,9 @@ def save_to_file(self, path: Union[str, Path]) -> None: # Save infos with path.open("w") as file: yaml.dump( - problem_infos_dict, file, default_flow_style=False, sort_keys=True - ) - - @deprecated( - "`ProblemDefinition._save_to_dir_(...)` is deprecated. Use `ProblemDefinition.save_to_dir(...)` instead.", - version="0.1.10", - removal="0.2.0", - ) - def _save_to_dir_(self, path: Union[str, Path]) -> None: - """DEPRECATED: use :meth:`ProblemDefinition.save_to_dir` instead.""" - self.save_to_dir(path) - - def save_to_dir(self, path: Union[str, Path]) -> None: - """Save problem information, inputs, outputs, and split to the specified directory in YAML and CSV formats. - - Args: - path (Union[str,Path]): The directory where the problem information will be saved. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - problem.save_to_dir("/path/to/save_directory") - """ - path = Path(path) - - if not (path.is_dir()): - path.mkdir(parents=True) - - problem_infos_dict = self._generate_problem_infos_dict() - - # Save infos - pbdef_fname = path / "problem_infos.yaml" - with pbdef_fname.open("w") as file: - yaml.dump( - problem_infos_dict, file, default_flow_style=False, sort_keys=True + self.model_dump(), file, default_flow_style=False, sort_keys=True ) - # Save split - split_fname = path / "split.json" - if self.get_split() is not None: - with split_fname.open("w") as file: - json.dump(self.get_split(), file) - - # # Save split - # split_fname = path / "train_split.json" - # if self.get_train_split() is not None: - # with split_fname.open("w") as file: - # json.dump(self.get_train_split(), file) - - # split_fname = path / "test_split.json" - # if self.get_test_split() is not None: - # with split_fname.open("w") as file: - # json.dump(self.get_test_split(), file) - - @classmethod - def load(cls, path: Union[str, Path]) -> Self: # pragma: no cover - """Load data from a specified directory. - - Args: - path (Union[str,Path]): The path from which to load files. - - Returns: - Self: The loaded dataset (Dataset). - """ - instance = cls() - instance._load_from_dir_(path) - return instance - - def _initialize_from_problem_infos_dict( - self, data: dict[str, Union[str, list]] - ) -> None: - if "version" not in data: - self._version = None - else: - self._version = Version(data["version"]) - self._task = data["task"] - self.in_features_identifiers = [] - if "input_features" in data: - for tup in data["input_features"]: - if isinstance(tup, dict): - self.in_features_identifiers.append(FeatureIdentifier(**tup)) - else: - self.in_features_identifiers.append(tup) - self.out_features_identifiers = [] - if "output_features" in data: - for tup in data["output_features"]: - if isinstance(tup, dict): - self.out_features_identifiers.append(FeatureIdentifier(**tup)) - else: - self.out_features_identifiers.append(tup) - self.constant_features_identifiers = [] - if "constant_features" in data: - for tup in data["constant_features"]: - self.constant_features_identifiers.append(tup) - if "version" not in data or Version(data["version"]) < Version("0.2.0"): - self.in_scalars_names = data.get("input_scalars", []) - self.out_scalars_names = data.get("output_scalars", []) - self.in_fields_names = data.get("input_fields", []) - self.out_fields_names = data.get("output_fields", []) - self.in_timeseries_names = data.get("input_timeseries", []) - self.out_timeseries_names = data.get("output_timeseries", []) - self.in_meshes_names = data.get("input_meshes", []) - self.out_meshes_names = data.get("output_meshes", []) - else: # pragma: no cover - old_keys = [ - "input_scalars", - "input_fields", - "input_timeseries", - "input_meshes", - "output_scalars", - "output_fields", - "output_timeseries", - "output_meshes", - ] - for k in old_keys: - if k in data: - logger.warning( - f"Key '{k}' is deprecated and will be ignored. You should convert your ProblemDefinition using FeatureIdentifiers to identify features instead of names." - ) - if "score_function" in data: - self._score_function = data["score_function"] - if "train_split" in data: - self._train_split = data["train_split"] - if "test_split" in data: - self._test_split = data["test_split"] - if "name" in data: - self._name = data["name"] - def _load_from_file_(self, path: Union[str, Path]) -> None: """Load problem information, inputs, outputs, and split from the specified file in YAML format. @@ -1723,161 +325,10 @@ def _load_from_file_(self, path: Union[str, Path]) -> None: with path.open("r") as file: data = yaml.safe_load(file) - self._initialize_from_problem_infos_dict(data) - - def _load_from_dir_(self, path: Union[str, Path]) -> None: - """Load problem information, inputs, outputs, and split from the specified directory in YAML and CSV formats. - - Args: - path (Union[str,Path]): The directory from which to load the problem information. - - Raises: - FileNotFoundError: Triggered if the provided directory or file problem_infos.yaml does not exist - FileExistsError: Triggered if the provided path is a file instead of a directory. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - problem._load_from_dir_("/path/to/load_directory") - """ - path = Path(path) - - if not path.exists(): - raise FileNotFoundError(f'Directory "{path}" does not exist. Abort') - - if not path.is_dir(): - raise FileExistsError(f'"{path}" is not a directory. Abort') - - pbdef_fname = path / "problem_infos.yaml" - data = {} # To avoid crash if pbdef_fname does not exist - if pbdef_fname.is_file(): - with pbdef_fname.open("r") as file: - data = yaml.safe_load(file) - else: - raise FileNotFoundError( - f"file with path `{pbdef_fname}` does not exist. Abort" - ) - - self._initialize_from_problem_infos_dict(data) - - # if it was saved with version <=0.1.7 it is a .csv else it is .json - split = {} - split_fname_csv = path / "split.csv" - split_fname_json = path / "split.json" - if split_fname_json.is_file(): - with split_fname_json.open("r") as file: - split = json.load(file) - if split_fname_csv.is_file(): # pragma: no cover - logger.warning( - f"Both files with path `{split_fname_csv}` and `{split_fname_json}` exist. JSON file is the standard from 0.1.7 -> CSV file will be ignored" - ) - elif split_fname_csv.is_file(): # pragma: no cover - with split_fname_csv.open("r") as file: - reader = csv.reader(file, delimiter=",") - for row in reader: - split[row[0]] = [int(i) for i in row[1:]] - else: # pragma: no cover - logger.warning( - f"file with path `{split_fname_csv}` or `{split_fname_json}` does not exist. Splits will not be set" - ) - self.set_split(split) - - def extract_problem_definition_from_identifiers( - self, identifiers: Sequence[Union[str, FeatureIdentifier]] - ) -> Self: - """Create a new ProblemDefinition restricted to a subset of feature identifiers. - - Args: - identifiers (Sequence[Union[str, FeatureIdentifier]]): List of identifiers to keep. - - Returns: - ProblemDefinition: A new :class:`ProblemDefinition` instance. - """ - new_problem_definition = ProblemDefinition() - if self._task is not None: - new_problem_definition.set_task(self.get_task()) - if self._name is not None: - new_problem_definition.set_name(self.get_name()) - - in_features = self.filter_in_features_identifiers(identifiers) - if len(in_features) > 0: - new_problem_definition.add_in_features_identifiers(in_features) - - out_features = self.filter_out_features_identifiers(identifiers) - if len(out_features) > 0: - new_problem_definition.add_out_features_identifiers(out_features) - - if self.get_split() is not None: - new_problem_definition.set_split(self.get_split()) - - return new_problem_definition - - # -------------------------------------------------------------------------# - def __repr__(self) -> str: - """Return a string representation of the problem. - - Returns: - str: A string representation of the overview of problem content. - - Example: - .. code-block:: python - - from plaid import ProblemDefinition - problem = ProblemDefinition() - # [...] - print(problem) - >>> ProblemDefinition(input_scalars_names=['s_1'], output_scalars_names=['s_2'], input_meshes_names=['mesh'], task='regression', split_names=['train', 'val']) - """ - str_repr = "ProblemDefinition(" - - # ---# features - if len(self.in_features_identifiers) > 0: - in_features_identifiers = self.in_features_identifiers - str_repr += f"{in_features_identifiers=}, " - if len(self.out_features_identifiers) > 0: - out_features_identifiers = self.out_features_identifiers - str_repr += f"{out_features_identifiers=}, " - - # ---# scalars - if len(self.in_scalars_names) > 0: - input_scalars_names = self.in_scalars_names - str_repr += f"{input_scalars_names=}, " - if len(self.out_scalars_names) > 0: - output_scalars_names = self.out_scalars_names - str_repr += f"{output_scalars_names=}, " - # ---# fields - if len(self.in_fields_names) > 0: - input_fields_names = self.in_fields_names - str_repr += f"{input_fields_names=}, " - if len(self.out_fields_names) > 0: - output_fields_names = self.out_fields_names - str_repr += f"{output_fields_names=}, " - # ---# timeseries - if len(self.in_timeseries_names) > 0: - input_timeseries_names = self.in_timeseries_names - str_repr += f"{input_timeseries_names=}, " - if len(self.out_timeseries_names) > 0: - output_timeseries_names = self.out_timeseries_names - str_repr += f"{output_timeseries_names=}, " - # ---# meshes - if len(self.in_meshes_names) > 0: - input_meshes_names = self.in_meshes_names - str_repr += f"{input_meshes_names=}, " - if len(self.out_meshes_names) > 0: - output_meshes_names = self.out_meshes_names - str_repr += f"{output_meshes_names=}, " - # ---# task - if self._task is not None: - task = self._task - str_repr += f"{task=}, " - # ---# split - if self._split is not None: - split_names = list(self._split.keys()) - str_repr += f"{split_names=}, " - - if str_repr[-2:] == ", ": - str_repr = str_repr[:-2] - str_repr += ")" - return str_repr + model_fields = type(self).model_fields.keys() + for key, value in data.items(): + if key in model_fields: + setattr(self, key, value) + else: + logger.warning(f" Data ignored! : {key}: {value}") + raise diff --git a/src/plaid/storage/__init__.py b/src/plaid/storage/__init__.py index 60344be9..ad09f49f 100644 --- a/src/plaid/storage/__init__.py +++ b/src/plaid/storage/__init__.py @@ -1,26 +1,19 @@ """Public API for plaid.storage.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -from plaid.storage.common.reader import ( +from .common.reader import ( load_problem_definitions_from_disk, load_problem_definitions_from_hub, ) -from plaid.storage.common.writer import ( +from .common.writer import ( push_local_problem_definitions_to_hub, save_problem_definitions_to_disk, ) -from plaid.storage.reader import ( +from .reader import ( download_from_hub, init_from_disk, init_streaming_from_hub, ) -from plaid.storage.writer import ( +from .registry import get_backend +from .writer import ( push_to_hub, save_to_disk, ) @@ -35,4 +28,5 @@ "load_problem_definitions_from_hub", "push_local_problem_definitions_to_hub", "save_problem_definitions_to_disk", + "get_backend", ] diff --git a/src/plaid/storage/backend_api.py b/src/plaid/storage/backend_api.py new file mode 100644 index 00000000..74ded837 --- /dev/null +++ b/src/plaid/storage/backend_api.py @@ -0,0 +1,104 @@ +"""Protocol definition for storage backend modules.""" + +from pathlib import Path +from typing import ( + TYPE_CHECKING, + Any, + Callable, + Generator, + Iterable, + Mapping, + Optional, + Protocol, + Union, +) + +import numpy as np +from datasets import IterableDataset + +from ..types import IndexType + +if TYPE_CHECKING: + from ..containers.dataset import Dataset + from ..containers.sample import Sample + + +class BackendModule(Protocol): + """Protocol describing required methods for storage backend plugins.""" + + name: str + + @staticmethod + def init_from_disk(path: Union[str, Path]) -> Mapping[str, Any]: + """Load a dataset dictionary from local storage.""" + ... + + @staticmethod + def download_from_hub( + repo_id: str, + local_dir: Union[str, Path], + split_ids: Optional[dict[str, Iterable[int]]] = None, + features: Optional[list[str]] = None, # noqa: ARG001 + overwrite: bool = False, + ) -> str: + """Download a dataset dictionary from a remote hub into a local folder.""" + ... + + @staticmethod + def init_datasetdict_streaming_from_hub( + repo_id: str, + split_ids: Optional[dict[str, Iterable[int]]] = None, + features: Optional[list[str]] = None, # noqa: ARG001 + ) -> dict[str, IterableDataset]: + """Initialize a streaming dataset dictionary from a remote hub.""" + ... + + @staticmethod + def generate_to_disk( + output_folder: Union[str, Path], + generators: dict[str, Callable[..., Generator["Sample", None, None]]], + variable_schema: Optional[dict[str, dict]] = None, # noqa: ARG001 + gen_kwargs: Optional[dict[str, dict[str, list[IndexType]]]] = None, + num_proc: int = 1, + verbose: bool = False, + ) -> None: + """Generate and save a dataset dictionary to local storage.""" + ... + + @staticmethod + def push_local_to_hub( + repo_id: str, local_dir: Union[str, Path], num_workers: int = 1 + ) -> None: + """Push a local dataset dictionary to a remote hub repository.""" + ... + + @staticmethod + def configure_dataset_card( + repo_id: str, + infos: dict[str, Any], + local_dir: Optional[Union[str, Path]] = None, + viewer: bool = False, + pretty_name: Optional[str] = None, + dataset_long_description: Optional[str] = None, + illustration_urls: Optional[list[str]] = None, + arxiv_paper_urls: Optional[list[str]] = None, + ) -> None: # pragma: no cover + """Configure metadata for a dataset card associated with a repository.""" + ... + + @staticmethod + def to_var_sample_dict( + dataset: "Dataset", + idx: int, + features: Optional[list[str]] = None, + indexers: Optional[dict[str, Any]] = None, + ) -> dict[str, Optional[np.ndarray]]: + """Convert a backend sample to PLAID variable-sample dictionary representation.""" + ... + + @staticmethod + def sample_to_var_sample_dict( + sample: dict[str, Any], + ) -> dict[str, Any]: + """Convert a backend-native sample object to a variable-sample dictionary.""" + ... diff --git a/src/plaid/storage/cgns/__init__.py b/src/plaid/storage/cgns/__init__.py index 225b3b94..f5c86aa3 100644 --- a/src/plaid/storage/cgns/__init__.py +++ b/src/plaid/storage/cgns/__init__.py @@ -1,23 +1,120 @@ """Package for CGNS storage.""" +from collections.abc import Iterable +from pathlib import Path +from typing import Any, Mapping, Optional, Union -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# +from datasets import IterableDataset -from plaid.storage.cgns.reader import ( +from .reader import ( download_datasetdict_from_hub, init_datasetdict_from_disk, init_datasetdict_streaming_from_hub, ) -from plaid.storage.cgns.writer import ( +from .writer import ( configure_dataset_card, generate_datasetdict_to_disk, push_local_datasetdict_to_hub, ) + +class CgnsBackend: + name = "cgns" + + @staticmethod + def init_from_disk(path: Union[str, Path]) -> Mapping[str, Any]: + return init_datasetdict_from_disk(path=path) + + @staticmethod + def download_from_hub( + repo_id: str, + local_dir: Union[str, Path], + split_ids: Optional[dict[str, Iterable[int]]] = None, + features: Optional[list[str]] = None, + overwrite: bool = False, + ) -> str: + return download_datasetdict_from_hub( + repo_id=repo_id, + local_dir=local_dir, + split_ids=split_ids, + features=features, + overwrite=overwrite, + ) + + @staticmethod + def init_datasetdict_streaming_from_hub( + repo_id: str, + split_ids: Optional[dict[str, Iterable[int]]] = None, + features: Optional[list[str]] = None, + ) -> dict[str, IterableDataset]: + return init_datasetdict_streaming_from_hub( + repo_id=repo_id, split_ids=split_ids, features=features + ) + + @staticmethod + def generate_to_disk( + output_folder: Union[str, Path], + generators: dict, + variable_schema: Optional[dict[str, dict]] = None, + gen_kwargs: Optional[dict[str, dict[str, list]]] = None, + num_proc: int = 1, + verbose: bool = False, + ) -> None: + return generate_datasetdict_to_disk( + output_folder=output_folder, + generators=generators, + variable_schema=variable_schema, + gen_kwargs=gen_kwargs, + num_proc=num_proc, + verbose=verbose, + ) + + @staticmethod + def push_local_to_hub( + repo_id: str, local_dir: Union[str, Path], num_workers: int = 1 + ) -> None: + return push_local_datasetdict_to_hub( + repo_id=repo_id, local_dir=local_dir, num_workers=num_workers + ) + + @staticmethod + def configure_dataset_card( + repo_id: str, + infos: dict, + local_dir: Optional[Union[str, Path]] = None, + viewer: bool = False, + pretty_name: Optional[str] = None, + dataset_long_description: Optional[str] = None, + illustration_urls: Optional[list[str]] = None, + arxiv_paper_urls: Optional[list[str]] = None, + ) -> None: + if local_dir is None: + raise ValueError("local_dir must be provided for cgns backend") + return configure_dataset_card( + repo_id=repo_id, + infos=infos, + local_dir=local_dir, + viewer=viewer, + pretty_name=pretty_name, + dataset_long_description=dataset_long_description, + illustration_urls=illustration_urls, + arxiv_paper_urls=arxiv_paper_urls, + ) + + @staticmethod + def to_var_sample_dict( + dataset: object, + idx: int, + features: Optional[list[str]] = None, + indexers: Optional[dict[str, Any]] = None, + ) -> dict: + _ = dataset, idx, features, indexers + raise ValueError(f"to_dict not available for 'cgns' backend") + + @staticmethod + def sample_to_var_sample_dict(sample: dict) -> dict: + raise ValueError(f"sample_to_var_sample_dict not available for 'cgns' backend") + + __all__ = [ "configure_dataset_card", "download_datasetdict_from_hub", diff --git a/src/plaid/storage/cgns/reader.py b/src/plaid/storage/cgns/reader.py index 0c47f37d..1a89b771 100644 --- a/src/plaid/storage/cgns/reader.py +++ b/src/plaid/storage/cgns/reader.py @@ -10,14 +10,6 @@ - Selective loading of splits and sample IDs - Integration with PLAID Sample objects """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import logging import os import shutil @@ -31,7 +23,7 @@ from datasets.splits import NamedSplit from huggingface_hub import hf_hub_download, snapshot_download -from plaid import Sample +from ...containers.sample import Sample logger = logging.getLogger(__name__) @@ -54,11 +46,11 @@ def __init__(self, path: Union[str, Path]) -> None: Args: path: Path to the dataset directory. """ - self.path = path + self.path = Path(path) ids = sorted( int(p.name.removeprefix("sample_")) - for p in path.iterdir() + for p in self.path.iterdir() if p.is_dir() and p.name.startswith("sample_") ) self.ids = np.asarray(ids, dtype=int) @@ -222,7 +214,7 @@ def download_datasetdict_from_hub( if overwrite: shutil.rmtree(local_dir) logger.warning(f"Existing {local_dir} directory has been reset.") - elif any(local_dir.iterdir()): + elif any(output_folder.iterdir()): raise ValueError( f"directory {local_dir} already exists and is not empty. Set `overwrite` to True if needed." ) diff --git a/src/plaid/storage/cgns/writer.py b/src/plaid/storage/cgns/writer.py index 44b4e064..434da302 100644 --- a/src/plaid/storage/cgns/writer.py +++ b/src/plaid/storage/cgns/writer.py @@ -4,14 +4,6 @@ It includes utilities for generating datasets from sample generators, saving to disk, uploading to Hugging Face Hub, and configuring dataset cards. """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import logging import multiprocessing as mp from pathlib import Path @@ -21,8 +13,8 @@ from huggingface_hub import DatasetCard, HfApi from tqdm import tqdm -from plaid import Sample -from plaid.types import IndexType +from ...containers.sample import Sample +from ...types import IndexType logger = logging.getLogger(__name__) diff --git a/src/plaid/storage/common/__init__.py b/src/plaid/storage/common/__init__.py index b1ed4bab..4d1b1649 100644 --- a/src/plaid/storage/common/__init__.py +++ b/src/plaid/storage/common/__init__.py @@ -1,19 +1,11 @@ """Package for common functions of the storage backends.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -from plaid.storage.common.bridge import ( +from .bridge import ( plaid_to_sample_dict, to_plaid_sample, to_sample_dict, ) -from plaid.storage.common.preprocessor import preprocess -from plaid.storage.common.reader import ( +from .preprocessor import preprocess +from .reader import ( load_infos_from_disk, load_infos_from_hub, load_metadata_from_disk, @@ -21,7 +13,7 @@ load_problem_definitions_from_disk, load_problem_definitions_from_hub, ) -from plaid.storage.common.writer import ( +from .writer import ( push_infos_to_hub, save_infos_to_disk, save_metadata_to_disk, diff --git a/src/plaid/storage/common/bridge.py b/src/plaid/storage/common/bridge.py index 02b35139..cd4ecef5 100644 --- a/src/plaid/storage/common/bridge.py +++ b/src/plaid/storage/common/bridge.py @@ -3,23 +3,16 @@ This module provides bridge functions for converting between PLAID samples and storage formats, including flattening/unflattening and sample reconstruction. """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -from typing import Any, Optional +from typing import Any, Iterable, Optional import numpy as np from plaid.containers.features import SampleFeatures from plaid.containers.sample import Sample -from plaid.storage.common.preprocessor import build_sample_dict from plaid.utils.cgns_helper import unflatten_cgns_tree +from .preprocessor import build_sample_dict + def unflatten_path(key: str) -> str: """Unflattens a Zarr key by replacing underscores with slashes. @@ -219,7 +212,9 @@ def to_plaid_sample( def plaid_to_sample_dict( - sample: Sample, variable_features: list[str], constant_features: list[str] + sample: Sample, + variable_features: Iterable[str], + constant_features: Iterable[str], ) -> dict[str, Any]: """Convert PLAID Sample to sample dict. diff --git a/src/plaid/storage/common/preprocessor.py b/src/plaid/storage/common/preprocessor.py index d9442f2f..620282dd 100644 --- a/src/plaid/storage/common/preprocessor.py +++ b/src/plaid/storage/common/preprocessor.py @@ -4,28 +4,22 @@ for storage, including flattening CGNS trees, inferring data types, and handling parallel processing of sample shards. """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import hashlib +import logging import multiprocessing as mp -import sys -import traceback from queue import Empty from typing import Any, Callable, Generator, Optional, Union import numpy as np from tqdm import tqdm -from plaid import Sample from plaid.types import IndexType from plaid.utils.cgns_helper import flatten_cgns_tree +from ...containers.sample import Sample + +logger = logging.getLogger(__name__) + def infer_dtype(value: Any) -> dict[str, int | str]: """Infer canonical dtype schema from a value.""" @@ -314,8 +308,7 @@ def _process_shard_debug( try: return process_shard(generator_fn, progress_queue, n_proc, shard_ids) except Exception as e: - print(f"Exception in worker for shards {shard_ids}: {e}", file=sys.stderr) - traceback.print_exc() + logger.exception("Exception in worker for shards %s: %s", shard_ids, e) raise # re-raise to propagate to main process diff --git a/src/plaid/storage/common/reader.py b/src/plaid/storage/common/reader.py index 320a9d39..d6f0f409 100644 --- a/src/plaid/storage/common/reader.py +++ b/src/plaid/storage/common/reader.py @@ -3,14 +3,6 @@ This module provides common utilities for reading dataset metadata, problem definitions, and other auxiliary files from disk or downloading them from Hugging Face Hub. """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import json import logging import tempfile @@ -21,7 +13,7 @@ import yaml from huggingface_hub import hf_hub_download, snapshot_download -from plaid import ProblemDefinition +from ...problem_definition import ProblemDefinition logger = logging.getLogger(__name__) @@ -93,7 +85,10 @@ def load_problem_definitions_from_disk( ValueError: If the ``problem_definitions/`` directory does not exist. """ - pb_def_dir = Path(path) / Path("problem_definitions") + + pb_def_dir = Path(path).absolute() + if pb_def_dir.name != "problem_definitions": + pb_def_dir /= Path("problem_definitions") if pb_def_dir.is_dir(): pb_defs = {} @@ -101,32 +96,33 @@ def load_problem_definitions_from_disk( if p.is_file(): pb_def = ProblemDefinition() pb_def._load_from_file_(pb_def_dir / Path(p.name)) - pb_defs[pb_def.get_name()] = pb_def + pb_defs[pb_def.name] = pb_def return pb_defs else: - raise ValueError("No problem definitions found on disk.") # pragma: no cover + raise ValueError( + f"No problem definitions found on disk. path '{pb_def_dir}'" + ) # pragma: no cover def load_constants_from_disk(path): """Load constant features stored under a dataset's "constants" directory. - The function expects the following layout under /constants/: - - one folder per split (e.g. "train", "test", ...) - each containing: - * layout.json : mapping constant_name -> {'offset': int, 'shape': [..]} or None - * constant_schema.yaml : YAML describing dtype for each constant (dtype string or "string") - * data.mmap : raw bytes memory-mapped file containing packed constant data + The function expects the following layout under /constants/. One folder per split (e.g. "train", "test", ...) + each containing: + - layout.json : mapping constant_name -> {'offset': int, 'shape': [..]} or None + - constant_schema.yaml : YAML describing dtype for each constant (dtype string or "string") + - data.mmap : raw bytes memory-mapped file containing packed constant data Args: path (str | Path): Root dataset directory that contains the "constants" folder. Returns: tuple: - flat_cst (dict[str, dict[str, Any]]): Mapping split -> {constant_name: numpy array | None}. - - Numeric constants are returned as ``np.memmap`` arrays backed by - ``data.mmap`` in the dataset directory. - - String constants are returned as 1-element numpy arrays of Python str decoded using ASCII. - - If layout entry for a key is None, the value is returned as None. + flat_cst (dict[str, dict[str, Any]]): Mapping from split names to dictionaries of constant + values. Numeric constants are returned as + ``np.memmap`` arrays backed by ``data.mmap``. String constants are + returned as 1-element numpy arrays of Python strings decoded using + ASCII. If a layout entry is ``None``, the returned value is ``None`` constant_schema (dict[str, dict[str, Any]]): Mapping split -> loaded constant schema (from YAML). Raises: diff --git a/src/plaid/storage/common/writer.py b/src/plaid/storage/common/writer.py index 30a52b45..0d67aa58 100644 --- a/src/plaid/storage/common/writer.py +++ b/src/plaid/storage/common/writer.py @@ -4,14 +4,6 @@ and other auxiliary files to disk or uploading them to Hugging Face Hub. It handles serialization of infos, problem definitions, and dataset tree structures. """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import io import json import logging @@ -22,7 +14,7 @@ import yaml from huggingface_hub import HfApi -from plaid import ProblemDefinition +from ...problem_definition import ProblemDefinition logger = logging.getLogger(__name__) @@ -58,7 +50,7 @@ def save_problem_definitions_to_disk( pb_defs (Union[dict[str, ProblemDefinition], ProblemDefinition]): The problem definitions to save. """ if isinstance(pb_defs, ProblemDefinition): - pb_defs = {pb_defs.get_name(): pb_defs} + pb_defs = {pb_defs.name: pb_defs} target_dir = Path(path) / "problem_definitions" target_dir.mkdir(parents=True, exist_ok=True) @@ -179,7 +171,8 @@ def save_constants_to_disk(path, constant_schema, flat_cst): offset += nbytes - json.dump(layout, open(cst_path / "layout.json", "w"), indent=2) + with open(cst_path / "layout.json", "w") as f: + json.dump(layout, f, indent=2) with open(cst_path / "constant_schema.yaml", "w", encoding="utf-8") as f: yaml.dump(constant_schema[split], f, sort_keys=False) @@ -271,6 +264,7 @@ def push_local_problem_definitions_to_hub( ``HfApi.upload_folder``. Expected local layout: + / problem_definitions/ @@ -337,7 +331,8 @@ def push_local_metadata_to_hub( (e.g. via ``save_metadata_to_disk``). This function performs no validation, transformation, or serialization; it strictly uploads existing files. - Expected local layout: + Expected local layout:: + / constants/ / diff --git a/src/plaid/storage/hf_datasets/__init__.py b/src/plaid/storage/hf_datasets/__init__.py index 5dfa06f7..4deac76b 100644 --- a/src/plaid/storage/hf_datasets/__init__.py +++ b/src/plaid/storage/hf_datasets/__init__.py @@ -1,27 +1,128 @@ """Package for HF_datasets storage.""" +from collections.abc import Iterable, Mapping +from pathlib import Path +from typing import Any, Optional, Union -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# +import numpy as np +from datasets import Dataset, IterableDatasetDict -from plaid.storage.hf_datasets.bridge import ( +from .bridge import ( sample_to_var_sample_dict, to_var_sample_dict, ) -from plaid.storage.hf_datasets.reader import ( +from .reader import ( download_datasetdict_from_hub, init_datasetdict_from_disk, init_datasetdict_streaming_from_hub, ) -from plaid.storage.hf_datasets.writer import ( +from .writer import ( configure_dataset_card, generate_datasetdict_to_disk, push_local_datasetdict_to_hub, ) + +class HFBackend: + name = "hf_datasets" + + @staticmethod + def init_from_disk(path: Union[str, Path]) -> Mapping[str, Any]: + return init_datasetdict_from_disk(path=path) + + @staticmethod + def download_from_hub( + repo_id: str, + local_dir: Union[str, Path], + split_ids: Optional[dict[str, Iterable[int]]] = None, + features: Optional[list[str]] = None, # noqa: ARG001 + overwrite: bool = False, + ) -> str: + return download_datasetdict_from_hub( + repo_id=repo_id, + local_dir=local_dir, + split_ids=split_ids, + features=features, + overwrite=overwrite, + ) + + @staticmethod + def init_datasetdict_streaming_from_hub( + repo_id: str, + split_ids: Optional[dict[str, Iterable[int]]] = None, + features: Optional[list[str]] = None, + ) -> IterableDatasetDict: + return init_datasetdict_streaming_from_hub( + repo_id=repo_id, split_ids=split_ids, features=features + ) + + @staticmethod + def generate_to_disk( + output_folder: Union[str, Path], + generators: dict, + variable_schema: dict[str, dict], + gen_kwargs: Optional[dict[str, dict[str, list]]] = None, + num_proc: int = 1, + verbose: bool = False, + ) -> None: + return generate_datasetdict_to_disk( + output_folder=output_folder, + generators=generators, + variable_schema=variable_schema, + gen_kwargs=gen_kwargs, + num_proc=num_proc, + verbose=verbose, + ) + + @staticmethod + def push_local_to_hub( + repo_id: str, local_dir: Union[str, Path], num_workers: int = 1 + ) -> None: + return push_local_datasetdict_to_hub( + repo_id=repo_id, local_dir=local_dir, num_workers=num_workers + ) + + @staticmethod + def configure_dataset_card( + repo_id: str, + infos: dict, + local_dir: Optional[Union[str, Path]] = None, + viewer: bool = False, + pretty_name: Optional[str] = None, + dataset_long_description: Optional[str] = None, + illustration_urls: Optional[list[str]] = None, + arxiv_paper_urls: Optional[list[str]] = None, + ) -> None: + return configure_dataset_card( + repo_id=repo_id, + infos=infos, + local_dir=local_dir, + viewer=viewer, + pretty_name=pretty_name, + dataset_long_description=dataset_long_description, + illustration_urls=illustration_urls, + arxiv_paper_urls=arxiv_paper_urls, + ) + + @staticmethod + def to_var_sample_dict( + dataset: Dataset, + idx: int, + features: Optional[list[str]] = None, + indexers: Optional[dict[str, Any]] = None + ) -> dict[str, Optional[np.ndarray]]: + return to_var_sample_dict( + ds=dataset, + i=idx, + features=features, + indexers= indexers + + ) + + @staticmethod + def sample_to_var_sample_dict(sample: dict[str, Any]) -> dict[str, Any]: + return sample_to_var_sample_dict(hf_sample=sample) + + __all__ = [ "configure_dataset_card", "download_datasetdict_from_hub", diff --git a/src/plaid/storage/hf_datasets/bridge.py b/src/plaid/storage/hf_datasets/bridge.py index 317c9824..ba393292 100644 --- a/src/plaid/storage/hf_datasets/bridge.py +++ b/src/plaid/storage/hf_datasets/bridge.py @@ -4,14 +4,6 @@ and Hugging Face Datasets format. It includes utilities for feature type conversion, dataset generation from PLAID objects, and sample reconstruction. """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - from functools import partial from typing import Any, Callable, Generator, Optional @@ -20,10 +12,11 @@ import pyarrow as pa from datasets import Features, Sequence, Value -from plaid import Sample from plaid.storage.common.preprocessor import build_sample_dict from plaid.types import IndexType +from ...containers.sample import Sample + def convert_dtype_to_hf_feature(feature_type: dict[str, Any]): """Convert a PLAID feature type dict to Hugging Face Feature. @@ -159,7 +152,7 @@ def to_var_sample_dict( ds: datasets.Dataset, i: int, features: Optional[list[str]] = None, - enforce_shapes: bool = True, + indexers: Optional[dict[str, Any]] = None, ) -> dict[str, Optional[np.ndarray]]: """Convert a Hugging Face dataset row to a variable sample dict containing the features that vary in the dataset. @@ -167,7 +160,8 @@ def to_var_sample_dict( ds (datasets.Dataset): The Hugging Face dataset. i (int): The row index. features: Iterable of feature names (keys) to extract from the dataset. - enforce_shapes (bool): Whether to enforce consistent shapes. + indexers: Optional mapping ``feature_path -> indexer`` used to select + feature values along the last axis. Returns: dict[str, Optional[np.ndarray]]: The variable sample dictionary. @@ -175,41 +169,87 @@ def to_var_sample_dict( table = ds.data if features is None: - features = table.column_names + selected_features = list(table.column_names) else: missing = set(features) - set(table.column_names) if missing: # pragma: no cover raise KeyError(f"Missing features in hf_dataset: {sorted(missing)}") + selected_features = features var_sample_dict = {} - if not enforce_shapes: - for name in features: + indexers = indexers or {} + + for name in selected_features: + if isinstance(table[name][i], pa.NullScalar): + var_sample_dict[name] = None # pragma: no cover + else: value = table[name][i].values if value is None: var_sample_dict[name] = None # pragma: no cover else: - var_sample_dict[name] = value.to_numpy(zero_copy_only=False) - else: - for name in features: - if isinstance(table[name][i], pa.NullScalar): - var_sample_dict[name] = None # pragma: no cover - else: - value = table[name][i].values - if value is None: - var_sample_dict[name] = None # pragma: no cover + if name in indexers: + var_sample_dict[name] = _extract_indexed_arrow( + value, indexers[name], name + ) + elif isinstance(value, pa.ListArray): + var_sample_dict[name] = np.stack( + value.to_numpy(zero_copy_only=False) + ) + elif isinstance(value, pa.StringArray): # pragma: no cover + var_sample_dict[name] = value.to_numpy(zero_copy_only=False) else: - if isinstance(value, pa.ListArray): - var_sample_dict[name] = np.stack( - value.to_numpy(zero_copy_only=False) - ) - elif isinstance(value, pa.StringArray): # pragma: no cover - var_sample_dict[name] = value.to_numpy(zero_copy_only=False) - else: - var_sample_dict[name] = value.to_numpy(zero_copy_only=True) + var_sample_dict[name] = value.to_numpy(zero_copy_only=True) return var_sample_dict +def _extract_indexed_arrow(value: Any, indexer: Any, feat_name: str) -> np.ndarray: + """Extract selected indices along the last axis from an Arrow value.""" + if indexer is None: # pragma: no cover + return _to_numpy_arrow(value) + + if isinstance(indexer, slice): + return _apply_indexer(_to_numpy_arrow(value), indexer, feat_name) + + idx = np.asarray(indexer, dtype=np.int64) + if idx.ndim != 1: + raise ValueError( + f"Indexer for feature '{feat_name}' must be a 1D sequence or slice" + ) + + # Best effort: for primitive arrays, gather directly with Arrow before numpy conversion. + if isinstance(value, pa.Array) and not isinstance(value, pa.ListArray): + axis_size = len(value) + if np.any(idx >= axis_size) or np.any(idx < -axis_size): + raise IndexError( + f"Indexer for feature '{feat_name}' contains out-of-bounds values " + f"for last axis of size {axis_size}" + ) + idx_arrow = pa.array(idx, type=pa.int64()) + taken = value.take(idx_arrow) + return taken.to_numpy(zero_copy_only=False) + + return _apply_indexer(_to_numpy_arrow(value), idx, feat_name) + + +def _to_numpy_arrow(value: Any) -> np.ndarray: + """Convert Arrow values used by storage bridge to numpy arrays.""" + if isinstance(value, pa.ListArray): + return np.stack(value.to_numpy(zero_copy_only=False)) + if isinstance(value, pa.StringArray): # pragma: no cover + return value.to_numpy(zero_copy_only=False) + return value.to_numpy(zero_copy_only=False) + + +def _apply_indexer(arr: np.ndarray, indexer: Any, feat_name: str) -> np.ndarray: + """Apply a last-axis indexer to a numpy array.""" + if arr.ndim == 0: + raise ValueError(f"Cannot apply indexer to scalar feature '{feat_name}'") + + selector_prefix = (slice(None),) * (arr.ndim - 1) + return np.asarray(arr[selector_prefix + (indexer,)]) + + def sample_to_var_sample_dict( hf_sample: dict[str, Any], ) -> dict[str, Any]: diff --git a/src/plaid/storage/hf_datasets/reader.py b/src/plaid/storage/hf_datasets/reader.py index 38f4ca01..502afa50 100644 --- a/src/plaid/storage/hf_datasets/reader.py +++ b/src/plaid/storage/hf_datasets/reader.py @@ -10,20 +10,12 @@ - If the dataset is already cached locally, loads from disk. - Otherwise, loads from the hub, optionally using streaming mode. """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import logging import os import shutil import tempfile from pathlib import Path -from typing import Optional, Union +from typing import Iterable, Optional, Union import datasets from datasets import load_dataset, load_from_disk @@ -38,7 +30,7 @@ # Load from disk # ------------------------------------------------------ -HFDatasetDict = dict[str, datasets.DatasetDict] +HFDatasetDict = datasets.DatasetDict def init_datasetdict_from_disk(path: Union[str, Path]) -> HFDatasetDict: @@ -50,7 +42,10 @@ def init_datasetdict_from_disk(path: Union[str, Path]) -> HFDatasetDict: Returns: HFDatasetDict: The loaded dataset dictionary. """ - return load_from_disk(dataset_path=str(Path(path) / "data")) + dataset = load_from_disk(dataset_path=str(Path(path) / "data")) + if not isinstance(dataset, datasets.DatasetDict): # pragma: no cover + raise TypeError("Expected DatasetDict when loading hf_datasets backend from disk") + return dataset # ------------------------------------------------------ @@ -61,7 +56,7 @@ def init_datasetdict_from_disk(path: Union[str, Path]) -> HFDatasetDict: def download_datasetdict_from_hub( repo_id: str, local_dir: Union[str, Path], - split_ids: Optional[dict[str, int]] = None, # noqa: ARG001 + split_ids: Optional[dict[str, Iterable[int]]] = None, # noqa: ARG001 features: Optional[list[str]] = None, # noqa: ARG001 overwrite: bool = False, ) -> str: # pragma: no cover (not tested in unit tests) @@ -70,7 +65,7 @@ def download_datasetdict_from_hub( Args: repo_id (str): The repository ID on Hugging Face Hub. local_dir (Union[str, Path]): Local directory to download to. - split_ids (Optional[dict[str, int]]): Unused parameter for split selection. + split_ids (Optional[dict[str, Iterable[int]]]): Unused parameter for split selection. features (Optional[list[str]]): Unused parameter for feature selection. overwrite (bool): Whether to overwrite existing directory. @@ -102,19 +97,19 @@ def download_datasetdict_from_hub( datasetdict = load_dataset("parquet", data_files=data_files, cache_dir=tmp_dir) datasetdict.save_to_disk(str(Path(output_folder) / "data")) - return output_folder + return str(output_folder) def init_datasetdict_streaming_from_hub( repo_id: str, - split_ids: Optional[dict[str, int]] = None, # noqa: ARG001 + split_ids: Optional[dict[str, Iterable[int]]] = None, # noqa: ARG001 features: Optional[list[str]] = None, ) -> datasets.IterableDatasetDict: # pragma: no cover """Initializes a streaming DatasetDict from Hugging Face Hub. Args: repo_id (str): The repository ID on Hugging Face Hub. - split_ids (Optional[dict[str, int]]): Unused parameter for split selection. + split_ids (Optional[dict[str, Iterable[int]]]): Unused parameter for split selection. features (Optional[list[str]]): Optional list of features to load. Returns: diff --git a/src/plaid/storage/hf_datasets/writer.py b/src/plaid/storage/hf_datasets/writer.py index 430ddcc2..625e1769 100644 --- a/src/plaid/storage/hf_datasets/writer.py +++ b/src/plaid/storage/hf_datasets/writer.py @@ -10,13 +10,6 @@ - Hub uploading with optimized sharding - Dataset card configuration and updating """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# import gc import logging import tempfile @@ -27,11 +20,12 @@ import yaml from huggingface_hub import DatasetCard, hf_hub_download -from plaid import Sample from plaid.storage.hf_datasets.bridge import generator_to_datasetdict from plaid.storage.hf_datasets.reader import init_datasetdict_from_disk from plaid.types import IndexType +from ...containers.sample import Sample + logger = logging.getLogger(__name__) diff --git a/src/plaid/storage/in_memory/__init__.py b/src/plaid/storage/in_memory/__init__.py new file mode 100644 index 00000000..3b50b12a --- /dev/null +++ b/src/plaid/storage/in_memory/__init__.py @@ -0,0 +1,247 @@ +from pathlib import Path +from typing import ( + Any, + Callable, + Dict, + Generator, + Iterable, + Optional, + Sequence, + Union, + overload, +) + +import numpy as np + +from ...containers.sample import Sample +from ...types import IndexType + + +def _find_first_missing(d: Iterable[int]) -> int: + key = 0 # Or 0, depending on your starting preference + while key in d: + key += 1 + return key + + +class InMemoryBackend: + name = "in_memory" + + @staticmethod + def init_from_disk(path: Union[str, Path]) -> Any: + """Raise because the in-memory backend cannot be initialized from disk.""" + raise NotImplementedError("inMemoryBackend does not support init from disk") + + def download_from_hub( + self, + repo_id: str, + local_dir: Union[str, Path], + split_ids: Optional[dict[str, Iterable[int]]] = None, + features: Optional[list[str]] = None, + overwrite: bool = False, + ) -> str: + """Raise because hub download is not implemented for the in-memory backend.""" + raise NotImplementedError("InMemoryBackend download_from_hub not implemented") + + def init_datasetdict_streaming_from_hub( + self, + repo_id: str, + split_ids: Optional[dict[str, Iterable[int]]] = None, + features: Optional[list[str]] = None, + ) -> dict[str, Any]: + """Raise because streaming from hub is not implemented for this backend.""" + raise NotImplementedError( + "InMemoryBackend init_datasetdict_streaming_from_hub not implemented" + ) + + def generate_to_disk( + self, + output_folder: Union[str, Path], + generators: dict[str, Callable[..., Generator[Sample, None, None]]], + variable_schema: Optional[dict[str, dict]] = None, + gen_kwargs: Optional[dict[str, dict[str, list[IndexType]]]] = None, + num_proc: int = 1, + verbose: bool = False, + ) -> None: + """Raise because writing to disk is not implemented for the in-memory backend.""" + raise NotImplementedError("InMemoryBackend generate_to_disk not implemented") + + def push_local_to_hub( + self, repo_id: str, local_dir: Union[str, Path], num_workers: int = 1 + ) -> None: + """Raise because pushing to hub is not implemented for this backend.""" + raise NotImplementedError("InMemoryBackend push_local_to_hub not implemented") + + def configure_dataset_card( + self, + repo_id: str, + infos: dict, + local_dir: Optional[Union[str, Path]] = None, + viewer: bool = False, + pretty_name: Optional[str] = None, + dataset_long_description: Optional[str] = None, + illustration_urls: Optional[list[str]] = None, + arxiv_paper_urls: Optional[list[str]] = None, + ) -> None: + """Raise because dataset-card configuration is not implemented for this backend.""" + raise NotImplementedError( + "InMemoryBackend configure_dataset_card not implemented" + ) + + def __init__(self) -> None: + self._samples: Dict[int, Sample] = {} + + def __len__(self) -> int: + return len(self._samples) + + def __getitem__( + self, key: Union[int, slice, Sequence[int]] + ) -> Union[Sample, list[Sample]]: + if isinstance(key, (slice, Sequence)): + return [ + self._samples[k] + for k in ( + range(*key.indices(len(self))) if isinstance(key, slice) else key + ) + ] + return self._samples[key] + + def add_sample( + self, + sample: Union[Sample, Sequence[Sample]], + sample_id: Optional[Union[int, Sequence[int]]] = None, + *, + id: Optional[Union[int, Sequence[int]]] = None, + ) -> Union[int, list[int]]: + """Add one sample or a sequence of samples to the in-memory backend. + + Args: + sample: One :class:`Sample` or a sequence of samples. + sample_id: Optional id(s) associated with ``sample``. + id: Alias of ``sample_id`` for backward compatibility. + + Returns: + Added sample id or list of added ids. + """ + if sample_id is None: + sample_id = id + + if isinstance(sample, Sample): + if sample_id is None: + sample_id = len(self) + elif not isinstance(sample_id, int): + raise TypeError("sample_id must be an int when samples is a Sample") + + self.set_sample(sample=sample, sample_id=sample_id) + return sample_id + elif isinstance(sample, Sequence): + if sample_id is None: + sample_id = list(range(len(self), len(self) + len(sample))) + elif not isinstance(sample_id, Sequence): + raise TypeError( + "sample_id must be a sequence when samples is a sequence" + ) + else: + if len(sample_id) != len(np.unique(sample_id)): + raise ValueError("sample_ids must be unique") + + if len(sample) != len(sample_id): + raise ValueError( + "The length of the list of samples to add and the list of IDs are different" + ) + + return [ + self.add_sample(sample=s, sample_id=i) + for i, s in zip(sample_id, sample) + ] + else: + raise TypeError( + f"sample must be a Sample of sequence[Samples], not : {type(sample)}" + ) + + @overload + def set_sample(self, sample: Sample, sample_id: Optional[int] = None) -> int: ... + + @overload + def set_sample( + self, sample: Sequence[Sample], sample_id: Optional[Sequence[int]] + ) -> list[int]: ... + + def set_sample( + self, + sample: Union[Sample, Sequence[Sample]], + sample_id: Optional[Union[int, Sequence[int]]] = None, + *, + id: Optional[Union[int, Sequence[int]]] = None, + ) -> Union[int, list[int]]: + """Set the samples of the data set, overwriting the existing ones. + + Args: + sample: A single sample or a sequence of samples to set. + sample_id: Optional single id or sequence of ids matching ``sample``. + + Raises: + TypeError: If ``sample`` is not a :class:`Sample` or sequence of samples. + TypeError: If ``sample_id`` type does not match the ``sample`` kind. + ValueError: If a provided integer sample_id is negative. + """ + if sample_id is None: + sample_id = id + + if isinstance(sample, Sequence) and not isinstance(sample, Sample): + if sample_id is None: + return [self.set_sample(s) for s in sample] + if not isinstance(sample_id, Sequence): # pragma: no cover + raise TypeError( + "sample_id should be a sequence when sample is a sequence" + ) + added_ids: list[int] = [] + for i, s in zip(sample_id, sample): + added_id = self.set_sample(sample=s, sample_id=i) + if not isinstance(added_id, int): # pragma: no cover + raise TypeError("expected integer id when adding one sample") + added_ids.append(added_id) + return added_ids + + if not (isinstance(sample, Sample)): + raise TypeError(f"sample should be of type Sample but is {type(sample)=}") + + if sample_id is None: + sample_id = _find_first_missing(self._samples) + elif not (isinstance(sample_id, int)): + raise TypeError( + f"sample_id should be of type {int.__class__} but {type(sample_id)=}" + ) + + if sample_id < 0: + raise ValueError(f"sample_id should be positive (sample_id>=0) but {sample_id=}") + + self._samples[sample_id] = sample + + return sample_id + + def merge_dataset(self, dataset: Any) -> Optional[list[int]]: + """Merges samples of another dataset into this one. + + Args: + dataset (Dataset): The data set to be merged into this one (self). + in_place (bool, option): If True, modifies the current dataset in place. + + Returns: + Optional[list[int]]: ids of added :class:`Samples ` + from input :class:`Dataset `. Returns + ``None`` when ``dataset`` is ``None``. + + Raises: + ValueError: If the provided dataset value is not an instance of Dataset + """ + if dataset is None: + return None + + added_ids: list[int] = [] + for i in range(len(dataset)): + added_id = self.add_sample(dataset[i]) + if not isinstance(added_id, int): # pragma: no cover + raise TypeError("expected integer id when merging dataset samples") + added_ids.append(added_id) + return added_ids diff --git a/src/plaid/storage/reader.py b/src/plaid/storage/reader.py index bfb141ff..6d5628e9 100644 --- a/src/plaid/storage/reader.py +++ b/src/plaid/storage/reader.py @@ -10,18 +10,9 @@ - Automatic backend detection and converter creation - Sample conversion between storage formats and PLAID objects """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - from pathlib import Path from typing import Any, Iterable, Optional, Union -from plaid import Sample from plaid.storage.common.bridge import ( plaid_to_sample_dict, to_plaid_sample, @@ -42,6 +33,8 @@ from plaid.storage.registry import get_backend from plaid.utils.cgns_helper import update_features_for_CGNS_compatibility +from ..containers.sample import Sample + class Converter: """Converter class for transforming samples between storage and PLAID formats. @@ -83,6 +76,7 @@ def to_dict( dataset: Any, idx: int, features: Optional[list[str]] = None, + indexers: Optional[dict[str, Any]] = None, ) -> dict[float, dict[str, Any]]: """Convert a dataset sample to dictionary format. @@ -91,6 +85,9 @@ def to_dict( idx: Index of the sample to convert. features: Optional list of feature names to include from the variable fields. If None, all variable features available for the backend are included. + indexers: Optional mapping ``feature_path -> indexer`` used to extract only + selected indices inside variable features. Indexing semantics are + backend-dependent and ignored for non-requested features. Returns: dict: Sample data in dictionary format. @@ -98,21 +95,37 @@ def to_dict( Raises: ValueError: If called with CGNS backend. """ - if self.backend_spec.to_var_sample_dict is None: + if self.backend_spec.to_var_sample_dict is None: # pragma: no cover raise ValueError( f"Converter.to_dict not available for {self.backend} backend" ) if features: features = update_features_for_CGNS_compatibility( - features, self.constant_features, self.variable_features + features, + self.constant_features, + self.variable_features, ) req_var_feat = [f for f in features if f in self.variable_features] else: req_var_feat = None + if indexers is not None: + unknown = set(indexers.keys()) - self.variable_features + if unknown: + raise KeyError( + f"Indexers contain unknown variable features: {sorted(unknown)}" + ) + if req_var_feat is not None: + not_requested = set(indexers.keys()) - set(req_var_feat) + if not_requested: + raise KeyError( + "Indexers contain features not present in requested variable " + f"features: {sorted(not_requested)}" + ) + var_sample_dict = self.backend_spec.to_var_sample_dict( - dataset, idx, features=req_var_feat + dataset, idx, features=req_var_feat, indexers=indexers ) return to_sample_dict(var_sample_dict, self.flat_cst, self.cgns_types, features) @@ -121,6 +134,7 @@ def to_plaid( dataset: Any, idx: int, features: Optional[list[str]] = None, + indexers: Optional[dict[str, Any]] = None, ) -> Sample: """Convert a dataset sample to PLAID Sample object. @@ -130,16 +144,20 @@ def to_plaid( features: Optional list of feature names to include from the variable fields. If None, all variable features available for the backend are included. Features are retreated based on self.constant_features and self.variable_features to satisfy the CGNS conventions. + indexers: Optional mapping ``feature_path -> indexer`` used to extract only + selected indices inside variable features. Returns: Sample: A PLAID Sample object. """ if features: features = update_features_for_CGNS_compatibility( - features, self.constant_features, self.variable_features + features, + self.constant_features, + self.variable_features, ) if self.backend != "cgns": - sample_dict = self.to_dict(dataset, idx, features) + sample_dict = self.to_dict(dataset, idx, features, indexers=indexers) return to_plaid_sample(sample_dict, self.cgns_types) else: return dataset[idx] @@ -156,7 +174,7 @@ def sample_to_dict(self, sample: Sample) -> dict[float, dict[str, Any]]: Raises: ValueError: If called with CGNS backend. """ - if self.backend_spec.sample_to_var_sample_dict is None: + if self.backend_spec.sample_to_var_sample_dict is None: # pragma: no cover raise ValueError( f"Converter.sample_to_var_sample_dict not available for {self.backend} backend" ) @@ -188,7 +206,9 @@ def plaid_to_dict(self, plaid_sample: Sample) -> dict[str, Any]: dict: Sample data in dictionary format suitable for storage. """ return plaid_to_sample_dict( - plaid_sample, self.variable_features, self.constant_features + plaid_sample, + self.variable_features, + self.constant_features, ) def __repr__(self) -> str: @@ -225,8 +245,7 @@ def init_from_disk( backend = infos["storage_backend"] num_samples = infos["num_samples"] - backend_spec = get_backend(backend) - datasetdict = backend_spec.init_from_disk(local_dir) + datasetdict = get_backend(backend).init_from_disk(path=local_dir) if splits is None: splits = list(datasetdict.keys()) @@ -310,7 +329,9 @@ def init_streaming_from_hub( num_samples = infos["num_samples"] backend_spec = get_backend(backend) - datasetdict = backend_spec.init_streaming_from_hub(repo_id, split_ids, features) + datasetdict = backend_spec.init_datasetdict_streaming_from_hub( + repo_id, split_ids, features + ) converterdict = {} for split in datasetdict.keys(): diff --git a/src/plaid/storage/registry.py b/src/plaid/storage/registry.py index 09da48a6..17ab6d73 100644 --- a/src/plaid/storage/registry.py +++ b/src/plaid/storage/registry.py @@ -1,95 +1,28 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - """Backend registry for plaid.storage. This module centralizes backend wiring so reader/writer code can use a single source of truth for backend capabilities. """ -from dataclasses import dataclass -from pathlib import Path -from typing import Any, Callable, Optional, Union - -import datasets - -from plaid.storage.cgns.reader import CGNSDatasetDict -from plaid.storage.hf_datasets.reader import HFDatasetDict -from plaid.storage.zarr.reader import ZarrDatasetDict - -from . import cgns, hf_datasets, zarr - - -@dataclass(frozen=True) -class BackendSpec: - """Backend wiring for storage operations.""" - - name: str - init_from_disk: Callable[ - [Union[str, Path]], Union[CGNSDatasetDict, HFDatasetDict, ZarrDatasetDict] - ] - download_from_hub: Callable[..., str] - init_streaming_from_hub: Callable[ - ..., dict[str, datasets.IterableDataset] | datasets.IterableDatasetDict - ] - generate_to_disk: Callable[..., None] - push_local_to_hub: Callable[..., None] - configure_dataset_card: Callable[..., None] - to_var_sample_dict: Optional[Callable[..., dict[str, Any]]] - sample_to_var_sample_dict: Optional[Callable[..., dict[str, Any]]] - +from . import cgns, hf_datasets, in_memory, zarr +from .backend_api import BackendModule BACKENDS = { - "cgns": BackendSpec( - name="cgns", - init_from_disk=cgns.init_datasetdict_from_disk, - download_from_hub=cgns.download_datasetdict_from_hub, - init_streaming_from_hub=cgns.init_datasetdict_streaming_from_hub, - generate_to_disk=cgns.generate_datasetdict_to_disk, - push_local_to_hub=cgns.push_local_datasetdict_to_hub, - configure_dataset_card=cgns.configure_dataset_card, - to_var_sample_dict=None, - sample_to_var_sample_dict=None, - ), - "hf_datasets": BackendSpec( - name="hf_datasets", - init_from_disk=hf_datasets.init_datasetdict_from_disk, - download_from_hub=hf_datasets.download_datasetdict_from_hub, - init_streaming_from_hub=hf_datasets.init_datasetdict_streaming_from_hub, - generate_to_disk=hf_datasets.generate_datasetdict_to_disk, - push_local_to_hub=hf_datasets.push_local_datasetdict_to_hub, - configure_dataset_card=hf_datasets.configure_dataset_card, - to_var_sample_dict=hf_datasets.to_var_sample_dict, - sample_to_var_sample_dict=hf_datasets.sample_to_var_sample_dict, - ), - "zarr": BackendSpec( - name="zarr", - init_from_disk=zarr.init_datasetdict_from_disk, - download_from_hub=zarr.download_datasetdict_from_hub, - init_streaming_from_hub=zarr.init_datasetdict_streaming_from_hub, - generate_to_disk=zarr.generate_datasetdict_to_disk, - push_local_to_hub=zarr.push_local_datasetdict_to_hub, - configure_dataset_card=zarr.configure_dataset_card, - to_var_sample_dict=zarr.to_var_sample_dict, - sample_to_var_sample_dict=zarr.sample_to_var_sample_dict, - ), + "in_memory": in_memory.InMemoryBackend, + "cgns": cgns.CgnsBackend, + "hf_datasets": hf_datasets.HFBackend, + "zarr": zarr.ZarrBackend, } +def get_backend(name: str) -> type[BackendModule]: + if name not in BACKENDS: + raise ValueError( + f"Error! backend '{name}' not available, option are: {list(BACKENDS.keys())}" + ) + return BACKENDS[name] + + def available_backends() -> list[str]: """Return available backend names in stable order.""" return list(BACKENDS.keys()) - - -def get_backend(name: str) -> BackendSpec: - """Return the backend spec for a given name.""" - try: - return BACKENDS[name] - except KeyError as exc: - raise ValueError( - f"backend {name} not among available ones: {available_backends()}" - ) from exc diff --git a/src/plaid/storage/writer.py b/src/plaid/storage/writer.py index d54f9df8..231b0dce 100644 --- a/src/plaid/storage/writer.py +++ b/src/plaid/storage/writer.py @@ -10,14 +10,6 @@ - Metadata and problem definition handling - Hub integration with dataset cards and metadata """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import logging import shutil from pathlib import Path @@ -25,13 +17,6 @@ from packaging.version import Version -import plaid -from plaid import ProblemDefinition, Sample -from plaid.containers.utils import validate_required_infos -from plaid.storage.common.preprocessor import preprocess -from plaid.storage.common.reader import ( - load_infos_from_disk, -) from plaid.storage.common.writer import ( push_infos_to_hub, push_local_metadata_to_hub, @@ -42,6 +27,15 @@ ) from plaid.storage.registry import available_backends, get_backend +from ..containers.sample import Sample +from ..problem_definition import ProblemDefinition +from ..utils.info import validate_required_infos +from ..version import __version__ +from .common.preprocessor import preprocess +from .common.reader import ( + load_infos_from_disk, +) + logger = logging.getLogger(__name__) @@ -230,7 +224,7 @@ def sample_constructor(file_path): infos = infos.copy() if infos else {} infos.setdefault("num_samples", num_samples) infos.setdefault("storage_backend", backend) - infos.setdefault("plaid", {"version": str(Version(plaid.__version__))}) + infos.setdefault("plaid", {"version": str(Version(__version__))}) save_infos_to_disk(output_folder, infos) diff --git a/src/plaid/storage/zarr/__init__.py b/src/plaid/storage/zarr/__init__.py index 1eec2a15..a88869f1 100644 --- a/src/plaid/storage/zarr/__init__.py +++ b/src/plaid/storage/zarr/__init__.py @@ -1,27 +1,130 @@ """Package for Zarr storage.""" +from collections.abc import Iterable, Mapping +from pathlib import Path +from typing import Any, Optional, Union -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# +import numpy as np +import zarr +from datasets import IterableDataset -from plaid.storage.zarr.bridge import ( +from .bridge import ( sample_to_var_sample_dict, to_var_sample_dict, ) -from plaid.storage.zarr.reader import ( +from .reader import ( download_datasetdict_from_hub, init_datasetdict_from_disk, init_datasetdict_streaming_from_hub, ) -from plaid.storage.zarr.writer import ( +from .writer import ( configure_dataset_card, generate_datasetdict_to_disk, push_local_datasetdict_to_hub, ) + +class ZarrBackend: + name = "zarr" + + @staticmethod + def init_from_disk(path: Union[str, Path]) -> Mapping[str, Any]: + return init_datasetdict_from_disk(path=path) + + @staticmethod + def download_from_hub( + repo_id: str, + local_dir: Union[str, Path], + split_ids: Optional[dict[str, Iterable[int]]] = None, + features: Optional[list[str]] = None, + overwrite: bool = False, + ) -> str: + return download_datasetdict_from_hub( + repo_id=repo_id, + local_dir=local_dir, + split_ids=split_ids, + features=features, + overwrite=overwrite, + ) + + @staticmethod + def init_datasetdict_streaming_from_hub( + repo_id: str, + split_ids: Optional[dict[str, Iterable[int]]] = None, + features: Optional[list[str]] = None, + ) -> dict[str, IterableDataset]: + return init_datasetdict_streaming_from_hub( + repo_id=repo_id, split_ids=split_ids, features=features + ) + + @staticmethod + def generate_to_disk( + output_folder: Union[str, Path], + generators: dict, + variable_schema: dict[str, dict], + gen_kwargs: Optional[dict[str, dict[str, list]]] = None, + num_proc: int = 1, + verbose: bool = False, + ) -> None: + return generate_datasetdict_to_disk( + output_folder=output_folder, + generators=generators, + variable_schema=variable_schema, + gen_kwargs=gen_kwargs, + num_proc=num_proc, + verbose=verbose, + ) + + @staticmethod + def push_local_to_hub( + repo_id: str, local_dir: Union[str, Path], num_workers: int = 1 + ) -> None: + return push_local_datasetdict_to_hub( + repo_id=repo_id, local_dir=local_dir, num_workers=num_workers + ) + + @staticmethod + def configure_dataset_card( + repo_id: str, + infos: dict, + local_dir: Optional[Union[str, Path]] = None, + viewer: bool = False, + pretty_name: Optional[str] = None, + dataset_long_description: Optional[str] = None, + illustration_urls: Optional[list[str]] = None, + arxiv_paper_urls: Optional[list[str]] = None, + ) -> None: + if local_dir is None: # pragma: no cover + raise ValueError("local_dir must be provided for zarr backend") + return configure_dataset_card( + repo_id=repo_id, + infos=infos, + local_dir=local_dir, + viewer=viewer, + pretty_name=pretty_name, + dataset_long_description=dataset_long_description, + illustration_urls=illustration_urls, + arxiv_paper_urls=arxiv_paper_urls, + ) + + @staticmethod + def to_var_sample_dict( + dataset: zarr.Group, + idx: int, + features: Optional[list[str]] = None, + indexers: Optional[dict[str, Any]] = None, + ) -> dict[str, Optional[np.ndarray]]: + return to_var_sample_dict( + zarr_dataset=dataset, + idx=idx, + features=features, + indexers=indexers, + ) + + @staticmethod + def sample_to_var_sample_dict(sample: dict[str, Any]) -> dict[str, Any]: + return sample_to_var_sample_dict(zarr_sample=sample) + + __all__ = [ "configure_dataset_card", "download_datasetdict_from_hub", diff --git a/src/plaid/storage/zarr/bridge.py b/src/plaid/storage/zarr/bridge.py index 2b376a4a..9ca3ccb6 100644 --- a/src/plaid/storage/zarr/bridge.py +++ b/src/plaid/storage/zarr/bridge.py @@ -3,23 +3,18 @@ This module provides utility functions for bridging between PLAID samples and Zarr storage format. It includes functions for key transformation and sample data conversion. """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - from typing import Any, Optional -import zarr +import numpy as np from plaid.storage.common.bridge import flatten_path, unflatten_path def to_var_sample_dict( - zarr_dataset: zarr.Group, idx: int, features: Optional[list[str]] + zarr_dataset: Any, + idx: int, + features: Optional[list[str]], + indexers: Optional[dict[str, Any]] = None, ) -> dict[str, Any]: """Extracts a sample dictionary from a Zarr dataset by index. @@ -27,6 +22,8 @@ def to_var_sample_dict( zarr_dataset (zarr.Group): The Zarr group containing the dataset. idx (int): The sample index to extract. features: Iterable of feature names (keys) to extract from the dataset. + indexers: Optional mapping ``feature_path -> indexer`` used to select + feature values along the last axis. Returns: dict[str, Any]: Dictionary of variable features for the sample. @@ -41,11 +38,48 @@ def to_var_sample_dict( # if missing: # pragma: no cover # raise KeyError(f"Missing features in sample {idx}: {sorted(missing)}") - return { - feat: zarr_sample[flat_feat] - for feat, flat_feat in flattened.items() - if flat_feat in zarr_sample.array_keys() - } + indexers = indexers or {} + out = {} + for feat, flat_feat in flattened.items(): + if flat_feat not in zarr_sample.array_keys(): + continue + + arr = zarr_sample[flat_feat] + if feat in indexers: + out[feat] = _apply_indexer(arr, indexers[feat], feat) + else: + out[feat] = arr + + return out + + +def _apply_indexer(arr: Any, indexer: Any, feat_name: str) -> np.ndarray: + """Apply a last-axis indexer to a Zarr array-like object.""" + if indexer is None: # pragma: no cover + return np.asarray(arr) + + if arr.ndim == 0: + raise ValueError(f"Cannot apply indexer to scalar feature '{feat_name}'") + + selector_prefix = (slice(None),) * (arr.ndim - 1) + + if isinstance(indexer, slice): + return np.asarray(arr[selector_prefix + (indexer,)]) + + idx = np.asarray(indexer, dtype=np.int64) + if idx.ndim != 1: + raise ValueError( + f"Indexer for feature '{feat_name}' must be a 1D sequence or slice" + ) + + axis_size = int(arr.shape[-1]) + if np.any(idx >= axis_size) or np.any(idx < -axis_size): + raise IndexError( + f"Indexer for feature '{feat_name}' contains out-of-bounds values " + f"for last axis of size {axis_size}" + ) + + return np.asarray(arr.oindex[selector_prefix + (idx,)]) def sample_to_var_sample_dict(zarr_sample: dict[str, Any]) -> dict[str, Any]: diff --git a/src/plaid/storage/zarr/reader.py b/src/plaid/storage/zarr/reader.py index 9e187170..8ed4de5d 100644 --- a/src/plaid/storage/zarr/reader.py +++ b/src/plaid/storage/zarr/reader.py @@ -11,13 +11,6 @@ - Selective loading of splits and features - ZarrDataset class for convenient data access """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# import logging import os import shutil @@ -104,14 +97,14 @@ def __repr__(self) -> str: def _zarr_patterns( repo_id: str, - split_ids: Optional[dict[str, list[int]]] = None, + split_ids: Optional[dict[str, Iterable[int]]] = None, features: Optional[list[str]] = None, ): # pragma: no cover """Generates allow and ignore patterns for Zarr dataset downloading. Args: repo_id (str): The Hugging Face repository ID. - split_ids (Optional[dict[str, list[int]]]): Optional split IDs for selective loading. + split_ids (Optional[dict[str, Iterable[int]]]): Optional split IDs for selective loading. features (Optional[list[str]]): Optional features for selective loading. Returns: @@ -243,7 +236,7 @@ def init_datasetdict_from_disk( def download_datasetdict_from_hub( repo_id: str, local_dir: Union[str, Path], - split_ids: Optional[dict[str, list[int]]] = None, + split_ids: Optional[dict[str, Iterable[int]]] = None, features: Optional[list[str]] = None, overwrite: bool = False, ) -> str: # pragma: no cover @@ -252,7 +245,7 @@ def download_datasetdict_from_hub( Args: repo_id (str): The Hugging Face repository ID. local_dir (Union[str, Path]): Local directory to download to. - split_ids (Optional[dict[str, list[int]]]): Optional split IDs for selective download. + split_ids (Optional[dict[str, Iterable[int]]]): Optional split IDs for selective download. features (Optional[list[str]]): Optional features for selective download. overwrite (bool): Whether to overwrite existing directory. @@ -283,7 +276,7 @@ def download_datasetdict_from_hub( def init_datasetdict_streaming_from_hub( repo_id: str, - split_ids: Optional[dict[str, list[int]]] = None, + split_ids: Optional[dict[str, Iterable[int]]] = None, features: Optional[list[str]] = None, ) -> dict[str, IterableDataset]: # pragma: no cover """Initializes streaming dataset dictionaries from Hugging Face Hub. @@ -295,7 +288,7 @@ def init_datasetdict_streaming_from_hub( Args: repo_id (str): The Hugging Face repository ID (e.g., "username/dataset_name"). - split_ids (Optional[dict[str, list[int]]]): Optional dictionary mapping split names + split_ids (Optional[dict[str, Iterable[int]]]): Optional dictionary mapping split names to lists of sample IDs to include. If None, all samples from all splits are included. features (Optional[list[str]]): Optional list of feature names to include. diff --git a/src/plaid/storage/zarr/writer.py b/src/plaid/storage/zarr/writer.py index 5f92dbce..87d90c78 100644 --- a/src/plaid/storage/zarr/writer.py +++ b/src/plaid/storage/zarr/writer.py @@ -11,14 +11,6 @@ - Integration with Hugging Face Hub for dataset sharing - Dataset card generation with splits, features, and documentation """ - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import gc import multiprocessing as mp from pathlib import Path @@ -30,11 +22,12 @@ from huggingface_hub import DatasetCard, HfApi from tqdm import tqdm -from plaid import Sample from plaid.storage.common.bridge import flatten_path from plaid.storage.common.preprocessor import build_sample_dict from plaid.types import IndexType +from ...containers.sample import Sample + def _auto_chunks(shape: tuple[int, ...], target_n: int) -> tuple[int, ...]: """Computes automatic chunk sizes for Zarr arrays based on shape and target size. diff --git a/src/plaid/types/__init__.py b/src/plaid/types/__init__.py index 8af26c25..ba832356 100644 --- a/src/plaid/types/__init__.py +++ b/src/plaid/types/__init__.py @@ -1,24 +1,9 @@ """Custom types for PLAID library.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -from plaid.types.cgns_types import ( +from .cgns_types import ( CGNSNode, CGNSTree, ) -from plaid.types.common import Array, ArrayDType, IndexType -from plaid.types.feature_types import ( - Feature, - Field, - Scalar, - TimeSequence, -) -from plaid.types.sklearn_types import SklearnBlock +from .common import Array, ArrayDType, IndexType __all__ = [ "Array", @@ -26,14 +11,4 @@ "IndexType", "CGNSNode", "CGNSTree", - "Scalar", - "Field", - "TimeSequence", - "Feature", - "FeatureIdentifier", - "SklearnBlock", ] - -# Re-export FeatureIdentifier from containers to maintain backwards compatibility -# Import is done at the bottom to avoid circular import issues -from plaid.containers.feature_identifier import FeatureIdentifier diff --git a/src/plaid/types/cgns_types.py b/src/plaid/types/cgns_types.py index bd8a898d..4a060e59 100644 --- a/src/plaid/types/cgns_types.py +++ b/src/plaid/types/cgns_types.py @@ -1,16 +1,8 @@ """Custom types for CGNS data structures.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import sys from typing import Any, Optional -from pydantic import BaseModel, Field +from pydantic import BaseModel, Field, RootModel, field_validator if sys.version_info >= (3, 11): from typing import TypeAlias @@ -34,3 +26,62 @@ class CGNSNode(BaseModel): # A CGNSTree is simply the root CGNSNode CGNSTree: TypeAlias = CGNSNode + +import re + +CGNS_PATTERN = re.compile(r"^Base_\d+_\d+/[^/]+/[^/]+$") + + +class CGNSPath(RootModel): + root: str + + @field_validator("root") + @classmethod + def validate_path(cls, v: str) -> str: + """Validate CGNS path format. + + Args: + v: Candidate CGNS path. + + Returns: + The validated path. + + Raises: + ValueError: If the path does not match the expected CGNS pattern. + """ + if not CGNS_PATTERN.match(v): + raise ValueError( + "Invalid CGNS variable format. Need to be in the form of 'Base_X_Y/ZoneName/VariableName'" + ) + return v + + @property + def path(self) -> str: + """Return the full CGNS path.""" + return self.root + + @property + def base(self) -> str: + """Return the base component of the CGNS path.""" + return self.root.split("/")[0] + + def zone(self) -> str: + """Return the zone component of the CGNS path.""" + return self.root.split("/")[1] + + +# Example usage of CGNSPath +if __name__ == "__main__": + # Valid CGNS paths + valid_path = CGNSPath("Base_1_0/Zone/GridCoordinates") + print(f"Valid path: {valid_path.root}") + + valid_path2 = CGNSPath("Base_0_0/Normal/Normals") + print(f"Valid path: {valid_path2.root}") + print(f"Valid path: {valid_path2.path}") + + # Invalid CGNS paths will raise ValidationError + try: + invalid_path = CGNSPath("InvalidPath") + except Exception as e: + print(f"Invalid path error: {e}") diff --git a/src/plaid/types/common.py b/src/plaid/types/common.py index 98dd33c6..c3ca93ce 100644 --- a/src/plaid/types/common.py +++ b/src/plaid/types/common.py @@ -1,14 +1,8 @@ """Common types used across the PLAID library.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import sys -from typing import Union +from typing import Annotated, Any, Union + +from pydantic import BeforeValidator, PlainSerializer if sys.version_info >= (3, 11): from typing import TypeAlias @@ -24,4 +18,13 @@ Array: TypeAlias = NDArray[ArrayDType] # Types used in indexing operations -IndexType = Union[list[int], NDArray[Union[np.int32, np.int64]], str] +IndexType = Union[list[int], NDArray[Union[np.int32, np.int64]]] + +# Define a reusable custom type +NDArrayInt = Annotated[ + Any, + BeforeValidator(lambda v: np.asarray(v, dtype=int)), # Convert input to numpy array + PlainSerializer( + lambda v: v.tolist(), return_type=list + ), # Serialize back to list for JSON +] diff --git a/src/plaid/types/feature_types.py b/src/plaid/types/feature_types.py deleted file mode 100644 index 1cf7b4fb..00000000 --- a/src/plaid/types/feature_types.py +++ /dev/null @@ -1,27 +0,0 @@ -"""Custom types for features.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -import sys -from typing import Union - -if sys.version_info >= (3, 11): - from typing import TypeAlias -else: # pragma: no cover - from typing_extensions import TypeAlias - - -from plaid.types.common import Array - -# Physical data types -Scalar: TypeAlias = Union[float, int] -Field: TypeAlias = Array -TimeSequence: TypeAlias = Array - -# Feature data types -Feature: TypeAlias = Union[Scalar, Field, Array] diff --git a/src/plaid/types/sklearn_types.py b/src/plaid/types/sklearn_types.py deleted file mode 100644 index ac91ff4f..00000000 --- a/src/plaid/types/sklearn_types.py +++ /dev/null @@ -1,34 +0,0 @@ -"""Custom types for scikit-learn related objects.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -from typing import Union - -from sklearn.base import ( - BaseEstimator, - BiclusterMixin, - ClassifierMixin, - ClusterMixin, - DensityMixin, - MultiOutputMixin, - OutlierMixin, - RegressorMixin, - TransformerMixin, -) - -SklearnBlock = Union[ - BaseEstimator, - TransformerMixin, - RegressorMixin, - ClassifierMixin, - ClusterMixin, - BiclusterMixin, - DensityMixin, - OutlierMixin, - MultiOutputMixin, -] diff --git a/src/plaid/utils/__init__.py b/src/plaid/utils/__init__.py index c6835793..c5cf2694 100644 --- a/src/plaid/utils/__init__.py +++ b/src/plaid/utils/__init__.py @@ -1,8 +1 @@ """Common utilities for the PLAID package.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/src/plaid/utils/base.py b/src/plaid/utils/base.py index 745a0c34..89c58a52 100644 --- a/src/plaid/utils/base.py +++ b/src/plaid/utils/base.py @@ -1,12 +1,4 @@ """Base utilities.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports from functools import wraps diff --git a/src/plaid/utils/cgns_helper.py b/src/plaid/utils/cgns_helper.py index 22c58def..a48f71ca 100644 --- a/src/plaid/utils/cgns_helper.py +++ b/src/plaid/utils/cgns_helper.py @@ -1,14 +1,6 @@ """Utility functions for working with CGNS trees and nodes.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - from copy import copy -from typing import Any, Optional +from typing import Any, Iterable, Optional import CGNS.PAT.cgnsutils as CGU import numpy as np @@ -536,8 +528,8 @@ def unflatten_cgns_tree( def update_features_for_CGNS_compatibility( features: list[str], - context_constant_features: list[str], - context_variable_features: list[str], + context_constant_features: Iterable[str], + context_variable_features: Iterable[str], ): """Expand a list of feature paths to include all CGNS hierarchy nodes and metadata required for compatibility. diff --git a/src/plaid/utils/cgns_worker.py b/src/plaid/utils/cgns_worker.py index 760cbdb2..68244a6b 100644 --- a/src/plaid/utils/cgns_worker.py +++ b/src/plaid/utils/cgns_worker.py @@ -1,12 +1,4 @@ """Utility function to save a pickled CGNS tree in a subprocess.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - import logging import os import pickle diff --git a/src/plaid/utils/deprecation.py b/src/plaid/utils/deprecation.py deleted file mode 100644 index 5eb7ae33..00000000 --- a/src/plaid/utils/deprecation.py +++ /dev/null @@ -1,173 +0,0 @@ -"""Implementation of deprecation utilities.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports - -import functools -import warnings -from typing import Optional, Union - -from packaging.version import Version - -import plaid -from plaid.utils.base import DeprecatedError - -try: - from warnings import deprecated as deprecated_builtin # Python 3.13+ -except ImportError: - deprecated_builtin = None - -# %% Functions - - -def deprecated( - reason: str, - version: Optional[Union[str, Version]] = None, - removal: Optional[Union[str, Version]] = None, -): - """Decorator to mark a function, method, or class as deprecated. - - Uses built-in `warnings.deprecated` when running on Python 3.13+, - otherwise falls back to a custom warning wrapper. - - Args: - reason (str): Explanation and suggested replacement. - version (Union[str,Version], optional): Version since deprecation. - removal (Union[str,Version], optional): Planned removal version. - """ - message_parts = [reason] - if version: - if isinstance(version, str): - version = Version(version) - message_parts.append(f"[since v{version}]") - if removal: - if isinstance(removal, str): - removal = Version(removal) - message_parts.append(f"(will be removed in v{removal})") - full_message = " ".join(message_parts) - - if removal and Version(plaid.__version__) >= removal: # pragma: no cover - full_message = [f"Removed in v{removal}, {reason}"] - - def decorator(_func): - def wrapper(*_args, **_kwargs): - raise DeprecatedError(full_message) - - return wrapper - - if deprecated_builtin is not None: # pragma: no cover - - def decorator(obj): - return deprecated_builtin( - full_message, category=DeprecationWarning, stacklevel=2 - )(obj) - - return decorator - - def decorator(obj): - if isinstance(obj, type): - orig_init = obj.__init__ - - @functools.wraps(orig_init) - def new_init(self, *args, **kwargs): - warnings.warn(full_message, DeprecationWarning, stacklevel=2) - return orig_init(self, *args, **kwargs) - - obj.__init__ = new_init - return obj - - elif callable(obj): - - @functools.wraps(obj) - def wrapper(*args, **kwargs): - warnings.warn(full_message, DeprecationWarning, stacklevel=2) - return obj(*args, **kwargs) - - return wrapper - - else: - raise TypeError( - "@deprecated decorator with non-None category must be applied to " - f"a class or callable, not {obj!r}" - ) - - return decorator - - -def deprecated_argument( - old_arg: str, - new_arg: str, - converter=lambda x: x, - version: Optional[Union[str, Version]] = None, - removal: Optional[Union[str, Version]] = None, -): - """Decorator to mark a function argument as deprecated and redirect it to a new argument. - - Args: - old_arg (str): Name of the old argument. - new_arg (str): Name of the new argument. - converter (callable): Function to convert the old value into the new format. - version (Union[str,Version], optional): Version since deprecation. - removal (Union[str,Version], optional): Planned removal version. - """ - if isinstance(removal, str): - removal = Version(removal) - - if removal and Version(plaid.__version__) >= removal: # pragma: no cover - full_message = [ - f"Argument `{old_arg}` has been removed in v{removal}, use `{new_arg}` instead." - ] - - def decorator(func): - @functools.wraps(func) - def wrapper(*args, **kwargs): - if old_arg in kwargs: - raise DeprecatedError(full_message) - return func(*args, **kwargs) - - return wrapper - - else: - if isinstance(version, str): - version = Version(version) - - message_parts = [ - f"Argument `{old_arg}` is deprecated, use `{new_arg}` instead." - ] - if version: - message_parts.append(f"[since v{version}]") - if removal: - message_parts.append(f"(will be removed in v{removal})") - full_message = " ".join(message_parts) - - def decorator(func): - @functools.wraps(func) - def wrapper(*args, **kwargs): - if old_arg in kwargs: - if new_arg in kwargs: - raise ValueError( - f"Arguments `{old_arg}` and `{new_arg}` cannot be both set." - ) - # Emit deprecation warning - if deprecated_builtin is not None: # pragma: no cover - # In Python 3.13+, link warning to the function itself - decorated = deprecated_builtin( - full_message, category=DeprecationWarning, stacklevel=2 - )(func) - return decorated( - *args, **{new_arg: converter(kwargs.pop(old_arg)), **kwargs} - ) - else: - warnings.warn(full_message, DeprecationWarning, stacklevel=2) - kwargs[new_arg] = converter(kwargs.pop(old_arg)) - return func(*args, **kwargs) - - return wrapper - - return decorator diff --git a/src/plaid/utils/info.py b/src/plaid/utils/info.py new file mode 100644 index 00000000..5f91dd42 --- /dev/null +++ b/src/plaid/utils/info.py @@ -0,0 +1,72 @@ +"""Helpers for validating and normalizing dataset infos metadata.""" +import copy +from typing import Any + +from ..constants import AUTHORIZED_INFO_KEYS, REQUIRED_INFOS_KEYS + + +def verify_info(infos: dict[str, dict[str, Any]]) -> None: + """Validate infos keys against authorized categories and entries. + + Args: + infos: Metadata dictionary grouped by category. + + Raises: + KeyError: If a category or an info key is not authorized. + """ + for cat_key in infos.keys(): # Format checking on "infos" + if cat_key not in {"plaid", "num_samples", "storage_backend"}: + if cat_key not in AUTHORIZED_INFO_KEYS: + raise KeyError( + f"{cat_key=} not among authorized keys. Maybe you want to try among these keys {list(AUTHORIZED_INFO_KEYS.keys())}" + ) + for info_key in infos[cat_key].keys(): + if info_key not in AUTHORIZED_INFO_KEYS[cat_key]: + raise KeyError( + f"{info_key=} not among authorized keys. Maybe you want to try among these keys {AUTHORIZED_INFO_KEYS[cat_key]}" + ) + + +def validate_required_infos(infos: dict[str, Any]) -> None: + """Validate that required infos categories and keys are present. + + Args: + infos: Dataset infos dictionary loaded from disk. + + Raises: + ValueError: If a required infos category or key is missing. + """ + assert isinstance(infos, dict) + + missing_entries: list[str] = [] + for category, required_keys in REQUIRED_INFOS_KEYS.items(): + category_infos = infos.get(category) + assert isinstance(category_infos, dict) + + for key in required_keys: + if key not in category_infos: + missing_entries.append(f"{category}.{key}") + + if missing_entries: + raise ValueError( + "Missing required infos entries: " + + ", ".join(sorted(missing_entries)) + + f". Required entries are defined by {REQUIRED_INFOS_KEYS!r}." + ) + + +def normalize_infos(infos: dict[str, dict[str, Any]]) -> dict[str, dict[str, Any]]: + """Return a validated deep copy of infos with guaranteed ``plaid`` section. + + Args: + infos: Metadata dictionary grouped by category. + + Returns: + dict[str, dict[str, Any]]: Validated infos with a guaranteed + ``plaid`` section. + """ + verify_info(infos) + + normalized = copy.deepcopy(infos) + normalized.setdefault("plaid", {}) + return normalized diff --git a/src/plaid/utils/init_with_tabular.py b/src/plaid/utils/init_with_tabular.py deleted file mode 100644 index da6bcc78..00000000 --- a/src/plaid/utils/init_with_tabular.py +++ /dev/null @@ -1,100 +0,0 @@ -"""Utility functions to initialize a Dataset with tabular data.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports - -import logging - -import numpy as np - -from plaid import Dataset, Sample - -# from plaid.quantity import QuantityValueType - -logger = logging.getLogger(__name__) - - -# %% Functions - - -def initialize_dataset_with_tabular_data( - tabular_data: dict[str, np.ndarray], -) -> Dataset: - """Initialize a Dataset with tabular data. - - This function takes a dictionary of tabular data where keys represent scalar names, - and values are numpy arrays of the same length. It creates a Dataset and adds samples - to it based on the provided tabular data. - - Args: - tabular_data (dict[str,np.ndarray]): A dictionary of scalar names and corresponding numpy arrays. - - Returns: - Dataset: A Dataset initialized with the tabular data. - - Raises: - AssertionError: If the lengths of the numpy arrays in tabular data are not identical. - - Example: - .. code-block:: python - - import numpy as np - from plaid.utils.init import initialize_dataset_with_tabular_data - tabular_data = {'feature1': np.array([1, 2, 3]), 'feature2': np.array([4, 5, 6])} - dataset = initialize_dataset_with_tabular_data(tabular_data) - """ - lengths = [len(value) for value in tabular_data.values()] - assert len(list(set(lengths))) == 1, "sizes not identical in tabular data" - - dataset = Dataset() - - nb_samples = lengths[0] - for i in range(nb_samples): - sample = Sample() - for scalar_name, value in tabular_data.items(): - sample.add_scalar(scalar_name, value[i]) - dataset.add_sample(sample) - - # TODO: - # logger.info("Pour l'instant on boucle sur les samples, il y a probablement mieux à faire, mais l'API est simple") - - return dataset - - -# def initialize_quantity_dataset_with_tabular_data(tabular_data:dict[str,Union[list[QuantityValueType],np.ndarray]]) -> Dataset: -# """_summary_ - -# Args: -# tabular_data (dict[str,Union[list[QuantityValueType],np.ndarray]]): -# `feature_name` -> tabular values - -# Returns: -# Dataset -# """ -# lengths = [len(value) for value in tabular_data.values()] -# assert len(list(set(lengths))) == 1, "sizes not identical in tabular data" - -# #---# Adds data to collection -# data_collection = DataCollection() -# for name in tabular_data: -# storage = data_collection.add_storage('quantity', name) -# storage.add_values(tabular_data[name]) - -# #---# Link samples to data in collection -# dataset = Dataset() -# nb_samples = lengths[0] -# for i_samp in range(nb_samples): -# sample = Sample(data_collection = data_collection) -# for feature_name in tabular_data: -# sample.link_to_value("quantity", feature_name, i_samp) -# dataset.add_sample(sample) - -# return dataset - -# %% Classes diff --git a/src/plaid/utils/interpolation.py b/src/plaid/utils/interpolation.py deleted file mode 100644 index 2ba4237f..00000000 --- a/src/plaid/utils/interpolation.py +++ /dev/null @@ -1,168 +0,0 @@ -"""Interpolation utilities for working with ordered lists and vectors.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports - -import bisect -from typing import Union - -import numpy as np - -# %% Functions - - -def binary_search( - ordered_list: Union[list, np.ndarray], item: Union[float, int] -) -> int: - """Find the rank of the largest element smaller or equal to the given item in a sorted list. - - Inspects the sorted list "ordered_list" and returns: - - 0 if item <= ordered_list[0] - - the rank of the largest element smaller or equal than item otherwise - - Args: - ordered_list (Union[list, np.ndarray]): The data sorted in increasing order from which the previous rank is searched. - item (Union[float, int]): The item for which the previous rank is searched. - - Returns: - int: 0 or the rank of the largest element smaller or equal than item in "ordered_list". - """ - return max(bisect.bisect_right(ordered_list, item) - 1, 0) - - -def binary_search_vectorized( - ordered_list: Union[list, np.ndarray], items: Union[list, np.ndarray] -) -> np.ndarray: - """Vectorized binary search for multiple items in a sorted list (items is now a list or one-dimensional np.ndarray). - - Args: - ordered_list (Union[list, np.ndarray]): The data sorted in increasing order. - items (Union[list, np.ndarray]): The items for which ranks are searched. - - Returns: - np.ndarray: An array containing the ranks of the largest elements smaller or equal to each item. - """ - return np.fromiter( - map(lambda item: binary_search(ordered_list, item), items), dtype=int - ) - - -def piece_wise_linear_interpolation( - item: float, - item_indices: np.ndarray, - vectors: Union[np.ndarray, dict], - tolerance: float = 1e-4, -) -> np.ndarray: - """Computes a item interpolation for temporal vectors defined either by item_indices and vectors at these indices. - - Args: - item (float): The input item at which the interpolation is required. - item_indices (np.ndarray): The items where the available data is defined, of size (numberOfTimeIndices). - vectors (Union[np.ndarray, dict]): The available data, of size (numberOfVectors, numberOfDofs). - tolerance (float): Tolerance for deciding when using the closest timestep value instead of carrying out the linear interpolation, default to 1e-4. - - Returns: - np.ndarray: Interpolated vector, of size (numberOfDofs). - """ - if item <= item_indices[0]: - return vectors[0] - if item >= item_indices[-1]: - return vectors[-1] - - prev = binary_search(item_indices, item) - coef = (item - item_indices[prev]) / (item_indices[prev + 1] - item_indices[prev]) - - if 0.5 - abs(coef - 0.5) < tolerance: - coef = round(coef) - - return coef * vectors[prev + 1] + (1 - coef) * vectors[prev] - - -def piece_wise_linear_interpolation_with_map( - item: float, - item_indices: np.ndarray, - vectors: Union[np.ndarray, dict], - vectors_map: list = None, - tolerance: float = 1e-4, -) -> np.ndarray: - """Computes a item interpolation for temporal vectors defined either by item_indices, some tags at these item indices (vectors_map), and vectors at those tags. - - Args: - item (float): The input item at which the interpolation is required. - item_indices (np.ndarray): The items where the available data is defined, of size (numberOfTimeIndices). - vectors (Union[np.ndarray, dict]): The available data, of size (numberOfVectors, numberOfDofs). - vectors_map (list, optional): List containing the mapping from the numberOfTimeIndices items indices to the numberOfVectors vectors, of size (numberOfTimeIndices,). Defaults to None. - tolerance (float, optional): Tolerance for deciding when using the closest timestep value instead of carrying out the linear interpolation, default to 1e-4. - - Returns: - np.ndarray: Interpolated vector, of size (numberOfDofs). - """ - # TODO What if vectorsMap = None ??? it will crash - if item <= item_indices[0]: - return vectors[vectors_map[0]] - if item >= item_indices[-1]: - return vectors[vectors_map[-1]] - - prev = binary_search(item_indices, item) - coef = (item - item_indices[prev]) / (item_indices[prev + 1] - item_indices[prev]) - - if 0.5 - abs(coef - 0.5) < tolerance: - coef = round(coef) - - return ( - coef * vectors[vectors_map[prev + 1]] + (1 - coef) * vectors[vectors_map[prev]] - ) - - -def piece_wise_linear_interpolation_vectorized( - items: list[float], item_indices: np.ndarray, vectors: Union[np.ndarray, str] -) -> list[np.ndarray]: - """piece_wise_linear_interpolation for more than one call (items is now a list or one-dimensional np.ndarray). - - Args: - items (list[float]): The input items at which interpolations are required. - item_indices (np.ndarray): The items where the available data is defined, of size (numberOfTimeIndices). - vectors (np.ndarray or dict): The available data, of size (numberOfVectors, numberOfDofs). - - Returns: - list[np.ndarray]: List of interpolated vectors, each of size (numberOfDofs). - """ - return [ - piece_wise_linear_interpolation(item, item_indices, vectors) for item in items - ] - # return np.fromiter(map(lambda item: piece_wise_linear_interpolation(item, - # item_indices, vectors), items), dtype = type(vectors[0])) - - -def piece_wise_linear_interpolation_vectorized_with_map( - items: list[float], - item_indices: np.ndarray, - vectors: Union[np.ndarray, dict], - vectors_map: list = None, -) -> list[np.ndarray]: - """piece_wise_linear_interpolation_with_map for more than one call (items is now a list or one-dimensional np.ndarray). - - Args: - items (list[float]): The input items at which interpolations are required. - item_indices (np.ndarray): The items where the available data is defined, of size (numberOfTimeIndices). - vectors (np.ndarray or dict): The available data, of size (numberOfVectors, numberOfDofs). - vectors_map (list): List containing the mapping from the numberOfTimeIndices items indices to the numberOfVectors vectors, of size (numberOfTimeIndices,). Default is None, in which case numberOfVectors = numberOfTimeIndices. - - Returns: - list[np.ndarray]: List of interpolated vectors, each of size (numberOfDofs). - """ - return [ - piece_wise_linear_interpolation_with_map( - item, item_indices, vectors, vectors_map - ) - for item in items - ] - # return np.fromiter(map(lambda item: - # piece_wise_linear_interpolation_with_map(item, item_indices, vectors, - # vectors_map), items), dtype = np.float) diff --git a/src/plaid/utils/split.py b/src/plaid/utils/split.py deleted file mode 100644 index 107377ea..00000000 --- a/src/plaid/utils/split.py +++ /dev/null @@ -1,258 +0,0 @@ -"""Utility function for splitting a Dataset.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports - -import logging -from typing import Any, Optional - -import numpy as np -from numpy.typing import NDArray -from scipy.spatial.distance import cdist - -from plaid import Dataset - -logger = logging.getLogger(__name__) - - -# %% Functions - - -def split_dataset(dset: Dataset, options: dict[str, Any]) -> dict[str, int]: - """Splits a Dataset in several sub Datasets. - - Args: - dset(Dataset): dataset to be splited. - options([str,Any]): may have keys 'shuffle', 'split_sizes', 'split_ratios' or 'split_ids': - - 'split_sizes' is supposed to be a dict[str,int]: split name -> size of splited dataset - - 'split_ratios' is supposed to be a dict[str,float]: split name -> size ratios of splited dataset - - 'split_ids' is supposed to be a dict[str,np.ndarray(int)]: split name -> ids of samples in splited dataset - - if 'shuffle' is not set, it is supposed to be False - - if 'split_ids' is present, other keys will be ignored - Returns: - Dataset: the dataset with splits. - - Raises: - ValueError: If a split is named 'other' (not authorized). - ValueError: If there are some ids out of bounds. - ValueError: If some split names are in 'split_ratios' and 'split_sizes'. - - Example: - .. code-block:: python - - # Given a dataset of 2 samples - print(dataset) - >>> Dataset(2 samples, 2 scalars, 2 fields) - - options = { - 'shuffle':False, - 'split_sizes': { - 'train':1, - 'val':1, - }, - } - split = split_dataset(dataset, options) - print(split) - >>> {'train': [0], 'val': [1]} - - """ - _splits = {} - all_ids = dset.get_sample_ids() - total_size = len(dset) - - # Verify that split option validity - def check_options_validity(split_option: dict): - assert isinstance(split_option, dict), "split option must be a dictionary" - if "other" in split_option: - raise ValueError("name 'other' is not authorized for a split") - - # Check that the keys in options are among authorized keys - authorized_task = ["split_ids", "split_ratios", "split_sizes", "shuffle"] - for task in options: - if task in authorized_task: - continue - logger.warning(f"option {task} is not authorized. {task} key will be ignored") - - f_case = len(set(["split_ids"]).intersection(set(options.keys()))) - s_case = len(set(["split_ratios", "split_sizes"]).intersection(set(options.keys()))) - assert f_case == 0 or s_case == 0, ( - "split by id cannot exist with split by ratios or sizes" - ) - - # First case - if "split_ids" in options: - check_options_validity(options["split_ids"]) - - if len(options) > 1: - logger.warning( - "options has key 'split_ids' and 'shuffle' -> 'shuffle' key will be ignored" - ) - - # all_ids = np.arange(total_size) - used_ids = np.unique( - np.concatenate([ids for ids in options["split_ids"].values()]) - ) - - if np.min(used_ids) < 0 or np.max(used_ids) >= total_size: - raise ValueError( - "there are some ids out of bounds -> min/max:{}/{} | dataset len:{}".format( - np.min(used_ids), np.max(used_ids), total_size - ) - ) - - other_ids = np.setdiff1d(all_ids, used_ids) - if len(other_ids) > 0: - options["split_ids"]["other"] = other_ids - - if len(used_ids) < np.sum([len(ids) for ids in options["split_ids"].values()]): - logger.warning("there are some ids present in several splits") - - for name in options["split_ids"]: - _splits[name] = options["split_ids"][name] - # split_samples = [] - # for id in options['split_ids'][name]: - # split_samples.append(dset[id]) - # dset._splits[name] = Dataset() - # dset._splits[name].add_samples(split_samples) - return _splits - - if "shuffle" in options: - shuffle = options["shuffle"] - else: - shuffle = False - - split_sizes = [0] - split_names = [] - # Second case - if "split_ratios" in options: - check_options_validity(options["split_ratios"]) - - for key, value in options["split_ratios"].items(): - assert isinstance(value, float) - split_names.append(key) - split_sizes.append(int(total_size * value)) - - if "split_sizes" in options: - check_options_validity(options["split_sizes"]) - - for key, value in options["split_sizes"].items(): - assert "split_ratios" not in options or key not in options["split_ratios"] - assert isinstance(value, int) - split_names.append(key) - split_sizes.append(value) - - assert np.sum(split_sizes) <= total_size - if np.sum(split_sizes) < total_size: - split_names.append("other") - split_sizes.append(total_size - np.sum(split_sizes)) - slices = np.cumsum(split_sizes) - - # all_ids = np.arange(total_size) - if shuffle: - all_ids = np.random.permutation(all_ids) - - for i_split in range(len(split_names)): - _splits[split_names[i_split]] = all_ids[slices[i_split] : slices[i_split + 1]] - # split_samples = [] - # for id in all_ids[slices[i_split]:slices[i_split+1]]: - # split_samples.append(dset[id]) - # dset._splits[split_names[i_split]] = Dataset() - # dset._splits[split_names[i_split]].add_samples(split_samples) - - return _splits - - -def mmd_subsample_fn( - X: NDArray[np.float64], - size: int, - initial_ids: Optional[list[int]] = None, - memory_safe: bool = False, -) -> NDArray[np.int64]: - """Selects samples in the input table by greedily minimizing the maximum mena discrepancy (MMD). - - Args: - X(np.ndarray): input table of shape n_samples x n_features - size(int): number of samples to select - initial_ids(list[int]): a list of ids of points to initialize the gready algorithm. Defaults to None. - memory_safe(bool): if True, avoids a memory expensive computation. Useful for large tables. Defaults to False. - - Returns: - np.ndarray: array of selected samples - Example: - .. code-block:: python - - # Let X be drawn from a standard 10-dimensional Gaussian distribution - np.random.seed(0) - X = np.random.randn(1000,10) - # Select 100 particles - idx = mmd_subsample_fn(X, size=100) - print(idx) - >>> [765 113 171 727 796 855 715 207 458 603 23 384 860 3 459 708 794 138 - 221 639 8 816 619 806 398 236 36 404 167 87 201 676 961 624 556 840 - 485 975 283 150 554 409 69 769 332 357 388 216 900 134 15 730 80 694 - 251 714 11 817 525 382 328 67 356 514 597 668 959 260 968 26 209 789 - 305 122 989 571 801 322 14 160 908 12 1 980 582 440 42 452 666 526 - 290 231 712 21 606 575 656 950 879 948] - # In this simple Gaussian example, the means and standard deviations of the - # selected subsample should be close to the ones of the original sample - print(np.abs(np.mean(x[idx], axis=0) - np.mean(x, axis=0))) - >>> [0.00280955 0.00220179 0.01359079 0.00461107 0.0011997 0.01106616 - 0.01157571 0.0061314 0.00813494 0.0026543] - print(np.abs(np.std(x[idx], axis=0) - np.std(x, axis=0))) - >>> [0.0067711 0.00316008 0.00860733 0.07130127 0.02858514 0.0173707 - 0.00739646 0.03526784 0.0054039 0.00351996] - """ - n = X.shape[0] - assert size <= n - # Precompute norms and distance matrix - norms = np.linalg.norm(X, axis=1) - if memory_safe: - k0_mean = np.zeros(n) - for i in range(n): - kxy = norms[i : i + 1, None] + norms[None, :] - cdist(X[i : i + 1], X) - k0_mean[i] = np.mean(kxy) - else: - dist_matrix = cdist(X, X) - gram_matrix = norms[:, None] + norms[None, :] - dist_matrix - k0_mean = np.mean(gram_matrix, axis=1) - - idx = np.zeros(size, dtype=np.int64) - if initial_ids is None or len(initial_ids) == 0: - k0 = np.zeros((n, size)) - k0[:, 0] = 2.0 * norms - - idx[0] = np.argmin(k0[:, 0] - 2.0 * k0_mean) - for i in range(1, size): - x_ = X[idx[i - 1]] - dist = np.linalg.norm(X - x_, axis=1) - k0[:, i] = -dist + norms[idx[i - 1]] + norms - - idx[i] = np.argmin( - k0[:, 0] - + 2.0 * np.sum(k0[:, 1 : (i + 1)], axis=1) - - 2.0 * (i + 1) * k0_mean - ) - else: - assert len(initial_ids) < size - idx[: len(initial_ids)] = initial_ids - k0 = np.zeros((n, size)) - - k0[:, 0] = 2.0 * norms - for i in range(1, size): - x_ = X[idx[i - 1]] - dist = np.linalg.norm(X - x_, axis=1) - k0[:, i] = -dist + norms[idx[i - 1]] + norms - - if i >= len(initial_ids): - idx[i] = np.argmin( - k0[:, 0] - + 2.0 * np.sum(k0[:, 1 : (i + 1)], axis=1) - - 2.0 * (i + 1) * k0_mean - ) - return idx diff --git a/src/plaid/utils/stats.py b/src/plaid/utils/stats.py deleted file mode 100644 index d904a4ad..00000000 --- a/src/plaid/utils/stats.py +++ /dev/null @@ -1,616 +0,0 @@ -"""Utility functions for computing statistics on datasets.""" - -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports - -import copy -import logging -import sys -from typing import Union - -if sys.version_info >= (3, 11): - from typing import Self -else: # pragma: no cover - from typing import TypeVar - - Self = TypeVar("Self") - - -import numpy as np - -from plaid import Dataset, Sample -from plaid.constants import CGNS_FIELD_LOCATIONS - -logger = logging.getLogger(__name__) - - -# %% Functions - - -def aggregate_stats( - sizes: np.ndarray, means: np.ndarray, vars: np.ndarray -) -> tuple[np.ndarray, np.ndarray, np.ndarray]: - """Compute aggregated statistics of a batch of already computed statistics (without original samples information). - - This function calculates aggregated statistics, such as the total number of samples, mean, and variance, by taking into account the statistics computed for each batch of data. - - cf: [Variance from (cardinal,mean,variance) of several statistical series](https://fr.wikipedia.org/wiki/Variance_(math%C3%A9matiques)#Formules) - - Args: - sizes (np.ndarray): An array containing the sizes (number of samples) of each batch. Expect shape (n_batches,1). - means (np.ndarray): An array containing the means of each batch. Expect shape (n_batches, n_features). - vars (np.ndarray): An array containing the variances of each batch. Expect shape (n_batches, n_features). - - Returns: - tuple[np.ndarray,np.ndarray,np.ndarray]: A tuple containing the aggregated statistics in the following order: - - Total number of samples in all batches. - - Weighted mean calculated from the batch means. - - Weighted variance calculated from the batch variances, considering the means. - """ - assert sizes.ndim == 1 - assert means.ndim == 2 - assert len(sizes) == len(means) - assert means.shape == vars.shape - sizes = sizes.reshape((-1, 1)) - total_n_samples = np.sum(sizes) - total_mean = np.sum(sizes * means, axis=0, keepdims=True) / total_n_samples - total_var = ( - np.sum(sizes * (vars + (total_mean - means) ** 2), axis=0, keepdims=True) - / total_n_samples - ) - return total_n_samples, total_mean, total_var - - -# %% Classes - - -class OnlineStatistics(object): - """OnlineStatistics is a class for computing online statistics of numpy arrays. - - This class computes running statistics (min, max, mean, variance, std) for streaming data - without storing all samples in memory. - - Example: - >>> stats = OnlineStatistics() - >>> stats.add_samples(np.array([[1, 2], [3, 4]])) - >>> stats.add_samples(np.array([[5, 6]])) - >>> print(stats.get_stats()['mean']) - [[3. 4.]] - """ - - def __init__(self) -> None: - """Initialize an empty OnlineStatistics object.""" - self.n_samples: int = 0 - self.n_features: int = None - self.n_points: int = None - self.min: np.ndarray = None - self.max: np.ndarray = None - self.mean: np.ndarray = None - self.var: np.ndarray = None - self.std: np.ndarray = None - - def add_samples(self, x: np.ndarray, n_samples: int = None) -> None: - """Add samples to compute statistics for. - - Args: - x (np.ndarray): The input numpy array containing samples data. Expect 2D arrays with shape (n_samples, n_features). - n_samples (int, optional): The number of samples in the input array. If not provided, it will be inferred from the shape of `x`. Use this argument when the input array has already been flattened because of shape inconsistencies. - - Raises: - ValueError: Raised when input contains NaN or Inf values. - """ - # Validate input - if not isinstance(x, np.ndarray): - raise TypeError("Input must be a numpy array") - - if np.any(~np.isfinite(x)): - raise ValueError("Input contains NaN or Inf values") - - # Handle 1D arrays - if x.ndim == 1: - if self.min is not None: - if self.min.shape[1] == 1: - x = x.reshape((-1, 1)) - else: - x = x.reshape((1, -1)) - else: - x = x.reshape((-1, 1)) # Default to column vector - - # Handle n-dimensional arrays - elif x.ndim > 2: - # if we have array of shape (n_samples, n_points, n_features) - # it will be reshaped to (n_samples * n_points, n_features) - x = x.reshape((-1, x.shape[-1])) - - if self.n_features is None: - self.n_features = x.shape[1] - - if x.shape[1] != self.n_features: - # it means that stats where previously on a per-point mode, - # but it is no longer possible as the new added samples have a different shape - # so we need to shift the stats to a per-sample mode, and then flatten the stats array - self.flatten_array() - n_samples = x.shape[0] - x = x.reshape((-1, 1)) - - added_n_samples = len(x) if n_samples is None else n_samples - added_n_points = x.size - added_min = np.min(x, axis=0, keepdims=True) - added_max = np.max(x, axis=0, keepdims=True) - added_mean = np.mean(x, axis=0, keepdims=True) - added_var = np.var(x, axis=0, keepdims=True) - - if ( - (self.n_samples == 0) - or (self.min is None) - or (self.max is None) - or (self.mean is None) - or (self.var is None) - ): - self.n_samples = added_n_samples - self.n_points = added_n_points - self.min = added_min - self.max = added_max - self.mean = added_mean - self.var = added_var - else: - self.min = np.min( - np.concatenate((self.min, added_min), axis=0), axis=0, keepdims=True - ) - self.max = np.max( - np.concatenate((self.max, added_max), axis=0), axis=0, keepdims=True - ) - if self.n_features > 1: - # feature not flattened, we are on a per-sample mode - self.n_points += added_n_points - self.n_samples, self.mean, self.var = aggregate_stats( - np.array([self.n_samples, added_n_samples]), - np.concatenate([self.mean, added_mean]), - np.concatenate([self.var, added_var]), - ) - else: - # feature flattened, we are on a per-point mode - self.n_samples += added_n_samples - self.n_points, self.mean, self.var = aggregate_stats( - np.array([self.n_points, added_n_points]), - np.concatenate([self.mean, added_mean]), - np.concatenate([self.var, added_var]), - ) - - self.std = np.sqrt(self.var) - - def merge_stats(self, other: Self) -> None: - """Merge statistics from another instance. - - Args: - other (Self): The other instance to merge statistics from. - """ - if not isinstance(other, self.__class__): - raise TypeError("Can only merge with another instance of the same class") - - if self.n_features != other.n_features: - # flatten both - self.flatten_array() - other = copy.deepcopy(other) - other.flatten_array() - assert self.min.shape == other.min.shape, ( - "Shape mismatch in OnlineStatistics merging" - ) - - self.min = np.min( - np.concatenate((self.min, other.min), axis=0), axis=0, keepdims=True - ) - self.max = np.max( - np.concatenate((self.max, other.max), axis=0), axis=0, keepdims=True - ) - self.n_points += other.n_points - self.n_samples, self.mean, self.var = aggregate_stats( - np.array([self.n_samples, other.n_samples]), - np.concatenate([self.mean, other.mean]), - np.concatenate([self.var, other.var]), - ) - self.std = np.sqrt(self.var) - - def flatten_array(self) -> None: - """When a shape incoherence is detected, you should call this function.""" - self.min = np.min(self.min, keepdims=True).reshape(1, 1) - self.max = np.max(self.max, keepdims=True).reshape(1, 1) - self.n_points = self.n_samples * self.n_features - assert self.mean.shape == self.var.shape - self.n_points, self.mean, self.var = aggregate_stats( - np.array([self.n_samples] * self.n_features), - self.mean.reshape(-1, 1), - self.var.reshape(-1, 1), - ) - self.std = np.sqrt(self.var) - - self.n_features = 1 - - def get_stats(self) -> dict[str, Union[int, np.ndarray]]: - """Get computed statistics. - - Returns: - dict[str, Union[int, np.ndarray]]: A dictionary containing computed statistics. - The shapes of the arrays depend on the input data and may vary. - """ - return { - "n_samples": self.n_samples, - "n_points": self.n_points, - "n_features": self.n_features, - "min": self.min, - "max": self.max, - "mean": self.mean, - "var": self.var, - "std": self.std, - } - - -class Stats: - """Class for aggregating and computing statistics across datasets. - - The Stats class processes both scalar and field data from samples or datasets, - computing running statistics like min, max, mean, variance and standard deviation. - - Attributes: - _stats (dict[str, OnlineStatistics]): Dictionary mapping data identifiers to their statistics - """ - - def __init__(self): - """Initialize an empty Stats object.""" - self._stats: dict[str, OnlineStatistics] = {} - self._feature_is_flattened: dict[str, bool] = {} - - def add_dataset(self, dset: Dataset) -> None: - """Add a dataset to compute statistics for. - - Args: - dset (Dataset): The dataset to add. - """ - self.add_samples(dset) - - def add_samples(self, samples: Union[list[Sample], Dataset]) -> None: - """Add samples or a dataset to compute statistics for. - - Compute stats for each features present in the samples among scalars and fields. - For fields, as long as the added samples have the same shape as the existing ones, - the stats will be computed per-coordinates (n_features=x.shape[-1]). - But as soon as the shapes differ, the stats and added fields will be flattened (n_features=1), - then stats will be computed over all values of the field. - - Args: - samples (Union[list[Sample], Dataset]): List of samples or dataset to process - - Raises: - TypeError: If samples is not a list[Sample] or Dataset - ValueError: If a sample contains invalid data - """ - # Input validation - if not isinstance(samples, (list, Dataset)): - raise TypeError("samples must be a list[Sample] or Dataset") - - # Process each sample - new_data: dict[str, list] = {} - - for sample in samples: - # Process scalars - self._process_scalar_data(sample, new_data) - - # Process fields - self._process_field_data(sample, new_data) - - # ---# SpatialSupport (Meshes) - # TODO - - # ---# TemporalSupport - # TODO - - # ---# Categorical - # TODO - - # Update statistics - self._update_statistics(new_data) - - def get_stats( - self, identifiers: list[str] = None - ) -> dict[str, dict[str, np.ndarray]]: - """Get computed statistics for specified data identifiers. - - Args: - identifiers (list[str], optional): List of data identifiers to retrieve. - If None, returns statistics for all identifiers. - - Returns: - dict[str, dict[str, np.ndarray]]: Dictionary mapping identifiers to their statistics - """ - if identifiers is None: - identifiers = self.get_available_statistics() - - stats = {} - for identifier in identifiers: - if identifier in self._stats: - stats[identifier] = {} - for stat_name, stat_value in ( - self._stats[identifier].get_stats().items() - ): - stats[identifier][stat_name] = stat_value - # stats[identifier][stat_name] = np.squeeze(stat_value) - - return stats - - def get_available_statistics(self) -> list[str]: - """Get list of data identifiers with computed statistics. - - Returns: - list[str]: List of data identifiers - """ - return sorted(self._stats.keys()) - - def clear_statistics(self) -> None: - """Clear all computed statistics.""" - self._stats.clear() - - def merge_stats(self, other: Self) -> None: - """Merge statistics from another Stats object. - - Args: - other (Stats): Stats object to merge with - """ - for name, stats in other._stats.items(): - if name not in self._stats: - self._stats[name] = copy.deepcopy(stats) - else: - self._stats[name].merge_stats(stats) - - def _process_scalar_data(self, sample: Sample, data_dict: dict[str, list]) -> None: - """Process scalar data from a sample. - - Args: - sample (Sample): Sample containing scalar data - data_dict (dict[str, list]): Dictionary to store processed data - """ - for name in sample.get_scalar_names(): - if name not in data_dict: - data_dict[name] = [] - value = sample.get_scalar(name) - if value is not None: - data_dict[name].append(np.array(value).reshape((1, -1))) - - def _process_field_data(self, sample: Sample, data_dict: dict[str, list]) -> None: - """Process field data from a sample. - - Args: - sample (Sample): Sample containing field data - data_dict (dict[str, list]): Dictionary to store processed data - """ - for time in sample.features.get_all_time_values(): - for base_name in sample.features.get_base_names(time=time): - for zone_name in sample.features.get_zone_names( - base_name=base_name, time=time - ): - for location in CGNS_FIELD_LOCATIONS: - for field_name in sample.get_field_names( - location=location, - zone_name=zone_name, - base_name=base_name, - time=time, - ): - stat_key = ( - f"{base_name}/{zone_name}/{location}/{field_name}" - ) - if stat_key not in data_dict: - data_dict[stat_key] = [] - field = sample.get_field( - field_name, - location=location, - zone_name=zone_name, - base_name=base_name, - time=time, - ).reshape((1, -1)) - if field is not None: - # check if all previous arrays are the same shape as the new one that will be added to data_dict[stat_key] - if len( - data_dict[stat_key] - ) > 0 and not self._feature_is_flattened.get( - stat_key, False - ): - prev_shape = data_dict[stat_key][0].shape - if field.shape != prev_shape: - # set this stat as flattened - self._feature_is_flattened[stat_key] = True - # flatten corresponding stat - if stat_key in self._stats: - self._stats[stat_key].flatten_array() - - if self._feature_is_flattened.get(stat_key, False): - field = field.reshape((-1, 1)) - - data_dict[stat_key].append(field) - - def _update_statistics(self, new_data: dict[str, list]) -> None: - """Update running statistics with new data. - - Args: - new_data (dict[str, list]): Dictionary containing new data to process - """ - for name, list_of_arrays in new_data.items(): - if len(list_of_arrays) > 0: - if name not in self._stats: - self._stats[name] = OnlineStatistics() - - # internal check, should never happen if self._process_* functions work correctly - for sample_id in range(len(list_of_arrays)): - assert isinstance(list_of_arrays[sample_id], np.ndarray) - assert list_of_arrays[sample_id].ndim == 2, ( - f"for feature <{name}> -> {sample_id=}: {list_of_arrays[sample_id].ndim=} should be 2" - ) - - if self._feature_is_flattened.get(name, False): - # flatten all arrays in list_of_arrays - n_samples = len(list_of_arrays) - for i in range(len(list_of_arrays)): - list_of_arrays[i] = list_of_arrays[i].reshape((-1, 1)) - else: - n_samples = None - - # Convert to numpy array and reshape if needed - data = np.concatenate(list_of_arrays) - assert data.ndim == 2 - - self._stats[name].add_samples(data, n_samples=n_samples) - - # # old version of _update_statistics logic - # for name in new_data: - # # new_shapes = [value.shape for value in new_data[name] if value.shape!=new_data[name][0].shape] - # # has_same_shape = (len(new_shapes)==0) - # has_same_shape = True - - # if has_same_shape: - # new_data[name] = np.array(new_data[name]) - # else: # pragma: no cover ### remove "no cover" when "has_same_shape = True" is no longer used - # if name in self._stats: - # self._stats[name].flatten_array() - # new_data[name] = np.concatenate( - # [np.ravel(value) for value in new_data[name]] - # ) - - # if new_data[name].ndim == 1: - # new_data[name] = new_data[name].reshape((-1, 1)) - - # if name not in self._stats: - # self._stats[name] = OnlineStatistics() - - # self._stats[name].add_samples(new_data[name]) - - # TODO : FAIRE DEUX FONCTIONS : - # - compute_stats(samples) -> stats - # - aggregate_stats(list[stats]) - - # TODO: reuse this ? more adapted to heterogenous data - # def _compute_scalars_stats_(self) -> None: - # nb_samples_with_scalars = 0 - # scalars_have_timestamps = False - # full_scalars = [] - # full_scalars_timestamps = [] - # for sample in self.samples: - # if 'scalars' in sample._data: - # nb_samples_with_scalars += 1 - # if isinstance(sample._data['scalars'], dict): - # scalars_have_timestamps = True - # for k in sample._data['scalars']: - # full_scalars_timestamps.append(k) - # for val in sample._data['scalars'].values(): - # full_scalars.append(val) - # elif isinstance(sample._data['scalars'], tuple): - # scalars_have_timestamps = True - # full_scalars_timestamps.append(sample._data['scalars'][0]) - # full_scalars.append(sample._data['scalars'][1]) - # else: - # full_scalars.append(sample._data['scalars']) - # if nb_samples_with_scalars>0: - # full_scalars = np.array(full_scalars) - # logger.debug("full_scalars.shape: {}".format(full_scalars.shape)) - # self._stats['scalars'] = { - # 'min': np.min(full_scalars, axis=0), - # 'max': np.max(full_scalars, axis=0), - # 'mean': np.mean(full_scalars, axis=0), - # 'std': np.std(full_scalars, axis=0), - # 'var': np.var(full_scalars, axis=0), - # } - # if scalars_have_timestamps: - # full_scalars_timestamps = np.array(full_scalars_timestamps) - # logger.debug("full_scalars_timestamps.shape: {}".format(full_scalars_timestamps.shape)) - # self._stats['scalars_timestamps'] = { - # 'min': np.min(full_scalars_timestamps), - # 'max': np.max(full_scalars_timestamps), - # 'mean': np.mean(full_scalars_timestamps), - # 'std': np.std(full_scalars_timestamps), - # 'var': np.var(full_scalars_timestamps), - # } - - # def _compute_fields_stats_(self) -> None: - # nb_samples_with_fields = 0 - # fields_have_timestamps = False - # full_fields = [] - # full_fields_timestamps = [] - # for sample in self.samples: - # if 'fields' in sample._data: - # nb_samples_with_fields += 1 - # if isinstance(sample._data['fields'], dict): - # fields_have_timestamps = True - # for k in sample._data['fields']: - # full_fields_timestamps.append(k) - # for val in sample._data['fields'].values(): - # full_fields.append(val) - # elif isinstance(sample._data['fields'], tuple): - # fields_have_timestamps = True - # full_fields_timestamps.append(sample._data['fields'][0]) - # full_fields.append(sample._data['fields'][1]) - # else: - # full_fields.append(sample._data['fields']) - # if nb_samples_with_fields>0: - # full_fields = np.concatenate(full_fields, axis=0) - # logger.debug("full_fields.shape: {}".format(full_fields.shape)) - # self._stats['fields'] = { - # 'min': np.min(full_fields, axis=0), - # 'max': np.max(full_fields, axis=0), - # 'mean': np.mean(full_fields, axis=0), - # 'std': np.std(full_fields, axis=0), - # 'var': np.var(full_fields, axis=0), - # } - # if fields_have_timestamps: - # full_fields_timestamps = np.array(full_fields_timestamps) - # logger.debug("full_fields_timestamps.shape: {}".format(full_fields_timestamps.shape)) - # self._stats['fields_timestamps'] = { - # 'min': np.min(full_fields_timestamps), - # 'max': np.max(full_fields_timestamps), - # 'mean': np.mean(full_fields_timestamps), - # 'std': np.std(full_fields_timestamps), - # 'var': np.var(full_fields_timestamps), - # } - - # def _compute_mesh_stats_(self) -> None: - # nb_samples_with_mesh = 0 - # mesh_have_timestamps = False - # full_mesh = [] - # full_mesh_timestamps = [] - # for sample in self.samples: - # if 'mesh' in sample._data: - # nb_samples_with_mesh += 1 - # if isinstance(sample._data['mesh'], dict): - # mesh_have_timestamps = True - # for k in sample._data['mesh']: - # full_mesh_timestamps.append(k) - # for val in sample._data['mesh'].values(): - # full_mesh.append(val) - # elif isinstance(sample._data['mesh'], tuple): - # mesh_have_timestamps = True - # full_mesh_timestamps.append(sample._data['mesh'][0]) - # full_mesh.append(sample._data['mesh'][1]) - # else: - # full_mesh.append(sample._data['mesh']) - # if nb_samples_with_mesh>0: - # full_mesh = np.array(full_mesh) - # logger.debug("full_mesh.shape: {}".format(full_mesh.shape)) - # self._stats['mesh'] = { - # 'min': np.min(full_mesh, axis=0), - # 'max': np.max(full_mesh, axis=0), - # 'mean': np.mean(full_mesh, axis=0), - # 'std': np.std(full_mesh, axis=0), - # 'var': np.var(full_mesh, axis=0), - # } - # if mesh_have_timestamps: - # full_mesh_timestamps = np.array(full_mesh_timestamps) - # logger.debug("full_mesh_timestamps.shape: {}".format(full_mesh_timestamps.shape)) - # self._stats['mesh_timestamps'] = { - # 'min': np.min(full_mesh_timestamps), - # 'max': np.max(full_mesh_timestamps), - # 'mean': np.mean(full_mesh_timestamps), - # 'std': np.std(full_mesh_timestamps), - # 'var': np.var(full_mesh_timestamps), - # } diff --git a/src/plaid/version.py b/src/plaid/version.py new file mode 100644 index 00000000..5852aaaf --- /dev/null +++ b/src/plaid/version.py @@ -0,0 +1,6 @@ +try: + from ._version import __version__ +except ImportError: # pragma: no cover + __version__ = "None" + +__all__ = ["__version__"] diff --git a/tests/__init__.py b/tests/__init__.py index a9efb940..e69de29b 100644 --- a/tests/__init__.py +++ b/tests/__init__.py @@ -1,6 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/tests/bridges/__init__.py b/tests/bridges/__init__.py deleted file mode 100644 index a9efb940..00000000 --- a/tests/bridges/__init__.py +++ /dev/null @@ -1,6 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/tests/bridges/test_huggingface_bridge.py b/tests/bridges/test_huggingface_bridge.py deleted file mode 100644 index b4d0c131..00000000 --- a/tests/bridges/test_huggingface_bridge.py +++ /dev/null @@ -1,371 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports - -import pickle -from typing import Callable - -import pytest - -from plaid.bridges import huggingface_bridge -from plaid.containers.dataset import Dataset -from plaid.containers.sample import Sample -from plaid.problem_definition import ProblemDefinition -from plaid.utils import cgns_helper - - -# %% Fixtures -@pytest.fixture() -def dataset(samples, infos) -> Dataset: - samples_ = [] - for i, sample in enumerate(samples): - if i == 1: - sample.add_scalar("toto", 1.0) - samples_.append(sample) - samples_.append(sample) - dataset = Dataset(samples=samples_) - dataset.set_infos(infos) - return dataset - - -@pytest.fixture() -def problem_definition() -> ProblemDefinition: - problem_definition = ProblemDefinition() - problem_definition.set_task("regression") - problem_definition.add_input_scalars_names(["feature_name_1", "feature_name_2"]) - problem_definition.set_split({"train": [0, 2], "test": [1, 3]}) - return problem_definition - - -@pytest.fixture() -def generator(dataset) -> Callable: - def generator_(): - for sample in dataset: - yield sample - - return generator_ - - -@pytest.fixture() -def gen_kwargs(problem_definition) -> dict[str, dict]: - gen_kwargs = {} - for split_name, ids in problem_definition.get_split().items(): - mid = len(ids) // 2 - gen_kwargs[split_name] = {"shards_ids": [ids[:mid], ids[mid:]]} - return gen_kwargs - - -@pytest.fixture() -def generator_split(dataset, problem_definition) -> dict[str, Callable]: - generators_ = {} - - main_splits = problem_definition.get_split() - - for split_name, ids in main_splits.items(): - - def generator_(): - for id in ids: - yield dataset[id] - - generators_[split_name] = generator_ - - return generators_ - - -@pytest.fixture() -def generator_split_with_kwargs(dataset, gen_kwargs) -> dict[str, Callable]: - generators_ = {} - - for split_name in gen_kwargs.keys(): - - def generator_(shards_ids): - for ids in shards_ids: - if isinstance(ids, int): - ids = [ids] - for id in ids: - yield dataset[id] - - generators_[split_name] = generator_ - - return generators_ - - -@pytest.fixture() -def generator_binary(dataset) -> Callable: - def generator_(): - for sample in dataset: - yield { - "sample": pickle.dumps(sample), - } - - return generator_ - - -@pytest.fixture() -def generator_split_binary(dataset, problem_definition) -> dict[str, Callable]: - generators_ = {} - for split_name, ids in problem_definition.get_split().items(): - - def generator_(): - for id in ids: - yield {"sample": pickle.dumps(dataset[id])} - - generators_[split_name] = generator_ - return generators_ - - -@pytest.fixture() -def hf_dataset(generator_binary) -> Dataset: - hf_dataset = huggingface_bridge.plaid_generator_to_huggingface_binary( - generator_binary - ) - return hf_dataset - - -class Test_Huggingface_Bridge: - def assert_sample(self, sample): - assert isinstance(sample, Sample) - assert sample.get_scalar_names()[0] == "test_scalar" - assert "test_field_same_size" in sample.get_field_names() - assert sample.get_field("test_field_same_size").shape[0] == 17 - - def assert_hf_dataset_binary(self, hfds_binary): - self.assert_sample(huggingface_bridge.binary_to_plaid_sample(hfds_binary[0])) - - def assert_plaid_dataset(self, ds): - self.assert_sample(ds[0]) - - # ------------------------------------------------------------------------------ - # HUGGING FACE BRIDGE (with tree flattening and pyarrow tables) - # ------------------------------------------------------------------------------ - - def test_with_datasetdict(self, dataset, problem_definition): - main_splits = problem_definition.get_split() - - hf_dataset_dict, flat_cst, key_mappings = ( - huggingface_bridge.plaid_dataset_to_huggingface_datasetdict( - dataset, main_splits - ) - ) - - huggingface_bridge.to_plaid_sample( - hf_dataset_dict["train"], 0, flat_cst["train"], key_mappings["cgns_types"] - ) - huggingface_bridge.to_plaid_sample( - hf_dataset_dict["test"], - 0, - flat_cst["test"], - key_mappings["cgns_types"], - enforce_shapes=False, - ) - huggingface_bridge.to_plaid_dataset( - hf_dataset_dict["train"], flat_cst["train"], key_mappings["cgns_types"] - ) - huggingface_bridge.to_plaid_dataset( - hf_dataset_dict["test"], - flat_cst=flat_cst["test"], - cgns_types=key_mappings["cgns_types"], - enforce_shapes=False, - ) - cgns_helper.compare_cgns_trees(dataset[0].get_tree(), dataset[0].get_tree()) - cgns_helper.compare_cgns_trees_no_types( - dataset[0].get_tree(), dataset[0].get_tree() - ) - - def test_with_generator( - self, generator_split_with_kwargs, generator_split, gen_kwargs - ): - hf_dataset_dict, flat_cst, key_mappings = ( - huggingface_bridge.plaid_generator_to_huggingface_datasetdict( - generator_split_with_kwargs, gen_kwargs - ) - ) - hf_dataset_dict, flat_cst, key_mappings = ( - huggingface_bridge.plaid_generator_to_huggingface_datasetdict( - generator_split - ) - ) - huggingface_bridge.to_plaid_sample( - hf_dataset_dict["train"], 0, flat_cst["train"], key_mappings["cgns_types"] - ) - huggingface_bridge.to_plaid_sample( - hf_dataset_dict["test"], - 0, - flat_cst["test"], - key_mappings["cgns_types"], - enforce_shapes=True, - ) - - # ------------------------------------------------------------------------------ - # HUGGING FACE INTERACTIONS ON DISK - # ------------------------------------------------------------------------------ - - def test_save_load_to_disk( - self, tmp_path, generator_split, infos, problem_definition - ): - hf_dataset_dict, flat_cst, key_mappings = ( - huggingface_bridge.plaid_generator_to_huggingface_datasetdict( - generator_split - ) - ) - - test_dir = tmp_path / "test" - huggingface_bridge.save_dataset_dict_to_disk(test_dir, hf_dataset_dict) - huggingface_bridge.save_infos_to_disk(test_dir, infos) - huggingface_bridge.save_problem_definition_to_disk( - test_dir, "task_1", problem_definition - ) - huggingface_bridge.save_tree_struct_to_disk(test_dir, flat_cst, key_mappings) - - huggingface_bridge.load_dataset_from_disk(test_dir) - huggingface_bridge.load_infos_from_disk(test_dir) - huggingface_bridge.load_problem_definition_from_disk(test_dir, "task_1") - huggingface_bridge.load_tree_struct_from_disk(test_dir) - - # ------------------------------------------------------------------------------ - # HUGGING FACE BINARY BRIDGE - # ------------------------------------------------------------------------------ - - def test_save_load_to_disk_binary( - self, tmp_path, generator_split_binary, infos, problem_definition - ): - hf_dataset_dict = ( - huggingface_bridge.plaid_generator_to_huggingface_datasetdict_binary( - generator_split_binary - ) - ) - test_dir = tmp_path / "test" - huggingface_bridge.save_dataset_dict_to_disk(test_dir, hf_dataset_dict) - huggingface_bridge.save_infos_to_disk(test_dir, infos) - huggingface_bridge.save_problem_definition_to_disk( - test_dir, "task_1", problem_definition - ) - huggingface_bridge.load_dataset_from_disk(test_dir) - huggingface_bridge.load_infos_from_disk(test_dir) - huggingface_bridge.load_problem_definition_from_disk(test_dir, "task_1") - - def test_binary_to_plaid_sample(self, generator_binary): - hfds = huggingface_bridge.plaid_generator_to_huggingface_binary( - generator_binary - ) - huggingface_bridge.binary_to_plaid_sample(hfds[0]) - - def test_binary_to_plaid_sample_fallback_build_succeeds(self, dataset): - sample = dataset[0] - old_hf_sample = { - "path": getattr(sample, "path", None), - "scalars": {sn: sample.get_scalar(sn) for sn in sample.get_scalar_names()}, - "meshes": sample.features.data, - } - old_hf_sample = {"sample": pickle.dumps(old_hf_sample)} - plaid_sample = huggingface_bridge.binary_to_plaid_sample(old_hf_sample) - assert isinstance(plaid_sample, Sample) - - def test_plaid_dataset_to_huggingface_binary(self, dataset): - hfds = huggingface_bridge.plaid_dataset_to_huggingface_binary(dataset) - hfds = huggingface_bridge.plaid_dataset_to_huggingface_binary( - dataset, ids=[0, 1] - ) - self.assert_hf_dataset_binary(hfds) - - def test_plaid_dataset_to_huggingface_datasetdict_binary( - self, dataset, problem_definition - ): - huggingface_bridge.plaid_dataset_to_huggingface_datasetdict_binary( - dataset, main_splits=problem_definition.get_split() - ) - - def test_plaid_generator_to_huggingface_binary(self, generator_binary): - hfds = huggingface_bridge.plaid_generator_to_huggingface_binary( - generator_binary - ) - hfds = huggingface_bridge.plaid_generator_to_huggingface_binary( - generator_binary, processes_number=2 - ) - self.assert_hf_dataset_binary(hfds) - - def test_plaid_generator_to_huggingface_datasetdict_binary( - self, generator_split_binary - ): - huggingface_bridge.plaid_generator_to_huggingface_datasetdict_binary( - generator_split_binary - ) - - def test_huggingface_dataset_to_plaid(self, hf_dataset): - ds, _ = huggingface_bridge.huggingface_dataset_to_plaid(hf_dataset) - self.assert_plaid_dataset(ds) - - def test_huggingface_dataset_to_plaid_no_warning(self, hf_dataset, caplog): - """Test that huggingface_dataset_to_plaid does not trigger infos replacement warning.""" - import logging - - with caplog.at_level(logging.WARNING): - ds, _ = huggingface_bridge.huggingface_dataset_to_plaid( - hf_dataset, verbose=False - ) - - # Should not warn about replacing infos - assert "infos not empty, replacing it anyway" not in caplog.text - # Dataset should still be valid - self.assert_plaid_dataset(ds) - - def test_huggingface_dataset_to_plaid_with_ids_binary(self, hf_dataset): - huggingface_bridge.huggingface_dataset_to_plaid(hf_dataset, ids=[0, 1]) - - def test_huggingface_dataset_to_plaid_large_binary(self, hf_dataset): - huggingface_bridge.huggingface_dataset_to_plaid( - hf_dataset, processes_number=2, large_dataset=True - ) - - def test_huggingface_dataset_to_plaid_large_binary_2(self, hf_dataset): - huggingface_bridge.huggingface_dataset_to_plaid(hf_dataset, processes_number=2) - - def test_huggingface_dataset_to_plaid_with_ids_large_binary(self, hf_dataset): - with pytest.raises(NotImplementedError): - huggingface_bridge.huggingface_dataset_to_plaid( - hf_dataset, ids=[0, 1], processes_number=2, large_dataset=True - ) - - def test_huggingface_dataset_to_plaid_error_processes_number_binary( - self, hf_dataset - ): - with pytest.raises(AssertionError): - huggingface_bridge.huggingface_dataset_to_plaid( - hf_dataset, processes_number=128 - ) - - def test_huggingface_dataset_to_plaid_error_processes_number_binary_2( - self, hf_dataset - ): - with pytest.raises(AssertionError): - huggingface_bridge.huggingface_dataset_to_plaid( - hf_dataset, ids=[0], processes_number=2 - ) - - def test_huggingface_description_to_problem_definition(self, hf_dataset): - huggingface_bridge.huggingface_description_to_problem_definition( - hf_dataset.description - ) - - def test_huggingface_description_to_infos(self, infos): - hf_description = {} - hf_description.update(infos) - huggingface_bridge.huggingface_description_to_infos(hf_description) - - # ---- Deprecated ---- - def test_create_string_for_huggingface_dataset_card(self, infos): - dataset_card = "---\ndataset_name: my_dataset\n---" - - huggingface_bridge.update_dataset_card( - dataset_card=dataset_card, - infos=infos, - pretty_name="2D quasistatic non-linear structural mechanics solutions", - dataset_long_description="my long description", - illustration_urls=["url0", "url1"], - arxiv_paper_urls=["url2"], - ) diff --git a/tests/conftest.py b/tests/conftest.py index 09c1aec9..8777f3f8 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -31,21 +31,21 @@ def generate_samples_no_string(nb: int, zone_name: str, base_name: str) -> list[ sample.init_zone( np.array([[17, 10, 0]]), zone_name=zone_name, base_name=base_name ) - sample.add_scalar("test_scalar", float(i)) - sample.add_scalar("test_scalar_2", float(i**2)) + sample.add_global("test_scalar", float(i)) + sample.add_global("test_scalar_2", float(i**2)) sample.add_global("global_0", 0.5 + np.ones((2, 3))) sample.add_global("global_1", 1.5 + i + np.ones((2, 3, 2))) sample.add_field( name="test_field_same_size", field=float(i**4) * np.ones(17), - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) sample.add_field( name="test_field_2785", field=float(i**5) * np.ones(10), - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, location="CellCenter", ) sample_list.append(sample) @@ -207,36 +207,34 @@ def empty_dataset(): @pytest.fixture() -def dataset_with_samples(dataset, samples, infos): - dataset.add_samples(samples) - dataset.set_infos(infos) +def dataset_with_samples(dataset, samples): + dataset.get_backend().add_sample(samples, list(range(len(samples)) )) return dataset @pytest.fixture() -def dataset_with_samples_with_tree(samples_with_tree, infos): +def dataset_with_samples_with_tree(samples_with_tree): dataset = Dataset() - dataset.add_samples(samples_with_tree) - dataset.set_infos(infos) + dataset.get_backend_new().add_sample(samples_with_tree) return dataset @pytest.fixture() def other_dataset_with_samples(other_samples): other_dataset = Dataset() - other_dataset.add_samples(other_samples) + other_dataset.get_backend_new().add_sample(other_samples) return other_dataset @pytest.fixture() def heterogeneous_dataset(dataset_with_samples_with_tree): dataset = dataset_with_samples_with_tree.copy() - dataset.add_sample(Sample()) + dataset.get_backend_new().add_sample(Sample()) sample_with_scalar = Sample() - sample_with_scalar.add_scalar("scalar", 1.0) - dataset.add_sample(sample_with_scalar) + sample_with_scalar.add_global("scalar", 1.0) + dataset.get_backend_new().add_sample(sample_with_scalar) sample_with_ts = Sample() - dataset.add_sample(sample_with_ts) + dataset.get_backend_new().add_sample(sample_with_ts) return dataset @@ -244,10 +242,10 @@ def heterogeneous_dataset(dataset_with_samples_with_tree): def scalar_dataset(): dataset = Dataset() sample = Sample() - sample.add_scalar("test_scalar", 0.0) - dataset.add_sample(sample) + sample.add_global("test_scalar", 0.0) + dataset.get_backend_new().add_sample(sample) sample2 = Sample() for i in range(8): - sample2.add_scalar(f"scalar_{i}", float(i)) - dataset.add_sample(sample2) + sample2.add_global(f"scalar_{i}", float(i)) + dataset.get_backend_new().add_sample(sample2) return dataset diff --git a/tests/containers/__init__.py b/tests/containers/__init__.py index a9efb940..e69de29b 100644 --- a/tests/containers/__init__.py +++ b/tests/containers/__init__.py @@ -1,6 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/tests/containers/dataset/cgns_types.yaml b/tests/containers/dataset/cgns_types.yaml new file mode 100644 index 00000000..8355d699 --- /dev/null +++ b/tests/containers/dataset/cgns_types.yaml @@ -0,0 +1,19 @@ +CGNSLibraryVersion: CGNSLibraryVersion_t +Base_2_3: CGNSBase_t +Base_2_3/2D: Family_t +Base_2_3/Zone: Zone_t +Base_2_3/Zone/ZoneType: ZoneType_t +Base_2_3/Zone/GridCoordinates: GridCoordinates_t +Base_2_3/Zone/GridCoordinates/CoordinateX: DataArray_t +Base_2_3/Zone/GridCoordinates/CoordinateY: DataArray_t +Base_2_3/Zone/GridCoordinates/CoordinateZ: DataArray_t +Base_2_3/Zone/Elements_TRI_3: Elements_t +Base_2_3/Zone/Elements_TRI_3/ElementRange: IndexRange_t +Base_2_3/Zone/Elements_TRI_3/ElementConnectivity: DataArray_t +Base_2_3/Zone/VertexFields: FlowSolution_t +Base_2_3/Zone/VertexFields/GridLocation: GridLocation_t +Base_2_3/Zone/VertexFields/pressure: DataArray_t +Base_2_3/Zone/FamilyName: FamilyName_t +Base_2_3/Time: BaseIterativeData_t +Base_2_3/Time/IterationValues: DataArray_t +Base_2_3/Time/TimeValues: DataArray_t diff --git a/tests/containers/dataset/constants/test/constant_schema.yaml b/tests/containers/dataset/constants/test/constant_schema.yaml new file mode 100644 index 00000000..31c0fa8d --- /dev/null +++ b/tests/containers/dataset/constants/test/constant_schema.yaml @@ -0,0 +1,69 @@ +Base_2_3: + dtype: int32 + ndim: 1 +Base_2_3/2D: + dtype: null +Base_2_3/2D_times: + dtype: null +Base_2_3/Zone: + dtype: int64 + ndim: 2 +Base_2_3/Zone/Elements_TRI_3: + dtype: int32 + ndim: 1 +Base_2_3/Zone/Elements_TRI_3/ElementConnectivity_times: + dtype: float64 + ndim: 1 +Base_2_3/Zone/Elements_TRI_3/ElementRange: + dtype: int64 + ndim: 1 +Base_2_3/Zone/Elements_TRI_3/ElementRange_times: + dtype: float64 + ndim: 1 +Base_2_3/Zone/Elements_TRI_3_times: + dtype: float64 + ndim: 1 +Base_2_3/Zone/FamilyName: + dtype: string + ndim: 1 +Base_2_3/Zone/FamilyName_times: + dtype: float64 + ndim: 1 +Base_2_3/Zone/GridCoordinates: + dtype: null +Base_2_3/Zone/GridCoordinates/CoordinateX_times: + dtype: float64 + ndim: 1 +Base_2_3/Zone/GridCoordinates/CoordinateY_times: + dtype: float64 + ndim: 1 +Base_2_3/Zone/GridCoordinates/CoordinateZ_times: + dtype: float64 + ndim: 1 +Base_2_3/Zone/GridCoordinates_times: + dtype: null +Base_2_3/Zone/VertexFields: + dtype: null +Base_2_3/Zone/VertexFields/GridLocation: + dtype: string + ndim: 1 +Base_2_3/Zone/VertexFields/GridLocation_times: + dtype: float64 + ndim: 1 +Base_2_3/Zone/VertexFields/pressure_times: + dtype: float64 + ndim: 1 +Base_2_3/Zone/VertexFields_times: + dtype: null +Base_2_3/Zone/ZoneType: + dtype: string + ndim: 1 +Base_2_3/Zone/ZoneType_times: + dtype: float64 + ndim: 1 +Base_2_3/Zone_times: + dtype: float64 + ndim: 1 +Base_2_3_times: + dtype: float64 + ndim: 1 diff --git a/tests/containers/dataset/constants/test/data.mmap b/tests/containers/dataset/constants/test/data.mmap new file mode 100644 index 00000000..ebf489da Binary files /dev/null and b/tests/containers/dataset/constants/test/data.mmap differ diff --git a/tests/containers/dataset/constants/test/layout.json b/tests/containers/dataset/constants/test/layout.json new file mode 100644 index 00000000..ec7cdda4 --- /dev/null +++ b/tests/containers/dataset/constants/test/layout.json @@ -0,0 +1,142 @@ +{ + "Base_2_3": { + "offset": 0, + "shape": [ + 2 + ], + "dtype": " Path: class Test_Sample: # -------------------------------------------------------------------------# def test___init__(self, current_directory): - sample_path_1 = current_directory / "dataset" / "samples" / "sample_000000000" - sample_path_2 = current_directory / "dataset" / "samples" / "sample_000000001" - sample_path_3 = current_directory / "dataset" / "samples" / "sample_000000002" + sample_path_1 = ( + current_directory / "dataset" / "data" / "test" / "sample_000000000" + ) + sample_path_2 = ( + current_directory / "dataset" / "data" / "test" / "sample_000000001" + ) + sample_path_3 = ( + current_directory / "dataset" / "data" / "test" / "sample_000000002" + ) sample_already_filled_1 = Sample(path=sample_path_1) sample_already_filled_2 = Sample(path=sample_path_2) sample_already_filled_3 = Sample(path=sample_path_3) @@ -182,23 +180,24 @@ def test__init__unknown_directory(self, current_directory): Sample(path=sample_path) def test__init__file_provided(self, current_directory): - sample_path = current_directory / "dataset" / "samples" / "sample_000067392" + sample_path = ( + current_directory + / "dataset" + / "data" + / "test" + / "sample_000000000" + / "meshes" + / "mesh_000000000.cgns" + ) with pytest.raises(FileExistsError): Sample(path=sample_path) def test__init__path(self, current_directory): - sample_path = current_directory / "dataset" / "samples" / "sample_000000000" + sample_path = ( + current_directory / "dataset" / "data" / "test" / "sample_000000000" + ) Sample(path=sample_path) - # def test__init__directory_path(self, current_directory): - # sample_path = current_directory / "dataset" / "samples" / "sample_000000000" - # Sample(directory_path=sample_path) - - # def test__init__both_path_and_directory_path(self, current_directory): - # sample_path = current_directory / "dataset" / "samples" / "sample_000000000" - # with pytest.raises(ValueError): - # Sample(path=sample_path, directory_path=sample_path) - def test_copy(self, sample_with_tree_and_scalar): sample_with_tree_and_scalar.copy() @@ -643,91 +642,74 @@ def test_get_zone(self, sample: Sample, zone_name, base_name): # -------------------------------------------------------------------------# def test_get_scalar_names(self, sample: Sample): - assert sample.get_scalar_names() == [] + assert sample.get_global_names() == [] + + def test_get_global_names_at_specific_time(self, sample: Sample): + sample.add_global("g_t0", np.array([1.0]), time=0.0) + sample.add_global("g_t1", np.array([2.0]), time=1.0) + + assert sample.get_global_names(time=0.0) == ["g_t0"] + assert sample.get_global_names(time=1.0) == ["g_t1"] def test_get_scalar_empty(self, sample): - assert sample.get_scalar("missing_scalar_name") is None + assert sample.get_global("missing_scalar_name") is None def test_get_scalar(self, sample_with_scalar): - assert sample_with_scalar.get_scalar("missing_scalar_name") is None - assert sample_with_scalar.get_scalar("test_scalar_1") is not None - assert isinstance(sample_with_scalar.get_scalar("test_scalar_1"), np.float64) + assert sample_with_scalar.get_global("missing_scalar_name") is None + assert sample_with_scalar.get_global("test_scalar_1") is not None + assert isinstance(sample_with_scalar.get_global("test_scalar_1"), np.float64) def test_scalars_add_empty(self, sample_with_scalar): - assert isinstance(sample_with_scalar.get_scalar("test_scalar_1"), float) + assert isinstance(sample_with_scalar.get_global("test_scalar_1"), float) def test_scalars_add(self, sample_with_scalar): - sample_with_scalar.add_scalar("test_scalar_2", np.random.randn()) + sample_with_scalar.add_global("test_scalar_2", np.random.randn()) def test_del_scalar_unknown_scalar(self, sample_with_scalar): with pytest.raises(KeyError): - sample_with_scalar.del_scalar("non_existent_scalar") + sample_with_scalar.del_global("non_existent_scalar") def test_del_scalar_no_scalar(self): sample = Sample() with pytest.raises(KeyError): - sample.del_scalar("non_existent_scalar") + sample.del_global("non_existent_scalar") - def test_del_scalar(self, sample_with_scalar): - assert len(sample_with_scalar.get_scalar_names()) == 1 + def test_del_global(self, sample_with_scalar): + assert len(sample_with_scalar.get_global_names()) == 1 - sample_with_scalar.add_scalar("test_scalar_2", np.random.randn(5)) - assert len(sample_with_scalar.get_scalar_names()) == 2 + sample_with_scalar.add_global("test_scalar_2", np.random.randn(5)) + assert len(sample_with_scalar.get_global_names()) == 2 - scalar = sample_with_scalar.del_scalar("test_scalar_1") - assert len(sample_with_scalar.get_scalar_names()) == 1 + scalar = sample_with_scalar.del_global("test_scalar_1") + assert len(sample_with_scalar.get_global_names()) == 1 assert scalar is not None assert isinstance(scalar, float) - scalar = sample_with_scalar.del_scalar("test_scalar_2") - assert len(sample_with_scalar.get_scalar_names()) == 0 + scalar = sample_with_scalar.del_global("test_scalar_2") + assert len(sample_with_scalar.get_global_names()) == 0 assert scalar is not None assert isinstance(scalar, np.ndarray) - def test_add_feature(self, sample_with_scalar): - sample_with_scalar.add_feature( - feature_identifier=FeatureIdentifier( - {"type": "scalar", "name": "test_scalar_2"} - ), + def test_add_feature(self, sample_with_tree3d): + sample_with_tree3d.add_feature( + feature_path="Global/test_scalar_2", feature=[3.1415], ) - def test_del_feature(self, sample_with_scalar: Sample, sample_with_tree3d: Sample): - sample_with_scalar.del_feature( - feature_identifier=FeatureIdentifier( - {"type": "scalar", "name": "test_scalar_1"} - ), - ) - assert sample_with_scalar.get_all_features_identifiers_by_type("scalar") == [] - sample_with_tree3d.del_feature( - feature_identifier=FeatureIdentifier( - {"type": "field", "name": "test_node_field_1"} - ), - ) - sample_with_tree3d.del_feature( - feature_identifier=FeatureIdentifier( - {"type": "field", "name": "big_node_field"} - ), + sample_with_tree3d.add_feature( + feature_path="Base_2_3/Zone/VertexFields/pressure", + feature=np.arange(5), ) - sample_with_tree3d.del_feature( - feature_identifier=FeatureIdentifier( - {"type": "field", "name": "test_elem_field_1", "location": "CellCenter"} - ), - ) - sample_with_tree3d.del_feature( - feature_identifier=FeatureIdentifier( - {"type": "field", "name": "OriginalIds"} - ), - ) - sample_with_tree3d.del_feature( - feature_identifier=FeatureIdentifier( - {"type": "field", "name": "OriginalIds", "location": "CellCenter"} - ), + + sample_with_tree3d.add_feature( + feature_path="Base_2_3/Zone/GridCoordinates", + feature=np.zeros((5, 3)), ) - with pytest.raises(NotImplementedError): - sample_with_tree3d.del_feature( - feature_identifier=FeatureIdentifier({"type": "nodes"}), - ) + + def test_del_feature(self, sample_with_scalar: Sample, sample_with_tree3d: Sample): + sample_with_scalar.del_feature_by_path(path="Global/test_scalar_1") + assert sample_with_scalar.get_all_features_identifiers_by_type("scalar") == [] + sample_with_tree3d.del_feature_by_path("Base_2_3/Zone/VertexFields/test_node_field_1") # -------------------------------------------------------------------------# def test_get_nodal_tags_empty(self, sample): @@ -746,6 +728,15 @@ def test_get_nodes(self, sample_with_tree, nodes): def test_get_nodes3d(self, sample_with_tree3d, nodes3d): assert np.all(sample_with_tree3d.get_nodes() == nodes3d) + def test_get_nodes_by_coordinate_name(self, sample_with_tree, nodes): + assert np.all(sample_with_tree.get_nodes(name="CoordinateX") == nodes[:, 0]) + assert np.all(sample_with_tree.get_nodes(name="CoordinateY") == nodes[:, 1]) + sample_with_tree.get_nodes(name="CoordinateZ") + + def test_get_nodes_unknown_coordinate_name(self, sample_with_tree): + with pytest.raises(ValueError): + sample_with_tree.get_nodes(name="UnknownCoordinate") + def test_set_nodes(self, sample, nodes, zone_name, base_name): sample.init_base(3, 3, base_name) with pytest.raises(KeyError): @@ -841,128 +832,128 @@ def test_get_field_names_several_bases(self): name="vertex_Zone_1_Base_1_2_t_m0.1", field=np.random.randn(5), location="Vertex", - zone_name="Zone_1", - base_name="Base_1_2", + zone="Zone_1", + base="Base_1_2", time=-0.1, ) sample.add_field( name="cell_Zone_1_Base_1_2_t_m0.1", field=np.random.randn(3), location="CellCenter", - zone_name="Zone_1", - base_name="Base_1_2", + zone="Zone_1", + base="Base_1_2", time=-0.1, ) sample.add_field( name="vertex_Zone_2_Base_1_2_t_m0.1", field=np.random.randn(5), location="Vertex", - zone_name="Zone_2", - base_name="Base_1_2", + zone="Zone_2", + base="Base_1_2", time=-0.1, ) sample.add_field( name="cell_Zone_2_Base_1_2_t_m0.1", field=np.random.randn(3), location="CellCenter", - zone_name="Zone_2", - base_name="Base_1_2", + zone="Zone_2", + base="Base_1_2", time=-0.1, ) sample.add_field( name="vertex_Zone_1_Base_2_2_t_m0.1", field=np.random.randn(5), location="Vertex", - zone_name="Zone_1", - base_name="Base_2_2", + zone="Zone_1", + base="Base_2_2", time=-0.1, ) sample.add_field( name="cell_Zone_1_Base_2_2_t_m0.1", field=np.random.randn(3), location="CellCenter", - zone_name="Zone_1", - base_name="Base_2_2", + zone="Zone_1", + base="Base_2_2", time=-0.1, ) sample.add_field( name="vertex_Zone_2_Base_2_2_t_m0.1", field=np.random.randn(5), location="Vertex", - zone_name="Zone_2", - base_name="Base_2_2", + zone="Zone_2", + base="Base_2_2", time=-0.1, ) sample.add_field( name="cell_Zone_2_Base_2_2_t_m0.1", field=np.random.randn(3), location="CellCenter", - zone_name="Zone_2", - base_name="Base_2_2", + zone="Zone_2", + base="Base_2_2", time=-0.1, ) sample.add_field( name="vertex_Zone_1_Base_1_3_t_1.0", field=np.random.randn(5), location="Vertex", - zone_name="Zone_1", - base_name="Base_1_3", + zone="Zone_1", + base="Base_1_3", time=1.0, ) sample.add_field( name="cell_Zone_1_Base_1_3_t_1.0", field=np.random.randn(3), location="CellCenter", - zone_name="Zone_1", - base_name="Base_1_3", + zone="Zone_1", + base="Base_1_3", time=1.0, ) sample.add_field( name="vertex_Zone_2_Base_1_3_t_1.0", field=np.random.randn(5), location="Vertex", - zone_name="Zone_2", - base_name="Base_1_3", + zone="Zone_2", + base="Base_1_3", time=1.0, ) sample.add_field( name="cell_Zone_2_Base_1_3_t_1.0", field=np.random.randn(3), location="CellCenter", - zone_name="Zone_2", - base_name="Base_1_3", + zone="Zone_2", + base="Base_1_3", time=1.0, ) sample.add_field( name="vertex_Zone_1_Base_3_3_t_1.0", field=np.random.randn(5), location="Vertex", - zone_name="Zone_1", - base_name="Base_3_3", + zone="Zone_1", + base="Base_3_3", time=1.0, ) sample.add_field( name="cell_Zone_1_Base_3_3_t_1.0", field=np.random.randn(3), location="CellCenter", - zone_name="Zone_1", - base_name="Base_3_3", + zone="Zone_1", + base="Base_3_3", time=1.0, ) sample.add_field( name="vertex_Zone_2_Base_3_3_t_1.0", field=np.random.randn(5), location="Vertex", - zone_name="Zone_2", - base_name="Base_3_3", + zone="Zone_2", + base="Base_3_3", time=1.0, ) sample.add_field( name="cell_Zone_2_Base_3_3_t_1.0", field=np.random.randn(3), location="CellCenter", - zone_name="Zone_2", - base_name="Base_3_3", + zone="Zone_2", + base="Base_3_3", time=1.0, ) expected_field_names = [ @@ -987,12 +978,12 @@ def test_get_field_names_several_bases(self): sample.add_field( name="field_of_ints", field=np.arange(5), - zone_name="Zone_2", - base_name="Base_3_3", + zone="Zone_2", + base="Base_3_3", time=1.0, ) field = sample.get_field( - "field_of_ints", zone_name="Zone_2", base_name="Base_3_3", time=1.0 + "field_of_ints", zone="Zone_2", base="Base_3_3", time=1.0 ) assert field.dtype == np.float64 @@ -1013,15 +1004,15 @@ def test_add_field_vertex(self, sample: Sample, vertex_field, zone_name, base_na sample.add_field( name="test_node_field_2", field=vertex_field, - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) with pytest.raises(ValueError): sample.add_field( name="test_node_field_2", field=np.zeros((5, 2)), - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) sample.init_zone( np.array([[5, 3, 0]]), zone_name=zone_name, base_name=base_name @@ -1029,15 +1020,15 @@ def test_add_field_vertex(self, sample: Sample, vertex_field, zone_name, base_na sample.add_field( name="test_node_field_2", field=vertex_field, - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) with pytest.raises(ValueError): sample.add_field( name="test_node_field_2", field=np.zeros((13)), - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) def test_add_field_cell_center( @@ -1049,8 +1040,8 @@ def test_add_field_cell_center( name="test_elem_field_2", field=cell_center_field, location="CellCenter", - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) sample.init_zone( np.array([[5, 3, 0]]), zone_name=zone_name, base_name=base_name @@ -1059,16 +1050,16 @@ def test_add_field_cell_center( name="test_elem_field_2", location="CellCenter", field=cell_center_field, - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) with pytest.raises(ValueError): sample.add_field( name="test_elem_field_2", location="CellCenter", field=np.zeros((13)), - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) def test_add_field_vertex_already_present( @@ -1079,8 +1070,8 @@ def test_add_field_vertex_already_present( sample_with_tree.add_field( name="test_node_field_1", field=vertex_field, - zone_name="Zone", - base_name="Base_2_2", + zone="Zone", + base="Base_2_2", ) def test_add_field_cell_center_already_present( @@ -1092,8 +1083,8 @@ def test_add_field_cell_center_already_present( name="test_elem_field_1", field=cell_center_field, location="CellCenter", - zone_name="Zone", - base_name="Base_2_2", + zone="Zone", + base="Base_2_2", ) def test_del_field_existing(self, sample_with_tree): @@ -1101,15 +1092,15 @@ def test_del_field_existing(self, sample_with_tree): sample_with_tree.del_field( name="unknown", location="CellCenter", - zone_name="Zone", - base_name="Base_2_2", + zone="Zone", + base="Base_2_2", ) with pytest.raises(KeyError): sample_with_tree.del_field( name="unknown", location="CellCenter", - zone_name="unknown_zone", - base_name="Base_2_2", + zone="unknown_zone", + base="Base_2_2", ) def test_del_field_nonexistent(self, base_name): @@ -1119,8 +1110,8 @@ def test_del_field_nonexistent(self, base_name): sample.del_field( name="unknown", location="CellCenter", - zone_name="unknown_zone", - base_name=base_name, + zone="unknown_zone", + base=base_name, ) def test_del_field_in_zone(self, zone_name, base_name, cell_center_field): @@ -1133,8 +1124,8 @@ def test_del_field_in_zone(self, zone_name, base_name, cell_center_field): name="test_elem_field_1", field=cell_center_field, location="CellCenter", - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) # Add field 'test_elem_field_2' @@ -1142,15 +1133,15 @@ def test_del_field_in_zone(self, zone_name, base_name, cell_center_field): name="test_elem_field_2", field=cell_center_field, location="CellCenter", - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) assert isinstance( sample.get_field( name="test_elem_field_2", location="CellCenter", - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ), np.ndarray, ) @@ -1159,8 +1150,8 @@ def test_del_field_in_zone(self, zone_name, base_name, cell_center_field): new_tree = sample.del_field( name="test_elem_field_2", location="CellCenter", - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) # Testing new tree on field 'test_elem_field_2' @@ -1171,13 +1162,13 @@ def test_del_field_in_zone(self, zone_name, base_name, cell_center_field): new_sample.get_field( name="test_elem_field_2", location="CellCenter", - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) is None ) fields = new_sample.get_field_names( - location="CellCenter", zone_name=zone_name, base_name=base_name + location="CellCenter", zone=zone_name, base=base_name ) assert "test_elem_field_2" not in fields @@ -1187,8 +1178,8 @@ def test_del_field_in_zone(self, zone_name, base_name, cell_center_field): new_tree = sample.del_field( name="test_elem_field_1", location="CellCenter", - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) # Testing new tree on field 'test_elem_field_1' @@ -1199,423 +1190,78 @@ def test_del_field_in_zone(self, zone_name, base_name, cell_center_field): new_sample.get_field( name="test_elem_field_1", location="CellCenter", - zone_name=zone_name, - base_name=base_name, + zone=zone_name, + base=base_name, ) is None ) fields = new_sample.get_field_names( - location="CellCenter", zone_name=zone_name, base_name=base_name + location="CellCenter", zone=zone_name, base=base_name ) assert len(fields) == 0 - def test_del_all_fields(self, sample_with_tree): - sample_with_tree.del_all_fields() - # -------------------------------------------------------------------------# def test_get_feature_by_path(self, sample_with_tree_and_scalar): sample_with_tree_and_scalar.get_feature_by_path( "Base_2_2/Zone/Elements_TRI_3/ElementConnectivity", 0.0 ) - def test_get_feature_from_string_identifier(self, sample_with_tree_and_scalar): - sample_with_tree_and_scalar.get_feature_from_string_identifier( - "scalar::test_scalar_1" - ) - - sample_with_tree_and_scalar.get_feature_from_string_identifier( - "field::test_node_field_1" - ) - sample_with_tree_and_scalar.get_feature_from_string_identifier( - "field::test_node_field_1///Base_2_2" - ) - sample_with_tree_and_scalar.get_feature_from_string_identifier( - "field::test_node_field_1//Zone/Base_2_2" - ) - sample_with_tree_and_scalar.get_feature_from_string_identifier( - "field::test_node_field_1/Vertex/Zone/Base_2_2" - ) - sample_with_tree_and_scalar.get_feature_from_string_identifier( - "field::test_node_field_1/Vertex/Zone/Base_2_2/0" - ) - - sample_with_tree_and_scalar.get_feature_from_string_identifier("nodes::") - sample_with_tree_and_scalar.get_feature_from_string_identifier( - "nodes::/Base_2_2" - ) - sample_with_tree_and_scalar.get_feature_from_string_identifier( - "nodes::Zone/Base_2_2" - ) - sample_with_tree_and_scalar.get_feature_from_string_identifier( - "nodes::Zone/Base_2_2/0" - ) - def test_get_feature_from_identifier(self, sample_with_tree_and_scalar): - sample_with_tree_and_scalar.get_feature_from_identifier( - {"type": "scalar", "name": "test_scalar_1"} - ) - - sample_with_tree_and_scalar.get_feature_from_identifier( - {"type": "field", "name": "test_node_field_1"} - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - {"type": "field", "name": "test_node_field_1", "base_name": "Base_2_2"} - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - { - "type": "field", - "name": "test_node_field_1", - "base_name": "Base_2_2", - "zone_name": "Zone", - } - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - { - "type": "field", - "name": "test_node_field_1", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "Vertex", - } - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - { - "type": "field", - "name": "test_node_field_1", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "Vertex", - "time": 0.0, - } - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - {"type": "field", "name": "test_node_field_1", "time": 0.0} - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - { - "type": "field", - "name": "test_node_field_1", - "base_name": "Base_2_2", - "time": 0.0, - } - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - { - "type": "field", - "name": "test_node_field_1", - "zone_name": "Zone", - "location": "Vertex", - "time": 0.0, - } - ) - - sample_with_tree_and_scalar.get_feature_from_identifier({"type": "nodes"}) - sample_with_tree_and_scalar.get_feature_from_identifier( - {"type": "nodes", "base_name": "Base_2_2"} - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - {"type": "nodes", "base_name": "Base_2_2", "zone_name": "Zone"} - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - {"type": "nodes", "base_name": "Base_2_2", "zone_name": "Zone", "time": 0.0} - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - {"type": "nodes", "zone_name": "Zone"} - ) - sample_with_tree_and_scalar.get_feature_from_identifier( - {"type": "nodes", "time": 0.0} - ) - - def test_get_features_from_identifiers(self, sample_with_tree_and_scalar): - sample_with_tree_and_scalar.get_features_from_identifiers( - [{"type": "scalar", "name": "test_scalar_1"}] - ) - sample_with_tree_and_scalar.get_features_from_identifiers( - [ - {"type": "scalar", "name": "test_scalar_1"}, - ] - ) - - sample_with_tree_and_scalar.get_features_from_identifiers( - [ - { - "type": "field", - "name": "test_node_field_1", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "Vertex", - "time": 0.0, - }, - {"type": "scalar", "name": "test_scalar_1"}, - {"type": "nodes"}, - ] - ) - - def test_update_features_from_identifier(self, sample_with_tree_and_scalar): - before = sample_with_tree_and_scalar.get_scalar("test_scalar_1") - sample_ = sample_with_tree_and_scalar.update_features_from_identifier( - feature_identifiers={"type": "scalar", "name": "test_scalar_1"}, - features=3.141592, - in_place=False, - ) - after = sample_.get_scalar("test_scalar_1") - show_cgns_tree(sample_.features.data[0]) - assert after != before - - before = sample_with_tree_and_scalar.get_field( - name="test_node_field_1", - zone_name="Zone", - base_name="Base_2_2", - location="Vertex", - time=0.0, - ) - sample_ = sample_with_tree_and_scalar.update_features_from_identifier( - feature_identifiers=FeatureIdentifier( - { - "type": "field", - "name": "test_node_field_1", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "Vertex", - "time": 0.0, - } - ), - features=np.random.rand(*before.shape), - in_place=False, + sample_with_tree_and_scalar.get_feature_by_path( + "Base_2_2/Zone/GridCoordinates/CoordinateX" + ) is not None + print(sample_with_tree_and_scalar.show_tree()) + assert ( + sample_with_tree_and_scalar.get_feature_by_path( + "Base_2_2/Zone/VertexFields/test_node_field_1" + ) + is not None ) - after = sample_.get_field( - name="test_node_field_1", - zone_name="Zone", - base_name="Base_2_2", - location="Vertex", - time=0.0, + assert ( + sample_with_tree_and_scalar.get_feature_by_path("Global/test_scalar_1") + is not None ) - assert np.any(~np.isclose(after, before)) - before = sample_with_tree_and_scalar.get_nodes( - zone_name="Zone", base_name="Base_2_2", time=0.0 - ) - sample_ = sample_with_tree_and_scalar.update_features_from_identifier( - feature_identifiers=FeatureIdentifier( - { - "type": "nodes", - "base_name": "Base_2_2", - "zone_name": "Zone", - "time": 0.0, - } - ), - features=np.random.rand(*before.shape), - in_place=False, - ) - after = sample_.get_nodes(zone_name="Zone", base_name="Base_2_2", time=0.0) - assert np.any(~np.isclose(after, before)) - - before_1 = sample_with_tree_and_scalar.get_field("test_node_field_1") - before_2 = sample_with_tree_and_scalar.get_nodes() - sample_ = sample_with_tree_and_scalar.update_features_from_identifier( - feature_identifiers=[ - {"type": "field", "name": "test_node_field_1"}, - {"type": "nodes"}, - ], - features=[ - np.random.rand(*before_1.shape), - np.random.rand(*before_2.shape), - ], + def test_update_features_by_path(self, sample_with_tree_and_scalar): + sample_with_tree_and_scalar.update_features_by_path( + "Global/test_scalar_1", + features=3.141592, in_place=False, ) - after_1 = sample_.get_field("test_node_field_1") - after_2 = sample_.get_nodes() - assert np.any(~np.isclose(after_1, before_1)) - assert np.any(~np.isclose(after_2, before_2)) - - sample_ = sample_with_tree_and_scalar.update_features_from_identifier( - feature_identifiers=[{"type": "field", "name": "test_node_field_1"}], - features=[np.random.rand(*before_1.shape)], - in_place=True, - ) - ref_1 = sample_with_tree_and_scalar.get_field("test_node_field_1") - ref_2 = sample_.get_field("test_node_field_1") - assert np.any(np.isclose(ref_1, ref_2)) - - def test_extract_sample_from_identifier(self, sample_with_tree_and_scalar): - sample_: Sample = sample_with_tree_and_scalar.extract_sample_from_identifier( - feature_identifiers={"type": "scalar", "name": "test_scalar_1"}, - ) - assert sample_.get_scalar_names() == ["test_scalar_1"] - assert len(sample_.get_field_names()) == 0 - - sample_: Sample = sample_with_tree_and_scalar.extract_sample_from_identifier( - feature_identifiers={ - "type": "field", - "name": "test_node_field_1", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "Vertex", - "time": 0.0, - }, - ) - show_cgns_tree(sample_with_tree_and_scalar.features.data[0]) - assert len(sample_.get_scalar_names()) == 0 - assert sample_.get_field_names() == ["test_node_field_1"] - - sample_: Sample = sample_with_tree_and_scalar.extract_sample_from_identifier( - feature_identifiers={ - "type": "nodes", - "base_name": "Base_2_2", - "zone_name": "Zone", - "time": 0.0, - }, - ) - assert len(sample_.get_scalar_names()) == 0 - assert len(sample_.get_field_names()) == 0 - - sample_: Sample = sample_with_tree_and_scalar.extract_sample_from_identifier( - feature_identifiers=[ - {"type": "field", "name": "test_node_field_1"}, - {"type": "nodes"}, - ], - ) - assert len(sample_.get_scalar_names()) == 0 - assert sample_.get_field_names() == ["test_node_field_1"] - - def test_get_all_features_identifiers(self, sample_with_tree_and_scalar): - feat_ids = sample_with_tree_and_scalar.get_all_features_identifiers() - assert len(feat_ids) == 8 - assert {"type": "scalar", "name": "r"} in feat_ids - assert {"type": "scalar", "name": "test_scalar_1"} in feat_ids - assert { - "type": "nodes", - "base_name": "Base_2_2", - "zone_name": "Zone", - "time": 0.0, - } in feat_ids - assert { - "type": "field", - "name": "big_node_field", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "Vertex", - "time": 0.0, - } in feat_ids - assert { - "type": "field", - "name": "test_node_field_1", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "Vertex", - "time": 0.0, - } in feat_ids - assert { - "type": "field", - "name": "OriginalIds", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "Vertex", - "time": 0.0, - } in feat_ids - assert { - "type": "field", - "name": "test_elem_field_1", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "CellCenter", - "time": 0.0, - } in feat_ids - assert { - "type": "field", - "name": "OriginalIds", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "CellCenter", - "time": 0.0, - } in feat_ids def test_get_all_features_identifiers_by_type(self, sample_with_tree_and_scalar): feat_ids = sample_with_tree_and_scalar.get_all_features_identifiers_by_type( "scalar" ) assert len(feat_ids) == 2 - assert {"type": "scalar", "name": "r"} in feat_ids - assert {"type": "scalar", "name": "test_scalar_1"} in feat_ids + assert "r" in feat_ids + assert "test_scalar_1" in feat_ids feat_ids = sample_with_tree_and_scalar.get_all_features_identifiers_by_type( - "nodes" + "field" ) - assert { - "type": "nodes", - "base_name": "Base_2_2", - "zone_name": "Zone", - "time": 0.0, - } in feat_ids + assert len(feat_ids) == 4 feat_ids = sample_with_tree_and_scalar.get_all_features_identifiers_by_type( - "field" - ) - assert len(feat_ids) == 5 - assert { - "type": "field", - "name": "big_node_field", - "base_name": "Base_2_2", - "zone_name": "Zone", - "location": "Vertex", - "time": 0.0, - } in feat_ids - - def test_merge_features(self, sample_with_tree_and_scalar, sample_with_tree): - feat_id = sample_with_tree_and_scalar.get_all_features_identifiers() - feat_id = [fid for fid in feat_id if fid["type"] not in ["scalar"]] - sample_1 = sample_with_tree_and_scalar.extract_sample_from_identifier(feat_id) - feat_id = sample_with_tree.get_all_features_identifiers() - feat_id = [fid for fid in feat_id if fid["type"] not in ["field"]] - sample_2 = sample_with_tree.extract_sample_from_identifier(feat_id) - sample_merge_1 = sample_1.merge_features(sample_2, in_place=False) - sample_merge_2 = sample_2.merge_features(sample_1, in_place=False) - assert ( - sample_merge_1.get_all_features_identifiers() - == sample_merge_2.get_all_features_identifiers() - ) - sample_2.merge_features(sample_1, in_place=True) - sample_1.merge_features(sample_2, in_place=True) - - def test_merge_features2(self, sample_with_tree_and_scalar, sample_with_tree): - feat_id = sample_with_tree_and_scalar.get_all_features_identifiers() - feat_id = [fid for fid in feat_id if fid["type"] not in ["scalar"]] - sample_1 = sample_with_tree_and_scalar.extract_sample_from_identifier(feat_id) - feat_id = sample_with_tree.get_all_features_identifiers() - feat_id = [fid for fid in feat_id if fid["type"] not in ["field", "nodes"]] - sample_2 = sample_with_tree.extract_sample_from_identifier(feat_id) - sample_merge_1 = sample_1.merge_features(sample_2, in_place=False) - sample_merge_2 = sample_2.merge_features(sample_1, in_place=False) - assert ( - sample_merge_1.get_all_features_identifiers() - == sample_merge_2.get_all_features_identifiers() + "nodes" ) - sample_2.merge_features(sample_1, in_place=True) - sample_1.merge_features(sample_2, in_place=True) + assert len(feat_ids) == 2 # -------------------------------------------------------------------------# def test_save(self, sample_with_tree_and_scalar, tmp_path): save_dir = tmp_path / "test_dir" - sample_with_tree_and_scalar.save(save_dir) + sample_with_tree_and_scalar.save_to_dir(save_dir) assert save_dir.is_dir() with pytest.raises(ValueError): - sample_with_tree_and_scalar.save(save_dir, memory_safe=False) - sample_with_tree_and_scalar.save(save_dir, overwrite=True) - sample_with_tree_and_scalar.save(save_dir, overwrite=True, memory_safe=True) - - def test_load_from_saved_file(self, sample_with_tree_and_scalar, tmp_path): - save_dir = tmp_path / "test_dir" - sample_with_tree_and_scalar.save(save_dir) - new_sample = Sample() - new_sample.load(save_dir) - assert CGU.checkSameTree( - sample_with_tree_and_scalar.get_tree(), - new_sample.get_tree(), + sample_with_tree_and_scalar.save_to_dir(save_dir, memory_safe=False) + sample_with_tree_and_scalar.save_to_dir(save_dir, overwrite=True) + sample_with_tree_and_scalar.save_to_dir( + save_dir, overwrite=True, memory_safe=True ) def test_load_from_dir(self, sample_with_tree_and_scalar, tmp_path): save_dir = tmp_path / "test_dir" - sample_with_tree_and_scalar.save(save_dir) + sample_with_tree_and_scalar.save_to_dir(save_dir) new_sample = Sample.load_from_dir(save_dir) assert CGU.checkSameTree( sample_with_tree_and_scalar.get_tree(), @@ -1663,3 +1309,76 @@ def test_check_completeness_with_tree(self, sample_with_tree): def test_check_completeness_with_tree_and_scalar(self, sample_with_tree_and_scalar): print(sample_with_tree_and_scalar.check_completeness()) + + +# %% Tests for delegated methods proxy + + +class TestSampleFeaturesDelegation: + """Tests for the ``@delegate_methods("features", FEATURES_METHODS)`` proxy. + + These tests ensure that every method listed in ``FEATURES_METHODS`` is + actually exposed on ``Sample`` as a delegate to the corresponding method on + ``SampleFeatures``. Any method added to ``FEATURES_METHODS`` is picked up + automatically, so these tests also guard against future regressions (e.g. + forgetting to add a new method name, or renaming a method on + ``SampleFeatures`` without updating the proxy list). + """ + + @pytest.mark.parametrize("method_name", FEATURES_METHODS) + def test_proxy_exposes_sample_features_method(self, method_name: str): + """Each delegated method exists on both ``Sample`` and ``SampleFeatures``.""" + assert hasattr(SampleFeatures, method_name), ( + f"'{method_name}' is listed in FEATURES_METHODS but not defined on " + f"SampleFeatures." + ) + assert hasattr(Sample, method_name), ( + f"'{method_name}' is listed in FEATURES_METHODS but not proxied on Sample." + ) + assert callable(getattr(Sample, method_name)), ( + f"Sample.{method_name} exists but is not callable." + ) + + @pytest.mark.parametrize("method_name", FEATURES_METHODS) + def test_proxy_forwards_call_to_sample_features( + self, sample: Sample, method_name: str + ): + """Calling ``sample.`` delegates to ``sample.features.``. + + We patch the method on the ``features`` instance and verify that the + proxy forwards the call with the same positional and keyword arguments, + and returns the value produced by the underlying method. + """ + sentinel = object() + calls: list[tuple[tuple, dict]] = [] + + def fake(*args, **kwargs): + calls.append((args, kwargs)) + return sentinel + + # Bind the fake on the instance so only this ``sample`` is affected. + setattr(sample.features, method_name, fake) + + result = getattr(sample, method_name)(1, 2, key="value") + + assert result is sentinel, ( + f"Sample.{method_name} did not return the value produced by " + f"SampleFeatures.{method_name}." + ) + assert calls == [((1, 2), {"key": "value"})], ( + f"Sample.{method_name} did not forward args/kwargs to " + f"SampleFeatures.{method_name} as-is." + ) + + def test_set_trees_proxy_end_to_end(self, sample: Sample, tree): + """``sample.set_trees`` actually stores trees on the underlying features. + + This is a functional sanity check for the specific method added in + this change set: calling the proxy without going through ``.features`` + must produce the same observable state as calling the underlying + method directly. + """ + sample.set_trees({0.0: tree}) + + assert sample.features.get_all_time_values() == [0.0] + assert sample.features.get_tree(time=0.0) is tree diff --git a/tests/containers/test_utils.py b/tests/containers/test_utils.py index 0683c747..914679e9 100644 --- a/tests/containers/test_utils.py +++ b/tests/containers/test_utils.py @@ -1,10 +1,3 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports from pathlib import Path @@ -13,13 +6,11 @@ import pytest from plaid.containers.utils import ( - check_features_type_homogeneity, get_feature_details_from_path, get_number_of_samples, get_sample_ids, - has_duplicates_feature_ids, - validate_required_infos, ) +from plaid.utils.info import validate_required_infos # %% Fixtures @@ -34,121 +25,123 @@ def current_directory(): class Test_Container_Utils: def test_get_sample_ids(self, current_directory): - dataset_path = current_directory / "dataset" - assert get_sample_ids(dataset_path) == list(np.arange(0, 3)) + dataset_path = current_directory / "dataset" /"data"/"test" + assert get_sample_ids(dataset_path) == list(np.arange(0, 10)) def test_get_number_of_samples(self, current_directory): - dataset_path = current_directory / "dataset" - assert get_number_of_samples(dataset_path) == 3 + dataset_path = current_directory / "dataset" /"data"/"test" + assert get_number_of_samples(dataset_path) == 10 def test_get_sample_ids_with_str(self, current_directory): - dataset_path = current_directory / "dataset" - assert get_sample_ids(str(dataset_path)) == list(np.arange(0, 3)) + dataset_path = current_directory / "dataset" /"data"/"test" + assert get_sample_ids(str(dataset_path)) == list(np.arange(0, 10)) def test_get_number_of_samples_with_str(self, current_directory): - dataset_path = current_directory / "dataset" - assert get_number_of_samples(str(dataset_path)) == 3 - - def test_check_features_type_homogeneity(self): - check_features_type_homogeneity( - [{"type": "scalar", "name": "Mach"}, {"type": "scalar", "name": "P"}] - ) - - def test_check_features_type_homogeneity_fail_type(self): - with pytest.raises(AssertionError): - check_features_type_homogeneity(0) - - def test_check_features_type_homogeneity_fail(self): - with pytest.raises(AssertionError): - check_features_type_homogeneity( - [{"type": "scalar", "name": "Mach"}, {"type": "nodes"}] - ) - - def test_has_duplicates_feature_ids(self): - assert not has_duplicates_feature_ids( - [{"type": "scalar", "name": "Mach"}, {"type": "scalar", "name": "P"}] - ) - assert has_duplicates_feature_ids( - [{"type": "scalar", "name": "Mach"}, {"type": "scalar", "name": "Mach"}] - ) - - def test_get_feature_details_from_path(self): - details = get_feature_details_from_path("Base_2_2") - assert details["base"] == "Base_2_2" - - details = get_feature_details_from_path("Global/toto") - assert details["type"] == "global" - assert details["name"] == "toto" - - details = get_feature_details_from_path("Base_2_2/Zone") - assert details["base"] == "Base_2_2" - assert details["zone"] == "Zone" - - details = get_feature_details_from_path( - "Base_2_2/Zone/Elements_QUAD_4/ElementConnectivity" - ) - assert details["base"] == "Base_2_2" - assert details["zone"] == "Zone" - assert details["type"] == "elements" - assert details["sub_type"] == "connectivity" - assert details["element_type"] == "QUAD_4" - - details = get_feature_details_from_path( - "Base_2_2/Zone/Elements_QUAD_4/ElementRange" - ) - assert details["base"] == "Base_2_2" - assert details["zone"] == "Zone" - assert details["type"] == "elements" - assert details["sub_type"] == "range" - assert details["element_type"] == "QUAD_4" - - details = get_feature_details_from_path( - "Base_2_2/Zone/GridCoordinates/CoordinateX" - ) - assert details["base"] == "Base_2_2" - assert details["zone"] == "Zone" - assert details["type"] == "coordinate" - assert details["sub_type"] == "node" - assert details["name"] == "CoordinateX" - - details = get_feature_details_from_path("Base_2_2/Zone/VertexFields/materialID") - assert details["base"] == "Base_2_2" - assert details["zone"] == "Zone" - assert details["type"] == "field" - assert details["location"] == "Vertex" - assert details["name"] == "materialID" - - details = get_feature_details_from_path( - "Base_2_2/Zone/ZoneBC/BottomLeft/PointList" - ) - assert details["base"] == "Base_2_2" - assert details["zone"] == "Zone" - assert details["type"] == "boundary_condition" - assert details["sub_type"] == "PointList" - assert details["name"] == "BottomLeft" - - details = get_feature_details_from_path("Base_2_2/Time") - assert details["base"] == "Base_2_2" - assert details["zone"] == "Time" - assert details["type"] == "zone" - - details = get_feature_details_from_path("Base_2_2/Time/IterationValues") - assert details["base"] == "Base_2_2" - assert details["zone"] == "Time" - assert details["type"] == "other" - assert details["path"] == "Base_2_2/Time/IterationValues" - - details = get_feature_details_from_path("Base_2_2/Time/TimeValues") - assert details["base"] == "Base_2_2" - assert details["zone"] == "Time" - assert details["type"] == "other" - assert details["path"] == "Base_2_2/Time/TimeValues" - - with pytest.raises(AssertionError): - get_feature_details_from_path("Dummy") - - with pytest.raises(AssertionError): - get_feature_details_from_path("Dummy/Dummy/Dummy/Dummy/Dummy/Dummy/Dummy") + dataset_path = current_directory / "dataset" /"data"/"test" + assert get_number_of_samples(str(dataset_path)) == 10 + + @pytest.mark.parametrize( + ("url", "expected"), + [ + ( + "CGNSLibraryVersion", + {"type": "cgns", "sub_type": "library_version"}, + ), + ("Global", {"type": "global", "sub_type": "root"}), + ( + "Global/Time/IterationValues", + { + "type": "global", + "sub_type": "time", + "name": "IterationValues", + }, + ), + ( + "Global/Mach", + {"type": "global", "sub_type": "scalar", "name": "Mach"}, + ), + ( + "Global_times/Q", + {"type": "global", "sub_type": "scalar", "name": "Q"}, + ), + ("Base_2_2", {"base": "Base_2_2", "type": "base"}), + ( + "Base_2_2/Zone", + {"base": "Base_2_2", "zone": "Zone", "type": "zone"}, + ), + ( + "Base_2_2/Zone/GridCoordinates", + { + "base": "Base_2_2", + "zone": "Zone", + "type": "coordinate", + "sub_type": "node", + }, + ), + ( + "Base_2_2/Zone/GridCoordinates/CoordinateX", + { + "base": "Base_2_2", + "zone": "Zone", + "type": "coordinate", + "sub_type": "node", + "name": "CoordinateX", + }, + ), + ( + "Base_2_2/Zone/Elements_QUAD_4/ElementConnectivity", + { + "base": "Base_2_2", + "zone": "Zone", + "type": "elements", + "element_type": "QUAD_4", + "sub_type": "connectivity", + }, + ), + ( + "Base_2_2/Zone/Elements_QUAD_4/ElementRange", + { + "base": "Base_2_2", + "zone": "Zone", + "type": "elements", + "element_type": "QUAD_4", + "sub_type": "range", + }, + ), + ( + "Base_2_2/Zone/VertexFields/materialID", + { + "base": "Base_2_2", + "zone": "Zone", + "type": "field", + "location": "Vertex", + "name": "materialID", + }, + ), + ( + "Base_2_2/Zone/PointData/rov", + { + "base": "Base_2_2", + "zone": "Zone", + "type": "field", + "location": "Vertex", + "name": "rov", + }, + ), + ( + "Base_2_2/Zone/Time/IterationValues", + { + "base": "Base_2_2", + "zone": "Zone", + "type": "other", + "path": "Base_2_2/Zone/Time/IterationValues", + }, + ), + ], + ) + def test_get_feature_details_from_path(self, url, expected): + assert get_feature_details_from_path(url) == expected + def test_validate_required_infos(self): infos = { @@ -164,6 +157,3 @@ def test_validate_required_infos(self): with pytest.raises(ValueError): validate_required_infos(infos_missing_license) - infos_dummy = {"dummy": "toto"} - with pytest.raises(AssertionError): - validate_required_infos(infos_dummy) diff --git a/tests/pipelines/conftest.py b/tests/pipelines/conftest.py deleted file mode 100644 index fec95327..00000000 --- a/tests/pipelines/conftest.py +++ /dev/null @@ -1,145 +0,0 @@ -"""This file defines shared pytest fixtures and test configurations for pipelines.""" - -import pytest -from sklearn.decomposition import PCA -from sklearn.gaussian_process import GaussianProcessRegressor -from sklearn.gaussian_process.kernels import RBF -from sklearn.linear_model import LinearRegression -from sklearn.multioutput import MultiOutputRegressor -from sklearn.preprocessing import MinMaxScaler - -from plaid.pipelines.plaid_blocks import ( - ColumnTransformer, - TransformedTargetRegressor, -) -from plaid.pipelines.sklearn_block_wrappers import ( - WrappedSklearnRegressor, - WrappedSklearnTransformer, -) - - -@pytest.fixture() -def sklearn_scaler(): - return MinMaxScaler() - - -@pytest.fixture() -def sklearn_pca(): - return PCA(n_components=2) - - -@pytest.fixture() -def sklearn_linear_regressor(): - return LinearRegression() - - -@pytest.fixture() -def sklearn_multioutput_gp_regressor(): - gpr = GaussianProcessRegressor(kernel=RBF(), random_state=42) - return MultiOutputRegressor(gpr) - - -@pytest.fixture() -def dataset_with_samples_scalar1_feat_ids(dataset_with_samples): - return [dataset_with_samples.get_all_features_identifiers_by_type("scalar")[0]] - - -@pytest.fixture() -def dataset_with_samples_scalar2_feat_ids(dataset_with_samples): - return [dataset_with_samples.get_all_features_identifiers_by_type("scalar")[1]] - - -@pytest.fixture() -def dataset_with_samples_with_tree_field_feat_ids(dataset_with_samples_with_tree): - return dataset_with_samples_with_tree.get_all_features_identifiers_by_type("field") - - -@pytest.fixture() -def dataset_with_samples_with_tree_1field_feat_ids(dataset_with_samples_with_tree): - return [ - dataset_with_samples_with_tree.get_all_features_identifiers_by_type("field")[0] - ] - - -@pytest.fixture() -def dataset_with_samples_with_tree_nodes_feat_ids(dataset_with_samples_with_tree): - return dataset_with_samples_with_tree.get_all_features_identifiers_by_type("nodes") - - -# --------------------------------------------------------------------------------------- - - -@pytest.fixture() -def wrapped_sklearn_transformer(sklearn_scaler, dataset_with_samples_scalar1_feat_ids): - return WrappedSklearnTransformer( - sklearn_block=sklearn_scaler, - in_features_identifiers=dataset_with_samples_scalar1_feat_ids, - ) - - -@pytest.fixture() -def wrapped_sklearn_transformer_2(sklearn_pca): - return WrappedSklearnTransformer( - sklearn_block=sklearn_pca, - in_features_identifiers=[{"type": "field", "name": "test_field_same_size"}], - out_features_identifiers=[ - {"type": "scalar", "name": "pca_component_0"}, - {"type": "scalar", "name": "pca_component_1"}, - ], - ) - - -@pytest.fixture() -def wrapped_sklearn_multioutput_gp_regressor( - sklearn_multioutput_gp_regressor, - dataset_with_samples_scalar1_feat_ids, - dataset_with_samples_scalar2_feat_ids, -): - return WrappedSklearnRegressor( - sklearn_block=sklearn_multioutput_gp_regressor, - in_features_identifiers=dataset_with_samples_scalar2_feat_ids, - out_features_identifiers=dataset_with_samples_scalar1_feat_ids, - ) - - -@pytest.fixture() -def wrapped_sklearn_blocks( - wrapped_sklearn_transformer, wrapped_sklearn_multioutput_gp_regressor -): - return [wrapped_sklearn_transformer, wrapped_sklearn_multioutput_gp_regressor] - - -# --------------------------------------------------------------------------------------- - - -@pytest.fixture() -def plaid_column_transformer( - wrapped_sklearn_transformer, wrapped_sklearn_transformer_2 -): - return ColumnTransformer( - plaid_transformers=[ - ("scaler_scalar", wrapped_sklearn_transformer), - ("pca_field", wrapped_sklearn_transformer_2), - ], - ) - - -@pytest.fixture() -def plaid_transformed_target_regressor( - wrapped_sklearn_multioutput_gp_regressor, wrapped_sklearn_transformer -): - return TransformedTargetRegressor( - regressor=wrapped_sklearn_multioutput_gp_regressor, - transformer=wrapped_sklearn_transformer, - ) - - -@pytest.fixture() -def plaid_blocks(plaid_column_transformer, plaid_transformed_target_regressor): - return [plaid_column_transformer, plaid_transformed_target_regressor] - - -# --------------------------------------------------------------------------------------- -@pytest.fixture() -def all_blocks(wrapped_sklearn_blocks, plaid_blocks): - return wrapped_sklearn_blocks + plaid_blocks diff --git a/tests/pipelines/test_blocks.py b/tests/pipelines/test_blocks.py deleted file mode 100644 index 36909170..00000000 --- a/tests/pipelines/test_blocks.py +++ /dev/null @@ -1,25 +0,0 @@ -import os - -import joblib -import pytest -from sklearn.base import clone -from sklearn.exceptions import NotFittedError -from sklearn.utils.validation import check_is_fitted - - -class Test_Blocks: - def test_clone(self, all_blocks): - for block in all_blocks: - clone(block) - - def test_save_load(self, all_blocks, tmp_path): - for block in all_blocks: - joblib.dump(block, os.path.join(tmp_path, "block_state.pkl")) - loaded_block = joblib.load(os.path.join(tmp_path, "block_state.pkl")) - with pytest.raises(NotFittedError): - check_is_fitted(loaded_block) - - def test_get_set_params(self, all_blocks): - for block in all_blocks: - param_name = next(iter(block.get_params())) - block.set_params(**{param_name: 0.0}) diff --git a/tests/pipelines/test_plaid_blocks.py b/tests/pipelines/test_plaid_blocks.py deleted file mode 100644 index 9e27ed0e..00000000 --- a/tests/pipelines/test_plaid_blocks.py +++ /dev/null @@ -1,104 +0,0 @@ -import numpy as np - -from plaid.pipelines.plaid_blocks import ( - ColumnTransformer, - TransformedTargetRegressor, -) -from plaid.pipelines.sklearn_block_wrappers import ( - get_2Darray_from_homogeneous_identifiers, -) - - -class Test_ColumnTransformer: - def test___init__(self, wrapped_sklearn_transformer, wrapped_sklearn_transformer_2): - ColumnTransformer([("titi", wrapped_sklearn_transformer)]) - ColumnTransformer( - [("toto", wrapped_sklearn_transformer)], - ) - ColumnTransformer( - plaid_transformers=[ - ("scaler_scalar", wrapped_sklearn_transformer), - ("pca_field", wrapped_sklearn_transformer_2), - ], - ) - - def test_fit_transform(self, plaid_column_transformer, dataset_with_samples): - transformed_dataset = plaid_column_transformer.fit_transform( - dataset_with_samples - ) - assert id(dataset_with_samples) != id(transformed_dataset) - - in_feat_id_0 = plaid_column_transformer.transformers_[0][ - 1 - ].in_features_identifiers_ - in_feat_id_1 = plaid_column_transformer.transformers_[1][ - 1 - ].in_features_identifiers_ - out_feat_id_0 = plaid_column_transformer.transformers_[0][ - 1 - ].out_features_identifiers_ - out_feat_id_1 = plaid_column_transformer.transformers_[1][ - 1 - ].out_features_identifiers_ - in_features_0 = get_2Darray_from_homogeneous_identifiers( - dataset_with_samples, in_feat_id_0 - ) - in_features_1 = get_2Darray_from_homogeneous_identifiers( - dataset_with_samples, in_feat_id_1 - ) - out_features_0 = get_2Darray_from_homogeneous_identifiers( - transformed_dataset, out_feat_id_0 - ) - out_features_1 = get_2Darray_from_homogeneous_identifiers( - transformed_dataset, out_feat_id_1 - ) - assert not np.allclose(in_features_0, out_features_0) - assert in_features_1.shape != out_features_1.shape - - transformed_dataset = plaid_column_transformer.fit_transform( - [s for s in dataset_with_samples] - ) - - def test_inverse_transform(self, plaid_column_transformer, dataset_with_samples): - transformed_dataset = plaid_column_transformer.fit_transform( - dataset_with_samples - ) - plaid_column_transformer.inverse_transform(transformed_dataset) - plaid_column_transformer.inverse_transform([s for s in transformed_dataset]) - - -class Test_TransformedTargetRegressor: - def test___init__( - self, wrapped_sklearn_multioutput_gp_regressor, wrapped_sklearn_transformer - ): - TransformedTargetRegressor( - regressor=wrapped_sklearn_multioutput_gp_regressor, - transformer=wrapped_sklearn_transformer, - ) - - def test_fit(self, plaid_transformed_target_regressor, dataset_with_samples): - plaid_transformed_target_regressor.fit(dataset_with_samples) - plaid_transformed_target_regressor.fit([s for s in dataset_with_samples]) - - def test_predict(self, plaid_transformed_target_regressor, dataset_with_samples): - plaid_transformed_target_regressor.fit(dataset_with_samples) - pred_dataset = plaid_transformed_target_regressor.predict(dataset_with_samples) - assert id(dataset_with_samples) != id(pred_dataset) - - out_feat_ids = plaid_transformed_target_regressor.out_features_identifiers_ - y_ref = get_2Darray_from_homogeneous_identifiers( - dataset_with_samples, out_feat_ids - ) - y_pred = get_2Darray_from_homogeneous_identifiers(pred_dataset, out_feat_ids) - - assert np.allclose(y_pred, y_ref) - plaid_transformed_target_regressor.predict([s for s in dataset_with_samples]) - - def test_score(self, plaid_transformed_target_regressor, dataset_with_samples): - plaid_transformed_target_regressor.fit(dataset_with_samples) - plaid_transformed_target_regressor.score([s for s in dataset_with_samples]) - plaid_transformed_target_regressor.score( - dataset_with_samples, [s for s in dataset_with_samples] - ) - score = plaid_transformed_target_regressor.score(dataset_with_samples) - assert np.isclose(score, 1.0) diff --git a/tests/pipelines/test_sklearn_block_wrappers.py b/tests/pipelines/test_sklearn_block_wrappers.py deleted file mode 100644 index f158f1f4..00000000 --- a/tests/pipelines/test_sklearn_block_wrappers.py +++ /dev/null @@ -1,149 +0,0 @@ -import numpy as np -import pytest - -from plaid.pipelines.sklearn_block_wrappers import ( - WrappedSklearnTransformer, - get_2Darray_from_homogeneous_identifiers, -) - - -def test_get_2Darray_from_homogeneous_identifiers( - dataset_with_samples, - dataset_with_samples_scalar1_feat_ids, - dataset_with_samples_scalar2_feat_ids, -): - # dataset_with_samples.get_all_features_identifiers() - X = get_2Darray_from_homogeneous_identifiers( - dataset_with_samples, dataset_with_samples_scalar1_feat_ids - ) - assert X.shape == (4, 1) - - feat_ids = ( - dataset_with_samples_scalar1_feat_ids + dataset_with_samples_scalar2_feat_ids - ) - X = get_2Darray_from_homogeneous_identifiers(dataset_with_samples, feat_ids) - assert X.shape == (4, 2) - - field_same_size_feat_id = { - "type": "field", - "name": "test_field_same_size", - "base_name": "Base_Name", - "zone_name": "Zone_Name", - "location": "Vertex", - "time": 0.0, - } - feat_ids = [field_same_size_feat_id] - X = get_2Darray_from_homogeneous_identifiers(dataset_with_samples, feat_ids) - assert X.shape == (4, 17) - - with pytest.raises(ValueError): - feat_ids = [field_same_size_feat_id, field_same_size_feat_id] - X = get_2Darray_from_homogeneous_identifiers(dataset_with_samples, feat_ids) - - with pytest.raises(AssertionError): - feat_ids = [ - { - "type": "field", - "name": "test_field_2785", - "base_name": "Base_Name", - "zone_name": "Zone_Name", - "location": "Vertex", - "time": 0.0, - }, - field_same_size_feat_id, - ] - X = get_2Darray_from_homogeneous_identifiers(dataset_with_samples, feat_ids) - - -def test_get_2Darray_from_homogeneous_identifiers_nodes( - dataset_with_samples_with_tree, dataset_with_samples_with_tree_nodes_feat_ids -): - X = get_2Darray_from_homogeneous_identifiers( - dataset_with_samples_with_tree, dataset_with_samples_with_tree_nodes_feat_ids - ) - assert X.shape == (4, 10) - - -class Test_WrappedSklearnTransformer: - def test___init__( - self, - sklearn_scaler, - dataset_with_samples_scalar1_feat_ids, - dataset_with_samples_scalar2_feat_ids, - ): - WrappedSklearnTransformer( - sklearn_block=sklearn_scaler, - in_features_identifiers=dataset_with_samples_scalar1_feat_ids, - ) - WrappedSklearnTransformer( - sklearn_block=sklearn_scaler, - in_features_identifiers=dataset_with_samples_scalar1_feat_ids, - out_features_identifiers=dataset_with_samples_scalar2_feat_ids, - ) - - def test_fit( - self, - wrapped_sklearn_transformer, - dataset_with_samples, - dataset_with_samples_scalar2_feat_ids, - ): - wrapped_sklearn_transformer.fit(dataset_with_samples) - wrapped_sklearn_transformer.out_features_identifiers = ( - dataset_with_samples_scalar2_feat_ids - ) - wrapped_sklearn_transformer.fit(dataset_with_samples) - - def test_transform(self, wrapped_sklearn_transformer, dataset_with_samples): - transformed_dataset = wrapped_sklearn_transformer.fit_transform( - dataset_with_samples - ) - assert id(dataset_with_samples) != id(transformed_dataset) - - in_features = get_2Darray_from_homogeneous_identifiers( - dataset_with_samples, wrapped_sklearn_transformer.in_features_identifiers_ - ) - tranformed_out_features = get_2Darray_from_homogeneous_identifiers( - transformed_dataset, wrapped_sklearn_transformer.out_features_identifiers_ - ) - assert not np.allclose(in_features, tranformed_out_features) - - def test_inverse_transform(self, wrapped_sklearn_transformer, dataset_with_samples): - wrapped_sklearn_transformer.fit(dataset_with_samples) - transformed_dataset = wrapped_sklearn_transformer.inverse_transform( - dataset_with_samples - ) - assert id(dataset_with_samples) != id(transformed_dataset) - - in_features = get_2Darray_from_homogeneous_identifiers( - dataset_with_samples, wrapped_sklearn_transformer.in_features_identifiers - ) - tranformed_in_features = get_2Darray_from_homogeneous_identifiers( - transformed_dataset, wrapped_sklearn_transformer.in_features_identifiers - ) - assert not np.allclose(in_features, tranformed_in_features) - - -class Test_WrappedSklearnRegressor: - def test___init__(self, wrapped_sklearn_multioutput_gp_regressor): - # __init__ is called in the input fixture - pass - - def test_fit(self, wrapped_sklearn_multioutput_gp_regressor, dataset_with_samples): - wrapped_sklearn_multioutput_gp_regressor.fit(dataset_with_samples) - - def test_predict( - self, wrapped_sklearn_multioutput_gp_regressor, dataset_with_samples - ): - out_feat_ids = wrapped_sklearn_multioutput_gp_regressor.out_features_identifiers - y_ref = get_2Darray_from_homogeneous_identifiers( - dataset_with_samples, out_feat_ids - ) - - wrapped_sklearn_multioutput_gp_regressor.fit(dataset_with_samples) - pred_dataset = wrapped_sklearn_multioutput_gp_regressor.predict( - dataset_with_samples - ) - - assert id(dataset_with_samples) != id(pred_dataset) - y_pred = get_2Darray_from_homogeneous_identifiers(pred_dataset, out_feat_ids) - assert np.allclose(y_pred, y_ref) diff --git a/tests/post/create_datasets.py b/tests/post/create_datasets.py deleted file mode 100644 index b9d23247..00000000 --- a/tests/post/create_datasets.py +++ /dev/null @@ -1,76 +0,0 @@ -import numpy as np - -from plaid import Dataset, ProblemDefinition, Sample -from plaid.types import FeatureIdentifier -from plaid.utils.split import split_dataset - -ins = [] -outs = [] -for i in range(30): - ins.append(np.random.rand()) - outs.append(np.random.rand()) - -samples = [] -for i in range(30): - sample = Sample() - sample.add_scalar("feature_1", ins[i]) - sample.add_scalar("feature_2", outs[i]) - samples.append(sample) - -dataset = Dataset(samples=samples) -dataset._save_to_dir_(path="dataset_ref", verbose=True) - - -samples = [] -for i in range(30): - sample = Sample() - sample.add_scalar("feature_1", 1.00001 * ins[i]) - sample.add_scalar("feature_2", 1.00001 * outs[i]) - samples.append(sample) - -dataset = Dataset(samples=samples) -dataset._save_to_dir_(path="dataset_near_pred", verbose=True) - - -samples = [] -for i in range(30): - sample = Sample() - sample.add_scalar("feature_1", 0.5 * ins[i]) - sample.add_scalar("feature_2", 0.5 * outs[i]) - samples.append(sample) - -dataset = Dataset(samples=samples) -dataset._save_to_dir_(path="dataset_pred", verbose=True) - - -print("dataset =", dataset) -print(dataset[0].get_scalar("feature_1")) - - -pb_def = ProblemDefinition() - -scalar_1_feat_id = FeatureIdentifier({"type": "scalar", "name": "feature_1"}) -scalar_2_feat_id = FeatureIdentifier({"type": "scalar", "name": "feature_2"}) - -pb_def.add_in_feature_identifier(scalar_1_feat_id) -pb_def.add_out_feature_identifier(scalar_2_feat_id) - -pb_def.add_input_scalar_name("feature_1") -pb_def.add_output_scalar_name("feature_2") - -pb_def.set_task("regression") - -options = { - "shuffle": False, - "split_sizes": { - "train": 20, - "test": 10, - }, -} - -split = split_dataset(dataset, options) -print(f"{split = }") - -pb_def.set_split(split) - -pb_def._save_to_dir_("problem_definition") diff --git a/tests/post/problem_definition/problem_infos.yaml b/tests/post/problem_definition/problem_infos.yaml deleted file mode 100644 index 470b12fb..00000000 --- a/tests/post/problem_definition/problem_infos.yaml +++ /dev/null @@ -1,17 +0,0 @@ -task: regression -input_features: -- type: scalar - name: feature_1 -output_features: -- type: scalar - name: feature_2 -input_scalars: -- feature_1 -output_scalars: -- feature_2 -input_fields: [] -output_fields: [] -input_timeseries: [] -output_timeseries: [] -input_meshes: [] -output_meshes: [] diff --git a/tests/post/problem_definition/split.json b/tests/post/problem_definition/split.json deleted file mode 100644 index ac64f861..00000000 --- a/tests/post/problem_definition/split.json +++ /dev/null @@ -1 +0,0 @@ -{"train": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], "test": [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]} \ No newline at end of file diff --git a/tests/post/test_bisect.py b/tests/post/test_bisect.py deleted file mode 100644 index 764b0a8b..00000000 --- a/tests/post/test_bisect.py +++ /dev/null @@ -1,89 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -import shutil -from pathlib import Path - -import pytest - -from plaid.containers.dataset import Dataset -from plaid.post.bisect import plot_bisect -from plaid.problem_definition import ProblemDefinition - - -@pytest.fixture() -def current_directory() -> Path: - return Path(__file__).absolute().parent - - -@pytest.fixture() -def working_directory() -> Path: - return Path.cwd() - - -class Test_Bisect: - def test_bisect_with_paths(self, current_directory, working_directory): - ref_path = current_directory / "dataset_ref" - pred_path = current_directory / "dataset_pred" - problem_path = current_directory / "problem_definition" - plot_bisect( - ref_path, pred_path, problem_path, "feature_2", "differ_bisect_plot" - ) - shutil.move( - working_directory / "differ_bisect_plot.png", - current_directory / "differ_bisect_plot.png", - ) - - def test_bisect_with_objects(self, current_directory, working_directory): - ref_path = Dataset(current_directory / "dataset_pred") - pred_path = Dataset(current_directory / "dataset_pred") - problem_path = ProblemDefinition(current_directory / "problem_definition") - plot_bisect(ref_path, pred_path, problem_path, "feature_2", "equal_bisect_plot") - shutil.move( - working_directory / "equal_bisect_plot.png", - current_directory / "equal_bisect_plot.png", - ) - - def test_bisect_with_mix(self, current_directory, working_directory): - scalar_index = 0 - ref_path = current_directory / "dataset_ref" - pred_path = current_directory / "dataset_near_pred" - problem_path = ProblemDefinition(current_directory / "problem_definition") - plot_bisect( - ref_path, - pred_path, - problem_path, - scalar_index, - "converge_bisect_plot", - verbose=True, - ) - shutil.move( - working_directory / "converge_bisect_plot.png", - current_directory / "converge_bisect_plot.png", - ) - - def test_bisect_error(self, current_directory): - ref_path = current_directory / "dataset_ref" - pred_path = current_directory / "dataset_near_pred" - problem_path = ProblemDefinition(current_directory / "problem_definition") - with pytest.raises(KeyError): - plot_bisect( - ref_path, - pred_path, - problem_path, - "unknown_scalar", - "converge_bisect_plot", - verbose=True, - ) - - def test_generated_files(self, current_directory): - path_1 = current_directory / "differ_bisect_plot.png" - path_2 = current_directory / "equal_bisect_plot.png" - path_3 = current_directory / "converge_bisect_plot.png" - assert path_1.is_file() - assert path_2.is_file() - assert path_3.is_file() diff --git a/tests/post/test_metrics.py b/tests/post/test_metrics.py deleted file mode 100644 index 9deb1c39..00000000 --- a/tests/post/test_metrics.py +++ /dev/null @@ -1,91 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -import shutil -from pathlib import Path - -import pytest -import yaml - -from plaid.containers.dataset import Dataset -from plaid.post.metrics import compute_metrics -from plaid.problem_definition import ProblemDefinition - - -@pytest.fixture() -def current_directory() -> Path: - return Path(__file__).absolute().parent - - -@pytest.fixture() -def working_directory() -> Path: - return Path.cwd() - - -class Test_Metrics: - def test_compute_metrics_with_paths(self, current_directory, working_directory): - ref_ds = current_directory / "dataset_ref" - pred_ds = current_directory / "dataset_near_pred" - problem = current_directory / "problem_definition" - compute_metrics(ref_ds, pred_ds, problem, "first_metrics") - shutil.move( - working_directory / "first_metrics.yaml", - current_directory / "first_metrics.yaml", - ) - - def test_compute_metrics_with_objects(self, current_directory, working_directory): - ref_ds = Dataset(current_directory / "dataset_ref") - pred_ds = Dataset(current_directory / "dataset_pred") - problem = ProblemDefinition(current_directory / "problem_definition") - compute_metrics(ref_ds, pred_ds, problem, "second_metrics", verbose=True) - shutil.move( - working_directory / "second_metrics.yaml", - current_directory / "second_metrics.yaml", - ) - - def test_compute_metrics_mix(self, current_directory, working_directory): - ref_ds = Dataset(current_directory / "dataset_ref") - pred_ds = Dataset(current_directory / "dataset_ref") - problem = ProblemDefinition(current_directory / "problem_definition") - compute_metrics(ref_ds, pred_ds, problem, "third_metrics", verbose=True) - shutil.move( - working_directory / "third_metrics.yaml", - current_directory / "third_metrics.yaml", - ) - - def test_compute_RMSE_data(self, current_directory): - path = current_directory / "first_metrics.yaml" - with path.open("r") as file: - contenu_yaml = yaml.load(file, Loader=yaml.FullLoader) - assert contenu_yaml["rRMSE for scalars"]["train"]["feature_2"] < 0.2 - assert contenu_yaml["rRMSE for scalars"]["test"]["feature_2"] < 0.2 - assert contenu_yaml["RMSE for scalars"]["train"]["feature_2"] < 0.2 - assert contenu_yaml["RMSE for scalars"]["test"]["feature_2"] < 0.2 - assert contenu_yaml["R2 for scalars"]["train"]["feature_2"] > 0.8 - assert contenu_yaml["R2 for scalars"]["test"]["feature_2"] > 0.8 - - def test_compute_rRMSE_data(self, current_directory): - path = current_directory / "second_metrics.yaml" - with path.open("r") as file: - contenu_yaml = yaml.load(file, Loader=yaml.FullLoader) - assert contenu_yaml["rRMSE for scalars"]["train"]["feature_2"] > 0.25 - assert contenu_yaml["rRMSE for scalars"]["test"]["feature_2"] > 0.25 - assert contenu_yaml["RMSE for scalars"]["train"]["feature_2"] > 0.25 - assert contenu_yaml["RMSE for scalars"]["test"]["feature_2"] > 0.25 - assert contenu_yaml["R2 for scalars"]["train"]["feature_2"] < 0.0 - assert contenu_yaml["R2 for scalars"]["test"]["feature_2"] < 0.0 - - def test_compute_R2_data(self, current_directory): - path = current_directory / "third_metrics.yaml" - with path.open("r") as file: - contenu_yaml = yaml.load(file, Loader=yaml.FullLoader) - assert contenu_yaml["rRMSE for scalars"]["train"]["feature_2"] == 0.0 - assert contenu_yaml["rRMSE for scalars"]["test"]["feature_2"] == 0.0 - assert contenu_yaml["RMSE for scalars"]["train"]["feature_2"] == 0.0 - assert contenu_yaml["RMSE for scalars"]["test"]["feature_2"] == 0.0 - assert contenu_yaml["R2 for scalars"]["train"]["feature_2"] == 1.0 - assert contenu_yaml["R2 for scalars"]["test"]["feature_2"] == 1.0 diff --git a/tests/storage/test_cgns_init.py b/tests/storage/test_cgns_init.py new file mode 100644 index 00000000..7610d4b2 --- /dev/null +++ b/tests/storage/test_cgns_init.py @@ -0,0 +1,194 @@ +from pathlib import Path + +import pytest + +import plaid.storage.cgns as cgns +from plaid.storage.cgns import CgnsBackend + + +def test_public_exports_and_backend_name(): + expected = { + "configure_dataset_card", + "download_datasetdict_from_hub", + "generate_datasetdict_to_disk", + "init_datasetdict_from_disk", + "init_datasetdict_streaming_from_hub", + "push_local_datasetdict_to_hub", + } + assert set(cgns.__all__) == expected + assert CgnsBackend.name == "cgns" + + +def test_cgns_backend_init_from_disk_delegates(monkeypatch): + call = {} + + def fake_init_datasetdict_from_disk(path): + call["path"] = path + return {"train": "dataset"} + + monkeypatch.setattr(cgns, "init_datasetdict_from_disk", fake_init_datasetdict_from_disk) + + local_path = Path("/tmp/my_dataset") + result = CgnsBackend.init_from_disk(local_path) + + assert result == {"train": "dataset"} + assert call == {"path": local_path} + + +def test_cgns_backend_download_from_hub_delegates(monkeypatch): + call = {} + + def fake_download_datasetdict_from_hub( + repo_id, local_dir, split_ids=None, features=None, overwrite=False + ): + call["repo_id"] = repo_id + call["local_dir"] = local_dir + call["split_ids"] = split_ids + call["features"] = features + call["overwrite"] = overwrite + return "downloaded_path" + + monkeypatch.setattr(cgns, "download_datasetdict_from_hub", fake_download_datasetdict_from_hub) + + backend = CgnsBackend() + result = backend.download_from_hub("dummy/repo", "/tmp/local") + + assert result == "downloaded_path" + assert call == { + "repo_id": "dummy/repo", + "local_dir": "/tmp/local", + "split_ids": None, + "features": None, + "overwrite": False, + } + + +def test_cgns_backend_streaming_from_hub_delegates(monkeypatch): + call = {} + + def fake_init_datasetdict_streaming_from_hub(repo_id, split_ids=None, features=None): + call["repo_id"] = repo_id + call["split_ids"] = split_ids + call["features"] = features + return {"train": "stream"} + + monkeypatch.setattr( + cgns, + "init_datasetdict_streaming_from_hub", + fake_init_datasetdict_streaming_from_hub, + ) + + result = CgnsBackend.init_datasetdict_streaming_from_hub( + "PhysArena/Rotor37", split_ids=["train"], features={"a": "b"} + ) + + assert result == {"train": "stream"} + assert call == { + "repo_id": "PhysArena/Rotor37", + "split_ids": ["train"], + "features": {"a": "b"}, + } + + +def test_cgns_backend_generate_to_disk_delegates(monkeypatch): + call = {} + + def fake_generate_datasetdict_to_disk(**kwargs): + call.update(kwargs) + return "ok" + + monkeypatch.setattr(cgns, "generate_datasetdict_to_disk", fake_generate_datasetdict_to_disk) + + generators = {"train": lambda: iter(())} + variable_schema = {"Global/temperature": {"dtype": "float32", "ndim": 1}} + gen_kwargs = {"train": {"shards_ids": [[0, 1]]}} + + result = CgnsBackend.generate_to_disk( + output_folder="/tmp/output", + generators=generators, + variable_schema=variable_schema, + gen_kwargs=gen_kwargs, + num_proc=2, + verbose=True, + ) + + assert result == "ok" + assert call == { + "output_folder": "/tmp/output", + "generators": generators, + "variable_schema": variable_schema, + "gen_kwargs": gen_kwargs, + "num_proc": 2, + "verbose": True, + } + + +def test_cgns_backend_push_local_to_hub_delegates(monkeypatch): + call = {} + + def fake_push_local_datasetdict_to_hub(repo_id, local_dir, num_workers=1): + call["repo_id"] = repo_id + call["local_dir"] = local_dir + call["num_workers"] = num_workers + return "pushed" + + monkeypatch.setattr(cgns, "push_local_datasetdict_to_hub", fake_push_local_datasetdict_to_hub) + + result = CgnsBackend.push_local_to_hub("dummy/repo", "/tmp/local") + + assert result == "pushed" + assert call == { + "repo_id": "dummy/repo", + "local_dir": "/tmp/local", + "num_workers": 1, + } + + +def test_cgns_backend_configure_dataset_card_requires_local_dir(): + with pytest.raises(ValueError, match="local_dir must be provided for cgns backend"): + CgnsBackend.configure_dataset_card(repo_id="dummy/repo", infos={}) + + +def test_cgns_backend_configure_dataset_card_delegates(monkeypatch): + call = {} + + def fake_configure_dataset_card(**kwargs): + call.update(kwargs) + return "configured" + + monkeypatch.setattr(cgns, "configure_dataset_card", fake_configure_dataset_card) + + result = CgnsBackend.configure_dataset_card( + repo_id="dummy/repo", + infos={"source": "synthetic"}, + local_dir="/tmp/local", + viewer=True, + pretty_name="My Dataset", + dataset_long_description="Long description", + illustration_urls=["https://example.com/img.png"], + arxiv_paper_urls=["https://arxiv.org/abs/1234.5678"], + ) + + assert result == "configured" + assert call == { + "repo_id": "dummy/repo", + "infos": {"source": "synthetic"}, + "local_dir": "/tmp/local", + "viewer": True, + "pretty_name": "My Dataset", + "dataset_long_description": "Long description", + "illustration_urls": ["https://example.com/img.png"], + "arxiv_paper_urls": ["https://arxiv.org/abs/1234.5678"], + } + + +def test_cgns_backend_to_var_sample_dict_raises_value_error(): + with pytest.raises(ValueError, match="to_dict not available for 'cgns' backend"): + CgnsBackend.to_var_sample_dict(dataset=None, idx=0, features=[]) + + +def test_cgns_backend_sample_to_var_sample_dict_raises_value_error(): + with pytest.raises( + ValueError, match="sample_to_var_sample_dict not available for 'cgns' backend" + ): + CgnsBackend.sample_to_var_sample_dict(sample={}) \ No newline at end of file diff --git a/tests/storage/test_hf_datasets_init.py b/tests/storage/test_hf_datasets_init.py new file mode 100644 index 00000000..f4e5148c --- /dev/null +++ b/tests/storage/test_hf_datasets_init.py @@ -0,0 +1,246 @@ +from pathlib import Path + +import plaid.storage.hf_datasets as hf_datasets +from plaid.storage.hf_datasets import HFBackend + + +def test_public_exports_and_backend_name(): + expected = { + "configure_dataset_card", + "download_datasetdict_from_hub", + "generate_datasetdict_to_disk", + "init_datasetdict_from_disk", + "init_datasetdict_streaming_from_hub", + "push_local_datasetdict_to_hub", + "sample_to_var_sample_dict", + "to_var_sample_dict", + } + assert set(hf_datasets.__all__) == expected + assert HFBackend.name == "hf_datasets" + + +def test_hf_backend_init_from_disk_delegates(monkeypatch): + call = {} + + def fake_init_datasetdict_from_disk(path): + call["path"] = path + return {"train": "dataset"} + + monkeypatch.setattr(hf_datasets, "init_datasetdict_from_disk", fake_init_datasetdict_from_disk) + + local_path = Path("/tmp/my_dataset") + result = HFBackend.init_from_disk(local_path) + + assert result == {"train": "dataset"} + assert call == {"path": local_path} + + +def test_hf_backend_download_from_hub_delegates(monkeypatch): + call = {} + + def fake_download_datasetdict_from_hub(repo_id, local_dir, split_ids, features, overwrite): + call["repo_id"] = repo_id + call["local_dir"] = local_dir + call["split_ids"] = split_ids + call["features"] = features + call["overwrite"] = overwrite + return "downloaded_path" + + monkeypatch.setattr(hf_datasets, "download_datasetdict_from_hub", fake_download_datasetdict_from_hub) + + result = HFBackend.download_from_hub( + repo_id="dummy/repo", + local_dir="/tmp/local", + split_ids={"train": 0}, + features=["path/to/feature"], + overwrite=True, + ) + + assert result == "downloaded_path" + assert call == { + "repo_id": "dummy/repo", + "local_dir": "/tmp/local", + "split_ids": {"train": 0}, + "features": ["path/to/feature"], + "overwrite": True, + } + + +def test_hf_backend_streaming_from_hub_delegates(monkeypatch): + call = {} + + def fake_init_datasetdict_streaming_from_hub(repo_id, split_ids, features): + call["repo_id"] = repo_id + call["split_ids"] = split_ids + call["features"] = features + return {"train": "streaming_dataset"} + + monkeypatch.setattr( + hf_datasets, + "init_datasetdict_streaming_from_hub", + fake_init_datasetdict_streaming_from_hub, + ) + + result = HFBackend.init_datasetdict_streaming_from_hub( + repo_id="PhysArena/Rotor37", + split_ids={"train": [0, 1]}, + features=["Base/Zone/Field"], + ) + + assert result == {"train": "streaming_dataset"} + assert call == { + "repo_id": "PhysArena/Rotor37", + "split_ids": {"train": [0, 1]}, + "features": ["Base/Zone/Field"], + } + + +def test_hf_backend_streaming_from_hub_default_args(monkeypatch): + call = {} + + def fake_init_datasetdict_streaming_from_hub(repo_id, split_ids=None, features=None): + call["repo_id"] = repo_id + call["split_ids"] = split_ids + call["features"] = features + return {"train": "streaming_dataset"} + + monkeypatch.setattr( + hf_datasets, + "init_datasetdict_streaming_from_hub", + fake_init_datasetdict_streaming_from_hub, + ) + + result = HFBackend.init_datasetdict_streaming_from_hub("PhysArena/Rotor37") + + assert result == {"train": "streaming_dataset"} + assert call == { + "repo_id": "PhysArena/Rotor37", + "split_ids": None, + "features": None, + } + + +def test_hf_backend_generate_to_disk_delegates(monkeypatch): + call = {} + + def fake_generate_datasetdict_to_disk(**kwargs): + call.update(kwargs) + return "ok" + + monkeypatch.setattr(hf_datasets, "generate_datasetdict_to_disk", fake_generate_datasetdict_to_disk) + + generators = {"train": lambda: iter(())} + variable_schema = {"Global/temperature": {"dtype": "float32", "ndim": 1}} + gen_kwargs = {"train": {"shards_ids": [[0, 1]]}} + + result = HFBackend.generate_to_disk( + output_folder="/tmp/output", + generators=generators, + variable_schema=variable_schema, + gen_kwargs=gen_kwargs, + num_proc=2, + verbose=True, + ) + + assert result == "ok" + assert call == { + "output_folder": "/tmp/output", + "generators": generators, + "variable_schema": variable_schema, + "gen_kwargs": gen_kwargs, + "num_proc": 2, + "verbose": True, + } + + +def test_hf_backend_push_local_to_hub_delegates(monkeypatch): + call = {} + + def fake_push_local_datasetdict_to_hub(repo_id, local_dir, num_workers=1): + call["repo_id"] = repo_id + call["local_dir"] = local_dir + call["num_workers"] = num_workers + return "pushed" + + monkeypatch.setattr( + hf_datasets, + "push_local_datasetdict_to_hub", + fake_push_local_datasetdict_to_hub, + ) + + result = HFBackend.push_local_to_hub("dummy/repo", "/tmp/local") + + assert result == "pushed" + assert call == { + "repo_id": "dummy/repo", + "local_dir": "/tmp/local", + "num_workers": 1, + } + + +def test_hf_backend_configure_dataset_card_delegates(monkeypatch): + call = {} + + def fake_configure_dataset_card( + repo_id, + infos, + local_dir=None, + viewer=False, + pretty_name=None, + dataset_long_description=None, + illustration_urls=None, + arxiv_paper_urls=None, + ): + call["repo_id"] = repo_id + call["infos"] = infos + call["local_dir"] = local_dir + return "configured" + + monkeypatch.setattr(hf_datasets, "configure_dataset_card", fake_configure_dataset_card) + + infos = {"legal": {"owner": "owner", "license": "cc-by-4.0"}} + result = HFBackend.configure_dataset_card("dummy/repo", infos) + + assert result == "configured" + assert call == {"repo_id": "dummy/repo", "infos": infos, "local_dir": None} + + +def test_hf_backend_to_var_sample_dict_delegates(monkeypatch): + call = {} + + def fake_to_var_sample_dict(ds, i, features, indexers=None): + call["ds"] = ds + call["i"] = i + call["features"] = features + call["indexers"] = indexers + return {"field": [1, 2, 3]} + + monkeypatch.setattr(hf_datasets, "to_var_sample_dict", fake_to_var_sample_dict) + + dataset = object() + features = ["Base/Zone/Field"] + result = HFBackend.to_var_sample_dict(dataset=dataset, idx=3, features=features) + + assert result == {"field": [1, 2, 3]} + assert call == { + "ds": dataset, + "i": 3, + "features": features, + "indexers": None, + } + + +def test_hf_backend_sample_to_var_sample_dict_delegates(monkeypatch): + call = {} + + def fake_sample_to_var_sample_dict(hf_sample): + call["hf_sample"] = hf_sample + return {"field": [4, 5]} + + monkeypatch.setattr(hf_datasets, "sample_to_var_sample_dict", fake_sample_to_var_sample_dict) + + sample = {"Base": {"Zone": {}}} + result = HFBackend.sample_to_var_sample_dict(sample) + + assert result == {"field": [4, 5]} + assert call == {"hf_sample": sample} \ No newline at end of file diff --git a/tests/storage/test_in_memory.py b/tests/storage/test_in_memory.py new file mode 100644 index 00000000..b85191d7 --- /dev/null +++ b/tests/storage/test_in_memory.py @@ -0,0 +1,159 @@ +"""Tests for :mod:`plaid.storage.in_memory`.""" + +from typing import Any, cast + +import pytest + +from plaid.containers.sample import Sample +from plaid.storage.in_memory import InMemoryBackend, _find_first_missing + + +def _new_sample() -> Sample: + """Build a minimal valid sample for in-memory storage tests.""" + return Sample(path=None) + + +def test_find_first_missing(): + """Helper returns the first non-negative missing integer key.""" + assert _find_first_missing([]) == 0 + assert _find_first_missing([0, 1, 3]) == 2 + + +def test_backend_basic_len_and_getitem(): + """Backend supports len and indexing by int/slice/sequence.""" + backend = InMemoryBackend() + assert backend.name == "in_memory" + assert len(backend) == 0 + + samples = [_new_sample(), _new_sample(), _new_sample()] + ids = backend.add_sample(samples) + assert ids == [0, 1, 2] + assert len(backend) == 3 + + assert backend[0] is samples[0] + assert backend[1:3] == [samples[1], samples[2]] + assert backend[[2, 0]] == [samples[2], samples[0]] + + +def test_add_sample_single_and_validation_errors(): + """Single add_sample path validates sample_id and sample types.""" + backend = InMemoryBackend() + sample = _new_sample() + + new_id = backend.add_sample(sample) + assert new_id == 0 + assert backend[0] is sample + + with pytest.raises(TypeError, match="sample_id must be an int"): + backend.add_sample(sample, sample_id=[1]) + + with pytest.raises(TypeError, match="sample must be a Sample"): + backend.add_sample(123) # type: ignore[arg-type] + + +def test_add_sample_sequence_and_validation_errors(): + """Sequence add_sample path checks IDs shape/type/uniqueness.""" + backend = InMemoryBackend() + samples = [_new_sample(), _new_sample()] + + assert backend.add_sample(samples, sample_id=[10, 11]) == [10, 11] + assert backend[10] is samples[0] + assert backend[11] is samples[1] + + with pytest.raises(TypeError, match="sample_id must be a sequence"): + backend.add_sample(samples, sample_id=1) + + with pytest.raises(ValueError, match="sample_ids must be unique"): + backend.add_sample(samples, sample_id=[0, 0]) + + with pytest.raises(ValueError, match="length of the list of samples"): + backend.add_sample(samples, sample_id=[0]) + + +def test_set_sample_single_iterable_and_validation_errors(): + """set_sample supports single and iterable paths with validations.""" + backend = InMemoryBackend() + + s0 = _new_sample() + s1 = _new_sample() + s2 = _new_sample() + + assert backend.set_sample(s0, sample_id=None) == 0 + assert backend.set_sample(s1, sample_id=2) == 2 + assert backend.set_sample(s2, sample_id=None) == 1 + + # Overwrite existing id + replacement = _new_sample() + assert backend.set_sample(replacement, sample_id=2) == 2 + assert backend[2] is replacement + + # Iterable path with explicit ids + s3 = _new_sample() + s4 = _new_sample() + ids = backend.set_sample([s3, s4], sample_id=[5, 6]) + assert ids == [5, 6] + assert backend[5] is s3 + assert backend[6] is s4 + + # Iterable path with inferred ids + s5 = _new_sample() + s6 = _new_sample() + ids = backend.set_sample([s5, s6], sample_id=None) + assert len(ids) == 2 + assert all(isinstance(i, int) for i in ids) + + with pytest.raises(TypeError, match="sample should be of type Sample"): + backend.set_sample(3.14, sample_id=None) # type: ignore[arg-type] + + with pytest.raises(TypeError, match="sample_id should be of type"): + backend.set_sample(_new_sample(), sample_id="abc") # type: ignore[arg-type] + + with pytest.raises(ValueError, match="sample_id should be positive"): + backend.set_sample(_new_sample(), sample_id=-1) + + +def test_merge_dataset_and_unsupported_operations(): + """merge_dataset behavior and unsupported hub/disk methods.""" + backend = InMemoryBackend() + assert backend.merge_dataset(None) is None + + source = InMemoryBackend() + src_samples = [_new_sample(), _new_sample()] + source.add_sample(src_samples) + + merged_ids = backend.merge_dataset(source) + assert merged_ids == [0, 1] + assert backend[0] is src_samples[0] + assert backend[1] is src_samples[1] + + with pytest.raises(NotImplementedError): + InMemoryBackend.init_from_disk("/tmp/dummy") + + with pytest.raises(NotImplementedError): + backend.download_from_hub("repo/id", "/tmp/dummy") + + with pytest.raises(NotImplementedError): + backend.download_from_hub( + "repo/id", + "/tmp/dummy", + split_ids={"train": [0]}, + features=["Base/Zone/Field"], + overwrite=True, + ) + + with pytest.raises(NotImplementedError): + backend.init_datasetdict_streaming_from_hub("repo/id") + + with pytest.raises(NotImplementedError): + backend.init_datasetdict_streaming_from_hub( + "repo/id", split_ids={"train": [0]}, features=["Base/Zone/Field"] + ) + + with pytest.raises(NotImplementedError): + backend.generate_to_disk("/tmp/out", generators={}) + + with pytest.raises(NotImplementedError): + backend.push_local_to_hub("repo/id", "/tmp/out") + + with pytest.raises(NotImplementedError): + backend.configure_dataset_card("repo/id", cast(dict[str, Any], {"a": 1})) diff --git a/tests/storage/test_storage.py b/tests/storage/test_storage.py index aa58932c..6c914aae 100644 --- a/tests/storage/test_storage.py +++ b/tests/storage/test_storage.py @@ -1,10 +1,3 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports import json @@ -27,9 +20,6 @@ from plaid.storage.cgns.writer import ( generate_datasetdict_to_disk as cgns_generate_datasetdict_to_disk, ) -from plaid.storage.hf_datasets.bridge import ( - to_var_sample_dict, -) from plaid.storage.writer import ( _build_gen_kwargs, _SampleFuncGenerator, @@ -161,16 +151,16 @@ def current_directory(): # %% Fixtures @pytest.fixture() -def dataset(samples, infos) -> Dataset: +def dataset(samples) -> Dataset: samples_ = [] for i, sample in enumerate(samples): sample_ = deepcopy(sample) if i == 0 or i == 2: - sample_.add_scalar("toto", 1.0) + sample_.add_global("toto", 1.0) samples_.append(sample_) - dataset = Dataset(samples=samples_) - dataset.set_infos(infos) + dataset = Dataset() + dataset.get_backend().add_sample(sample=samples_) return dataset @@ -182,9 +172,10 @@ def main_splits() -> dict: @pytest.fixture() def problem_definition(main_splits) -> ProblemDefinition: problem_definition = ProblemDefinition() - problem_definition.set_task("regression") - problem_definition.add_input_scalars_names(["feature_name_1", "feature_name_2"]) - problem_definition.set_split(main_splits) + problem_definition.task = "regression" + problem_definition.add_in_features_identifiers(["feature_name_1", "feature_name_2"]) + problem_definition.train_split = {"train": main_splits["train"]} + problem_definition.test_split = {"test": main_splits["test"]} return problem_definition @@ -200,13 +191,16 @@ def _sample_constructor(id): @pytest.fixture() def split_ids(problem_definition) -> dict: - return problem_definition.get_split() + return { + "train": problem_definition.get_train_split_indices(), + "test": problem_definition.get_test_split_indices(), + } class Test_Storage: def assert_sample(self, sample): assert isinstance(sample, Sample) - sorted_names = sorted(sample.get_scalar_names()) + sorted_names = sorted(sample.get_global_names()) for i in range(4): assert sorted_names[i] == f"global_{i}" assert "test_field_same_size" in sample.get_field_names() @@ -224,6 +218,8 @@ def test_hf_datasets( infos, problem_definition, ): + import plaid.storage.hf_datasets.bridge as hf_bridge + test_dir = tmp_path / "test_hf" save_to_disk( @@ -247,7 +243,7 @@ def test_hf_datasets( ) with pytest.raises(ValueError): - problem_definition.set_name(None) + problem_definition.name = None save_to_disk( output_folder=test_dir, sample_constructor=sample_constructor, @@ -296,13 +292,7 @@ def test_hf_datasets( converter.plaid_to_dict(plaid_sample) - to_var_sample_dict(hf_dataset, 0, enforce_shapes=False) - - for t in plaid_sample.get_all_time_values(): - for path in problem_definition.get_in_features_identifiers(): - plaid_sample.get_feature_by_path(path=path, time=t) - for path in problem_definition.get_out_features_identifiers(): - plaid_sample.get_feature_by_path(path=path, time=t) + hf_bridge.to_var_sample_dict(hf_dataset, 0) converter.to_dict(hf_dataset, 0) converter.sample_to_dict(hf_dataset[0]) @@ -369,12 +359,6 @@ def test_zarr( zarr_dataset.toto = 1.0 print(zarr_dataset) - for t in plaid_sample.get_all_time_values(): - for path in problem_definition.get_in_features_identifiers(): - plaid_sample.get_feature_by_path(path=path, time=t) - for path in problem_definition.get_out_features_identifiers(): - plaid_sample.get_feature_by_path(path=path, time=t) - converter.to_dict(zarr_dataset, 0) converter.sample_to_dict(zarr_dataset[0]) @@ -389,6 +373,216 @@ def test_zarr( with pytest.raises(KeyError): converter.to_dict(zarr_dataset, 0, features=["dummy"]) + def test_hf_datasets_indexers( + self, + tmp_path, + sample_constructor, + split_ids, + infos, + problem_definition, + ): + test_dir = tmp_path / "test_hf_indexers" + + save_to_disk( + output_folder=test_dir, + sample_constructor=sample_constructor, + ids=split_ids, + backend="hf_datasets", + infos=infos, + pb_defs={"pb_def": problem_definition}, + overwrite=True, + ) + + datasetdict, converterdict = init_from_disk(test_dir) + hf_dataset = datasetdict["train"] + converter = converterdict["train"] + + field_path = "Base_Name/Zone_Name/VertexFields/test_field_same_size" + selected_idx = [1, 3, 7, 11] + + sampled = converter.to_plaid( + hf_dataset, + 0, + features=[field_path], + indexers={field_path: selected_idx}, + ) + full = converter.to_plaid(hf_dataset, 0, features=[field_path]) + + expected = full.get_field("test_field_same_size")[selected_idx] + got = sampled.get_field("test_field_same_size") + assert np.array_equal(got, expected) + + with pytest.raises(KeyError): + converter.to_dict( + hf_dataset, + 0, + features=[field_path], + indexers={"dummy": selected_idx}, + ) + + # Valid variable feature key, but not among requested features + other_variable_feature = next( + f + for f in converter.variable_features + if f != field_path and not f.endswith("_times") + ) + with pytest.raises(KeyError): + converter.to_dict( + hf_dataset, + 0, + features=[field_path], + indexers={other_variable_feature: [0]}, + ) + + def test_zarr_indexers( + self, + tmp_path, + sample_constructor, + split_ids, + infos, + problem_definition, + ): + test_dir = tmp_path / "test_zarr_indexers" + + save_to_disk( + output_folder=test_dir, + sample_constructor=sample_constructor, + ids=split_ids, + backend="zarr", + infos=infos, + pb_defs={"pb_def": problem_definition}, + overwrite=True, + ) + + datasetdict, converterdict = init_from_disk(test_dir) + zarr_dataset = datasetdict["train"] + converter = converterdict["train"] + + field_path = "Base_Name/Zone_Name/VertexFields/test_field_same_size" + selected_idx = [0, 2, 4, 8, 16] + + sampled = converter.to_plaid( + zarr_dataset, + 0, + features=[field_path], + indexers={field_path: selected_idx}, + ) + full = converter.to_plaid(zarr_dataset, 0, features=[field_path]) + + expected = full.get_field("test_field_same_size")[selected_idx] + got = sampled.get_field("test_field_same_size") + assert np.array_equal(got, expected) + + with pytest.raises(IndexError): + converter.to_dict( + zarr_dataset, + 0, + features=[field_path], + indexers={field_path: [999]}, + ) + + def test_zarr_bridge_indexer_branches( + self, + tmp_path, + sample_constructor, + split_ids, + infos, + problem_definition, + ): + import plaid.storage.zarr.bridge as zarr_bridge + + test_dir = tmp_path / "test_zarr_bridge_indexers" + save_to_disk( + output_folder=test_dir, + sample_constructor=sample_constructor, + ids=split_ids, + backend="zarr", + infos=infos, + pb_defs={"pb_def": problem_definition}, + overwrite=True, + ) + datasetdict, _ = init_from_disk(test_dir) + zarr_dataset = datasetdict["train"] + + # cover `continue` on missing feature key + out = zarr_bridge.to_var_sample_dict( + zarr_dataset, 0, features=["missing/feature/path"] + ) + assert out == {} + + # cover slice branch + arr = np.arange(10) + sliced = zarr_bridge._apply_indexer(arr, slice(1, 6, 2), "feat") + assert np.array_equal(sliced, np.array([1, 3, 5])) + + # cover scalar and invalid-shape indexer branches + with pytest.raises(ValueError): + zarr_bridge._apply_indexer(np.array(1), [0], "feat") + with pytest.raises(ValueError): + zarr_bridge._apply_indexer(np.arange(5), [[0, 1]], "feat") + + def test_hf_bridge_indexer_branches( + self, + tmp_path, + sample_constructor, + split_ids, + infos, + problem_definition, + ): + import pyarrow as pa + + import plaid.storage.hf_datasets.bridge as hf_bridge + + test_dir = tmp_path / "test_hf_bridge_indexers" + save_to_disk( + output_folder=test_dir, + sample_constructor=sample_constructor, + ids=split_ids, + backend="hf_datasets", + infos=infos, + pb_defs={"pb_def": problem_definition}, + overwrite=True, + ) + datasetdict, _ = init_from_disk(test_dir) + hf_dataset = datasetdict["train"] + + field_path = "Base_Name/Zone_Name/VertexFields/test_field_same_size" + + # cover indexed branch + out = hf_bridge.to_var_sample_dict( + hf_dataset, + 0, + features=[field_path], + indexers={field_path: [0, 2, 4]}, + ) + assert out[field_path].shape == (3,) + + # cover sample_to_var_sample_dict None branch + assert hf_bridge.sample_to_var_sample_dict({"a": None}) == {"a": None} + + # cover _extract_indexed_arrow slice / invalid ndim / oob + primitive = pa.array([0, 1, 2, 3, 4], type=pa.int64()) + assert np.array_equal( + hf_bridge._extract_indexed_arrow(primitive, slice(1, 4), "f"), + np.array([1, 2, 3]), + ) + with pytest.raises(ValueError): + hf_bridge._extract_indexed_arrow(primitive, [[0, 1]], "f") + with pytest.raises(IndexError): + hf_bridge._extract_indexed_arrow(primitive, [99], "f") + + # cover fallback branch (ListArray -> _to_numpy_arrow + _apply_indexer) + list_arr = pa.array([[1, 2, 3], [4, 5, 6]], type=pa.list_(pa.int64())) + fallback = hf_bridge._extract_indexed_arrow(list_arr, [0, 2], "f") + assert np.array_equal(fallback, np.array([[1, 3], [4, 6]])) + + # cover _to_numpy_arrow default and _apply_indexer scalar guard + assert np.array_equal( + hf_bridge._to_numpy_arrow(primitive), np.array([0, 1, 2, 3, 4]) + ) + with pytest.raises(ValueError): + hf_bridge._apply_indexer(np.array(1), [0], "f") + def test_cgns( self, tmp_path, sample_constructor, split_ids, infos, problem_definition ): @@ -426,12 +620,6 @@ def test_cgns( converter.plaid_to_dict(plaid_sample) - for t in plaid_sample.get_all_time_values(): - for path in problem_definition.get_in_features_identifiers(): - plaid_sample.get_feature_by_path(path=path, time=t) - for path in problem_definition.get_out_features_identifiers(): - plaid_sample.get_feature_by_path(path=path, time=t) - with pytest.raises(ValueError): converter.to_dict(cgns_dataset, 0) with pytest.raises(ValueError): diff --git a/tests/storage/test_zarr_init.py b/tests/storage/test_zarr_init.py new file mode 100644 index 00000000..a1f18d85 --- /dev/null +++ b/tests/storage/test_zarr_init.py @@ -0,0 +1,217 @@ +from pathlib import Path + +import plaid.storage.zarr as zarr +from plaid.storage.zarr import ZarrBackend + + +def test_public_exports_and_backend_name(): + expected = { + "configure_dataset_card", + "download_datasetdict_from_hub", + "generate_datasetdict_to_disk", + "init_datasetdict_from_disk", + "init_datasetdict_streaming_from_hub", + "push_local_datasetdict_to_hub", + "sample_to_var_sample_dict", + "to_var_sample_dict", + } + assert set(zarr.__all__) == expected + assert ZarrBackend.name == "zarr" + + +def test_zarr_backend_init_from_disk_delegates(monkeypatch): + call = {} + + def fake_init_datasetdict_from_disk(path): + call["path"] = path + return {"train": "dataset"} + + monkeypatch.setattr(zarr, "init_datasetdict_from_disk", fake_init_datasetdict_from_disk) + + local_path = Path("/tmp/my_dataset") + result = ZarrBackend.init_from_disk(local_path) + + assert result == {"train": "dataset"} + assert call == {"path": local_path} + + +def test_zarr_backend_download_from_hub_delegates(monkeypatch): + call = {} + + def fake_download_datasetdict_from_hub( + repo_id, local_dir, split_ids=None, features=None, overwrite=False + ): + call["repo_id"] = repo_id + call["local_dir"] = local_dir + call["split_ids"] = split_ids + call["features"] = features + call["overwrite"] = overwrite + return "downloaded_path" + + monkeypatch.setattr(zarr, "download_datasetdict_from_hub", fake_download_datasetdict_from_hub) + + backend = ZarrBackend() + result = backend.download_from_hub("dummy/repo", "/tmp/local") + + assert result == "downloaded_path" + assert call == { + "repo_id": "dummy/repo", + "local_dir": "/tmp/local", + "split_ids": None, + "features": None, + "overwrite": False, + } + + +def test_zarr_backend_streaming_from_hub_delegates(monkeypatch): + call = {} + + def fake_init_datasetdict_streaming_from_hub( + repo_id, split_ids=None, features=None + ): + call["repo_id"] = repo_id + call["split_ids"] = split_ids + call["features"] = features + return {"train": "streaming_dataset"} + + monkeypatch.setattr( + zarr, + "init_datasetdict_streaming_from_hub", + fake_init_datasetdict_streaming_from_hub, + ) + + backend = ZarrBackend() + result = backend.init_datasetdict_streaming_from_hub("PhysArena/Rotor37") + + assert result == {"train": "streaming_dataset"} + assert call == { + "repo_id": "PhysArena/Rotor37", + "split_ids": None, + "features": None, + } + + +def test_zarr_backend_generate_to_disk_delegates(monkeypatch): + call = {} + + def fake_generate_datasetdict_to_disk(**kwargs): + call.update(kwargs) + return "ok" + + monkeypatch.setattr(zarr, "generate_datasetdict_to_disk", fake_generate_datasetdict_to_disk) + + generators = {"train": lambda: iter(())} + variable_schema = {"Global/temperature": {"dtype": "float32", "ndim": 1}} + gen_kwargs = {"train": {"shards_ids": [[0, 1]]}} + + result = ZarrBackend.generate_to_disk( + output_folder="/tmp/output", + generators=generators, + variable_schema=variable_schema, + gen_kwargs=gen_kwargs, + num_proc=2, + verbose=True, + ) + + assert result == "ok" + assert call == { + "output_folder": "/tmp/output", + "generators": generators, + "variable_schema": variable_schema, + "gen_kwargs": gen_kwargs, + "num_proc": 2, + "verbose": True, + } + + +def test_zarr_backend_push_local_to_hub_delegates(monkeypatch): + call = {} + + def fake_push_local_datasetdict_to_hub(repo_id, local_dir, num_workers=1): + call["repo_id"] = repo_id + call["local_dir"] = local_dir + call["num_workers"] = num_workers + return "pushed" + + monkeypatch.setattr( + zarr, + "push_local_datasetdict_to_hub", + fake_push_local_datasetdict_to_hub, + ) + + result = ZarrBackend.push_local_to_hub("dummy/repo", "/tmp/local") + + assert result == "pushed" + assert call == { + "repo_id": "dummy/repo", + "local_dir": "/tmp/local", + "num_workers": 1, + } + + +def test_zarr_backend_configure_dataset_card_delegates(monkeypatch): + call = {} + + def fake_configure_dataset_card( + repo_id, + infos, + local_dir, + viewer=False, + pretty_name=None, + dataset_long_description=None, + illustration_urls=None, + arxiv_paper_urls=None, + ): + call["repo_id"] = repo_id + call["infos"] = infos + call["local_dir"] = local_dir + return "configured" + + monkeypatch.setattr(zarr, "configure_dataset_card", fake_configure_dataset_card) + + infos = {"legal": {"owner": "owner", "license": "cc-by-4.0"}} + result = ZarrBackend.configure_dataset_card("dummy/repo", infos, "/tmp/local") + + assert result == "configured" + assert call == {"repo_id": "dummy/repo", "infos": infos, "local_dir": "/tmp/local"} + + +def test_zarr_backend_to_var_sample_dict_delegates(monkeypatch): + call = {} + + def fake_to_var_sample_dict(zarr_dataset, idx, features, indexers=None): + call["zarr_dataset"] = zarr_dataset + call["idx"] = idx + call["features"] = features + call["indexers"] = indexers + return {"field": [1, 2, 3]} + + monkeypatch.setattr(zarr, "to_var_sample_dict", fake_to_var_sample_dict) + + dataset = object() + features = ["Base/Zone/Field"] + result = ZarrBackend.to_var_sample_dict(dataset=dataset, idx=3, features=features) + + assert result == {"field": [1, 2, 3]} + assert call == { + "zarr_dataset": dataset, + "idx": 3, + "features": features, + "indexers": None, + } + + +def test_zarr_backend_sample_to_var_sample_dict_delegates(monkeypatch): + call = {} + + def fake_sample_to_var_sample_dict(zarr_sample): + call["zarr_sample"] = zarr_sample + return {"field": [4, 5]} + + monkeypatch.setattr(zarr, "sample_to_var_sample_dict", fake_sample_to_var_sample_dict) + + sample = {"Base": {"Zone": {}}} + result = ZarrBackend.sample_to_var_sample_dict(sample) + + assert result == {"field": [4, 5]} + assert call == {"zarr_sample": sample} \ No newline at end of file diff --git a/tests/test___init__.py b/tests/test___init__.py index 0bc5758f..a5c933eb 100644 --- a/tests/test___init__.py +++ b/tests/test___init__.py @@ -1,10 +1,3 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Tests diff --git a/tests/test_problem_definition.py b/tests/test_problem_definition.py index 3176d6c4..77e8339b 100644 --- a/tests/test_problem_definition.py +++ b/tests/test_problem_definition.py @@ -1,10 +1,3 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports import os @@ -12,10 +5,8 @@ from pathlib import Path import pytest -from packaging.version import Version +from pydantic import ValidationError -import plaid -from plaid.containers import FeatureIdentifier from plaid.problem_definition import ProblemDefinition # %% Fixtures @@ -28,25 +19,22 @@ def problem_definition() -> ProblemDefinition: @pytest.fixture() def problem_definition_full(problem_definition: ProblemDefinition) -> ProblemDefinition: - problem_definition.set_task("regression") - problem_definition.set_name("regression_1") + problem_definition.task = "regression" + problem_definition.name = "regression_1" # ---- - feature_identifier = FeatureIdentifier({"type": "scalar", "name": "feature"}) - predict_feature_identifier = FeatureIdentifier( - {"type": "scalar", "name": "predict_feature"} - ) - test_feature_identifier = FeatureIdentifier( - {"type": "scalar", "name": "test_feature"} - ) + feature_identifier = "Global/feature" + predict_feature_identifier = "Global/predict_feature" + test_feature_identifier = "Global/test_feature" + problem_definition.add_in_features_identifiers( [predict_feature_identifier, test_feature_identifier] ) - problem_definition.add_in_feature_identifier(feature_identifier) + problem_definition.add_in_features_identifiers(feature_identifier) problem_definition.add_out_features_identifiers( [predict_feature_identifier, test_feature_identifier] ) - problem_definition.add_out_feature_identifier(feature_identifier) + problem_definition.add_out_features_identifiers(feature_identifier) # ---- feature_identifier = "Base_2_2/Zone/PointData/U1" predict_feature_identifier = "Base_2_2/Zone/PointData/U2" @@ -54,44 +42,12 @@ def problem_definition_full(problem_definition: ProblemDefinition) -> ProblemDef problem_definition.add_in_features_identifiers( [predict_feature_identifier, test_feature_identifier] ) - problem_definition.add_in_feature_identifier(feature_identifier) + problem_definition.add_in_features_identifiers(feature_identifier) problem_definition.add_out_features_identifiers( [predict_feature_identifier, test_feature_identifier] ) - problem_definition.add_constant_feature_identifier(feature_identifier) - problem_definition.add_constant_features_identifiers( - [predict_feature_identifier, test_feature_identifier] - ) - - # ---- - problem_definition.add_input_scalars_names(["scalar", "test_scalar"]) - problem_definition.add_input_scalar_name("predict_scalar") - problem_definition.add_output_scalars_names(["scalar", "test_scalar"]) - problem_definition.add_output_scalar_name("predict_scalar") - - problem_definition.add_input_fields_names(["field", "test_field"]) - problem_definition.add_input_field_name("predict_field") - problem_definition.add_output_fields_names(["field", "test_field"]) - problem_definition.add_output_field_name("predict_field") - - problem_definition.add_input_timeseries_names(["timeseries", "test_timeseries"]) - problem_definition.add_input_timeseries_name("predict_timeseries") - problem_definition.add_output_timeseries_names(["timeseries", "test_timeseries"]) - problem_definition.add_output_timeseries_name("predict_timeseries") - - problem_definition.add_input_meshes_names(["mesh", "test_mesh"]) - problem_definition.add_input_mesh_name("predict_mesh") - problem_definition.add_output_meshes_names(["mesh", "test_mesh"]) - problem_definition.add_output_mesh_name("predict_mesh") - - new_split = {"train": [0, 1, 2], "test": [3, 4]} - problem_definition.set_split(new_split) - - new_split = {"train_1": [0, 1, 2], "train_2": "all"} - problem_definition.set_train_split(new_split) - - new_split = {"test_1": "all", "test_2": [0, 2]} - problem_definition.set_test_split(new_split) + problem_definition.train_split = {"train_1": [0, 1, 2]} + problem_definition.test_split = {"test_1": "all"} return problem_definition @@ -120,650 +76,240 @@ def clean_tests(): class Test_ProblemDefinition: def test__init__(self, problem_definition): - assert problem_definition.get_task() is None + assert problem_definition.task is None print(problem_definition) - def test__init__path(self, current_directory): - d_path = current_directory / "problem_definition" - ProblemDefinition(path=d_path) - - def test__init__directory_path(self, current_directory): - d_path = current_directory / "problem_definition" - ProblemDefinition(directory_path=d_path) - def test__init__both_path_and_directory_path(self, current_directory): d_path = current_directory / "problem_definition" with pytest.raises(ValueError): ProblemDefinition(path=d_path, directory_path=d_path) - # -------------------------------------------------------------------------# - def test_version(self, problem_definition): - # Unauthorized version - assert problem_definition.get_version() == Version(plaid.__version__) - # -------------------------------------------------------------------------# def test_task(self, problem_definition): # Unauthorized task - with pytest.raises(TypeError): - problem_definition.set_task("ighyurgv") - problem_definition.set_task("classification") - with pytest.raises(ValueError): - problem_definition.set_task("regression") - assert problem_definition.get_task() == "classification" + with pytest.raises(ValidationError): + problem_definition.task = "ighyurgv" + problem_definition.task = "classification" + assert problem_definition.task == "classification" print(problem_definition) # -------------------------------------------------------------------------# def test_score_function(self, problem_definition): # Unauthorized task - with pytest.raises(TypeError): - problem_definition.set_score_function("ighyurgv") - problem_definition.set_score_function("RRMSE") - with pytest.raises(ValueError): - problem_definition.set_score_function("RRMSE") - assert problem_definition.get_score_function() == "RRMSE" - print(problem_definition) - - # -#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-# - # -------------------------------------------------------------------------# - def test_get_in_features_identifiers(self, problem_definition): - assert problem_definition.get_in_features_identifiers() == [] - - def test_set_in_features_identifiers(self, problem_definition): - dummy_identifier = FeatureIdentifier({"type": "scalar", "name": "dummy"}) - problem_definition.set_in_features_identifiers(dummy_identifier) - dummy_identifier = "Global/toto" - problem_definition.set_in_features_identifiers(dummy_identifier) - - def test_add_in_features_identifiers_fail_same_identifier(self, problem_definition): - dummy_identifier = FeatureIdentifier({"type": "scalar", "name": "dummy"}) - with pytest.raises(ValueError): - problem_definition.add_in_features_identifiers( - [dummy_identifier, dummy_identifier] - ) - problem_definition.add_in_feature_identifier(dummy_identifier) - with pytest.raises(ValueError): - problem_definition.add_in_feature_identifier(dummy_identifier) - - def test_add_in_features_identifiers(self, problem_definition): - dummy_identifier_1 = FeatureIdentifier({"type": "scalar", "name": "dummy_1"}) - dummy_identifier_2 = FeatureIdentifier({"type": "scalar", "name": "dummy_2"}) - dummy_identifier_3 = FeatureIdentifier({"type": "scalar", "name": "dummy_3"}) - problem_definition.add_in_features_identifiers( - [dummy_identifier_1, dummy_identifier_2] - ) - problem_definition.add_in_feature_identifier(dummy_identifier_3) - inputs = problem_definition.get_in_features_identifiers() - assert len(inputs) == 3 - assert set(inputs) == set( - [dummy_identifier_1, dummy_identifier_2, dummy_identifier_3] - ) - print(problem_definition) - - # -------------------------------------------------------------------------# - def test_get_out_features_identifiers(self, problem_definition): - assert problem_definition.get_out_features_identifiers() == [] - - def test_set_out_features_identifiers(self, problem_definition): - dummy_identifier = FeatureIdentifier({"type": "scalar", "name": "dummy"}) - problem_definition.set_out_features_identifiers(dummy_identifier) - dummy_identifier = "Global/toto" - problem_definition.set_out_features_identifiers(dummy_identifier) - - def test_add_out_features_identifiers_fail(self, problem_definition): - dummy_identifier = FeatureIdentifier({"type": "scalar", "name": "dummy"}) - with pytest.raises(ValueError): - problem_definition.add_out_features_identifiers( - [dummy_identifier, dummy_identifier] - ) - problem_definition.add_out_feature_identifier(dummy_identifier) - with pytest.raises(ValueError): - problem_definition.add_out_feature_identifier(dummy_identifier) - - def test_add_out_features_identifiers(self, problem_definition): - dummy_identifier_1 = FeatureIdentifier({"type": "scalar", "name": "dummy_1"}) - dummy_identifier_2 = FeatureIdentifier({"type": "scalar", "name": "dummy_2"}) - dummy_identifier_3 = FeatureIdentifier({"type": "scalar", "name": "dummy_3"}) - problem_definition.add_out_features_identifiers( - [dummy_identifier_1, dummy_identifier_2] - ) - problem_definition.add_out_feature_identifier(dummy_identifier_3) - outputs = problem_definition.get_out_features_identifiers() - assert len(outputs) == 3 - assert set(outputs) == set( - [dummy_identifier_1, dummy_identifier_2, dummy_identifier_3] - ) - print(problem_definition) - - # -------------------------------------------------------------------------# - def test_get_constant_features_identifiers(self, problem_definition): - assert problem_definition.get_constant_features_identifiers() == [] - - def test_set_constant_features_identifiers(self, problem_definition): - dummy_identifier_1 = "Global/P" - dummy_identifier_2 = "Base_2_2/Zone/GridCoordinates" - problem_definition.set_constant_features_identifiers( - [dummy_identifier_1, dummy_identifier_2] - ) - constants = problem_definition.get_constant_features_identifiers() - assert len(constants) == 2 - assert set(constants) == set([dummy_identifier_1, dummy_identifier_2]) - - def test_add_constant_features_identifiers_fail(self, problem_definition): - dummy_identifier = FeatureIdentifier({"type": "scalar", "name": "dummy"}) - with pytest.raises(ValueError): - problem_definition.add_constant_features_identifiers( - [dummy_identifier, dummy_identifier] - ) - problem_definition.add_constant_feature_identifier(dummy_identifier) - with pytest.raises(ValueError): - problem_definition.add_constant_feature_identifier(dummy_identifier) - - def test_add_constant_features_identifiers(self, problem_definition): - dummy_identifier_1 = FeatureIdentifier({"type": "scalar", "name": "dummy_1"}) - dummy_identifier_2 = FeatureIdentifier({"type": "scalar", "name": "dummy_2"}) - dummy_identifier_3 = FeatureIdentifier({"type": "scalar", "name": "dummy_3"}) - problem_definition.add_constant_features_identifiers( - [dummy_identifier_1, dummy_identifier_2] - ) - problem_definition.add_constant_feature_identifier(dummy_identifier_3) - constants = problem_definition.get_constant_features_identifiers() - assert len(constants) == 3 - assert set(constants) == set( - [dummy_identifier_1, dummy_identifier_2, dummy_identifier_3] - ) + with pytest.raises(ValidationError): + problem_definition.score_function = "ighyurgv" + problem_definition.score_function = "RRMSE" + # can be set again to the same value + problem_definition.score_function = "RRMSE" + assert problem_definition.score_function == "RRMSE" print(problem_definition) - # -------------------------------------------------------------------------# - def test_filter_features_identifiers(self, current_directory): - d_path = current_directory / "problem_definition" - problem = ProblemDefinition(d_path) - predict_feature_identifier = FeatureIdentifier( - {"type": "scalar", "name": "predict_feature"} - ) - test_feature_identifier = FeatureIdentifier( - {"type": "scalar", "name": "test_feature"} - ) - filter_in = problem.filter_in_features_identifiers( - [predict_feature_identifier, test_feature_identifier] - ) - filter_out = problem.filter_out_features_identifiers( - [predict_feature_identifier, test_feature_identifier] - ) - filter_cte = problem.filter_constant_features_identifiers( - [predict_feature_identifier, test_feature_identifier] - ) - filter_cte - assert len(filter_in) == 2 and filter_in == [ - predict_feature_identifier, - test_feature_identifier, - ] - assert filter_in != [test_feature_identifier, predict_feature_identifier], ( - "common inputs not sorted" - ) - - assert len(filter_out) == 2 and filter_out == [ - predict_feature_identifier, - test_feature_identifier, - ] - assert filter_out != [test_feature_identifier, predict_feature_identifier], ( - "common outputs not sorted" - ) - - inexisting_feature_identifier = FeatureIdentifier( - {"type": "scalar", "name": "inexisting_feature"} - ) - fail_filter_in = problem.filter_in_features_identifiers( - [inexisting_feature_identifier] - ) - fail_filter_out = problem.filter_out_features_identifiers( - [inexisting_feature_identifier] - ) - fail_filter_cte = problem.filter_constant_features_identifiers( - ["Base_2_2/Zone/PointData/inexisting_feature"] - ) - - assert fail_filter_in == [] - assert fail_filter_out == [] - assert fail_filter_cte == [] - - # -#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-# - # -------------------------------------------------------------------------# - def test_get_input_scalars_names(self, problem_definition): - assert problem_definition.get_input_scalars_names() == [] - - def test_add_input_scalars_names_fail_same_name(self, problem_definition): - with pytest.raises(ValueError): - problem_definition.add_input_scalars_names(["feature_name", "feature_name"]) - problem_definition.add_input_scalar_name("feature_name") - with pytest.raises(ValueError): - problem_definition.add_input_scalar_name("feature_name") - - def test_add_input_scalars_names(self, problem_definition): - problem_definition.add_input_scalars_names(["scalar", "test_scalar"]) - problem_definition.add_input_scalar_name("predict_scalar") - inputs = problem_definition.get_input_scalars_names() - assert len(inputs) == 3 - assert set(inputs) == set(["predict_scalar", "scalar", "test_scalar"]) - print(problem_definition) - - # -------------------------------------------------------------------------# - def test_get_output_scalars_names(self, problem_definition): - assert problem_definition.get_output_scalars_names() == [] - - def test_add_output_scalars_names_fail(self, problem_definition): - with pytest.raises(ValueError): - problem_definition.add_output_scalars_names( - ["feature_name", "feature_name"] - ) - problem_definition.add_output_scalar_name("feature_name") - with pytest.raises(ValueError): - problem_definition.add_output_scalar_name("feature_name") - - def test_add_output_scalars_names(self, problem_definition): - problem_definition.add_output_scalars_names(["scalar", "test_scalar"]) - problem_definition.add_output_scalar_name("predict_scalar") - outputs = problem_definition.get_output_scalars_names() - assert len(outputs) == 3 - assert set(outputs) == set(["predict_scalar", "scalar", "test_scalar"]) - print(problem_definition) - - # -------------------------------------------------------------------------# - def test_filter_scalars_names(self, current_directory): - d_path = current_directory / "problem_definition" - problem = ProblemDefinition(d_path) - filter_in = problem.filter_input_scalars_names( - ["predict_scalar", "test_scalar"] - ) - filter_out = problem.filter_output_scalars_names( - ["predict_scalar", "test_scalar"] - ) - assert len(filter_in) == 2 and filter_in == ["predict_scalar", "test_scalar"] - assert filter_in != ["test_scalar", "predict_scalar"], ( - "common inputs not sorted" - ) - - assert len(filter_out) == 2 and filter_out == ["predict_scalar", "test_scalar"] - assert filter_out != ["test_scalar", "predict_scalar"], ( - "common outputs not sorted" - ) - - fail_filter_in = problem.filter_input_scalars_names(["a_scalar"]) - fail_filter_out = problem.filter_output_scalars_names(["b_scalar"]) - - assert fail_filter_in == [] - assert fail_filter_out == [] - - # -#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-# - # -------------------------------------------------------------------------# - def test_get_input_fields_names(self, problem_definition): - assert problem_definition.get_input_fields_names() == [] - - def test_add_input_fields_names_fail_same_name(self, problem_definition): - with pytest.raises(ValueError): - problem_definition.add_input_fields_names(["feature_name", "feature_name"]) - problem_definition.add_input_field_name("feature_name") - with pytest.raises(ValueError): - problem_definition.add_input_field_name("feature_name") - - def test_add_input_fields_names(self, problem_definition): - problem_definition.add_input_fields_names(["field", "test_field"]) - problem_definition.add_input_field_name("predict_field") - inputs = problem_definition.get_input_fields_names() - assert len(inputs) == 3 - assert set(inputs) == set(["predict_field", "field", "test_field"]) - print(problem_definition) - - # -------------------------------------------------------------------------# - def test_get_output_fields_names(self, problem_definition): - assert problem_definition.get_output_fields_names() == [] - - def test_add_output_fields_names_fail(self, problem_definition): - with pytest.raises(ValueError): - problem_definition.add_output_fields_names(["feature_name", "feature_name"]) - problem_definition.add_output_field_name("feature_name") - with pytest.raises(ValueError): - problem_definition.add_output_field_name("feature_name") - - def test_add_output_fields_names(self, problem_definition): - problem_definition.add_output_fields_names(["field", "test_field"]) - problem_definition.add_output_field_name("predict_field") - outputs = problem_definition.get_output_fields_names() - assert len(outputs) == 3 - assert set(outputs) == set(["predict_field", "field", "test_field"]) - print(problem_definition) - - # -------------------------------------------------------------------------# - def test_filter_fields_names(self, current_directory): - d_path = current_directory / "problem_definition" - problem = ProblemDefinition(d_path) - filter_in = problem.filter_input_fields_names(["predict_field", "test_field"]) - filter_out = problem.filter_output_fields_names(["predict_field", "test_field"]) - assert len(filter_in) == 2 and filter_in == ["predict_field", "test_field"] - assert filter_in != ["test_field", "predict_field"], "common inputs not sorted" - - assert len(filter_out) == 2 and filter_out == ["predict_field", "test_field"] - assert filter_out != ["test_field", "predict_field"], ( - "common outputs not sorted" - ) - - fail_filter_in = problem.filter_input_fields_names(["a_field"]) - fail_filter_out = problem.filter_output_fields_names(["b_field"]) - - assert fail_filter_in == [] - assert fail_filter_out == [] - - # -#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-# - # -------------------------------------------------------------------------# - def test_get_input_timeseries_names(self, problem_definition): - assert problem_definition.get_input_timeseries_names() == [] - - def test_add_input_timeseries_names_fail_same_name(self, problem_definition): - with pytest.raises(ValueError): - problem_definition.add_input_timeseries_names( - ["feature_name", "feature_name"] - ) - problem_definition.add_input_timeseries_name("feature_name") - with pytest.raises(ValueError): - problem_definition.add_input_timeseries_name("feature_name") - - def test_add_input_timeseries_names(self, problem_definition): - problem_definition.add_input_timeseries_names(["timeseries", "test_timeseries"]) - problem_definition.add_input_timeseries_name("predict_timeseries") - inputs = problem_definition.get_input_timeseries_names() - assert len(inputs) == 3 - assert set(inputs) == set( - ["predict_timeseries", "timeseries", "test_timeseries"] - ) - print(problem_definition) - - # -------------------------------------------------------------------------# - def test_get_output_timeseries_names(self, problem_definition): - assert problem_definition.get_output_timeseries_names() == [] - - def test_add_output_timeseries_names_fail(self, problem_definition): - with pytest.raises(ValueError): - problem_definition.add_output_timeseries_names( - ["feature_name", "feature_name"] - ) - problem_definition.add_output_timeseries_name("feature_name") - with pytest.raises(ValueError): - problem_definition.add_output_timeseries_name("feature_name") - - def test_add_output_timeseries_names(self, problem_definition): - problem_definition.add_output_timeseries_names( - ["timeseries", "test_timeseries"] - ) - problem_definition.add_output_timeseries_name("predict_timeseries") - outputs = problem_definition.get_output_timeseries_names() - assert len(outputs) == 3 - assert set(outputs) == set( - ["predict_timeseries", "timeseries", "test_timeseries"] - ) - print(problem_definition) - - # -------------------------------------------------------------------------# - def test_filter_timeseries_names(self, current_directory): - d_path = current_directory / "problem_definition" - problem = ProblemDefinition(d_path) - filter_in = problem.filter_input_timeseries_names( - ["predict_timeseries", "test_timeseries"] - ) - filter_out = problem.filter_output_timeseries_names( - ["predict_timeseries", "test_timeseries"] - ) - assert len(filter_in) == 2 and filter_in == [ - "predict_timeseries", - "test_timeseries", - ] - assert filter_in != ["test_timeseries", "predict_timeseries"], ( - "common inputs not sorted" - ) - - assert len(filter_out) == 2 and filter_out == [ - "predict_timeseries", - "test_timeseries", - ] - assert filter_out != ["test_timeseries", "predict_timeseries"], ( - "common outputs not sorted" - ) - - fail_filter_in = problem.filter_input_timeseries_names(["a_timeseries"]) - fail_filter_out = problem.filter_output_timeseries_names(["b_timeseries"]) - - assert fail_filter_in == [] - assert fail_filter_out == [] - - # -#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-# - # -------------------------------------------------------------------------# - def test_get_input_meshes_names(self, problem_definition): - assert problem_definition.get_input_meshes_names() == [] - - def test_add_input_meshes_names_fail_same_name(self, problem_definition): - with pytest.raises(ValueError): - problem_definition.add_input_meshes_names(["feature_name", "feature_name"]) - problem_definition.add_input_mesh_name("feature_name") - with pytest.raises(ValueError): - problem_definition.add_input_mesh_name("feature_name") - - def test_add_input_meshes_names(self, problem_definition): - problem_definition.add_input_meshes_names(["mesh", "test_mesh"]) - problem_definition.add_input_mesh_name("predict_mesh") - inputs = problem_definition.get_input_meshes_names() - assert len(inputs) == 3 - assert set(inputs) == set(["predict_mesh", "mesh", "test_mesh"]) - print(problem_definition) - - # -------------------------------------------------------------------------# - def test_get_output_meshes_names(self, problem_definition): - assert problem_definition.get_output_meshes_names() == [] - - def test_add_output_meshes_names_fail(self, problem_definition): - with pytest.raises(ValueError): - problem_definition.add_output_meshes_names(["feature_name", "feature_name"]) - problem_definition.add_output_mesh_name("feature_name") - with pytest.raises(ValueError): - problem_definition.add_output_mesh_name("feature_name") - - def test_add_output_meshes_names(self, problem_definition): - problem_definition.add_output_meshes_names(["mesh", "test_mesh"]) - problem_definition.add_output_mesh_name("predict_mesh") - outputs = problem_definition.get_output_meshes_names() - assert len(outputs) == 3 - assert set(outputs) == set(["predict_mesh", "mesh", "test_mesh"]) - print(problem_definition) - - # -------------------------------------------------------------------------# - def test_filter_meshes_names(self, current_directory): - d_path = current_directory / "problem_definition" - problem = ProblemDefinition(d_path) - print(f"{problem=}") - print(f"{problem.get_input_meshes_names()=}") - filter_in = problem.filter_input_meshes_names(["predict_mesh", "test_mesh"]) - filter_out = problem.filter_output_meshes_names(["predict_mesh", "test_mesh"]) - assert len(filter_in) == 2 and filter_in == ["predict_mesh", "test_mesh"] - assert filter_in != ["test_mesh", "predict_mesh"], "common inputs not sorted" - - assert len(filter_out) == 2 and filter_out == ["predict_mesh", "test_mesh"] - assert filter_out != ["test_mesh", "predict_mesh"], "common outputs not sorted" - - fail_filter_in = problem.filter_input_meshes_names(["a_mesh"]) - fail_filter_out = problem.filter_output_meshes_names(["b_mesh"]) - - assert fail_filter_in == [] - assert fail_filter_out == [] + def test_from_path_single_definition(self, monkeypatch, tmp_path): + expected = ProblemDefinition( + name="pb_single", + task="regression", + input_features=["in_a"], + output_features=["out_a"], + train_split={"train_0": [0, 1]}, + test_split={"test_0": [2]}, + ) + + def fake_loader(path): + assert path == tmp_path + return {"pb_single": expected} + + monkeypatch.setattr( + "plaid.storage.load_problem_definitions_from_disk", fake_loader + ) + + loaded = ProblemDefinition.from_path(tmp_path) + assert loaded.name == "pb_single" + assert loaded.task == "regression" + assert loaded.input_features == ["in_a"] + assert loaded.output_features == ["out_a"] + assert loaded.get_train_split_name() == "train_0" + assert loaded.get_test_split_name() == "test_0" + assert loaded.get_train_split_indices() == [0, 1] + assert loaded.get_test_split_indices() == [2] + + def test_from_path_named_definition_and_override(self, monkeypatch, tmp_path): + pb_1 = ProblemDefinition( + name="pb_1", + task="regression", + input_features=["in_a"], + output_features=["out_a"], + train_split={"train_0": [0, 1]}, + test_split={"test_0": [2]}, + ) + pb_2 = ProblemDefinition( + name="pb_2", + task="classification", + input_features=["in_b"], + output_features=["out_b"], + train_split={"train_1": [3, 4]}, + test_split={"test_1": [5]}, + ) + + def fake_loader(path): + assert path == tmp_path + return {"pb_1": pb_1, "pb_2": pb_2} + + monkeypatch.setattr( + "plaid.storage.load_problem_definitions_from_disk", fake_loader + ) + + loaded = ProblemDefinition.from_path( + tmp_path, + name="pb_2", + score_function="RRMSE", + ) + assert loaded.name == "pb_2" + assert loaded.task == "classification" + assert loaded.score_function == "RRMSE" + + def test_from_path_unknown_name_raises(self, monkeypatch, tmp_path): + pb = ProblemDefinition( + name="existing", + task="regression", + input_features=["in_a"], + output_features=["out_a"], + train_split={"train_0": [0, 1]}, + test_split={"test_0": [2]}, + ) + + monkeypatch.setattr( + "plaid.storage.load_problem_definitions_from_disk", + lambda path: {"existing": pb}, + ) + + with pytest.raises(ValueError, match="Problem definition 'missing' not found"): + ProblemDefinition.from_path(tmp_path, name="missing") + + def test_from_path_requires_name_when_multiple(self, monkeypatch, tmp_path): + pb_1 = ProblemDefinition( + name="pb_1", + task="regression", + input_features=["in_a"], + output_features=["out_a"], + train_split={"train_0": [0, 1]}, + test_split={"test_0": [2]}, + ) + pb_2 = ProblemDefinition( + name="pb_2", + task="classification", + input_features=["in_b"], + output_features=["out_b"], + train_split={"train_1": [3, 4]}, + test_split={"test_1": [5]}, + ) + + monkeypatch.setattr( + "plaid.storage.load_problem_definitions_from_disk", + lambda path: {"pb_1": pb_1, "pb_2": pb_2}, + ) + + with pytest.raises(RuntimeError, match="more than one Problem definition"): + ProblemDefinition.from_path(tmp_path) + + def test_feature_validators_reject_duplicates(self): + with pytest.raises( + ValidationError, match="duplicated values in input_features" + ): + ProblemDefinition(input_features=["a", "a"]) + + with pytest.raises( + ValidationError, match="duplicated values in output_features" + ): + ProblemDefinition(output_features=["a", "a"]) + + def test_split_validator_rejects_more_than_one_key(self): + with pytest.raises(ValidationError, match="Splits only support one element"): + ProblemDefinition(train_split={"train_1": [0], "train_2": [1]}) + + def test_non_overwritable_attributes_raise(self, problem_definition): + problem_definition.name = "problem_a" + with pytest.raises(AttributeError, match="'name' is already set"): + problem_definition.name = "problem_b" + + problem_definition.task = "regression" + with pytest.raises(AttributeError, match="'task' is already set"): + problem_definition.task = "classification" + + problem_definition.score_function = "RRMSE" + with pytest.raises(AttributeError, match="'score_function' is already set"): + problem_definition.score_function = "MSE" + + def test_split_replacement_logs_warning(self, problem_definition, caplog): + problem_definition.train_split = {"train_0": [0, 1]} + with caplog.at_level("WARNING"): + problem_definition.train_split = {"train_1": [2, 3]} + + assert "already exists -> data will be replaced" in caplog.text + + def test_get_split_paths(self, problem_definition): + problem_definition.train_split = {"train_0": [0, 1, 2]} + problem_definition.test_split = {"test_0": [3, 4]} + + assert problem_definition.get_train_split_name() == "train_0" + assert problem_definition.get_test_split_name() == "test_0" + assert problem_definition.get_train_split_indices() == [0, 1, 2] + assert problem_definition.get_test_split_indices() == [3, 4] + + def test_add_feature_identifiers_duplicate_checks(self, problem_definition): + problem_definition.add_in_features_identifiers(["in_1", "in_2"]) + with pytest.raises(ValueError, match="in_1 is already in"): + problem_definition.add_in_features_identifiers("in_1") + with pytest.raises( + ValueError, match="Some input features share the same identifier" + ): + problem_definition.add_in_features_identifiers(["x", "x"]) + + problem_definition.add_out_features_identifiers(["out_1", "out_2"]) + with pytest.raises(ValueError, match="out_1 is already in"): + problem_definition.add_out_features_identifiers("out_1") + with pytest.raises( + ValueError, match="Some output features share the same identifier" + ): + problem_definition.add_out_features_identifiers(["y", "y"]) # -------------------------------------------------------------------------# def test_split(self, problem_definition): - new_split = {"train": [0, 1, 2], "test": [3, 4]} - problem_definition.set_split(new_split) - assert problem_definition.get_split("train") == [0, 1, 2] - assert problem_definition.get_split("test") == [3, 4] - - all_split = problem_definition.get_split() - assert all_split["train"] == [0, 1, 2] and all_split["test"] == [3, 4] - assert problem_definition.get_all_indices() == [0, 1, 2, 3, 4] - - def test_train_split(self, problem_definition): - train_split = {"train1": [0, 1, 2], "train2": [3, 4]} - problem_definition.set_train_split(train_split) - problem_definition.get_train_split() - assert problem_definition.get_train_split("train1") == [0, 1, 2] - assert problem_definition.get_train_split("train2") == [3, 4] - - def test_test_split(self, problem_definition): - test_split = {"test1": [0, 1, 2], "test2": [3, 4]} - problem_definition.set_test_split(test_split) - problem_definition.get_test_split() - assert problem_definition.get_test_split("test1") == [0, 1, 2] - assert problem_definition.get_test_split("test2") == [3, 4] - - # -------------------------------------------------------------------------# - def test__save_to_dir_( - self, problem_definition_full: ProblemDefinition, tmp_path: Path - ): - problem_definition_full._save_to_dir_(tmp_path / "problem_definition") - - def test_save_to_dir( - self, problem_definition_full: ProblemDefinition, tmp_path: Path - ): - problem_definition_full.save_to_dir(tmp_path / "problem_definition") - - def test_load_path_object(self, current_directory): - my_dir = Path(current_directory) - ProblemDefinition(my_dir / "problem_definition") - - def test___init___path( - self, problem_definition_full: ProblemDefinition, tmp_path: Path - ): - d_path = tmp_path / "problem_definition" - problem_definition_full._save_to_dir_(d_path) - # - problem = ProblemDefinition(d_path) - assert problem.get_task() == "regression" - assert set(problem.get_input_scalars_names()) == set( - ["predict_scalar", "scalar", "test_scalar"] - ) - assert set(problem.get_output_scalars_names()) == set( - ["predict_scalar", "scalar", "test_scalar"] - ) - all_split = problem.get_split() - assert all_split["train"] == [0, 1, 2] and all_split["test"] == [3, 4] - - def test__load_from_dir_( - self, problem_definition_full: ProblemDefinition, tmp_path: Path - ): - d_path = tmp_path / "problem_definition" - problem_definition_full._save_to_dir_(d_path) - # - problem = ProblemDefinition() - problem._load_from_dir_(d_path) - assert problem.get_task() == "regression" - assert set(problem.get_input_scalars_names()) == set( - ["predict_scalar", "scalar", "test_scalar"] - ) - assert set(problem.get_output_scalars_names()) == set( - ["predict_scalar", "scalar", "test_scalar"] - ) - all_split = problem.get_split() - assert all_split["train"] == [0, 1, 2] and all_split["test"] == [3, 4] + problem_definition.train_split = {"train_0": [0, 1, 2]} + problem_definition.test_split = {"test-1": [3, 4]} + assert problem_definition.train_split == {"train_0": [0, 1, 2]} + assert problem_definition.test_split == {"test-1": [3, 4]} def test__load_from_file_( self, problem_definition_full: ProblemDefinition, tmp_path: Path ): + path = tmp_path / "pb_def" problem_definition_full.save_to_file(path) - # problem = ProblemDefinition() problem._load_from_file_(path) - assert problem.get_task() == "regression" - assert set(problem.get_input_scalars_names()) == set( - ["predict_scalar", "scalar", "test_scalar"] - ) - assert set(problem.get_output_scalars_names()) == set( - ["predict_scalar", "scalar", "test_scalar"] + assert problem.task == "regression" + assert set(problem.input_features) == set( + [ + "Base_2_2/Zone/PointData/sig12", + "Base_2_2/Zone/PointData/U1", + "Base_2_2/Zone/PointData/U2", + "Global/predict_feature", + "Global/test_feature", + "Global/feature", + ] + ) + assert set(problem.output_features) == set( + [ + "Global/predict_feature", + "Base_2_2/Zone/PointData/sig12", + "Global/feature", + "Base_2_2/Zone/PointData/U2", + "Global/test_feature", + ] ) - def test_load(self, problem_definition_full: ProblemDefinition, tmp_path: Path): - d_path = tmp_path / "problem_definition" - problem_definition_full._save_to_dir_(d_path) - # - problem = ProblemDefinition.load(d_path) - assert problem.get_task() == "regression" - assert problem.get_name() == "regression_1" - assert set(problem.get_input_scalars_names()) == set( - ["predict_scalar", "scalar", "test_scalar"] - ) - assert set(problem.get_output_scalars_names()) == set( - ["predict_scalar", "scalar", "test_scalar"] - ) - all_split = problem.get_split() - assert all_split["train"] == [0, 1, 2] and all_split["test"] == [3, 4] - - def test__load_from_dir__old_version( - self, problem_definition_full: ProblemDefinition, tmp_path: Path - ): - d_path = tmp_path / "problem_definition" - problem_definition_full._save_to_dir_(d_path) - # Modify the plaid version in saved file - infos_path = d_path / "problem_infos.yaml" - with infos_path.open("r") as f: - text = f.read().splitlines() - text.pop() - text.append("version: 0.1.7") - text.append("") - infos_path.write_text("\n".join(text)) - - # Load the problem definition from the directory - problem = ProblemDefinition.load(d_path) - assert problem.get_version() == Version("0.1.7") - - def test__load_from_dir__empty_dir(self, tmp_path): - problem = ProblemDefinition() - with pytest.raises(FileNotFoundError): - problem._load_from_dir_(tmp_path) - - def test__load_from_dir__non_existing_dir(self): - problem = ProblemDefinition() - non_existing_dir = Path("non_existing_path") - with pytest.raises(FileNotFoundError): - problem._load_from_dir_(non_existing_dir) - def test__load_from_file__non_existing_file(self): problem = ProblemDefinition() non_existing_path = Path("non_existing_path") with pytest.raises(FileNotFoundError): problem._load_from_file_(non_existing_path) - - def test__load_from_dir__path_is_file(self, tmp_path): - problem = ProblemDefinition() - file_path = tmp_path / "file.yaml" - file_path.touch() # Create an empty file - with pytest.raises(FileExistsError): - problem._load_from_dir_(file_path) - - def test_extract_problem_definition_from_identifiers(self, problem_definition): - in_id_1 = FeatureIdentifier({"type": "scalar", "name": "in_1"}) - in_id_2 = FeatureIdentifier({"type": "scalar", "name": "in_2"}) - out_id_1 = FeatureIdentifier({"type": "scalar", "name": "out_1"}) - out_id_2 = FeatureIdentifier({"type": "scalar", "name": "out_2"}) - - problem_definition.add_in_features_identifiers([in_id_1, in_id_2]) - problem_definition.add_out_features_identifiers([out_id_1, out_id_2]) - problem_definition.set_task("regression") - problem_definition.set_name("regression_1") - with pytest.raises(ValueError): - problem_definition.set_name("regression_2") - problem_definition.set_split({"train": [0, 1], "test": [2, 3]}) - - sub_problem_definition = ( - problem_definition.extract_problem_definition_from_identifiers( - [in_id_1, out_id_1] - ) - ) - - assert sub_problem_definition.get_in_features_identifiers() == [in_id_1] - assert sub_problem_definition.get_out_features_identifiers() == [out_id_1] - assert sub_problem_definition.get_version() == problem_definition.get_version() - assert sub_problem_definition.get_task() == "regression" - assert sub_problem_definition.get_name() == "regression_1" - assert sub_problem_definition.get_split() == {"train": [0, 1], "test": [2, 3]} diff --git a/tests/types/test_cgns_types.py b/tests/types/test_cgns_types.py new file mode 100644 index 00000000..3713dc95 --- /dev/null +++ b/tests/types/test_cgns_types.py @@ -0,0 +1,60 @@ +import importlib.util +import runpy +from pathlib import Path + +import pytest + + +def _load_cgns_types_module(): + module_path = ( + Path(__file__).resolve().parents[2] / "src" / "plaid" / "types" / "cgns_types.py" + ) + spec = importlib.util.spec_from_file_location("cgns_types_for_test", module_path) + assert spec is not None + assert spec.loader is not None + module = importlib.util.module_from_spec(spec) + spec.loader.exec_module(module) + return module + + +def test_cgns_node_and_tree_alias(): + cgns_types = _load_cgns_types_module() + child = cgns_types.CGNSNode(name="Child", value=1, label="DataArray_t") + root = cgns_types.CGNSNode( + name="Root", + value=None, + children=[child], + label="CGNSTree_t", + ) + + assert root.name == "Root" + assert root.children[0].name == "Child" + assert cgns_types.CGNSTree is cgns_types.CGNSNode + + +def test_cgns_path_properties_and_zone_method(): + cgns_types = _load_cgns_types_module() + path = cgns_types.CGNSPath("Base_1_0/Zone/GridCoordinates") + + assert path.root == "Base_1_0/Zone/GridCoordinates" + assert path.path == "Base_1_0/Zone/GridCoordinates" + assert path.base == "Base_1_0" + assert path.zone() == "Zone" + + +def test_cgns_path_rejects_invalid_pattern(): + cgns_types = _load_cgns_types_module() + with pytest.raises(ValueError, match="Invalid CGNS variable format"): + cgns_types.CGNSPath("InvalidPath") + + +def test_module_main_example_runs(capsys): + module_path = ( + Path(__file__).resolve().parents[2] / "src" / "plaid" / "types" / "cgns_types.py" + ) + runpy.run_path(str(module_path), run_name="__main__") + + out = capsys.readouterr().out + assert "Valid path: Base_1_0/Zone/GridCoordinates" in out + assert "Valid path: Base_0_0/Normal/Normals" in out + assert "Invalid path error:" in out \ No newline at end of file diff --git a/tests/utils/__init__.py b/tests/utils/__init__.py index a9efb940..e69de29b 100644 --- a/tests/utils/__init__.py +++ b/tests/utils/__init__.py @@ -1,6 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# diff --git a/tests/utils/test_base.py b/tests/utils/test_base.py index d468b9bb..8d728acc 100644 --- a/tests/utils/test_base.py +++ b/tests/utils/test_base.py @@ -1,10 +1,3 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports diff --git a/tests/utils/test_cgns_helper.py b/tests/utils/test_cgns_helper.py index e1dfe97f..536b917a 100644 --- a/tests/utils/test_cgns_helper.py +++ b/tests/utils/test_cgns_helper.py @@ -1,10 +1,3 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports import copy diff --git a/tests/utils/test_cgns_worker.py b/tests/utils/test_cgns_worker.py index 53d5ed36..5203668d 100644 --- a/tests/utils/test_cgns_worker.py +++ b/tests/utils/test_cgns_worker.py @@ -1,10 +1,3 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - # %% Imports diff --git a/tests/utils/test_deprecation.py b/tests/utils/test_deprecation.py deleted file mode 100644 index acc65775..00000000 --- a/tests/utils/test_deprecation.py +++ /dev/null @@ -1,137 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports - -import warnings - -import pytest - -import plaid.utils.deprecation as dep - -# %% Tests - - -def test_deprecated_function_warns(): - @dep.deprecated("use new_func instead", version="1.0", removal="2.0") - def old_func(x): - return x * 2 - - # Capture warning - with pytest.warns(DeprecationWarning, match="use new_func instead"): - result = old_func(5) - - assert result == 10 - - -def test_deprecated_class_warns(): - @dep.deprecated("use NewClass instead", version="1.1") - class OldClass: - def __init__(self, val): - self.val = val - - with pytest.warns(DeprecationWarning, match="use NewClass instead"): - obj = OldClass(42) - - assert obj.val == 42 - - -def test_deprecated_invalid_type(): - decorator = dep.deprecated("invalid use") - with pytest.raises( - TypeError, - match="@deprecated decorator with non-None category must be applied to a class or callable, not *", - ): - decorator(pytest) - with pytest.raises( - TypeError, - match=f"@deprecated decorator with non-None category must be applied to a class or callable, not {3!r}", - ): - decorator(3) - with pytest.raises( - TypeError, - match=f"@deprecated decorator with non-None category must be applied to a class or callable, not {3.14!r}", - ): - decorator(3.14) - with pytest.raises( - TypeError, - match=f"@deprecated decorator with non-None category must be applied to a class or callable, not {'test'!r}", - ): - decorator("test") - - -def test_deprecated_argument_warns_and_converts(): - @dep.deprecated_argument( - "old", "new", converter=lambda v: v + 1, version="1.2", removal="2.0" - ) - def func(new): - return new * 2 - - # Using old argument -> should warn and convert - with pytest.warns( - DeprecationWarning, match="Argument `old` is deprecated, use `new` instead." - ): - result = func(old=3) - assert result == 8 # (3+1)*2 - - # Using new argument directly -> no warning - with warnings.catch_warnings(record=True) as w: - warnings.simplefilter("always") - result2 = func(new=5) - assert result2 == 10 - assert len(w) == 0 - - -def test_deprecated_argument_no_old_arg(): - """Calling a function without the deprecated argument should just work.""" - - @dep.deprecated_argument("old", "new") - def func(new): - return new * 3 - - result = func(new=4) - assert result == 12 - - -def test_deprecated_fallback(monkeypatch): - """Simulate Python < 3.13 where warnings.deprecated does not exist.""" - - monkeypatch.setattr(dep, "deprecated_builtin", None) - - @dep.deprecated("use fallback", version="9.9", removal="10.0") - def legacy(x): - return x + 1 - - with pytest.warns(DeprecationWarning, match="use fallback"): - assert legacy(4) == 5 - - @dep.deprecated("old class fallback") - class LegacyClass: - def __init__(self, x): - self.x = x - - with pytest.warns(DeprecationWarning, match="old class fallback"): - obj = LegacyClass(7) - assert obj.x == 7 - - @dep.deprecated_argument("old", "new", version="9.9", removal="10.0") - def func(new): - return new - - with pytest.warns(DeprecationWarning, match="Argument `old` is deprecated"): - assert func(old=123) == 123 - - -def test_deprecated_argument_converter_identity(): - """Ensure converter default (identity) is applied.""" - - @dep.deprecated_argument("x", "y") - def func(y): - return y - - with pytest.warns(DeprecationWarning, match="Argument `x` is deprecated"): - assert func(x="hello") == "hello" diff --git a/tests/utils/test_info.py b/tests/utils/test_info.py new file mode 100644 index 00000000..c4623f4c --- /dev/null +++ b/tests/utils/test_info.py @@ -0,0 +1,36 @@ +import pytest + +from plaid.utils.info import normalize_infos, validate_required_infos, verify_info + + +def test_verify_info_accepts_special_internal_keys(): + infos = { + "legal": {"owner": "owner", "license": "cc-by-4.0"}, + "num_samples": {"train": 10}, + "storage_backend": "zarr", + } + verify_info(infos) + + +def test_verify_info_rejects_unknown_category(): + with pytest.raises(KeyError): + verify_info({"unknown": {"x": "y"}}) + + +def test_verify_info_rejects_unknown_key(): + with pytest.raises(KeyError): + verify_info({"legal": {"unknown_key": "v"}}) + + +def test_validate_required_infos_missing_required_key(): + with pytest.raises(ValueError): + validate_required_infos({"legal": {"owner": "someone"}}) + + +def test_normalize_infos_adds_plaid_section_and_copies(): + infos = {"legal": {"owner": "owner", "license": "cc-by-4.0"}} + normalized = normalize_infos(infos) + + assert "plaid" in normalized + assert normalized["plaid"] == {} + assert "plaid" not in infos diff --git a/tests/utils/test_init_with_tabular.py b/tests/utils/test_init_with_tabular.py deleted file mode 100644 index 1a5bcfc7..00000000 --- a/tests/utils/test_init_with_tabular.py +++ /dev/null @@ -1,66 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports - -import numpy as np -import pytest - -from plaid.utils.init_with_tabular import initialize_dataset_with_tabular_data - -# %% Fixtures - - -@pytest.fixture() -def nb_samples(): - return 400 - - -@pytest.fixture() -def scalar_tabular_data(nb_samples): - return { - "scalar_name_1": np.random.randn(nb_samples), - "scalar_name_2": np.random.randn(nb_samples), - } - - -@pytest.fixture() -def quantity_tabular_data(nb_samples): - nx = 11 - ny = 7 - nz = 5 - return { - "test_scalar": np.random.randn(nb_samples), - "test_1D_field": np.random.randn(nb_samples, nx), - "test_2D_field": np.random.randn(nb_samples, nx, ny), - "test_3D_field": np.random.randn(nb_samples, nx, ny, nz), - } - - -# %% Tests - - -class Test_initialize_dataset_with_tabular_data: - def test_initialize_dataset_with_tabular_data( - self, scalar_tabular_data, nb_samples - ): - dataset = initialize_dataset_with_tabular_data(scalar_tabular_data) - assert len(dataset) == nb_samples - - sample_1 = dataset[1] - scalar_value = sample_1.get_scalar("scalar_name_1") - assert isinstance(scalar_value, float) - - def test_initialize_dataset_with_quantity_tabular_data( - self, quantity_tabular_data, nb_samples - ): - dataset = initialize_dataset_with_tabular_data(quantity_tabular_data) - assert len(dataset) == nb_samples - - # scalar_names = ["test_scalar", "test_1D_field", "test_2D_field"] - # tabular_data_subset = dataset.get_scalars_to_tabular(scalar_names) - # assert isinstance(tabular_data_subset, dict) diff --git a/tests/utils/test_interpolation.py b/tests/utils/test_interpolation.py deleted file mode 100644 index 6a0fb346..00000000 --- a/tests/utils/test_interpolation.py +++ /dev/null @@ -1,227 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -import numpy as np -import pytest - -from plaid.utils.interpolation import ( - binary_search, - binary_search_vectorized, - piece_wise_linear_interpolation, - piece_wise_linear_interpolation_vectorized, - piece_wise_linear_interpolation_vectorized_with_map, - piece_wise_linear_interpolation_with_map, -) - - -@pytest.fixture() -def time_indices(): - return np.array([0.0, 1.0, 2.5]) - - -@pytest.fixture() -def vectors(): - return np.array([np.ones(5), 2.0 * np.ones(5), 3.0 * np.ones(5)]) - - -@pytest.fixture() -def vectors_map(): - return ["vec1", "vec2", "vec1"] - - -@pytest.fixture() -def vectors_dict(): - return {"vec1": np.ones(5), "vec2": 2.0 * np.ones(5)} - - -@pytest.fixture() -def input_values(): - return np.array([-0.1, 2.0, 3.0]) - - -@pytest.fixture() -def time_indices_bis(): - return np.array( - [ - 0.0, - 100.0, - 200.0, - 300.0, - 400.0, - 500.0, - 600.0, - 700.0, - 800.0, - 900.0, - 1000.0, - 2000.0, - ] - ) - - -@pytest.fixture() -def coefficients(): - return np.array( - [ - 2000000.0, - 2200000.0, - 2400000.0, - 2000000.0, - 2400000.0, - 3000000.0, - 2500000.0, - 2400000.0, - 2100000.0, - 2800000.0, - 4000000.0, - 3000000.0, - ] - ) - - -@pytest.fixture() -def vals(): - return np.array( - [ - -10.0, - 0.0, - 100.0, - 150.0, - 200.0, - 300.0, - 400.0, - 500.0, - 600.0, - 700.0, - 800.0, - 900.0, - 1000.0, - 3000.0, - 701.4752695491923, - ] - ) - - -class Test_sinterpolation: - def test_piece_wise_linear_interpolation_1(self, time_indices, vectors): - result = piece_wise_linear_interpolation(-1.0, time_indices, vectors) - np.testing.assert_almost_equal(result, [1.0, 1.0, 1.0, 1.0, 1.0]) - - def test_piece_wise_linear_interpolation_2(self, time_indices, vectors): - result = piece_wise_linear_interpolation(1.0, time_indices, vectors) - np.testing.assert_almost_equal(result, [2.0, 2.0, 2.0, 2.0, 2.0]) - - def test_piece_wise_linear_interpolation_3(self, time_indices, vectors): - result = piece_wise_linear_interpolation(0.4, time_indices, vectors) - np.testing.assert_almost_equal(result, [1.4, 1.4, 1.4, 1.4, 1.4]) - - def test_piece_wise_linear_interpolation_with_map_1( - self, time_indices, vectors_map, vectors_dict - ): - result = piece_wise_linear_interpolation_with_map( - 3.0, time_indices, vectors_dict, vectors_map - ) - np.testing.assert_almost_equal(result, [1.0, 1.0, 1.0, 1.0, 1.0]) - - def test_piece_wise_linear_interpolation_with_map_2( - self, time_indices, vectors_map, vectors_dict - ): - result = piece_wise_linear_interpolation_with_map( - 1.0, time_indices, vectors_dict, vectors_map - ) - np.testing.assert_almost_equal(result, [2.0, 2.0, 2.0, 2.0, 2.0]) - - def test_piece_wise_linear_interpolation_with_map_3( - self, time_indices, vectors_map, vectors_dict - ): - result = piece_wise_linear_interpolation_with_map( - 0.6, time_indices, vectors_dict, vectors_map - ) - np.testing.assert_almost_equal(result, [1.6, 1.6, 1.6, 1.6, 1.6]) - - def test_piece_wise_linear_interpolation_vectorized_with_map( - self, input_values, time_indices, vectors_dict, vectors_map - ): - result = piece_wise_linear_interpolation_vectorized_with_map( - input_values, time_indices, vectors_dict, vectors_map - ) - - expected_result = [ - np.array([1.0, 1.0, 1.0, 1.0, 1.0]), - np.array([1.33333333, 1.33333333, 1.33333333, 1.33333333, 1.33333333]), - np.array([1.0, 1.0, 1.0, 1.0, 1.0]), - ] - - np.testing.assert_almost_equal(result, expected_result) - - def test_piece_wise_linear_interpolation_loop( - self, time_indices_bis, coefficients, vals - ): - expected_result = np.array( - [ - 2000000.0, - 2000000.0, - 2200000.0, - 2300000.0, - 2400000.0, - 2000000.0, - 2400000.0, - 3000000.0, - 2500000.0, - 2400000.0, - 2100000.0, - 2800000.0, - 4000000.0, - 3000000.0, - 2395574.19135242, - ] - ) - - for i in range(vals.shape[0]): - assert ( - piece_wise_linear_interpolation(vals[i], time_indices_bis, coefficients) - - expected_result[i] - ) / expected_result[i] < 1.0e-10 - - def test_piece_wise_linear_interpolation_vectorized( - self, time_indices_bis, coefficients, vals - ): - result = piece_wise_linear_interpolation_vectorized( - np.array(vals), time_indices_bis, coefficients - ) - - expected_result = [ - 2000000.0, - 2000000.0, - 2200000.0, - 2300000.0, - 2400000.0, - 2000000.0, - 2400000.0, - 3000000.0, - 2500000.0, - 2400000.0, - 2100000.0, - 2800000.0, - 4000000.0, - 3000000.0, - 2395574.1913524233, - ] - - np.testing.assert_almost_equal(result, expected_result) - - def test_binary_search(self): - test_list = np.array([0.0, 1.0, 2.5, 10.0]) - val_list = np.array([-1.0, 11.0, 0.6, 2.0, 2.6, 9.9, 1.0]) - - # Apply binary search to find indices for given values within a reference list - ref = np.array([0, 3, 0, 1, 2, 2, 1], dtype=int) - result = binary_search_vectorized(test_list, val_list) - - for i, val in enumerate(val_list): - assert binary_search(test_list, val) == ref[i] - assert result[i] == ref[i] diff --git a/tests/utils/test_split.py b/tests/utils/test_split.py deleted file mode 100644 index ca760f24..00000000 --- a/tests/utils/test_split.py +++ /dev/null @@ -1,273 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports - -import numpy as np -import pytest - -from plaid.utils.init_with_tabular import initialize_dataset_with_tabular_data -from plaid.utils.split import mmd_subsample_fn, split_dataset - -# %% Fixtures - - -@pytest.fixture() -def nb_scalars(): - return 7 - - -@pytest.fixture() -def nb_samples(): - return 70 - - -@pytest.fixture() -def tabular_data(nb_scalars, nb_samples): - return {f"scalar_{j}": np.random.randn(nb_samples) for j in range(nb_scalars)} - - -@pytest.fixture() -def tabular_X(tabular_data): - return np.stack([v for v in tabular_data.values()]).T - - -@pytest.fixture() -def dataset(tabular_data): - return initialize_dataset_with_tabular_data(tabular_data) - - -# %% Tests - - -class Test_split_dataset: - def test_ratios(self, dataset): - options = { - "shuffle": True, - "split_ratios": { - "train": 0.8, - "val": 0.1, - "test": 0.1, - }, - "unknown": { # it will be ignored - "train": 0.8, - "val": 0.1, - "test": 0.1, - }, - } - split = split_dataset(dataset, options) - assert len(split["train"]) == 56 - assert len(split["val"]) == 7 - assert len(split["test"]) == 7 - - result = np.concatenate((split["train"], split["val"], split["test"]), axis=0) - assert len(set(result.tolist())) == len(result) - - def test_ratios_other(self, dataset): - options = { - "shuffle": True, - "split_ratios": { - "train": 0.8, - "val": 0.1, - }, - } - split = split_dataset(dataset, options) - assert len(split["train"]) == 56 - assert len(split["val"]) == 7 - assert len(split["other"]) == 7 - - result = np.concatenate((split["train"], split["val"], split["other"]), axis=0) - assert len(set(result.tolist())) == len(result) - - def test_split_size(self, dataset): - options = { - "shuffle": True, - "split_sizes": { - "train": 40, - "val": 20, - "test": 10, - }, - } - split = split_dataset(dataset, options) - assert len(split["train"]) == 40 - assert len(split["val"]) == 20 - assert len(split["test"]) == 10 - - result = np.concatenate((split["train"], split["val"], split["test"]), axis=0) - assert len(set(result.tolist())) == len(result) - - def test_split_size_other(self, dataset): - options = { - "shuffle": True, - "split_sizes": { - "train": 40, - "val": 10, - }, - } - split = split_dataset(dataset, options) - assert len(split["train"]) == 40 - assert len(split["val"]) == 10 - assert len(split["other"]) == 20 - - result = np.concatenate((split["train"], split["val"], split["other"]), axis=0) - assert len(set(result.tolist())) == len(result) - - def test_split_ids_unique_use(self, dataset): - options = { - "split_ids": { - "train": np.arange(30), - "val": np.arange(30, 60), - "predict": np.arange(60, 70), - } - } - split = split_dataset(dataset, options) - assert len(split["train"]) == 30 - assert len(split["val"]) == 30 - assert len(split["predict"]) == 10 - - result = np.concatenate( - (split["train"], split["val"], split["predict"]), axis=0 - ) - assert len(set(result.tolist())) == len(result) - - def test_split_ids(self, dataset): - options = { - "shuffle": True, - "split_ids": { - "train": np.arange(30), - "val": np.arange(30, 70), - "predict": np.arange(25, 35), - }, - } - split = split_dataset(dataset, options) - assert len(split["train"]) == 30 - assert len(split["val"]) == 40 - assert len(split["predict"]) == 10 - - def test_split_ids_other(self, dataset): - options = { - "split_ids": { - "train": np.arange(20), - "val": np.arange(30, 60), - "predict": np.arange(25, 35), - } - } - split = split_dataset(dataset, options) - assert len(split["train"]) == 20 - assert len(split["val"]) == 30 - assert len(split["predict"]) == 10 - assert len(split["other"]) == 15 - - def test_split_ratios_and_sizes(self, dataset): - options = { - "shuffle": True, - "split_ratios": { - "train": 0.8, - "test": 0.1, - }, - "split_sizes": {"val": 7}, - } - split = split_dataset(dataset, options) - assert len(split["train"]) == 56 - assert len(split["test"]) == 7 - assert len(split["val"]) == 7 - - result = np.concatenate((split["train"], split["val"], split["test"]), axis=0) - assert len(set(result.tolist())) == len(result) - - def test_split_ratios_and_sizes_other(self, dataset): - options = { - "shuffle": True, - "split_ratios": { - "train": 0.7, - "test": 0.1, - }, - "split_sizes": {"val": 7}, - } - split = split_dataset(dataset, options) - assert len(split["train"]) == 49 - assert len(split["test"]) == 7 - assert len(split["val"]) == 7 - assert len(split["other"]) == 7 - - result = np.concatenate( - (split["train"], split["val"], split["test"], split["other"]), axis=0 - ) - assert len(set(result.tolist())) == len(result) - - def test_split_ids_out_of_bounds(self, dataset): - with pytest.raises(ValueError): - split_dataset( - dataset, {"shuffle": True, "split_ids": {"train": np.arange(-1, 69)}} - ) - with pytest.raises(ValueError): - split_dataset( - dataset, {"shuffle": True, "split_ids": {"train": np.arange(0, 80)}} - ) - - def test_split_ratios_sizes_out_of_bounds(self, dataset): - with pytest.raises(AssertionError): - split_dataset( - dataset, - { - "shuffle": True, - "split_ratios": {"train": 0.8, "predict": 0.05, "test": 0.1}, - "split_sizes": {"val": 80}, - }, - ) - - def test_ratios_error(self, dataset): - options = { - "shuffle": True, - "split_ratios": { - "train": 0.8, - "val": 0.8, - }, - } - with pytest.raises(AssertionError): - split_dataset(dataset, options) - - def test_fail_other(self, dataset): - # 'other' key name is not authorized - with pytest.raises(ValueError): - split_dataset(dataset, {"split_ratios": {"other": 0.8}}) - with pytest.raises(ValueError): - split_dataset(dataset, {"split_sizes": {"other": 1}}) - with pytest.raises(ValueError): - split_dataset(dataset, {"split_ids": {"other": 1}}) - - def test_various_assertion_cases(self, dataset): - # Incompatible strategies - with pytest.raises(AssertionError): - split_dataset(dataset, {"split_ids": {"a": 1}, "split_ratios": {"b": 0.2}}) - with pytest.raises(AssertionError): - split_dataset(dataset, {"split_ids": {"a": 1}, "split_sizes": {"b": 1}}) - # Same key name 'a' - with pytest.raises(AssertionError): - split_dataset( - dataset, {"split_ratios": {"a": 0.2}, "split_sizes": {"a": 1}} - ) - # Bad type for ratios (must be float) - with pytest.raises(AssertionError): - split_dataset(dataset, {"split_ratios": {"a": 1}}) - # Bad type for ratios (must be int) - with pytest.raises(AssertionError): - split_dataset(dataset, {"split_sizes": {"a": 0.1}}) - - def test_shuffle(self, dataset): - split1 = split_dataset(dataset, {"shuffle": True}) - split2 = split_dataset(dataset, {"shuffle": True}) - assert not np.array_equal(split1["other"], split2["other"]), ( - "shuffle didn't work" - ) - - -class Test_mmd_subsample_fn: - def test_mmd_subsample_fn(self, tabular_X): - mmd_subsample_fn(tabular_X, size=10) - mmd_subsample_fn(tabular_X, size=10, initial_ids=[0, 12]) - mmd_subsample_fn(tabular_X, size=10, initial_ids=[0, 12], memory_safe=True) diff --git a/tests/utils/test_stats.py b/tests/utils/test_stats.py deleted file mode 100644 index 523b06b9..00000000 --- a/tests/utils/test_stats.py +++ /dev/null @@ -1,289 +0,0 @@ -# -*- coding: utf-8 -*- -# -# This file is subject to the terms and conditions defined in -# file 'LICENSE.txt', which is part of this source code package. -# -# - -# %% Imports - -import numpy as np -import pytest - -from plaid.constants import CGNS_FIELD_LOCATIONS -from plaid.containers.sample import Sample -from plaid.utils.stats import OnlineStatistics, Stats - -# %% Fixtures - - -@pytest.fixture() -def np_samples_1(): - return np.random.randn(400, 7) - - -@pytest.fixture() -def np_samples_2(): - return np.random.randn(20, 5, 7) - - -@pytest.fixture() -def np_samples_3(): - return np.random.randn(400, 1) - - -@pytest.fixture() -def np_samples_4(): - return np.random.randn(1, 400) - - -@pytest.fixture() -def np_samples_5(): - return np.random.randn(400) - - -@pytest.fixture() -def np_samples_6(): - return np.random.randn(50) - - -@pytest.fixture() -def online_stats(): - return OnlineStatistics() - - -@pytest.fixture() -def stats(): - return Stats() - - -@pytest.fixture() -def sample_with_scalar(np_samples_3): - s = Sample() - s.add_scalar("foo", float(np_samples_3.mean())) - return s - - -@pytest.fixture() -def sample_with_field(np_samples_6): - s = Sample() - # 1. Initialize the CGNS tree - s.features.init_tree() - # 2. Create a base and a zone - s.init_base(topological_dim=3, physical_dim=3) - s.init_zone(zone_shape=np.array([[np_samples_6.shape[0], 0, 0]])) - # 3. Set node coordinates (required for a valid zone) - s.set_nodes(np.zeros((np_samples_6.shape[0], 3))) - # 4. Add a field named "bar" - s.add_field(name="bar", field=np_samples_6) - return s - - -@pytest.fixture() -def field_data(): - return np.random.randn(101) - - -@pytest.fixture() -def field_data_of_different_size(): - return np.random.randn(51) - - -# %% Functions - - -def check_stats_dict(stats_dict): - # Check that all expected statistics keys are present - expected_keys = [ - {"name": "mean", "type": np.ndarray, "ndim": 2}, - {"name": "min", "type": np.ndarray, "ndim": 2}, - {"name": "max", "type": np.ndarray, "ndim": 2}, - {"name": "var", "type": np.ndarray, "ndim": 2}, - {"name": "std", "type": np.ndarray, "ndim": 2}, - {"name": "n_samples", "type": (int, np.integer)}, - {"name": "n_points", "type": (int, np.integer)}, - {"name": "n_features", "type": (int, np.integer)}, - ] - for key_info in expected_keys: - key = key_info["name"] - assert key in stats_dict, f"Missing key: {key}" - if "type" in key_info: - assert isinstance(stats_dict[key], key_info["type"]), ( - f"Key '{key}' has wrong type: {type(stats_dict[key])}, expected {key_info['type']}" - ) - if "ndim" in key_info: - assert hasattr(stats_dict[key], "ndim"), ( - f"Key '{key}' does not have 'ndim' attribute" - ) - assert stats_dict[key].ndim == key_info["ndim"], ( - f"Key '{key}' has wrong ndim: {stats_dict[key].ndim}, expected {key_info['ndim']}" - ) - - -# %% Tests - - -class Test_OnlineStatistics: - def test__init__(self, online_stats): - pass - - def test_add_samples_1(self, online_stats, np_samples_1, np_samples_2): - online_stats.add_samples(np_samples_1) - online_stats.add_samples(np_samples_2) - - def test_add_samples_2(self, online_stats, np_samples_4, np_samples_5): - online_stats.min = np_samples_4 - online_stats.add_samples(np_samples_5) - - def test_add_samples_3(self, online_stats, np_samples_3, np_samples_5): - online_stats.min = np_samples_3 - online_stats.add_samples(np_samples_5) - - def test_add_samples_4(self, online_stats, np_samples_5): - online_stats.add_samples(np_samples_5) - - def test_add_samples_already_present(self, online_stats, np_samples_1): - online_stats.add_samples(np_samples_1) - online_stats.add_samples(np_samples_1) - - def test_add_samples_and_flatten(self, online_stats, np_samples_1, np_samples_2): - online_stats.add_samples(np_samples_1) - online_stats.add_samples(np_samples_2) - online_stats.flatten_array() - - def test_get_stats(self, online_stats, np_samples_1): - online_stats.add_samples(np_samples_1) - stats_dict = online_stats.get_stats() - # Check that all expected statistics keys are present - check_stats_dict(stats_dict) - - def test_invalid_input_type(self, online_stats): - with pytest.raises(TypeError): - online_stats.add_samples([1, 2, 3]) # List instead of ndarray - - def test_nan_inf_input(self, online_stats): - with pytest.raises(ValueError): - online_stats.add_samples(np.array([1, np.nan, 3])) - with pytest.raises(ValueError): - online_stats.add_samples(np.array([1, np.inf, 3])) - - def test_merge_stats(self, np_samples_3, np_samples_4, np_samples_6): - stats1 = OnlineStatistics() - stats2 = OnlineStatistics() - stats1.add_samples(np_samples_3) - stats2.add_samples(np_samples_6) - n_samples_before = stats1.n_samples - n_samples_other = stats2.n_samples - mean_before = stats1.mean.copy() - other_mean = stats2.mean.copy() - stats3 = OnlineStatistics() - stats3.add_samples(np_samples_4) - # do the merging - stats1.merge_stats(stats2) - assert stats1.n_samples == n_samples_before + stats2.n_samples - expected_mean = ( - mean_before * n_samples_before + other_mean * n_samples_other - ) / (n_samples_before + n_samples_other) - assert np.allclose(stats1.mean, expected_mean) - # other merging tests - with pytest.raises(TypeError): - stats1.merge_stats(0.0) - stats1.merge_stats(stats3) - - -class Test_Stats: - def test__init__(self, stats): - pass - - def test_add_samples(self, stats, samples_no_string): - stats.add_samples(samples_no_string) - - def test_add_dataset(self, stats, dataset): - stats.add_dataset(dataset) - - def test_get_stats(self, stats, samples_no_string): - stats.add_samples(samples_no_string) - stats_dict = stats.get_stats() - - sample: Sample = samples_no_string[0] - feature_names = sample.get_scalar_names() - for base_name in sample.features.get_base_names(): - for zone_name in sample.features.get_zone_names(base_name=base_name): - for location in CGNS_FIELD_LOCATIONS: - for field_name in sample.get_field_names( - location=location, zone_name=zone_name, base_name=base_name - ): - feature_names.append( - f"{base_name}/{zone_name}/{location}/{field_name}" - ) - - for feat_name in feature_names: - assert feat_name in stats_dict, ( - f"Missing {feat_name=}, in {stats_dict.keys()}" - ) - check_stats_dict(stats_dict[feat_name]) - - def test_invalid_input(self, stats): - with pytest.raises(TypeError): - stats.add_samples("invalid") - - def test_empty_samples(self, stats): - stats.add_samples([]) - assert len(stats.get_available_statistics()) == 0 - - def test_merge_stats(self, sample_with_scalar, sample_with_field): - # Create two Stats objects with different samples - stats1 = Stats() - stats2 = Stats() - stats1.add_samples([sample_with_scalar]) - stats2.add_samples([sample_with_field]) - # Merge stats2 into stats1 - stats1.merge_stats(stats2) - # Both keys should be present - keys = stats1.get_available_statistics() - assert "foo" in keys or "bar" in keys - # Check that statistics are present for merged keys - for key in keys: - s = stats1._stats[key] - assert s.n_samples > 0 - - def test_clear_statistics(self, stats, samples_no_string): - stats.add_samples(samples_no_string) - stats.clear_statistics() - assert len(stats.get_available_statistics()) == 0 - - def test_merge_stats_with_same_sizes(self, sample_with_field): - stats1 = Stats() - stats2 = Stats() - stats1.add_samples([sample_with_field]) - stats1.add_samples([sample_with_field]) - stats1.merge_stats(stats2) - keys = stats1.get_available_statistics() - assert "Base_3_3/Zone/Vertex/bar" in keys - - stat_field = stats1._stats["Base_3_3/Zone/Vertex/bar"] - assert stat_field.n_samples == 2 - assert stat_field.n_points == 100 - assert stat_field.n_features == 50 - stats_dict = stat_field.get_stats() - check_stats_dict(stats_dict) - assert stats_dict["mean"].shape == (1, 50) - - def test_merge_stats_with_different_feautres( - self, sample_with_scalar, sample_with_field - ): - stats1 = Stats() - stats2 = Stats() - stats1.add_samples([sample_with_scalar]) - stats2.add_samples([sample_with_field]) - stats1.merge_stats(stats2) - keys = stats1.get_available_statistics() - assert "foo" in keys - - stat_field = stats1._stats["foo"] - assert stat_field.n_samples == 1 - assert stat_field.n_points == 1 - assert stat_field.n_features == 1 - stats_dict = stat_field.get_stats() - check_stats_dict(stats_dict) - assert stats_dict["mean"].shape == (1, 1) diff --git a/uv.lock b/uv.lock index 519c1389..11f4260a 100644 --- a/uv.lock +++ b/uv.lock @@ -226,11 +226,11 @@ wheels = [ [[package]] name = "certifi" -version = "2026.2.25" +version = "2026.4.22" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/af/2d/7bf41579a8986e348fa033a31cdd0e4121114f6bce2457e8876010b092dd/certifi-2026.2.25.tar.gz", hash = "sha256:e887ab5cee78ea814d3472169153c2d12cd43b14bd03329a39a9c6e2e80bfba7", size = 155029, upload-time = "2026-02-25T02:54:17.342Z" } +sdist = { url = "https://files.pythonhosted.org/packages/25/ee/6caf7a40c36a1220410afe15a1cc64993a1f864871f698c0f93acb72842a/certifi-2026.4.22.tar.gz", hash = "sha256:8d455352a37b71bf76a79caa83a3d6c25afee4a385d632127b6afb3963f1c580", size = 137077, upload-time = "2026-04-22T11:26:11.191Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/9a/3c/c17fb3ca2d9c3acff52e30b309f538586f9f5b9c9cf454f3845fc9af4881/certifi-2026.2.25-py3-none-any.whl", hash = "sha256:027692e4402ad994f1c42e52a4997a9763c646b73e4096e4d5d6db8af1d6f0fa", size = 153684, upload-time = "2026-02-25T02:54:15.766Z" }, + { url = "https://files.pythonhosted.org/packages/22/30/7cd8fdcdfbc5b869528b079bfb76dcdf6056b1a2097a662e5e8c04f42965/certifi-2026.4.22-py3-none-any.whl", hash = "sha256:3cb2210c8f88ba2318d29b0388d1023c8492ff72ecdde4ebdaddbb13a31b1c4a", size = 135707, upload-time = "2026-04-22T11:26:09.372Z" }, ] [[package]] @@ -292,59 +292,59 @@ wheels = [ [[package]] name = "charset-normalizer" -version = "3.4.6" -source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/7b/60/e3bec1881450851b087e301bedc3daa9377a4d45f1c26aa90b0b235e38aa/charset_normalizer-3.4.6.tar.gz", hash = "sha256:1ae6b62897110aa7c79ea2f5dd38d1abca6db663687c0b1ad9aed6f6bae3d9d6", size = 143363, upload-time = "2026-03-15T18:53:25.478Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/62/28/ff6f234e628a2de61c458be2779cb182bc03f6eec12200d4a525bbfc9741/charset_normalizer-3.4.6-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:82060f995ab5003a2d6e0f4ad29065b7672b6593c8c63559beefe5b443242c3e", size = 293582, upload-time = "2026-03-15T18:50:25.454Z" }, - { url = "https://files.pythonhosted.org/packages/1c/b7/b1a117e5385cbdb3205f6055403c2a2a220c5ea80b8716c324eaf75c5c95/charset_normalizer-3.4.6-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:60c74963d8350241a79cb8feea80e54d518f72c26db618862a8f53e5023deaf9", size = 197240, upload-time = "2026-03-15T18:50:27.196Z" }, - { url = "https://files.pythonhosted.org/packages/a1/5f/2574f0f09f3c3bc1b2f992e20bce6546cb1f17e111c5be07308dc5427956/charset_normalizer-3.4.6-cp311-cp311-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:f6e4333fb15c83f7d1482a76d45a0818897b3d33f00efd215528ff7c51b8e35d", size = 217363, upload-time = "2026-03-15T18:50:28.601Z" }, - { url = "https://files.pythonhosted.org/packages/4a/d1/0ae20ad77bc949ddd39b51bf383b6ca932f2916074c95cad34ae465ab71f/charset_normalizer-3.4.6-cp311-cp311-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:bc72863f4d9aba2e8fd9085e63548a324ba706d2ea2c83b260da08a59b9482de", size = 212994, upload-time = "2026-03-15T18:50:30.102Z" }, - { url = "https://files.pythonhosted.org/packages/60/ac/3233d262a310c1b12633536a07cde5ddd16985e6e7e238e9f3f9423d8eb9/charset_normalizer-3.4.6-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:9cc4fc6c196d6a8b76629a70ddfcd4635a6898756e2d9cac5565cf0654605d73", size = 204697, upload-time = "2026-03-15T18:50:31.654Z" }, - { url = "https://files.pythonhosted.org/packages/25/3c/8a18fc411f085b82303cfb7154eed5bd49c77035eb7608d049468b53f87c/charset_normalizer-3.4.6-cp311-cp311-manylinux_2_31_armv7l.whl", hash = "sha256:0c173ce3a681f309f31b87125fecec7a5d1347261ea11ebbb856fa6006b23c8c", size = 191673, upload-time = "2026-03-15T18:50:33.433Z" }, - { url = "https://files.pythonhosted.org/packages/ff/a7/11cfe61d6c5c5c7438d6ba40919d0306ed83c9ab957f3d4da2277ff67836/charset_normalizer-3.4.6-cp311-cp311-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:c907cdc8109f6c619e6254212e794d6548373cc40e1ec75e6e3823d9135d29cc", size = 201120, upload-time = "2026-03-15T18:50:35.105Z" }, - { url = "https://files.pythonhosted.org/packages/b5/10/cf491fa1abd47c02f69687046b896c950b92b6cd7337a27e6548adbec8e4/charset_normalizer-3.4.6-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:404a1e552cf5b675a87f0651f8b79f5f1e6fd100ee88dc612f89aa16abd4486f", size = 200911, upload-time = "2026-03-15T18:50:36.819Z" }, - { url = "https://files.pythonhosted.org/packages/28/70/039796160b48b18ed466fde0af84c1b090c4e288fae26cd674ad04a2d703/charset_normalizer-3.4.6-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:e3c701e954abf6fc03a49f7c579cc80c2c6cc52525340ca3186c41d3f33482ef", size = 192516, upload-time = "2026-03-15T18:50:38.228Z" }, - { url = "https://files.pythonhosted.org/packages/ff/34/c56f3223393d6ff3124b9e78f7de738047c2d6bc40a4f16ac0c9d7a1cb3c/charset_normalizer-3.4.6-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:7a6967aaf043bceabab5412ed6bd6bd26603dae84d5cb75bf8d9a74a4959d398", size = 218795, upload-time = "2026-03-15T18:50:39.664Z" }, - { url = "https://files.pythonhosted.org/packages/e8/3b/ce2d4f86c5282191a041fdc5a4ce18f1c6bd40a5bd1f74cf8625f08d51c1/charset_normalizer-3.4.6-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:5feb91325bbceade6afab43eb3b508c63ee53579fe896c77137ded51c6b6958e", size = 201833, upload-time = "2026-03-15T18:50:41.552Z" }, - { url = "https://files.pythonhosted.org/packages/3b/9b/b6a9f76b0fd7c5b5ec58b228ff7e85095370282150f0bd50b3126f5506d6/charset_normalizer-3.4.6-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:f820f24b09e3e779fe84c3c456cb4108a7aa639b0d1f02c28046e11bfcd088ed", size = 213920, upload-time = "2026-03-15T18:50:43.33Z" }, - { url = "https://files.pythonhosted.org/packages/ae/98/7bc23513a33d8172365ed30ee3a3b3fe1ece14a395e5fc94129541fc6003/charset_normalizer-3.4.6-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:b35b200d6a71b9839a46b9b7fff66b6638bb52fc9658aa58796b0326595d3021", size = 206951, upload-time = "2026-03-15T18:50:44.789Z" }, - { url = "https://files.pythonhosted.org/packages/32/73/c0b86f3d1458468e11aec870e6b3feac931facbe105a894b552b0e518e79/charset_normalizer-3.4.6-cp311-cp311-win32.whl", hash = "sha256:9ca4c0b502ab399ef89248a2c84c54954f77a070f28e546a85e91da627d1301e", size = 143703, upload-time = "2026-03-15T18:50:46.103Z" }, - { url = "https://files.pythonhosted.org/packages/c6/e3/76f2facfe8eddee0bbd38d2594e709033338eae44ebf1738bcefe0a06185/charset_normalizer-3.4.6-cp311-cp311-win_amd64.whl", hash = "sha256:a9e68c9d88823b274cf1e72f28cb5dc89c990edf430b0bfd3e2fb0785bfeabf4", size = 153857, upload-time = "2026-03-15T18:50:47.563Z" }, - { url = "https://files.pythonhosted.org/packages/e2/dc/9abe19c9b27e6cd3636036b9d1b387b78c40dedbf0b47f9366737684b4b0/charset_normalizer-3.4.6-cp311-cp311-win_arm64.whl", hash = "sha256:97d0235baafca5f2b09cf332cc275f021e694e8362c6bb9c96fc9a0eb74fc316", size = 142751, upload-time = "2026-03-15T18:50:49.234Z" }, - { url = "https://files.pythonhosted.org/packages/e5/62/c0815c992c9545347aeea7859b50dc9044d147e2e7278329c6e02ac9a616/charset_normalizer-3.4.6-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:2ef7fedc7a6ecbe99969cd09632516738a97eeb8bd7258bf8a0f23114c057dab", size = 295154, upload-time = "2026-03-15T18:50:50.88Z" }, - { url = "https://files.pythonhosted.org/packages/a8/37/bdca6613c2e3c58c7421891d80cc3efa1d32e882f7c4a7ee6039c3fc951a/charset_normalizer-3.4.6-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a4ea868bc28109052790eb2b52a9ab33f3aa7adc02f96673526ff47419490e21", size = 199191, upload-time = "2026-03-15T18:50:52.658Z" }, - { url = "https://files.pythonhosted.org/packages/6c/92/9934d1bbd69f7f398b38c5dae1cbf9cc672e7c34a4adf7b17c0a9c17d15d/charset_normalizer-3.4.6-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:836ab36280f21fc1a03c99cd05c6b7af70d2697e374c7af0b61ed271401a72a2", size = 218674, upload-time = "2026-03-15T18:50:54.102Z" }, - { url = "https://files.pythonhosted.org/packages/af/90/25f6ab406659286be929fd89ab0e78e38aa183fc374e03aa3c12d730af8a/charset_normalizer-3.4.6-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:f1ce721c8a7dfec21fcbdfe04e8f68174183cf4e8188e0645e92aa23985c57ff", size = 215259, upload-time = "2026-03-15T18:50:55.616Z" }, - { url = "https://files.pythonhosted.org/packages/4e/ef/79a463eb0fff7f96afa04c1d4c51f8fc85426f918db467854bfb6a569ce3/charset_normalizer-3.4.6-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0e28d62a8fc7a1fa411c43bd65e346f3bce9716dc51b897fbe930c5987b402d5", size = 207276, upload-time = "2026-03-15T18:50:57.054Z" }, - { url = "https://files.pythonhosted.org/packages/f7/72/d0426afec4b71dc159fa6b4e68f868cd5a3ecd918fec5813a15d292a7d10/charset_normalizer-3.4.6-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:530d548084c4a9f7a16ed4a294d459b4f229db50df689bfe92027452452943a0", size = 195161, upload-time = "2026-03-15T18:50:58.686Z" }, - { url = "https://files.pythonhosted.org/packages/bf/18/c82b06a68bfcb6ce55e508225d210c7e6a4ea122bfc0748892f3dc4e8e11/charset_normalizer-3.4.6-cp312-cp312-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:30f445ae60aad5e1f8bdbb3108e39f6fbc09f4ea16c815c66578878325f8f15a", size = 203452, upload-time = "2026-03-15T18:51:00.196Z" }, - { url = "https://files.pythonhosted.org/packages/44/d6/0c25979b92f8adafdbb946160348d8d44aa60ce99afdc27df524379875cb/charset_normalizer-3.4.6-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:ac2393c73378fea4e52aa56285a3d64be50f1a12395afef9cce47772f60334c2", size = 202272, upload-time = "2026-03-15T18:51:01.703Z" }, - { url = "https://files.pythonhosted.org/packages/2e/3d/7fea3e8fe84136bebbac715dd1221cc25c173c57a699c030ab9b8900cbb7/charset_normalizer-3.4.6-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:90ca27cd8da8118b18a52d5f547859cc1f8354a00cd1e8e5120df3e30d6279e5", size = 195622, upload-time = "2026-03-15T18:51:03.526Z" }, - { url = "https://files.pythonhosted.org/packages/57/8a/d6f7fd5cb96c58ef2f681424fbca01264461336d2a7fc875e4446b1f1346/charset_normalizer-3.4.6-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:8e5a94886bedca0f9b78fecd6afb6629142fd2605aa70a125d49f4edc6037ee6", size = 220056, upload-time = "2026-03-15T18:51:05.269Z" }, - { url = "https://files.pythonhosted.org/packages/16/50/478cdda782c8c9c3fb5da3cc72dd7f331f031e7f1363a893cdd6ca0f8de0/charset_normalizer-3.4.6-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:695f5c2823691a25f17bc5d5ffe79fa90972cc34b002ac6c843bb8a1720e950d", size = 203751, upload-time = "2026-03-15T18:51:06.858Z" }, - { url = "https://files.pythonhosted.org/packages/75/fc/cc2fcac943939c8e4d8791abfa139f685e5150cae9f94b60f12520feaa9b/charset_normalizer-3.4.6-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:231d4da14bcd9301310faf492051bee27df11f2bc7549bc0bb41fef11b82daa2", size = 216563, upload-time = "2026-03-15T18:51:08.564Z" }, - { url = "https://files.pythonhosted.org/packages/a8/b7/a4add1d9a5f68f3d037261aecca83abdb0ab15960a3591d340e829b37298/charset_normalizer-3.4.6-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:a056d1ad2633548ca18ffa2f85c202cfb48b68615129143915b8dc72a806a923", size = 209265, upload-time = "2026-03-15T18:51:10.312Z" }, - { url = "https://files.pythonhosted.org/packages/6c/18/c094561b5d64a24277707698e54b7f67bd17a4f857bbfbb1072bba07c8bf/charset_normalizer-3.4.6-cp312-cp312-win32.whl", hash = "sha256:c2274ca724536f173122f36c98ce188fd24ce3dad886ec2b7af859518ce008a4", size = 144229, upload-time = "2026-03-15T18:51:11.694Z" }, - { url = "https://files.pythonhosted.org/packages/ab/20/0567efb3a8fd481b8f34f739ebddc098ed062a59fed41a8d193a61939e8f/charset_normalizer-3.4.6-cp312-cp312-win_amd64.whl", hash = "sha256:c8ae56368f8cc97c7e40a7ee18e1cedaf8e780cd8bc5ed5ac8b81f238614facb", size = 154277, upload-time = "2026-03-15T18:51:13.004Z" }, - { url = "https://files.pythonhosted.org/packages/15/57/28d79b44b51933119e21f65479d0864a8d5893e494cf5daab15df0247c17/charset_normalizer-3.4.6-cp312-cp312-win_arm64.whl", hash = "sha256:899d28f422116b08be5118ef350c292b36fc15ec2daeb9ea987c89281c7bb5c4", size = 142817, upload-time = "2026-03-15T18:51:14.408Z" }, - { url = "https://files.pythonhosted.org/packages/1e/1d/4fdabeef4e231153b6ed7567602f3b68265ec4e5b76d6024cf647d43d981/charset_normalizer-3.4.6-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:11afb56037cbc4b1555a34dd69151e8e069bee82e613a73bef6e714ce733585f", size = 294823, upload-time = "2026-03-15T18:51:15.755Z" }, - { url = "https://files.pythonhosted.org/packages/47/7b/20e809b89c69d37be748d98e84dce6820bf663cf19cf6b942c951a3e8f41/charset_normalizer-3.4.6-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:423fb7e748a08f854a08a222b983f4df1912b1daedce51a72bd24fe8f26a1843", size = 198527, upload-time = "2026-03-15T18:51:17.177Z" }, - { url = "https://files.pythonhosted.org/packages/37/a6/4f8d27527d59c039dce6f7622593cdcd3d70a8504d87d09eb11e9fdc6062/charset_normalizer-3.4.6-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:d73beaac5e90173ac3deb9928a74763a6d230f494e4bfb422c217a0ad8e629bf", size = 218388, upload-time = "2026-03-15T18:51:18.934Z" }, - { url = "https://files.pythonhosted.org/packages/f6/9b/4770ccb3e491a9bacf1c46cc8b812214fe367c86a96353ccc6daf87b01ec/charset_normalizer-3.4.6-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:d60377dce4511655582e300dc1e5a5f24ba0cb229005a1d5c8d0cb72bb758ab8", size = 214563, upload-time = "2026-03-15T18:51:20.374Z" }, - { url = "https://files.pythonhosted.org/packages/2b/58/a199d245894b12db0b957d627516c78e055adc3a0d978bc7f65ddaf7c399/charset_normalizer-3.4.6-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:530e8cebeea0d76bdcf93357aa5e41336f48c3dc709ac52da2bb167c5b8271d9", size = 206587, upload-time = "2026-03-15T18:51:21.807Z" }, - { url = "https://files.pythonhosted.org/packages/7e/70/3def227f1ec56f5c69dfc8392b8bd63b11a18ca8178d9211d7cc5e5e4f27/charset_normalizer-3.4.6-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:a26611d9987b230566f24a0a125f17fe0de6a6aff9f25c9f564aaa2721a5fb88", size = 194724, upload-time = "2026-03-15T18:51:23.508Z" }, - { url = "https://files.pythonhosted.org/packages/58/ab/9318352e220c05efd31c2779a23b50969dc94b985a2efa643ed9077bfca5/charset_normalizer-3.4.6-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:34315ff4fc374b285ad7f4a0bf7dcbfe769e1b104230d40f49f700d4ab6bbd84", size = 202956, upload-time = "2026-03-15T18:51:25.239Z" }, - { url = "https://files.pythonhosted.org/packages/75/13/f3550a3ac25b70f87ac98c40d3199a8503676c2f1620efbf8d42095cfc40/charset_normalizer-3.4.6-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:5f8ddd609f9e1af8c7bd6e2aca279c931aefecd148a14402d4e368f3171769fd", size = 201923, upload-time = "2026-03-15T18:51:26.682Z" }, - { url = "https://files.pythonhosted.org/packages/1b/db/c5c643b912740b45e8eec21de1bbab8e7fc085944d37e1e709d3dcd9d72f/charset_normalizer-3.4.6-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:80d0a5615143c0b3225e5e3ef22c8d5d51f3f72ce0ea6fb84c943546c7b25b6c", size = 195366, upload-time = "2026-03-15T18:51:28.129Z" }, - { url = "https://files.pythonhosted.org/packages/5a/67/3b1c62744f9b2448443e0eb160d8b001c849ec3fef591e012eda6484787c/charset_normalizer-3.4.6-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:92734d4d8d187a354a556626c221cd1a892a4e0802ccb2af432a1d85ec012194", size = 219752, upload-time = "2026-03-15T18:51:29.556Z" }, - { url = "https://files.pythonhosted.org/packages/f6/98/32ffbaf7f0366ffb0445930b87d103f6b406bc2c271563644bde8a2b1093/charset_normalizer-3.4.6-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:613f19aa6e082cf96e17e3ffd89383343d0d589abda756b7764cf78361fd41dc", size = 203296, upload-time = "2026-03-15T18:51:30.921Z" }, - { url = "https://files.pythonhosted.org/packages/41/12/5d308c1bbe60cabb0c5ef511574a647067e2a1f631bc8634fcafaccd8293/charset_normalizer-3.4.6-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:2b1a63e8224e401cafe7739f77efd3f9e7f5f2026bda4aead8e59afab537784f", size = 215956, upload-time = "2026-03-15T18:51:32.399Z" }, - { url = "https://files.pythonhosted.org/packages/53/e9/5f85f6c5e20669dbe56b165c67b0260547dea97dba7e187938833d791687/charset_normalizer-3.4.6-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:6cceb5473417d28edd20c6c984ab6fee6c6267d38d906823ebfe20b03d607dc2", size = 208652, upload-time = "2026-03-15T18:51:34.214Z" }, - { url = "https://files.pythonhosted.org/packages/f1/11/897052ea6af56df3eef3ca94edafee410ca699ca0c7b87960ad19932c55e/charset_normalizer-3.4.6-cp313-cp313-win32.whl", hash = "sha256:d7de2637729c67d67cf87614b566626057e95c303bc0a55ffe391f5205e7003d", size = 143940, upload-time = "2026-03-15T18:51:36.15Z" }, - { url = "https://files.pythonhosted.org/packages/a1/5c/724b6b363603e419829f561c854b87ed7c7e31231a7908708ac086cdf3e2/charset_normalizer-3.4.6-cp313-cp313-win_amd64.whl", hash = "sha256:572d7c822caf521f0525ba1bce1a622a0b85cf47ffbdae6c9c19e3b5ac3c4389", size = 154101, upload-time = "2026-03-15T18:51:37.876Z" }, - { url = "https://files.pythonhosted.org/packages/01/a5/7abf15b4c0968e47020f9ca0935fb3274deb87cb288cd187cad92e8cdffd/charset_normalizer-3.4.6-cp313-cp313-win_arm64.whl", hash = "sha256:a4474d924a47185a06411e0064b803c68be044be2d60e50e8bddcc2649957c1f", size = 143109, upload-time = "2026-03-15T18:51:39.565Z" }, - { url = "https://files.pythonhosted.org/packages/2a/68/687187c7e26cb24ccbd88e5069f5ef00eba804d36dde11d99aad0838ab45/charset_normalizer-3.4.6-py3-none-any.whl", hash = "sha256:947cf925bc916d90adba35a64c82aace04fa39b46b52d4630ece166655905a69", size = 61455, upload-time = "2026-03-15T18:53:23.833Z" }, +version = "3.4.7" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/e7/a1/67fe25fac3c7642725500a3f6cfe5821ad557c3abb11c9d20d12c7008d3e/charset_normalizer-3.4.7.tar.gz", hash = "sha256:ae89db9e5f98a11a4bf50407d4363e7b09b31e55bc117b4f7d80aab97ba009e5", size = 144271, upload-time = "2026-04-02T09:28:39.342Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/c2/d7/b5b7020a0565c2e9fa8c09f4b5fa6232feb326b8c20081ccded47ea368fd/charset_normalizer-3.4.7-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:7641bb8895e77f921102f72833904dcd9901df5d6d72a2ab8f31d04b7e51e4e7", size = 309705, upload-time = "2026-04-02T09:26:02.191Z" }, + { url = "https://files.pythonhosted.org/packages/5a/53/58c29116c340e5456724ecd2fff4196d236b98f3da97b404bc5e51ac3493/charset_normalizer-3.4.7-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:202389074300232baeb53ae2569a60901f7efadd4245cf3a3bf0617d60b439d7", size = 206419, upload-time = "2026-04-02T09:26:03.583Z" }, + { url = "https://files.pythonhosted.org/packages/b2/02/e8146dc6591a37a00e5144c63f29fb7c97a734ea8a111190783c0e60ab63/charset_normalizer-3.4.7-cp311-cp311-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:30b8d1d8c52a48c2c5690e152c169b673487a2a58de1ec7393196753063fcd5e", size = 227901, upload-time = "2026-04-02T09:26:04.738Z" }, + { url = "https://files.pythonhosted.org/packages/fb/73/77486c4cd58f1267bf17db420e930c9afa1b3be3fe8c8b8ebbebc9624359/charset_normalizer-3.4.7-cp311-cp311-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:532bc9bf33a68613fd7d65e4b1c71a6a38d7d42604ecf239c77392e9b4e8998c", size = 222742, upload-time = "2026-04-02T09:26:06.36Z" }, + { url = "https://files.pythonhosted.org/packages/a1/fa/f74eb381a7d94ded44739e9d94de18dc5edc9c17fb8c11f0a6890696c0a9/charset_normalizer-3.4.7-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2fe249cb4651fd12605b7288b24751d8bfd46d35f12a20b1ba33dea122e690df", size = 214061, upload-time = "2026-04-02T09:26:08.347Z" }, + { url = "https://files.pythonhosted.org/packages/dc/92/42bd3cefcf7687253fb86694b45f37b733c97f59af3724f356fa92b8c344/charset_normalizer-3.4.7-cp311-cp311-manylinux_2_31_armv7l.whl", hash = "sha256:65bcd23054beab4d166035cabbc868a09c1a49d1efe458fe8e4361215df40265", size = 199239, upload-time = "2026-04-02T09:26:09.823Z" }, + { url = "https://files.pythonhosted.org/packages/4c/3d/069e7184e2aa3b3cddc700e3dd267413dc259854adc3380421c805c6a17d/charset_normalizer-3.4.7-cp311-cp311-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:08e721811161356f97b4059a9ba7bafb23ea5ee2255402c42881c214e173c6b4", size = 210173, upload-time = "2026-04-02T09:26:10.953Z" }, + { url = "https://files.pythonhosted.org/packages/62/51/9d56feb5f2e7074c46f93e0ebdbe61f0848ee246e2f0d89f8e20b89ebb8f/charset_normalizer-3.4.7-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:e060d01aec0a910bdccb8be71faf34e7799ce36950f8294c8bf612cba65a2c9e", size = 209841, upload-time = "2026-04-02T09:26:12.142Z" }, + { url = "https://files.pythonhosted.org/packages/d2/59/893d8f99cc4c837dda1fe2f1139079703deb9f321aabcb032355de13b6c7/charset_normalizer-3.4.7-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:38c0109396c4cfc574d502df99742a45c72c08eff0a36158b6f04000043dbf38", size = 200304, upload-time = "2026-04-02T09:26:13.711Z" }, + { url = "https://files.pythonhosted.org/packages/7d/1d/ee6f3be3464247578d1ed5c46de545ccc3d3ff933695395c402c21fa6b77/charset_normalizer-3.4.7-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:1c2a768fdd44ee4a9339a9b0b130049139b8ce3c01d2ce09f67f5a68048d477c", size = 229455, upload-time = "2026-04-02T09:26:14.941Z" }, + { url = "https://files.pythonhosted.org/packages/54/bb/8fb0a946296ea96a488928bdce8ef99023998c48e4713af533e9bb98ef07/charset_normalizer-3.4.7-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:1a87ca9d5df6fe460483d9a5bbf2b18f620cbed41b432e2bddb686228282d10b", size = 210036, upload-time = "2026-04-02T09:26:16.478Z" }, + { url = "https://files.pythonhosted.org/packages/9a/bc/015b2387f913749f82afd4fcba07846d05b6d784dd16123cb66860e0237d/charset_normalizer-3.4.7-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:d635aab80466bc95771bb78d5370e74d36d1fe31467b6b29b8b57b2a3cd7d22c", size = 224739, upload-time = "2026-04-02T09:26:17.751Z" }, + { url = "https://files.pythonhosted.org/packages/17/ab/63133691f56baae417493cba6b7c641571a2130eb7bceba6773367ab9ec5/charset_normalizer-3.4.7-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:ae196f021b5e7c78e918242d217db021ed2a6ace2bc6ae94c0fc596221c7f58d", size = 216277, upload-time = "2026-04-02T09:26:18.981Z" }, + { url = "https://files.pythonhosted.org/packages/06/6d/3be70e827977f20db77c12a97e6a9f973631a45b8d186c084527e53e77a4/charset_normalizer-3.4.7-cp311-cp311-win32.whl", hash = "sha256:adb2597b428735679446b46c8badf467b4ca5f5056aae4d51a19f9570301b1ad", size = 147819, upload-time = "2026-04-02T09:26:20.295Z" }, + { url = "https://files.pythonhosted.org/packages/20/d9/5f67790f06b735d7c7637171bbfd89882ad67201891b7275e51116ed8207/charset_normalizer-3.4.7-cp311-cp311-win_amd64.whl", hash = "sha256:8e385e4267ab76874ae30db04c627faaaf0b509e1ccc11a95b3fc3e83f855c00", size = 159281, upload-time = "2026-04-02T09:26:21.74Z" }, + { url = "https://files.pythonhosted.org/packages/ca/83/6413f36c5a34afead88ce6f66684d943d91f233d76dd083798f9602b75ae/charset_normalizer-3.4.7-cp311-cp311-win_arm64.whl", hash = "sha256:d4a48e5b3c2a489fae013b7589308a40146ee081f6f509e047e0e096084ceca1", size = 147843, upload-time = "2026-04-02T09:26:22.901Z" }, + { url = "https://files.pythonhosted.org/packages/0c/eb/4fc8d0a7110eb5fc9cc161723a34a8a6c200ce3b4fbf681bc86feee22308/charset_normalizer-3.4.7-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:eca9705049ad3c7345d574e3510665cb2cf844c2f2dcfe675332677f081cbd46", size = 311328, upload-time = "2026-04-02T09:26:24.331Z" }, + { url = "https://files.pythonhosted.org/packages/f8/e3/0fadc706008ac9d7b9b5be6dc767c05f9d3e5df51744ce4cc9605de7b9f4/charset_normalizer-3.4.7-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6178f72c5508bfc5fd446a5905e698c6212932f25bcdd4b47a757a50605a90e2", size = 208061, upload-time = "2026-04-02T09:26:25.568Z" }, + { url = "https://files.pythonhosted.org/packages/42/f0/3dd1045c47f4a4604df85ec18ad093912ae1344ac706993aff91d38773a2/charset_normalizer-3.4.7-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:e1421b502d83040e6d7fb2fb18dff63957f720da3d77b2fbd3187ceb63755d7b", size = 229031, upload-time = "2026-04-02T09:26:26.865Z" }, + { url = "https://files.pythonhosted.org/packages/dc/67/675a46eb016118a2fbde5a277a5d15f4f69d5f3f5f338e5ee2f8948fcf43/charset_normalizer-3.4.7-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:edac0f1ab77644605be2cbba52e6b7f630731fc42b34cb0f634be1a6eface56a", size = 225239, upload-time = "2026-04-02T09:26:28.044Z" }, + { url = "https://files.pythonhosted.org/packages/4b/f8/d0118a2f5f23b02cd166fa385c60f9b0d4f9194f574e2b31cef350ad7223/charset_normalizer-3.4.7-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5649fd1c7bade02f320a462fdefd0b4bd3ce036065836d4f42e0de958038e116", size = 216589, upload-time = "2026-04-02T09:26:29.239Z" }, + { url = "https://files.pythonhosted.org/packages/b1/f1/6d2b0b261b6c4ceef0fcb0d17a01cc5bc53586c2d4796fa04b5c540bc13d/charset_normalizer-3.4.7-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:203104ed3e428044fd943bc4bf45fa73c0730391f9621e37fe39ecf477b128cb", size = 202733, upload-time = "2026-04-02T09:26:30.5Z" }, + { url = "https://files.pythonhosted.org/packages/6f/c0/7b1f943f7e87cc3db9626ba17807d042c38645f0a1d4415c7a14afb5591f/charset_normalizer-3.4.7-cp312-cp312-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:298930cec56029e05497a76988377cbd7457ba864beeea92ad7e844fe74cd1f1", size = 212652, upload-time = "2026-04-02T09:26:31.709Z" }, + { url = "https://files.pythonhosted.org/packages/38/dd/5a9ab159fe45c6e72079398f277b7d2b523e7f716acc489726115a910097/charset_normalizer-3.4.7-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:708838739abf24b2ceb208d0e22403dd018faeef86ddac04319a62ae884c4f15", size = 211229, upload-time = "2026-04-02T09:26:33.282Z" }, + { url = "https://files.pythonhosted.org/packages/d5/ff/531a1cad5ca855d1c1a8b69cb71abfd6d85c0291580146fda7c82857caa1/charset_normalizer-3.4.7-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:0f7eb884681e3938906ed0434f20c63046eacd0111c4ba96f27b76084cd679f5", size = 203552, upload-time = "2026-04-02T09:26:34.845Z" }, + { url = "https://files.pythonhosted.org/packages/c1/4c/a5fb52d528a8ca41f7598cb619409ece30a169fbdf9cdce592e53b46c3a6/charset_normalizer-3.4.7-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:4dc1e73c36828f982bfe79fadf5919923f8a6f4df2860804db9a98c48824ce8d", size = 230806, upload-time = "2026-04-02T09:26:36.152Z" }, + { url = "https://files.pythonhosted.org/packages/59/7a/071feed8124111a32b316b33ae4de83d36923039ef8cf48120266844285b/charset_normalizer-3.4.7-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:aed52fea0513bac0ccde438c188c8a471c4e0f457c2dd20cdbf6ea7a450046c7", size = 212316, upload-time = "2026-04-02T09:26:37.672Z" }, + { url = "https://files.pythonhosted.org/packages/fd/35/f7dba3994312d7ba508e041eaac39a36b120f32d4c8662b8814dab876431/charset_normalizer-3.4.7-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:fea24543955a6a729c45a73fe90e08c743f0b3334bbf3201e6c4bc1b0c7fa464", size = 227274, upload-time = "2026-04-02T09:26:38.93Z" }, + { url = "https://files.pythonhosted.org/packages/8a/2d/a572df5c9204ab7688ec1edc895a73ebded3b023bb07364710b05dd1c9be/charset_normalizer-3.4.7-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:bb6d88045545b26da47aa879dd4a89a71d1dce0f0e549b1abcb31dfe4a8eac49", size = 218468, upload-time = "2026-04-02T09:26:40.17Z" }, + { url = "https://files.pythonhosted.org/packages/86/eb/890922a8b03a568ca2f336c36585a4713c55d4d67bf0f0c78924be6315ca/charset_normalizer-3.4.7-cp312-cp312-win32.whl", hash = "sha256:2257141f39fe65a3fdf38aeccae4b953e5f3b3324f4ff0daf9f15b8518666a2c", size = 148460, upload-time = "2026-04-02T09:26:41.416Z" }, + { url = "https://files.pythonhosted.org/packages/35/d9/0e7dffa06c5ab081f75b1b786f0aefc88365825dfcd0ac544bdb7b2b6853/charset_normalizer-3.4.7-cp312-cp312-win_amd64.whl", hash = "sha256:5ed6ab538499c8644b8a3e18debabcd7ce684f3fa91cf867521a7a0279cab2d6", size = 159330, upload-time = "2026-04-02T09:26:42.554Z" }, + { url = "https://files.pythonhosted.org/packages/9e/5d/481bcc2a7c88ea6b0878c299547843b2521ccbc40980cb406267088bc701/charset_normalizer-3.4.7-cp312-cp312-win_arm64.whl", hash = "sha256:56be790f86bfb2c98fb742ce566dfb4816e5a83384616ab59c49e0604d49c51d", size = 147828, upload-time = "2026-04-02T09:26:44.075Z" }, + { url = "https://files.pythonhosted.org/packages/c1/3b/66777e39d3ae1ddc77ee606be4ec6d8cbd4c801f65e5a1b6f2b11b8346dd/charset_normalizer-3.4.7-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:f496c9c3cc02230093d8330875c4c3cdfc3b73612a5fd921c65d39cbcef08063", size = 309627, upload-time = "2026-04-02T09:26:45.198Z" }, + { url = "https://files.pythonhosted.org/packages/2e/4e/b7f84e617b4854ade48a1b7915c8ccfadeba444d2a18c291f696e37f0d3b/charset_normalizer-3.4.7-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0ea948db76d31190bf08bd371623927ee1339d5f2a0b4b1b4a4439a65298703c", size = 207008, upload-time = "2026-04-02T09:26:46.824Z" }, + { url = "https://files.pythonhosted.org/packages/c4/bb/ec73c0257c9e11b268f018f068f5d00aa0ef8c8b09f7753ebd5f2880e248/charset_normalizer-3.4.7-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:a277ab8928b9f299723bc1a2dabb1265911b1a76341f90a510368ca44ad9ab66", size = 228303, upload-time = "2026-04-02T09:26:48.397Z" }, + { url = "https://files.pythonhosted.org/packages/85/fb/32d1f5033484494619f701e719429c69b766bfc4dbc61aa9e9c8c166528b/charset_normalizer-3.4.7-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:3bec022aec2c514d9cf199522a802bd007cd588ab17ab2525f20f9c34d067c18", size = 224282, upload-time = "2026-04-02T09:26:49.684Z" }, + { url = "https://files.pythonhosted.org/packages/fa/07/330e3a0dda4c404d6da83b327270906e9654a24f6c546dc886a0eb0ffb23/charset_normalizer-3.4.7-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e044c39e41b92c845bc815e5ae4230804e8e7bc29e399b0437d64222d92809dd", size = 215595, upload-time = "2026-04-02T09:26:50.915Z" }, + { url = "https://files.pythonhosted.org/packages/e3/7c/fc890655786e423f02556e0216d4b8c6bcb6bdfa890160dc66bf52dee468/charset_normalizer-3.4.7-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:f495a1652cf3fbab2eb0639776dad966c2fb874d79d87ca07f9d5f059b8bd215", size = 201986, upload-time = "2026-04-02T09:26:52.197Z" }, + { url = "https://files.pythonhosted.org/packages/d8/97/bfb18b3db2aed3b90cf54dc292ad79fdd5ad65c4eae454099475cbeadd0d/charset_normalizer-3.4.7-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:e712b419df8ba5e42b226c510472b37bd57b38e897d3eca5e8cfd410a29fa859", size = 211711, upload-time = "2026-04-02T09:26:53.49Z" }, + { url = "https://files.pythonhosted.org/packages/6f/a5/a581c13798546a7fd557c82614a5c65a13df2157e9ad6373166d2a3e645d/charset_normalizer-3.4.7-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:7804338df6fcc08105c7745f1502ba68d900f45fd770d5bdd5288ddccb8a42d8", size = 210036, upload-time = "2026-04-02T09:26:54.975Z" }, + { url = "https://files.pythonhosted.org/packages/8c/bf/b3ab5bcb478e4193d517644b0fb2bf5497fbceeaa7a1bc0f4d5b50953861/charset_normalizer-3.4.7-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:481551899c856c704d58119b5025793fa6730adda3571971af568f66d2424bb5", size = 202998, upload-time = "2026-04-02T09:26:56.303Z" }, + { url = "https://files.pythonhosted.org/packages/e7/4e/23efd79b65d314fa320ec6017b4b5834d5c12a58ba4610aa353af2e2f577/charset_normalizer-3.4.7-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:f59099f9b66f0d7145115e6f80dd8b1d847176df89b234a5a6b3f00437aa0832", size = 230056, upload-time = "2026-04-02T09:26:57.554Z" }, + { url = "https://files.pythonhosted.org/packages/b9/9f/1e1941bc3f0e01df116e68dc37a55c4d249df5e6fa77f008841aef68264f/charset_normalizer-3.4.7-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:f59ad4c0e8f6bba240a9bb85504faa1ab438237199d4cce5f622761507b8f6a6", size = 211537, upload-time = "2026-04-02T09:26:58.843Z" }, + { url = "https://files.pythonhosted.org/packages/80/0f/088cbb3020d44428964a6c97fe1edfb1b9550396bf6d278330281e8b709c/charset_normalizer-3.4.7-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:3dedcc22d73ec993f42055eff4fcfed9318d1eeb9a6606c55892a26964964e48", size = 226176, upload-time = "2026-04-02T09:27:00.437Z" }, + { url = "https://files.pythonhosted.org/packages/6a/9f/130394f9bbe06f4f63e22641d32fc9b202b7e251c9aef4db044324dac493/charset_normalizer-3.4.7-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:64f02c6841d7d83f832cd97ccf8eb8a906d06eb95d5276069175c696b024b60a", size = 217723, upload-time = "2026-04-02T09:27:02.021Z" }, + { url = "https://files.pythonhosted.org/packages/73/55/c469897448a06e49f8fa03f6caae97074fde823f432a98f979cc42b90e69/charset_normalizer-3.4.7-cp313-cp313-win32.whl", hash = "sha256:4042d5c8f957e15221d423ba781e85d553722fc4113f523f2feb7b188cc34c5e", size = 148085, upload-time = "2026-04-02T09:27:03.192Z" }, + { url = "https://files.pythonhosted.org/packages/5d/78/1b74c5bbb3f99b77a1715c91b3e0b5bdb6fe302d95ace4f5b1bec37b0167/charset_normalizer-3.4.7-cp313-cp313-win_amd64.whl", hash = "sha256:3946fa46a0cf3e4c8cb1cc52f56bb536310d34f25f01ca9b6c16afa767dab110", size = 158819, upload-time = "2026-04-02T09:27:04.454Z" }, + { url = "https://files.pythonhosted.org/packages/68/86/46bd42279d323deb8687c4a5a811fd548cb7d1de10cf6535d099877a9a9f/charset_normalizer-3.4.7-cp313-cp313-win_arm64.whl", hash = "sha256:80d04837f55fc81da168b98de4f4b797ef007fc8a79ab71c6ec9bc4dd662b15b", size = 147915, upload-time = "2026-04-02T09:27:05.971Z" }, + { url = "https://files.pythonhosted.org/packages/db/8f/61959034484a4a7c527811f4721e75d02d653a35afb0b6054474d8185d4c/charset_normalizer-3.4.7-py3-none-any.whl", hash = "sha256:3dce51d0f5e7951f8bb4900c257dad282f49190fdbebecd4ba99bcc41fef404d", size = 61958, upload-time = "2026-04-02T09:28:37.794Z" }, ] [[package]] @@ -367,14 +367,14 @@ wheels = [ [[package]] name = "click" -version = "8.3.1" +version = "8.3.3" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "colorama", marker = "sys_platform == 'win32'" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/3d/fa/656b739db8587d7b5dfa22e22ed02566950fbfbcdc20311993483657a5c0/click-8.3.1.tar.gz", hash = "sha256:12ff4785d337a1bb490bb7e9c2b1ee5da3112e94a8622f26a6c77f5d2fc6842a", size = 295065, upload-time = "2025-11-15T20:45:42.706Z" } +sdist = { url = "https://files.pythonhosted.org/packages/bb/63/f9e1ea081ce35720d8b92acde70daaedace594dc93b693c869e0d5910718/click-8.3.3.tar.gz", hash = "sha256:398329ad4837b2ff7cbe1dd166a4c0f8900c3ca3a218de04466f38f6497f18a2", size = 328061, upload-time = "2026-04-22T15:11:27.506Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/98/78/01c019cdb5d6498122777c1a43056ebb3ebfeef2076d9d026bfe15583b2b/click-8.3.1-py3-none-any.whl", hash = "sha256:981153a64e25f12d547d3426c367a4857371575ee7ad18df2a6183ab0545b2a6", size = 108274, upload-time = "2025-11-15T20:45:41.139Z" }, + { url = "https://files.pythonhosted.org/packages/ae/44/c1221527f6a71a01ec6fbad7fa78f1d50dfa02217385cf0fa3eec7087d59/click-8.3.3-py3-none-any.whl", hash = "sha256:a2bf429bb3033c89fa4936ffb35d5cb471e3719e1f3c8a7c3fff0b8314305613", size = 110502, upload-time = "2026-04-22T15:11:25.044Z" }, ] [[package]] @@ -583,7 +583,7 @@ wheels = [ [[package]] name = "datasets" -version = "4.8.4" +version = "4.8.5" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "dill" }, @@ -601,9 +601,9 @@ dependencies = [ { name = "tqdm" }, { name = "xxhash" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/22/22/73e46ac7a8c25e7ef0b3bd6f10da3465021d90219a32eb0b4d2afea4c56e/datasets-4.8.4.tar.gz", hash = "sha256:a1429ed853275ce7943a01c6d2e25475b4501eb758934362106a280470df3a52", size = 604382, upload-time = "2026-03-23T14:21:17.987Z" } +sdist = { url = "https://files.pythonhosted.org/packages/66/34/14cd8e76f907f7d4dca2334cfeec9f81d30fd15c25a015f99aaea694eaed/datasets-4.8.5.tar.gz", hash = "sha256:0f0c1c3d56ffff2c93b2f4c63c95bac94f3d7e8621aea2a2a576275233bba772", size = 605649, upload-time = "2026-04-27T15:43:57.384Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/b0/e5/247d094108e42ac26363ab8dc57f168840cf7c05774b40ffeb0d78868fcc/datasets-4.8.4-py3-none-any.whl", hash = "sha256:cdc8bee4698e549d78bf1fed6aea2eebc760b22b084f07e6fc020c6577a6ce6d", size = 526991, upload-time = "2026-03-23T14:21:15.89Z" }, + { url = "https://files.pythonhosted.org/packages/65/99/00f3196036501b53032c4b1ab8337a0b978dee832ed276dae3815df4e8b5/datasets-4.8.5-py3-none-any.whl", hash = "sha256:5079900781719c0e063a8efdd2cd95a31ad0c63209178669cd23cf1b926149ff", size = 528973, upload-time = "2026-04-27T15:43:53.702Z" }, ] [[package]] @@ -636,6 +636,15 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/4e/8c/f3147f5c4b73e7550fe5f9352eaa956ae838d5c51eb58e7a25b9f3e2643b/decorator-5.2.1-py3-none-any.whl", hash = "sha256:d316bb415a2d9e2d2b3abcc4084c6502fc09240e292cd76a76afc106a1c8e04a", size = 9190, upload-time = "2025-02-24T04:41:32.565Z" }, ] +[[package]] +name = "deepmerge" +version = "2.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/a8/3a/b0ba594708f1ad0bc735884b3ad854d3ca3bdc1d741e56e40bbda6263499/deepmerge-2.0.tar.gz", hash = "sha256:5c3d86081fbebd04dd5de03626a0607b809a98fb6ccba5770b62466fe940ff20", size = 19890, upload-time = "2024-08-30T05:31:50.308Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/2d/82/e5d2c1c67d19841e9edc74954c827444ae826978499bde3dfc1d007c8c11/deepmerge-2.0-py3-none-any.whl", hash = "sha256:6de9ce507115cff0bed95ff0ce9ecc31088ef50cbdf09bc90a09349a318b3d00", size = 13475, upload-time = "2024-08-30T05:31:48.659Z" }, +] + [[package]] name = "dill" version = "0.4.1" @@ -677,28 +686,28 @@ wheels = [ [[package]] name = "eigency" -version = "3.4.0.7" +version = "5.0.1.0" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "numpy" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/86/3f/603c67e1e4e30aecbd3d9bf7be02675b23004534d77c63b4c042d7d1b1bd/eigency-3.4.0.7.tar.gz", hash = "sha256:4f123342f0740b2d50d5cad4f2a89594200c55896c2de6c53864d90a96a0e651", size = 1254175, upload-time = "2026-02-25T07:55:01.01Z" } +sdist = { url = "https://files.pythonhosted.org/packages/24/db/ad7e9c8fdd43040ce4ebe23761a9810e38c26e422b117188bffe880eef22/eigency-5.0.1.0.tar.gz", hash = "sha256:aba3a16eb2bb1a42be2983abb95c11e58f92e277645a77dd35db1cfcec333f88", size = 1254353, upload-time = "2026-04-25T13:36:16.941Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/45/35/e2a31f3039083443dc349c86207ac11a6fbfddfccc7356bddb386427b13f/eigency-3.4.0.7-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:0f019389258b8a887afc750b676a20b7a7de39a4c1301433b0936055a79dadf7", size = 1621592, upload-time = "2026-02-25T07:54:20.436Z" }, - { url = "https://files.pythonhosted.org/packages/76/1f/53007228f0fbf7f9f7801a95ba3beb403bc2993bcd39f3b6c6638fba0198/eigency-3.4.0.7-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d78be187e30be3a50a2fbd4c38282401ce48851a4f856ba1ca6924b75228e4d7", size = 2581366, upload-time = "2026-02-25T07:54:22.228Z" }, - { url = "https://files.pythonhosted.org/packages/a3/bc/38fac7543c269b7adf37b8e398fbc0e6ad0e081263b26601d8bacc92244a/eigency-3.4.0.7-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:62645b24a0aed8fe15440bf13e0471c905fe207f810706dd0ef205a9c58aa7ba", size = 2428651, upload-time = "2026-02-25T07:54:24.092Z" }, - { url = "https://files.pythonhosted.org/packages/c4/2e/1802e74626b2ff34e8db03c32948c822ea9091f7a4e49efa1ab5f4c64888/eigency-3.4.0.7-cp311-cp311-win32.whl", hash = "sha256:0ee34edc310fabe04515a384c7ddb2342d5fc6c29acd07f563725ebb3c1df436", size = 1593342, upload-time = "2026-02-25T07:54:27.652Z" }, - { url = "https://files.pythonhosted.org/packages/40/db/508e024e6bb5e40acb777100ecbd9e8da55ced94b6f5f39e39f11c0f6c46/eigency-3.4.0.7-cp311-cp311-win_amd64.whl", hash = "sha256:3eb32ca1758fa03d52899f36acfdd18607195920cf1c06d77634933a7061384c", size = 1618609, upload-time = "2026-02-25T07:54:29.417Z" }, - { url = "https://files.pythonhosted.org/packages/4c/89/efb1c04cf656061b6acc088da6911388594beabf15a52dc9fe365b4bc7c9/eigency-3.4.0.7-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:9f866c2b9ee6c81c3cc335876bc0cd9f455ea595bb8e12de8e436e535033dd99", size = 1620245, upload-time = "2026-02-25T07:54:31.385Z" }, - { url = "https://files.pythonhosted.org/packages/7f/a1/b2a24bb526c1ead9bdfc16a7d336c83b61d18cc5e144c9fa3c4dd7c79293/eigency-3.4.0.7-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:97131f1b3dc722c46a6ed443f07521b1982571bd9af03efd3798568f5d260358", size = 2556830, upload-time = "2026-02-25T07:54:33.511Z" }, - { url = "https://files.pythonhosted.org/packages/55/bd/27afd65a41f4eb68f70f81acbcfbf7ada8bed7bae29241171376501aaa44/eigency-3.4.0.7-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:bc98d3b46bac63f5dadae94a7e9a2e7907c59db739d98ff09a6bbead9507d6c7", size = 2396840, upload-time = "2026-02-25T07:54:35.257Z" }, - { url = "https://files.pythonhosted.org/packages/61/57/51160b6cdcef2ee46c72d108729010b54b7e4e0b9b318754a3664d3b6102/eigency-3.4.0.7-cp312-cp312-win32.whl", hash = "sha256:708bab5af3041a5731b3bc9d00cee7234c2ff417d01da7a6c799c85632553397", size = 1585879, upload-time = "2026-02-25T07:54:37.365Z" }, - { url = "https://files.pythonhosted.org/packages/9f/4c/8c65e972a9685471dbe38880ad5cf8df03451dad798056566d0fd3538444/eigency-3.4.0.7-cp312-cp312-win_amd64.whl", hash = "sha256:96e1a447e131424d662d48c2936d50e037bf928327398e415a72ccd26984dbe7", size = 1608924, upload-time = "2026-02-25T07:54:39.217Z" }, - { url = "https://files.pythonhosted.org/packages/af/9f/867229dd7e4b885a1eaa5f7d0c17bbca876df4fb0c87a977b0176b38d296/eigency-3.4.0.7-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:1e78f068055da30934eede33960968468ec0883b6b52f7c1658cbdb41e50131a", size = 1619459, upload-time = "2026-02-25T07:54:41.18Z" }, - { url = "https://files.pythonhosted.org/packages/38/e8/4ee7f74c3be0b3c202a27a9cb6e0ba39d79b752bdb7592c2480b1515d60d/eigency-3.4.0.7-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:030417445bad9acd44c3ee7ee90e02424f42a60781cd914a09d3e23e430968f8", size = 2544291, upload-time = "2026-02-25T07:54:43.138Z" }, - { url = "https://files.pythonhosted.org/packages/bd/40/48ef7bac1dee5013aa71d9115c046906c7aa3725f8c3b500f8f1d34ecf90/eigency-3.4.0.7-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:c9d67f62724e635a3cfc59c46c3dca0efde73a10527cb37b63839522349ed381", size = 2382996, upload-time = "2026-02-25T07:54:44.815Z" }, - { url = "https://files.pythonhosted.org/packages/86/35/ad9403497ad582dfc3f084bc9b0f517b78c2dbc93dc85e8290625a568132/eigency-3.4.0.7-cp313-cp313-win32.whl", hash = "sha256:48512e223fc362c016dcddb382fe5a62e749d5b2ae5cb27936901a4d4caf379a", size = 1585867, upload-time = "2026-02-25T07:54:46.541Z" }, - { url = "https://files.pythonhosted.org/packages/57/9b/5d0abd45b014edbd9114a55385f82061bf3f95a08bccb9bf1c0588e5773e/eigency-3.4.0.7-cp313-cp313-win_amd64.whl", hash = "sha256:a74eb3c218ec1785c0be44d671807bc739755d27b34c7497f0af3b6ad2ac4d5d", size = 1609933, upload-time = "2026-02-25T07:54:48.963Z" }, + { url = "https://files.pythonhosted.org/packages/f4/ba/bd4f30f825e8a9bfb4950a0682e8e6f7b258615beaa65b712c6b767481cc/eigency-5.0.1.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:27795b5dc4564cf00d42bb3e1d777df0e76b01a65a248e786dceb9bd297aa3cb", size = 1621555, upload-time = "2026-04-25T13:35:38.898Z" }, + { url = "https://files.pythonhosted.org/packages/d9/89/8aa501a8939b8df36ac918e62e3ba11b471cda8d88744656aa01caa5cbc4/eigency-5.0.1.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c0b6c3e77ec5cacbe66cc6fb4b858f54f6829594e024d351cddb94ff0fb73331", size = 2581353, upload-time = "2026-04-25T13:35:40.962Z" }, + { url = "https://files.pythonhosted.org/packages/a2/2c/57a89d1dbf1c2b6fae491c50f74397808d71f2fa2a9ed9594fc3553a7d04/eigency-5.0.1.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:ba43680b154351e599f06e3dcd43f8c0187109f90fda11f0365047fa07d9b240", size = 2428649, upload-time = "2026-04-25T13:35:42.969Z" }, + { url = "https://files.pythonhosted.org/packages/b5/f3/064e52a16821f54524c161035cbbec3eb537abe6071b3ac8d5cb5ef6f64c/eigency-5.0.1.0-cp311-cp311-win32.whl", hash = "sha256:77eca0e133d968dc8db1e2b12163bfcb70a2041ecfb3e249cc249299c158c57a", size = 1593348, upload-time = "2026-04-25T13:35:45.218Z" }, + { url = "https://files.pythonhosted.org/packages/c8/38/1580086322333cd4079e1dda6fc4156c4df9b1d39fea2d61e5d1361f200b/eigency-5.0.1.0-cp311-cp311-win_amd64.whl", hash = "sha256:710b858d0a308400b0dce1b96d78fb6b41efe8d49ba669446a615333040543dc", size = 1618602, upload-time = "2026-04-25T13:35:47.448Z" }, + { url = "https://files.pythonhosted.org/packages/b8/b4/966679cebb2bd8efb1477c47f4889b8d2936661ef252decf73034fe78a79/eigency-5.0.1.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:3b3fa67fa3c8aae8144c8956909a70a12ebd883d28120d01ccac10218edf6fd2", size = 1620211, upload-time = "2026-04-25T13:35:49.146Z" }, + { url = "https://files.pythonhosted.org/packages/c9/db/abef8c2b294c0894ec6f3d957e07d475f75b462b9b262357f1d18e3b66a4/eigency-5.0.1.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:29f1dc3f4466c512b4e435fc70154a948a53d62588460fc38cb7adba4e0ee1cf", size = 2556831, upload-time = "2026-04-25T13:35:50.916Z" }, + { url = "https://files.pythonhosted.org/packages/3d/c4/ba5239b435c21296451c5b4fef8147141c1cd93ed9072b787da3cd946fa8/eigency-5.0.1.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:f9a32c5925cdb013d343a15fb6d869d9b8b2670f21a60223511cb730de1e5951", size = 2396834, upload-time = "2026-04-25T13:35:52.667Z" }, + { url = "https://files.pythonhosted.org/packages/dd/59/f2f79fd0896ebc2a3abcba4e3a1b16ee6a0fb4b7d103f32be0aff6b990af/eigency-5.0.1.0-cp312-cp312-win32.whl", hash = "sha256:a2d615b7862865702a86c46f212e2e39def3b02d1aa464318297ed3de27a9373", size = 1585848, upload-time = "2026-04-25T13:35:55.148Z" }, + { url = "https://files.pythonhosted.org/packages/69/cc/c0c910a3f31c124ed1cd14f08c3c829b154b59edce2280d4d90640bce3bf/eigency-5.0.1.0-cp312-cp312-win_amd64.whl", hash = "sha256:0578dbef42bc09fdbe99048038b13775f9d67b3ae04888dcb853d2105d215282", size = 1608927, upload-time = "2026-04-25T13:35:56.841Z" }, + { url = "https://files.pythonhosted.org/packages/df/70/e42b7121aab50369ba8e999512340a1e2ccc63bb8c7c958fbef42aee3fa6/eigency-5.0.1.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:7ca712b400b0e5c8a1b99918c322a7d62686660ffa664deb9c2b557edc363edf", size = 1619432, upload-time = "2026-04-25T13:35:58.804Z" }, + { url = "https://files.pythonhosted.org/packages/99/9d/e31f035e4b5357c48ac2e31730373db3e48b606413c8de1a3c8daf3b5bf4/eigency-5.0.1.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d75639d68cf8fb64b3cf6e59d7892ae1d4164286b58bfb6a0db41219b654dc6a", size = 2544289, upload-time = "2026-04-25T13:36:00.839Z" }, + { url = "https://files.pythonhosted.org/packages/40/f9/4e585a05ff6c81ef23cd64d34f2431dfe6fb29890f7f2f68233cf2001ea9/eigency-5.0.1.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:546a98be48e616ee59cf5b9a9716cb5001c2425f9606794eaa16c7b09667a301", size = 2382996, upload-time = "2026-04-25T13:36:02.813Z" }, + { url = "https://files.pythonhosted.org/packages/de/18/a0e25637ece20c123f438c5180523b5b29bb8ef63518b39bf0b4cf2fc1f7/eigency-5.0.1.0-cp313-cp313-win32.whl", hash = "sha256:4301ca88a7426a4a65ff827ad9c00e20808115ec9f18497f76cbb5662df246b7", size = 1585843, upload-time = "2026-04-25T13:36:04.434Z" }, + { url = "https://files.pythonhosted.org/packages/3e/3d/80aa31bd9f1e1f50562893af3880643b975d8c97b125909b2218a4f70dac/eigency-5.0.1.0-cp313-cp313-win_amd64.whl", hash = "sha256:be05dd2c0ad9e1ab534e25e03d7fb8b53c75e38aaa743f986dff4872312d04ff", size = 1609927, upload-time = "2026-04-25T13:36:05.969Z" }, ] [[package]] @@ -721,11 +730,11 @@ wheels = [ [[package]] name = "filelock" -version = "3.25.2" +version = "3.29.0" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/94/b8/00651a0f559862f3bb7d6f7477b192afe3f583cc5e26403b44e59a55ab34/filelock-3.25.2.tar.gz", hash = "sha256:b64ece2b38f4ca29dd3e810287aa8c48182bbecd1ae6e9ae126c9b35f1382694", size = 40480, upload-time = "2026-03-11T20:45:38.487Z" } +sdist = { url = "https://files.pythonhosted.org/packages/b5/fe/997687a931ab51049acce6fa1f23e8f01216374ea81374ddee763c493db5/filelock-3.29.0.tar.gz", hash = "sha256:69974355e960702e789734cb4871f884ea6fe50bd8404051a3530bc07809cf90", size = 57571, upload-time = "2026-04-19T15:39:10.068Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/a4/a5/842ae8f0c08b61d6484b52f99a03510a3a72d23141942d216ebe81fefbce/filelock-3.25.2-py3-none-any.whl", hash = "sha256:ca8afb0da15f229774c9ad1b455ed96e85a81373065fb10446672f64444ddf70", size = 26759, upload-time = "2026-03-11T20:45:37.437Z" }, + { url = "https://files.pythonhosted.org/packages/81/47/dd9a212ef6e343a6857485ffe25bba537304f1913bdbed446a23f7f592e1/filelock-3.29.0-py3-none-any.whl", hash = "sha256:96f5f6344709aa1572bbf631c640e4ebeeb519e08da902c39a001882f30ac258", size = 39812, upload-time = "2026-04-19T15:39:08.752Z" }, ] [[package]] @@ -892,34 +901,34 @@ wheels = [ [[package]] name = "greenlet" -version = "3.3.2" +version = "3.5.0" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/a3/51/1664f6b78fc6ebbd98019a1fd730e83fa78f2db7058f72b1463d3612b8db/greenlet-3.3.2.tar.gz", hash = "sha256:2eaf067fc6d886931c7962e8c6bede15d2f01965560f3359b27c80bde2d151f2", size = 188267, upload-time = "2026-02-20T20:54:15.531Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/f3/47/16400cb42d18d7a6bb46f0626852c1718612e35dcb0dffa16bbaffdf5dd2/greenlet-3.3.2-cp311-cp311-macosx_11_0_universal2.whl", hash = "sha256:c56692189a7d1c7606cb794be0a8381470d95c57ce5be03fb3d0ef57c7853b86", size = 278890, upload-time = "2026-02-20T20:19:39.263Z" }, - { url = "https://files.pythonhosted.org/packages/a3/90/42762b77a5b6aa96cd8c0e80612663d39211e8ae8a6cd47c7f1249a66262/greenlet-3.3.2-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1ebd458fa8285960f382841da585e02201b53a5ec2bac6b156fc623b5ce4499f", size = 581120, upload-time = "2026-02-20T20:47:30.161Z" }, - { url = "https://files.pythonhosted.org/packages/bf/6f/f3d64f4fa0a9c7b5c5b3c810ff1df614540d5aa7d519261b53fba55d4df9/greenlet-3.3.2-cp311-cp311-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:a443358b33c4ec7b05b79a7c8b466f5d275025e750298be7340f8fc63dff2a55", size = 594363, upload-time = "2026-02-20T20:55:56.965Z" }, - { url = "https://files.pythonhosted.org/packages/72/83/3e06a52aca8128bdd4dcd67e932b809e76a96ab8c232a8b025b2850264c5/greenlet-3.3.2-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8e2cd90d413acbf5e77ae41e5d3c9b3ac1d011a756d7284d7f3f2b806bbd6358", size = 594156, upload-time = "2026-02-20T20:20:59.955Z" }, - { url = "https://files.pythonhosted.org/packages/70/79/0de5e62b873e08fe3cef7dbe84e5c4bc0e8ed0c7ff131bccb8405cd107c8/greenlet-3.3.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:442b6057453c8cb29b4fb36a2ac689382fc71112273726e2423f7f17dc73bf99", size = 1554649, upload-time = "2026-02-20T20:49:32.293Z" }, - { url = "https://files.pythonhosted.org/packages/5a/00/32d30dee8389dc36d42170a9c66217757289e2afb0de59a3565260f38373/greenlet-3.3.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:45abe8eb6339518180d5a7fa47fa01945414d7cca5ecb745346fc6a87d2750be", size = 1619472, upload-time = "2026-02-20T20:21:07.966Z" }, - { url = "https://files.pythonhosted.org/packages/f1/3a/efb2cf697fbccdf75b24e2c18025e7dfa54c4f31fab75c51d0fe79942cef/greenlet-3.3.2-cp311-cp311-win_amd64.whl", hash = "sha256:1e692b2dae4cc7077cbb11b47d258533b48c8fde69a33d0d8a82e2fe8d8531d5", size = 230389, upload-time = "2026-02-20T20:17:18.772Z" }, - { url = "https://files.pythonhosted.org/packages/e1/a1/65bbc059a43a7e2143ec4fc1f9e3f673e04f9c7b371a494a101422ac4fd5/greenlet-3.3.2-cp311-cp311-win_arm64.whl", hash = "sha256:02b0a8682aecd4d3c6c18edf52bc8e51eacdd75c8eac52a790a210b06aa295fd", size = 229645, upload-time = "2026-02-20T20:18:18.695Z" }, - { url = "https://files.pythonhosted.org/packages/ea/ab/1608e5a7578e62113506740b88066bf09888322a311cff602105e619bd87/greenlet-3.3.2-cp312-cp312-macosx_11_0_universal2.whl", hash = "sha256:ac8d61d4343b799d1e526db579833d72f23759c71e07181c2d2944e429eb09cd", size = 280358, upload-time = "2026-02-20T20:17:43.971Z" }, - { url = "https://files.pythonhosted.org/packages/a5/23/0eae412a4ade4e6623ff7626e38998cb9b11e9ff1ebacaa021e4e108ec15/greenlet-3.3.2-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3ceec72030dae6ac0c8ed7591b96b70410a8be370b6a477b1dbc072856ad02bd", size = 601217, upload-time = "2026-02-20T20:47:31.462Z" }, - { url = "https://files.pythonhosted.org/packages/f8/16/5b1678a9c07098ecb9ab2dd159fafaf12e963293e61ee8d10ecb55273e5e/greenlet-3.3.2-cp312-cp312-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:a2a5be83a45ce6188c045bcc44b0ee037d6a518978de9a5d97438548b953a1ac", size = 611792, upload-time = "2026-02-20T20:55:58.423Z" }, - { url = "https://files.pythonhosted.org/packages/50/1f/5155f55bd71cabd03765a4aac9ac446be129895271f73872c36ebd4b04b6/greenlet-3.3.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:43e99d1749147ac21dde49b99c9abffcbc1e2d55c67501465ef0930d6e78e070", size = 613875, upload-time = "2026-02-20T20:21:01.102Z" }, - { url = "https://files.pythonhosted.org/packages/fc/dd/845f249c3fcd69e32df80cdab059b4be8b766ef5830a3d0aa9d6cad55beb/greenlet-3.3.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:4c956a19350e2c37f2c48b336a3afb4bff120b36076d9d7fb68cb44e05d95b79", size = 1571467, upload-time = "2026-02-20T20:49:33.495Z" }, - { url = "https://files.pythonhosted.org/packages/2a/50/2649fe21fcc2b56659a452868e695634722a6655ba245d9f77f5656010bf/greenlet-3.3.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:6c6f8ba97d17a1e7d664151284cb3315fc5f8353e75221ed4324f84eb162b395", size = 1640001, upload-time = "2026-02-20T20:21:09.154Z" }, - { url = "https://files.pythonhosted.org/packages/9b/40/cc802e067d02af8b60b6771cea7d57e21ef5e6659912814babb42b864713/greenlet-3.3.2-cp312-cp312-win_amd64.whl", hash = "sha256:34308836d8370bddadb41f5a7ce96879b72e2fdfb4e87729330c6ab52376409f", size = 231081, upload-time = "2026-02-20T20:17:28.121Z" }, - { url = "https://files.pythonhosted.org/packages/58/2e/fe7f36ff1982d6b10a60d5e0740c759259a7d6d2e1dc41da6d96de32fff6/greenlet-3.3.2-cp312-cp312-win_arm64.whl", hash = "sha256:d3a62fa76a32b462a97198e4c9e99afb9ab375115e74e9a83ce180e7a496f643", size = 230331, upload-time = "2026-02-20T20:17:23.34Z" }, - { url = "https://files.pythonhosted.org/packages/ac/48/f8b875fa7dea7dd9b33245e37f065af59df6a25af2f9561efa8d822fde51/greenlet-3.3.2-cp313-cp313-macosx_11_0_universal2.whl", hash = "sha256:aa6ac98bdfd716a749b84d4034486863fd81c3abde9aa3cf8eff9127981a4ae4", size = 279120, upload-time = "2026-02-20T20:19:01.9Z" }, - { url = "https://files.pythonhosted.org/packages/49/8d/9771d03e7a8b1ee456511961e1b97a6d77ae1dea4a34a5b98eee706689d3/greenlet-3.3.2-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ab0c7e7901a00bc0a7284907273dc165b32e0d109a6713babd04471327ff7986", size = 603238, upload-time = "2026-02-20T20:47:32.873Z" }, - { url = "https://files.pythonhosted.org/packages/59/0e/4223c2bbb63cd5c97f28ffb2a8aee71bdfb30b323c35d409450f51b91e3e/greenlet-3.3.2-cp313-cp313-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:d248d8c23c67d2291ffd47af766e2a3aa9fa1c6703155c099feb11f526c63a92", size = 614219, upload-time = "2026-02-20T20:55:59.817Z" }, - { url = "https://files.pythonhosted.org/packages/7a/34/259b28ea7a2a0c904b11cd36c79b8cef8019b26ee5dbe24e73b469dea347/greenlet-3.3.2-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b6997d360a4e6a4e936c0f9625b1c20416b8a0ea18a8e19cabbefc712e7397ab", size = 616774, upload-time = "2026-02-20T20:21:02.454Z" }, - { url = "https://files.pythonhosted.org/packages/0a/03/996c2d1689d486a6e199cb0f1cf9e4aa940c500e01bdf201299d7d61fa69/greenlet-3.3.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:64970c33a50551c7c50491671265d8954046cb6e8e2999aacdd60e439b70418a", size = 1571277, upload-time = "2026-02-20T20:49:34.795Z" }, - { url = "https://files.pythonhosted.org/packages/d9/c4/2570fc07f34a39f2caf0bf9f24b0a1a0a47bc2e8e465b2c2424821389dfc/greenlet-3.3.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:1a9172f5bf6bd88e6ba5a84e0a68afeac9dc7b6b412b245dd64f52d83c81e55b", size = 1640455, upload-time = "2026-02-20T20:21:10.261Z" }, - { url = "https://files.pythonhosted.org/packages/91/39/5ef5aa23bc545aa0d31e1b9b55822b32c8da93ba657295840b6b34124009/greenlet-3.3.2-cp313-cp313-win_amd64.whl", hash = "sha256:a7945dd0eab63ded0a48e4dcade82939783c172290a7903ebde9e184333ca124", size = 230961, upload-time = "2026-02-20T20:16:58.461Z" }, - { url = "https://files.pythonhosted.org/packages/62/6b/a89f8456dcb06becff288f563618e9f20deed8dd29beea14f9a168aef64b/greenlet-3.3.2-cp313-cp313-win_arm64.whl", hash = "sha256:394ead29063ee3515b4e775216cb756b2e3b4a7e55ae8fd884f17fa579e6b327", size = 230221, upload-time = "2026-02-20T20:17:37.152Z" }, +sdist = { url = "https://files.pythonhosted.org/packages/3c/3f/dbf99fb14bfeb88c28f16729215478c0e265cacd6dc22270c8f31bb6892f/greenlet-3.5.0.tar.gz", hash = "sha256:d419647372241bc68e957bf38d5c1f98852155e4146bd1e4121adea81f4f01e4", size = 196995, upload-time = "2026-04-27T13:37:15.544Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/8b/0f/a91f143f356523ff682309732b175765a9bc2836fd7c081c2c67fedc1ad4/greenlet-3.5.0-cp311-cp311-macosx_11_0_universal2.whl", hash = "sha256:8f1cc966c126639cd152fdaa52624d2655f492faa79e013fea161de3e6dda082", size = 284726, upload-time = "2026-04-27T12:20:51.402Z" }, + { url = "https://files.pythonhosted.org/packages/95/82/800646c7ffc5dbabd75ddd2f6b519bb898c0c9c969e5d0473bfe5d20bcce/greenlet-3.5.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:362624e6a8e5bca3b8233e45eef33903a100e9539a2b995c364d595dbc4018b3", size = 604264, upload-time = "2026-04-27T12:52:39.494Z" }, + { url = "https://files.pythonhosted.org/packages/ca/ac/354867c0bba812fc33b15bc55aedafedd0aee3c7dd91dfca22444157dc0c/greenlet-3.5.0-cp311-cp311-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:5ecd83806b0f4c2f53b1018e0005cd82269ea01d42befc0368730028d850ed1c", size = 616099, upload-time = "2026-04-27T12:59:39.623Z" }, + { url = "https://files.pythonhosted.org/packages/ff/b0/815bece7399e01cadb69014219eebd0042339875c59a59b0820a46ece356/greenlet-3.5.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0ff251e9a0279522e62f6176412869395a64ddf2b5c5f782ff609a8216a4e662", size = 615198, upload-time = "2026-04-27T12:25:25.928Z" }, + { url = "https://files.pythonhosted.org/packages/10/80/3b2c0a895d6698f6ddb31b07942ebfa982f3e30888bc5546a5b5990de8b2/greenlet-3.5.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:6d874e79afd41a96e11ff4c5d0bc90a80973e476fda1c2c64985667397df432b", size = 1574927, upload-time = "2026-04-27T12:53:25.81Z" }, + { url = "https://files.pythonhosted.org/packages/44/0e/f354af514a4c61454dbc68e44d47544a5a4d6317e30b77ddfa3a09f4c5f3/greenlet-3.5.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:0ed006e4b86c59de7467eb2601cd1b77b5a7d657d1ee55e30fe30d76451edba4", size = 1642683, upload-time = "2026-04-27T12:25:23.9Z" }, + { url = "https://files.pythonhosted.org/packages/fa/6a/87f38255201e993a1915265ebb80cd7c2c78b04a45744995abbf6b259fd8/greenlet-3.5.0-cp311-cp311-win_amd64.whl", hash = "sha256:703cb211b820dbffbbc55a16bfc6e4583a6e6e990f33a119d2cc8b83211119c8", size = 238115, upload-time = "2026-04-27T12:21:48.845Z" }, + { url = "https://files.pythonhosted.org/packages/e3/f8/450fe3c5938fa737ea4d22699772e6e34e8e24431a47bf4e8a1ceed4a98e/greenlet-3.5.0-cp311-cp311-win_arm64.whl", hash = "sha256:6c18dfb59c70f5a94acd271c72e90128c3c776e41e5f07767908c8c1b74ad339", size = 235017, upload-time = "2026-04-27T12:22:26.768Z" }, + { url = "https://files.pythonhosted.org/packages/ef/32/f2ce6d4cac3e55bc6173f92dbe627e782e1850f89d986c3606feb63aafa7/greenlet-3.5.0-cp312-cp312-macosx_11_0_universal2.whl", hash = "sha256:db2910d3c809444e0a20147361f343fe2798e106af8d9d8506f5305302655a9f", size = 286228, upload-time = "2026-04-27T12:20:34.421Z" }, + { url = "https://files.pythonhosted.org/packages/b7/aa/caed9e5adf742315fc7be2a84196373aab4816e540e38ba0d76cb7584d68/greenlet-3.5.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3ec9ea74e7268ace7f9aab1b1a4e730193fc661b39a993cd91c606c32d4a3628", size = 601775, upload-time = "2026-04-27T12:52:41.045Z" }, + { url = "https://files.pythonhosted.org/packages/c7/af/90ae08497400a941595d12774447f752d3dfe0fbb012e35b76bc5c0ff37e/greenlet-3.5.0-cp312-cp312-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:54d243512da35485fc7a6bf3c178fdda6327a9d6506fcdd62b1abd1e41b2927b", size = 614436, upload-time = "2026-04-27T12:59:41.595Z" }, + { url = "https://files.pythonhosted.org/packages/2b/e0/2e13df68f367e2f9960616927d60857dd7e56aaadd59a47c644216b2f920/greenlet-3.5.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:9d280a7f5c331622c69f97eb167f33577ff2d1df282c41cd15907fc0a3ca198c", size = 611388, upload-time = "2026-04-27T12:25:28.008Z" }, + { url = "https://files.pythonhosted.org/packages/82/f7/393c64055132ac0d488ef6be549253b7e6274194863967ddc0bc8f5b87b8/greenlet-3.5.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:1eb67d5adefb5bd2e182d42678a328979a209e4e82eb93575708185d31d1f588", size = 1570768, upload-time = "2026-04-27T12:53:28.099Z" }, + { url = "https://files.pythonhosted.org/packages/b8/4b/eaf7735253522cf56d1b74d672a58f54fc114702ceaf05def59aae72f6e1/greenlet-3.5.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:2628d6c86f6cb0cb45e0c3c54058bbec559f57eaae699447748cb3928150577e", size = 1635983, upload-time = "2026-04-27T12:25:26.903Z" }, + { url = "https://files.pythonhosted.org/packages/4c/fe/4fb3a0805bd5165da5ebf858da7cc01cce8061674106d2cf5bdab32cbfde/greenlet-3.5.0-cp312-cp312-win_amd64.whl", hash = "sha256:d4d9f0624c775f2dfc56ba54d515a8c771044346852a918b405914f6b19d7fd8", size = 238840, upload-time = "2026-04-27T12:23:54.806Z" }, + { url = "https://files.pythonhosted.org/packages/cb/cb/baa584cb00532126ffe12d9787db0a60c5a4f55c27bfe2666df5d4c30a32/greenlet-3.5.0-cp312-cp312-win_arm64.whl", hash = "sha256:83ed9f27f1680b50e89f40f6df348a290ea234b249a4003d366663a12eab94f2", size = 235615, upload-time = "2026-04-27T12:21:38.57Z" }, + { url = "https://files.pythonhosted.org/packages/0c/58/fc576f99037ce19c5aa16628e4c3226b6d1419f72a62c79f5f40576e6eb3/greenlet-3.5.0-cp313-cp313-macosx_11_0_universal2.whl", hash = "sha256:5a5ed18de6a0f6cc7087f1563f6bd93fc7df1c19165ca01e9bde5a5dc281d106", size = 285066, upload-time = "2026-04-27T12:23:05.033Z" }, + { url = "https://files.pythonhosted.org/packages/4a/ba/b28ddbe6bfad6a8ac196ef0e8cff37bc65b79735995b9e410923fffeeb70/greenlet-3.5.0-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3a717fbc46d8a354fa675f7c1e813485b6ba3885f9bef0cd56e5ba27d758ff5b", size = 604414, upload-time = "2026-04-27T12:52:42.358Z" }, + { url = "https://files.pythonhosted.org/packages/09/06/4b69f8f0b67603a8be2790e55107a190b376f2627fe0eaf5695d85ffb3cd/greenlet-3.5.0-cp313-cp313-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:ddc090c5c1792b10246a78e8c2163ebbe04cf877f9d785c230a7b27b39ad038e", size = 617349, upload-time = "2026-04-27T12:59:43.32Z" }, + { url = "https://files.pythonhosted.org/packages/8a/17/a3918541fd0ddefe024a69de6d16aa7b46d36ac19562adaa63c7fa180eff/greenlet-3.5.0-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2094acd54b272cb6eae8c03dd87b3fa1820a4cef18d6889c378d503500a1dc13", size = 613927, upload-time = "2026-04-27T12:25:30.28Z" }, + { url = "https://files.pythonhosted.org/packages/ee/e1/bd0af6213c7dd33175d8a462d4c1fe1175124ebed4855bc1475a5b5242c2/greenlet-3.5.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:5e05ba267789ea87b5a155cf0e810b1ab88bf18e9e8740813945ceb8ee4350ba", size = 1570893, upload-time = "2026-04-27T12:53:29.483Z" }, + { url = "https://files.pythonhosted.org/packages/9b/2a/0789702f864f5382cb476b93d7a9c823c10472658102ccd65f415747d2e2/greenlet-3.5.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:0ecec963079cd58cbd14723582384f11f166fd58883c15dcbfb342e0bc9b5846", size = 1636060, upload-time = "2026-04-27T12:25:28.845Z" }, + { url = "https://files.pythonhosted.org/packages/b2/8f/22bf9df92bbff0eb07842b60f7e63bf7675a9742df628437a9f02d09137f/greenlet-3.5.0-cp313-cp313-win_amd64.whl", hash = "sha256:728d9667d8f2f586644b748dbd9bb67e50d6a9381767d1357714ea6825bb3bf5", size = 238740, upload-time = "2026-04-27T12:24:01.341Z" }, + { url = "https://files.pythonhosted.org/packages/b6/b7/9c5c3d653bd4ff614277c049ac676422e2c557db47b4fe43e6313fc005dc/greenlet-3.5.0-cp313-cp313-win_arm64.whl", hash = "sha256:47422135b1d308c14b2c6e758beedb1acd33bb91679f5670edf77bf46244722b", size = 235525, upload-time = "2026-04-27T12:23:12.308Z" }, ] [[package]] @@ -992,43 +1001,43 @@ wheels = [ [[package]] name = "highspy" -version = "1.13.1" +version = "1.14.0" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "numpy" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/f3/17/74553587944f514142618c6d1b76551acf09fb0dd48abe9f4c37fb7d2bb8/highspy-1.13.1.tar.gz", hash = "sha256:7888873501c6ca3e0fa19fee960c8b3cb1c64132c5a9b514903cc7e259b5b0c7", size = 1597930, upload-time = "2026-02-11T16:39:55.185Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/b2/59/93e441e5067942113519d46e5786b948a5b356972dd5e744ab62ca21b5b5/highspy-1.13.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:a756dc2339f56d5979ddff1081ed49b68e272cae57ce33e44df6c9a64c306381", size = 2238251, upload-time = "2026-02-11T16:38:05.501Z" }, - { url = "https://files.pythonhosted.org/packages/40/af/024a071d4a0e172b71e5a4be3c3cffc87759043420534257897971f5c273/highspy-1.13.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:9a215cfb0975cc70a90cea5663d3a0c65c675ced50f0c3148727cac9b3b27a58", size = 2048259, upload-time = "2026-02-11T16:38:07.075Z" }, - { url = "https://files.pythonhosted.org/packages/45/f8/98080872d04063473467b99ac1d49cc0a24f95c6ca67760e1bec20fb34bf/highspy-1.13.1-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4d98fe763e42980e253edb8832c4fc719e15107d32ae80802dcb43bf506bec20", size = 2318544, upload-time = "2026-02-11T16:38:09.408Z" }, - { url = "https://files.pythonhosted.org/packages/10/d1/596aa00778547fb649a585e8bb053c08ae1fdced2b9788040b4cb0b5b762/highspy-1.13.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:bd5a921cdea21e816a1b1b36944803a51b72b0a87f7b7e00635ec9cd6aefca60", size = 2524373, upload-time = "2026-02-11T16:38:10.924Z" }, - { url = "https://files.pythonhosted.org/packages/80/8e/1696c272036b37eb40d5b718ddbb650d4e8d3e4242af79ad5755cbf5a899/highspy-1.13.1-cp311-cp311-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:cdfc79cf4395434d27cc84a965e7bc9e76e59b38defb3879b694be6bbd44ff28", size = 2690185, upload-time = "2026-02-11T16:38:12.408Z" }, - { url = "https://files.pythonhosted.org/packages/cf/01/a30e3473ed2a7cf5ecf8eda016a2c34b4235cfba5c597c64ced325e28ace/highspy-1.13.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:c1087def948268ef7100acea50d029f8ffce3db1d623dce3112babfe677dab90", size = 3377439, upload-time = "2026-02-11T16:38:14.135Z" }, - { url = "https://files.pythonhosted.org/packages/89/d7/fd5ed5d4f7197c8871c093622936091b9d701eda9844b051d88e467a4a4d/highspy-1.13.1-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:4bfbc28842ec5286b7db14ff3f9dcd772196327242183a65209f9558c5562888", size = 3938370, upload-time = "2026-02-11T16:38:15.834Z" }, - { url = "https://files.pythonhosted.org/packages/4f/de/26254f08f7b439310b78dd3a89d9515fe1007bc7297997aaa0d689434e4d/highspy-1.13.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:5d08e309f514dad949453b3f962dc1c6f757e6d7db34f670e14f8dd27fb0e4d5", size = 3605772, upload-time = "2026-02-11T16:38:19.094Z" }, - { url = "https://files.pythonhosted.org/packages/e7/b7/9af79898d70db1c0b93f83b077dbb213075f9eafd4a084a751dd703b2fa5/highspy-1.13.1-cp311-cp311-win32.whl", hash = "sha256:46feb46c3109e9b66dc53db1804e647e0351234c603fcc53287c1d3d53998d08", size = 1888063, upload-time = "2026-02-11T16:38:21.341Z" }, - { url = "https://files.pythonhosted.org/packages/6a/52/760ceb66b5e64736d6620906e6b7f41ea60cf5eeacd214f337ecf640b2dc/highspy-1.13.1-cp311-cp311-win_amd64.whl", hash = "sha256:2a2dc6b909284de95f84d288cfecafe2826245ae108dc33887e999bd8f9df75f", size = 2232508, upload-time = "2026-02-11T16:38:23.225Z" }, - { url = "https://files.pythonhosted.org/packages/98/9f/98a103b443e42755aa4e63800d6a517a928e32f28c40ff2ecce7d9c4e20a/highspy-1.13.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:31f9f7001b9a6afd1cb06e5f3ad24848aa0df78a32b004538f2630535a881e7d", size = 2240384, upload-time = "2026-02-11T16:38:24.765Z" }, - { url = "https://files.pythonhosted.org/packages/8e/98/920115e7e451e20a2a84e3fec4f5ade6760561d96a001ee8c87886331b9a/highspy-1.13.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:ce77b3771c20715e4552b890b9a2f4eee33faf1bd86ab08c6a7b00424ebe12b3", size = 2051929, upload-time = "2026-02-11T16:38:26.235Z" }, - { url = "https://files.pythonhosted.org/packages/f5/6b/9efc679003ff5d10cbf88e02deb47c7095b2e23d700eea895a4faecf1dcc/highspy-1.13.1-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:f469f0a340870e1f16324a096dead3775857470b7e9d552b08a878a945f34917", size = 2321820, upload-time = "2026-02-11T16:38:27.75Z" }, - { url = "https://files.pythonhosted.org/packages/ee/ad/be6847577282389a5b8ab4ef6bee71c5ccb2383c84a123561b4e538449b7/highspy-1.13.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f00b426b09e28306b279645beec2f892dd652ad41ec1597b742865bb844dd38d", size = 2525364, upload-time = "2026-02-11T16:38:29.152Z" }, - { url = "https://files.pythonhosted.org/packages/70/83/6c72c558c3bed2a06b3d850943621376116f374949bc77f2b353995cfba9/highspy-1.13.1-cp312-cp312-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:dce26ba2926612847be6d3e01ee98df2af8472a99de8ad9dc562060f6c8e5d0a", size = 2696833, upload-time = "2026-02-11T16:38:30.65Z" }, - { url = "https://files.pythonhosted.org/packages/1e/94/376940d949547d5bfdd68a86b0631d8c2fd86a7fd771c93a23c683be1293/highspy-1.13.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:506038db5c4732a4b65755aa63f580cef91e1a99a5d1399e311dc31d3300dadd", size = 3382720, upload-time = "2026-02-11T16:38:32.135Z" }, - { url = "https://files.pythonhosted.org/packages/fd/8d/6ecee935e5f57df9fbe9fedfcac14e4ac76186335595bf201f86302a7379/highspy-1.13.1-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:eda07ca3ec5506d279679d775dedc447fbcc3711814c5f2e22d2d063598586af", size = 3943880, upload-time = "2026-02-11T16:38:33.544Z" }, - { url = "https://files.pythonhosted.org/packages/a5/7b/524d977387e3ff3eba165feb81614ead36a7974c0f8d9948f8d9347f74ff/highspy-1.13.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:ce9481fb96771c3b74592e1a94cc8dc2748d18eafb0bad6a03ebca5f4cfaa8fd", size = 3610040, upload-time = "2026-02-11T16:38:35.582Z" }, - { url = "https://files.pythonhosted.org/packages/99/89/3133cc87d13399f66250f9e2ff0592c72a42edae945d27ec1dfadf009ebd/highspy-1.13.1-cp312-cp312-win32.whl", hash = "sha256:0a10b8d7bc6a2a50226bc6821f21ed5f456479950271e742aff8bc97412c4591", size = 1888436, upload-time = "2026-02-11T16:38:37.327Z" }, - { url = "https://files.pythonhosted.org/packages/2a/50/0c6986b87cc373dc9e9eb6254508f0f77bb4de897ae1d2a8c7e6ae70ef6c/highspy-1.13.1-cp312-cp312-win_amd64.whl", hash = "sha256:26f023093ed2fa2407f12a4a7dc9c1de253cd14228874e58e6719d78a74e2a9c", size = 2231427, upload-time = "2026-02-11T16:38:38.805Z" }, - { url = "https://files.pythonhosted.org/packages/86/4b/77e7f6b936dc130eb4dc78e64bb75d15940232027f519a7bb6564c91153b/highspy-1.13.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:0a255204f4ec79f5e447765d745e41bf3115a6c196ecfab5ed636d1ff53c0401", size = 2240587, upload-time = "2026-02-11T16:38:40.249Z" }, - { url = "https://files.pythonhosted.org/packages/6a/5c/8a57a0f2d18bd9fb88f64c87c11882defe90832361ecd4f612b36d1c5f05/highspy-1.13.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:cbb7509b5606a1ac7bc1fff794cebeb668f43531678f898803c12cbb694bea61", size = 2052178, upload-time = "2026-02-11T16:38:41.628Z" }, - { url = "https://files.pythonhosted.org/packages/5d/d8/fec866c1a48e98897f6348a303fa565dc00fd29e5fb90e6befe1e1ead12b/highspy-1.13.1-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e2e572c9ec5eb0fd5d178690981116413f277cb59d150bf271d927161335df23", size = 2321460, upload-time = "2026-02-11T16:38:43.04Z" }, - { url = "https://files.pythonhosted.org/packages/24/6f/3bfec9e2a0ff3adc20d3ae071572ae473f810c35e04a7455abf278623932/highspy-1.13.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e577ff0fdeccc7e207be8577dd8b223816d024c40c1b7af4afb6e96818f4544b", size = 2525308, upload-time = "2026-02-11T16:38:44.631Z" }, - { url = "https://files.pythonhosted.org/packages/33/7b/b21a2dffca27742a50c48e88b3c850b1a212506543ec7b38a5f843a689e0/highspy-1.13.1-cp313-cp313-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:dc8677bb6da2216fd72110d95d904022bf3150832441eb08562bdb8d2c5aeb6c", size = 2696881, upload-time = "2026-02-11T16:38:47.498Z" }, - { url = "https://files.pythonhosted.org/packages/ea/cc/793b6819c1a8d360b8a536f62526d3bb54a3ba8deaf3b0ba7d4076e56da3/highspy-1.13.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:3ec5adcef1028b40ae6d76d03c24b512c5f33849c5e09b70c15734b0e48c68ca", size = 3382687, upload-time = "2026-02-11T16:38:49.18Z" }, - { url = "https://files.pythonhosted.org/packages/18/13/20bbbf7a20779d799565ab4ce9c9f80ad39eeeb069b2a91e1fd5dece3109/highspy-1.13.1-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:1a435dd11fcefab2e9e0ad90d75a459a656b177ef3d1173b31db889808580e92", size = 3943115, upload-time = "2026-02-11T16:38:50.761Z" }, - { url = "https://files.pythonhosted.org/packages/02/df/1c37558d33e52376822136c3288b65d74cb0482da841904dc5ec89540dc4/highspy-1.13.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:0f473e94b19c186fda0abed1e31d3682ac4a9109930e77a1901c63f1b3d7d772", size = 3610136, upload-time = "2026-02-11T16:38:52.724Z" }, - { url = "https://files.pythonhosted.org/packages/27/e3/094bae0233275f5d293c40598a97a7ec8b0dee61572d5c2086aec8e35056/highspy-1.13.1-cp313-cp313-win32.whl", hash = "sha256:1464ed94e467de3cc20fdf3779609dd13aa91db8d11305b1d8f6a029ae129f33", size = 1888446, upload-time = "2026-02-11T16:38:54.341Z" }, - { url = "https://files.pythonhosted.org/packages/e2/15/d9be1b1f22ecafedddbf504b3c63b7e4a17a710e3b45f615be1869160938/highspy-1.13.1-cp313-cp313-win_amd64.whl", hash = "sha256:620fad11a92517d525300ad70f27f84f41e8c0af10ece5ae5537f2be13ae0970", size = 2231418, upload-time = "2026-02-11T16:38:56.499Z" }, +sdist = { url = "https://files.pythonhosted.org/packages/7a/66/e74b1a805f65c52666e3b54cfc1ba783e745c2c8a7abaae9e7ef2d9e7270/highspy-1.14.0.tar.gz", hash = "sha256:b09cb5e3179a25fc615b8b0941130b0f71e19372c119f3dd620d63b54cd3ca4c", size = 1654913, upload-time = "2026-04-06T15:53:31.738Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/73/4e/c813156aa513eb3344b333c1424373cebf1f5843868b2ba5c49c64beecde/highspy-1.14.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:69558127aabad8b5718a58009dc3d36618c3aa5aa5e733206c32ce396189a132", size = 2310870, upload-time = "2026-04-06T15:51:51.586Z" }, + { url = "https://files.pythonhosted.org/packages/3f/6c/d4baa83e8745d729764bf960b51828e9c99c90b4f5cd99e65b59fbf2b6f9/highspy-1.14.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:3699fc8072a70664d5bbaf1c9239f8e2e8700c5090f57486f2ce4567f9b2b6aa", size = 2116701, upload-time = "2026-04-06T15:51:53.175Z" }, + { url = "https://files.pythonhosted.org/packages/cd/64/9dbafa1f3f9ec9293c4038d64b4a49a7a577e1ffcc5c48cf861849d2cff0/highspy-1.14.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4dd237b94494b14693edbebd05a0068fa02ae36b2629e6353be42bc2b491c1f0", size = 2404995, upload-time = "2026-04-06T15:51:54.664Z" }, + { url = "https://files.pythonhosted.org/packages/94/d6/d73cfcce4d3863d9839174a42fd976d84bb7781c132bf0bccdc74e83d9a7/highspy-1.14.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:526c54ff6e0384abcc495cda08fb2f98156544031afb9ac8bb02f20968d743ba", size = 2629054, upload-time = "2026-04-06T15:51:56.157Z" }, + { url = "https://files.pythonhosted.org/packages/71/c2/5ec46d5381815b849f25f4327af187d70325aa693bccdad960228c98ebb1/highspy-1.14.0-cp311-cp311-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:60265c69fb9d199526b8190f3bd42bc241896a335c299570d1dfc9d6e97f8895", size = 2787400, upload-time = "2026-04-06T15:51:57.682Z" }, + { url = "https://files.pythonhosted.org/packages/25/f0/1e89d849701388886d39ffda26f64bdc835d63f26aa4b5d865067d56fae1/highspy-1.14.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:895b3450ece50c1cdcd348aa5c6efd6ee823598f3a21c23677d57a06e3c6f28a", size = 3465733, upload-time = "2026-04-06T15:51:59.9Z" }, + { url = "https://files.pythonhosted.org/packages/fd/31/b3477f7ca17526167e5eff9d194ba8ea3eca0f04e98a248b26cc6612aa8f/highspy-1.14.0-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:b50202f02c7db95163b0ad92160167541dad263b19327164dcf0f827cf9f69b3", size = 4044165, upload-time = "2026-04-06T15:52:01.482Z" }, + { url = "https://files.pythonhosted.org/packages/5d/85/67bead3e385de8f433572809b3cab6d7694c1a606edac2fbf94309b746d4/highspy-1.14.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:e12591c1d495ebb5f6d8dc50e99ed48f338dfb59741929cd68c2c2b40d97a58f", size = 3707245, upload-time = "2026-04-06T15:52:03.05Z" }, + { url = "https://files.pythonhosted.org/packages/08/1c/0f61f66855a39e22f1f8e7d1ab3632b2a27cc1f42ee628fd9c317a5b4616/highspy-1.14.0-cp311-cp311-win32.whl", hash = "sha256:9ba82456280ef72cde8e45ecf6bdb2a244c56d80f7e44bb2e5ef7a9abf21f4d9", size = 1953948, upload-time = "2026-04-06T15:52:04.549Z" }, + { url = "https://files.pythonhosted.org/packages/f8/bd/4eaa775022d55519101a51a2b1ad5c46cfa8c725a400033943a79138001f/highspy-1.14.0-cp311-cp311-win_amd64.whl", hash = "sha256:e726092e35237dccdd8093f8c91be195a5826e48aab349d6e7856e32c0e87b41", size = 2320210, upload-time = "2026-04-06T15:52:06.417Z" }, + { url = "https://files.pythonhosted.org/packages/c6/3d/83ee11de10ff6499efdb6edbb4586b472a4e9f982c0f6c5d3faa670bf1c3/highspy-1.14.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:2a8d0339887d65f5ef20c59be529af33115197d47f775ee21fca911a87d30f92", size = 2311831, upload-time = "2026-04-06T15:52:08.042Z" }, + { url = "https://files.pythonhosted.org/packages/b6/74/51cfca0c382886e302c4e6b9a50f9b160d214a88b4bc5937f5f8e2452dd9/highspy-1.14.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:2f61406a287b19680ece9a8c0bc3926e31d26ad2b4df8ba49f22666ce762fb7d", size = 2122237, upload-time = "2026-04-06T15:52:09.587Z" }, + { url = "https://files.pythonhosted.org/packages/78/69/a60f9dc033712f564089700441fb08c7f89db32bccab7926ef95db6b2306/highspy-1.14.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:74ee022cc8cc0e3a576f9b2309287974ea80db68adf8c9f1c5698dc725ec0497", size = 2409165, upload-time = "2026-04-06T15:52:11.067Z" }, + { url = "https://files.pythonhosted.org/packages/cd/c5/efa6d74704aa0bc5ffce9975553f6d13f4527e6fad8f79e7cacadfedd3d3/highspy-1.14.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:eb7c564ee426355671edf6ad17b2ece5aaa74f21b326ed5e7a8ba5bdc880f207", size = 2632243, upload-time = "2026-04-06T15:52:12.529Z" }, + { url = "https://files.pythonhosted.org/packages/f2/1f/a701fee9ca318e6d175d719f8916d090dd7c8100c28bc591adac9fc2db35/highspy-1.14.0-cp312-cp312-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:44d80756dde11941336933d777ae85b913c303fe40c3b7eb55fae0fe62bbea08", size = 2792226, upload-time = "2026-04-06T15:52:14.065Z" }, + { url = "https://files.pythonhosted.org/packages/cc/a4/b1db0018292e46d75d7aa7889f220d0fa0f2243a628e57790de80f1fc22f/highspy-1.14.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:6ba00a35b6c4b96eb2d20d75238c6046ecf82ac85a603706e5f588d28c3b3abf", size = 3465843, upload-time = "2026-04-06T15:52:15.692Z" }, + { url = "https://files.pythonhosted.org/packages/6c/41/64c4b290a5237e14fbc7e4812d110aecf457273bbade8b27ca9b9ea86c5b/highspy-1.14.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:e991b5ac8af64d686f2a56fb59391b4bb81a50d647fe1e31a1b0ff34d9d2bb51", size = 4042641, upload-time = "2026-04-06T15:52:17.573Z" }, + { url = "https://files.pythonhosted.org/packages/30/3a/cff37994f2fd313467749bc9939e5baab4aef0210c79547471d6bbda5e81/highspy-1.14.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:c10554bd0a37d4be126f24cbb7f5aca5a4252dd6e572b51b2433ad6d94ce7a9d", size = 3712946, upload-time = "2026-04-06T15:52:19.466Z" }, + { url = "https://files.pythonhosted.org/packages/6d/60/f328af00a9f05838e766aa7808dc733bb74a4fa79adcf5fdd665cfb8d7ed/highspy-1.14.0-cp312-cp312-win32.whl", hash = "sha256:7290540f0352192e43bdc790a59a82cee1f8029bd8d6b9ca20b54b651256bab4", size = 1955124, upload-time = "2026-04-06T15:52:20.965Z" }, + { url = "https://files.pythonhosted.org/packages/69/ea/0b47c49b6df4474c603b6a232278d5d6e6afddcb9da5044e06dfec579222/highspy-1.14.0-cp312-cp312-win_amd64.whl", hash = "sha256:c0568d0fb514dc82776c3be1041988fe428c9df2be0ce98c1bff6382c0d2a5ba", size = 2323321, upload-time = "2026-04-06T15:52:22.418Z" }, + { url = "https://files.pythonhosted.org/packages/2d/2e/43b5d51852a8b06a079cc324ee877a91d2e87128ecd99905b45840b0fd3e/highspy-1.14.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:777ce930bc29a984826bc4e65ca7fec0407284a30877bc0d179dacc6bb7ee972", size = 2311865, upload-time = "2026-04-06T15:52:24.112Z" }, + { url = "https://files.pythonhosted.org/packages/1d/49/7e1a308163954faa3e91cbc0f73282a24d99b9cb6e7314c6f4266fee88d5/highspy-1.14.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:479be627bbdf7646dccf5621059b9416e6480c3ffa16998e6cc4de89c40a3716", size = 2122334, upload-time = "2026-04-06T15:52:25.938Z" }, + { url = "https://files.pythonhosted.org/packages/87/8c/8ae1a6f3f645deeaaf5522ebd47c61f7d856e8883dd5e7f51318e00ca287/highspy-1.14.0-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:863a9363b624d0ef0a5a5bfcbe437bcd3891676c8b7c4d801462320b1387ef37", size = 2409256, upload-time = "2026-04-06T15:52:27.306Z" }, + { url = "https://files.pythonhosted.org/packages/7c/40/501586f760677501f3b2f19f0b7236515d06c08db29ccaae838a6f20c05d/highspy-1.14.0-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ba290153919da6368e9144189060a496077a738522555e89403504d379e8d63f", size = 2632208, upload-time = "2026-04-06T15:52:29.122Z" }, + { url = "https://files.pythonhosted.org/packages/ad/0c/a4137e3fd564fe2615093771463d66fa74cc4a2a94e6b68c15a0f8572218/highspy-1.14.0-cp313-cp313-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:bdc317bc2572a591cf9f0f2415d4b0879f9f0a47bdff3cda44f85a943941e64b", size = 2792091, upload-time = "2026-04-06T15:52:30.586Z" }, + { url = "https://files.pythonhosted.org/packages/af/91/aa3d6758212f4bb14c4c424053932a030ead226f8138da396a0bb6216d66/highspy-1.14.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:02a987e0020bf28757df9efbdfd6abde6d501b64294f8201301958095dae7979", size = 3466025, upload-time = "2026-04-06T15:52:32.209Z" }, + { url = "https://files.pythonhosted.org/packages/64/10/25cd0fe6b0dfbae39bee2b7a6dfb91e29b1c78a338e537eb5fc8475ea03d/highspy-1.14.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:03c9b39a9ca8fe973eb89093ac386a8606ac37beb0e637b1a9255e31bfe87ef3", size = 4043109, upload-time = "2026-04-06T15:52:34.203Z" }, + { url = "https://files.pythonhosted.org/packages/8c/27/4dbcba90d2c8b2ddcd3157d62f5531c2bfa75a3e360b8f05ac52996b489b/highspy-1.14.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:a92b50e7508b90c6d65b315b7dd3b37c6d0db331e698052f7054da523e446eb4", size = 3712794, upload-time = "2026-04-06T15:52:36.163Z" }, + { url = "https://files.pythonhosted.org/packages/de/5a/72d9840b26b0b26916bd7624d606f31a0ce83f14709fcfaf52df31e29898/highspy-1.14.0-cp313-cp313-win32.whl", hash = "sha256:e8eb72be8766717ec856cf5fa3fcadd8ddd59bee0f4d71d4f45ae0a2b861795f", size = 1955044, upload-time = "2026-04-06T15:52:37.532Z" }, + { url = "https://files.pythonhosted.org/packages/22/7e/89f07a03ff9f5460043ba4815583fd3fda22aa6d088be3fb730346ccae63/highspy-1.14.0-cp313-cp313-win_amd64.whl", hash = "sha256:3d15faaab62b408320373540bfd5a7b9a48e9a01072b847f2270b8d5d881647c", size = 2323514, upload-time = "2026-04-06T15:52:38.939Z" }, ] [[package]] @@ -1061,7 +1070,7 @@ wheels = [ [[package]] name = "huggingface-hub" -version = "1.8.0" +version = "1.12.0" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "filelock" }, @@ -1074,27 +1083,27 @@ dependencies = [ { name = "typer" }, { name = "typing-extensions" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/8e/2a/a847fd02261cd051da218baf99f90ee7c7040c109a01833db4f838f25256/huggingface_hub-1.8.0.tar.gz", hash = "sha256:c5627b2fd521e00caf8eff4ac965ba988ea75167fad7ee72e17f9b7183ec63f3", size = 735839, upload-time = "2026-03-25T16:01:28.152Z" } +sdist = { url = "https://files.pythonhosted.org/packages/56/52/1b54cb569509c725a32c1315261ac9fd0e6b91bbbf74d86fca10d3376164/huggingface_hub-1.12.0.tar.gz", hash = "sha256:7c3fe85e24b652334e5d456d7a812cd9a071e75630fac4365d9165ab5e4a34b6", size = 763091, upload-time = "2026-04-24T13:32:08.674Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/a9/ae/8a3a16ea4d202cb641b51d2681bdd3d482c1c592d7570b3fa264730829ce/huggingface_hub-1.8.0-py3-none-any.whl", hash = "sha256:d3eb5047bd4e33c987429de6020d4810d38a5bef95b3b40df9b17346b7f353f2", size = 625208, upload-time = "2026-03-25T16:01:26.603Z" }, + { url = "https://files.pythonhosted.org/packages/7e/2b/ef03ddb96bd1123503c2bd6932001020292deea649e9bf4caa2cb65a85bf/huggingface_hub-1.12.0-py3-none-any.whl", hash = "sha256:d74939969585ee35748bd66de09baf84099d461bda7287cd9043bfb99b0e424d", size = 646806, upload-time = "2026-04-24T13:32:06.717Z" }, ] [[package]] name = "identify" -version = "2.6.18" +version = "2.6.19" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/46/c4/7fb4db12296cdb11893d61c92048fe617ee853f8523b9b296ac03b43757e/identify-2.6.18.tar.gz", hash = "sha256:873ac56a5e3fd63e7438a7ecbc4d91aca692eb3fefa4534db2b7913f3fc352fd", size = 99580, upload-time = "2026-03-15T18:39:50.319Z" } +sdist = { url = "https://files.pythonhosted.org/packages/52/63/51723b5f116cc04b061cb6f5a561790abf249d25931d515cd375e063e0f4/identify-2.6.19.tar.gz", hash = "sha256:6be5020c38fcb07da56c53733538a3081ea5aa70d36a156f83044bfbf9173842", size = 99567, upload-time = "2026-04-17T18:39:50.265Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/46/33/92ef41c6fad0233e41d3d84ba8e8ad18d1780f1e5d99b3c683e6d7f98b63/identify-2.6.18-py2.py3-none-any.whl", hash = "sha256:8db9d3c8ea9079db92cafb0ebf97abdc09d52e97f4dcf773a2e694048b7cd737", size = 99394, upload-time = "2026-03-15T18:39:48.915Z" }, + { url = "https://files.pythonhosted.org/packages/94/84/d9273cd09688070a6523c4aee4663a8538721b2b755c4962aafae0011e72/identify-2.6.19-py2.py3-none-any.whl", hash = "sha256:20e6a87f786f768c092a721ad107fc9df0eb89347be9396cadf3f4abbd1fb78a", size = 99397, upload-time = "2026-04-17T18:39:49.221Z" }, ] [[package]] name = "idna" -version = "3.11" +version = "3.13" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/6f/6d/0703ccc57f3a7233505399edb88de3cbd678da106337b9fcde432b65ed60/idna-3.11.tar.gz", hash = "sha256:795dafcc9c04ed0c1fb032c2aa73654d8e8c5023a7df64a53f39190ada629902", size = 194582, upload-time = "2025-10-12T14:55:20.501Z" } +sdist = { url = "https://files.pythonhosted.org/packages/ce/cc/762dfb036166873f0059f3b7de4565e1b5bc3d6f28a414c13da27e442f99/idna-3.13.tar.gz", hash = "sha256:585ea8fe5d69b9181ec1afba340451fba6ba764af97026f92a91d4eef164a242", size = 194210, upload-time = "2026-04-22T16:42:42.314Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/0e/61/66938bbb5fc52dbdf84594873d5b51fb1f7c7794e9c0f5bd885f30bc507b/idna-3.11-py3-none-any.whl", hash = "sha256:771a87f49d9defaf64091e6e6fe9c18d4833f140bd19464795bc32d966ca37ea", size = 71008, upload-time = "2025-10-12T14:55:18.883Z" }, + { url = "https://files.pythonhosted.org/packages/5d/13/ad7d7ca3808a898b4612b6fe93cde56b53f3034dcde235acb1f0e1df24c6/idna-3.13-py3-none-any.whl", hash = "sha256:892ea0cde124a99ce773decba204c5552b69c3c67ffd5f232eb7696135bc8bb3", size = 68629, upload-time = "2026-04-22T16:42:40.909Z" }, ] [[package]] @@ -1135,8 +1144,7 @@ dependencies = [ { name = "appnope", marker = "sys_platform == 'darwin'" }, { name = "comm" }, { name = "debugpy" }, - { name = "ipython", version = "9.10.1", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.12'" }, - { name = "ipython", version = "9.12.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.12'" }, + { name = "ipython" }, { name = "jupyter-client" }, { name = "jupyter-core" }, { name = "matplotlib-inline" }, @@ -1154,55 +1162,25 @@ wheels = [ [[package]] name = "ipython" -version = "9.10.1" +version = "9.13.0" source = { registry = "https://pypi.org/simple" } -resolution-markers = [ - "python_full_version < '3.12' and sys_platform == 'win32'", - "python_full_version < '3.12' and sys_platform == 'emscripten'", - "python_full_version < '3.12' and sys_platform != 'emscripten' and sys_platform != 'win32'", -] dependencies = [ - { name = "colorama", marker = "python_full_version < '3.12' and sys_platform == 'win32'" }, - { name = "decorator", marker = "python_full_version < '3.12'" }, - { name = "ipython-pygments-lexers", marker = "python_full_version < '3.12'" }, - { name = "jedi", marker = "python_full_version < '3.12'" }, - { name = "matplotlib-inline", marker = "python_full_version < '3.12'" }, - { name = "pexpect", marker = "python_full_version < '3.12' and sys_platform != 'emscripten' and sys_platform != 'win32'" }, - { name = "prompt-toolkit", marker = "python_full_version < '3.12'" }, - { name = "pygments", marker = "python_full_version < '3.12'" }, - { name = "stack-data", marker = "python_full_version < '3.12'" }, - { name = "traitlets", marker = "python_full_version < '3.12'" }, + { name = "colorama", marker = "sys_platform == 'win32'" }, + { name = "decorator" }, + { name = "ipython-pygments-lexers" }, + { name = "jedi" }, + { name = "matplotlib-inline" }, + { name = "pexpect", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" }, + { name = "prompt-toolkit" }, + { name = "psutil" }, + { name = "pygments" }, + { name = "stack-data" }, + { name = "traitlets" }, { name = "typing-extensions", marker = "python_full_version < '3.12'" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/c5/25/daae0e764047b0a2480c7bbb25d48f4f509b5818636562eeac145d06dfee/ipython-9.10.1.tar.gz", hash = "sha256:e170e9b2a44312484415bdb750492699bf329233b03f2557a9692cce6466ada4", size = 4426663, upload-time = "2026-03-27T09:53:26.244Z" } +sdist = { url = "https://files.pythonhosted.org/packages/cd/c4/87cda5842cf5c31837c06ddb588e11c3c35d8ece89b7a0108c06b8c9b00a/ipython-9.13.0.tar.gz", hash = "sha256:7e834b6afc99f020e3f05966ced34792f40267d64cb1ea9043886dab0dde5967", size = 4430549, upload-time = "2026-04-24T12:24:55.221Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/01/09/ba70f8d662d5671687da55ad2cc0064cf795b15e1eea70907532202e7c97/ipython-9.10.1-py3-none-any.whl", hash = "sha256:82d18ae9fb9164ded080c71ef92a182ee35ee7db2395f67616034bebb020a232", size = 622827, upload-time = "2026-03-27T09:53:24.566Z" }, -] - -[[package]] -name = "ipython" -version = "9.12.0" -source = { registry = "https://pypi.org/simple" } -resolution-markers = [ - "python_full_version >= '3.12' and sys_platform == 'win32'", - "python_full_version >= '3.12' and sys_platform == 'emscripten'", - "python_full_version >= '3.12' and sys_platform != 'emscripten' and sys_platform != 'win32'", -] -dependencies = [ - { name = "colorama", marker = "python_full_version >= '3.12' and sys_platform == 'win32'" }, - { name = "decorator", marker = "python_full_version >= '3.12'" }, - { name = "ipython-pygments-lexers", marker = "python_full_version >= '3.12'" }, - { name = "jedi", marker = "python_full_version >= '3.12'" }, - { name = "matplotlib-inline", marker = "python_full_version >= '3.12'" }, - { name = "pexpect", marker = "python_full_version >= '3.12' and sys_platform != 'emscripten' and sys_platform != 'win32'" }, - { name = "prompt-toolkit", marker = "python_full_version >= '3.12'" }, - { name = "pygments", marker = "python_full_version >= '3.12'" }, - { name = "stack-data", marker = "python_full_version >= '3.12'" }, - { name = "traitlets", marker = "python_full_version >= '3.12'" }, -] -sdist = { url = "https://files.pythonhosted.org/packages/3a/73/7114f80a8f9cabdb13c27732dce24af945b2923dcab80723602f7c8bc2d8/ipython-9.12.0.tar.gz", hash = "sha256:01daa83f504b693ba523b5a407246cabde4eb4513285a3c6acaff11a66735ee4", size = 4428879, upload-time = "2026-03-27T09:42:45.312Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/59/22/906c8108974c673ebef6356c506cebb6870d48cedea3c41e949e2dd556bb/ipython-9.12.0-py3-none-any.whl", hash = "sha256:0f2701e8ee86e117e37f50563205d36feaa259d2e08d4a6bc6b6d74b18ce128d", size = 625661, upload-time = "2026-03-27T09:42:42.831Z" }, + { url = "https://files.pythonhosted.org/packages/b9/86/3060e8029b7cc505cce9a0137431dda81d0a3fde93a8f0f50ee0bf37a795/ipython-9.13.0-py3-none-any.whl", hash = "sha256:57f9d4639e20818d328d287c7b549af3d05f12486ea8f2e7f73e52a36ec4d201", size = 627274, upload-time = "2026-04-24T12:24:53.038Z" }, ] [[package]] @@ -1428,14 +1406,23 @@ wheels = [ [[package]] name = "mako" -version = "1.3.10" +version = "1.3.12" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "markupsafe" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/9e/38/bd5b78a920a64d708fe6bc8e0a2c075e1389d53bef8413725c63ba041535/mako-1.3.10.tar.gz", hash = "sha256:99579a6f39583fa7e5630a28c3c1f440e4e97a414b80372649c0ce338da2ea28", size = 392474, upload-time = "2025-04-10T12:44:31.16Z" } +sdist = { url = "https://files.pythonhosted.org/packages/00/62/791b31e69ae182791ec67f04850f2f062716bbd205483d63a215f3e062d3/mako-1.3.12.tar.gz", hash = "sha256:9f778e93289bd410bb35daadeb4fc66d95a746f0b75777b942088b7fd7af550a", size = 400219, upload-time = "2026-04-28T19:01:08.512Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/87/fb/99f81ac72ae23375f22b7afdb7642aba97c00a713c217124420147681a2f/mako-1.3.10-py3-none-any.whl", hash = "sha256:baef24a52fc4fc514a0887ac600f9f1cff3d82c61d4d700a1fa84d597b88db59", size = 78509, upload-time = "2025-04-10T12:50:53.297Z" }, + { url = "https://files.pythonhosted.org/packages/bc/b1/a0ec7a5a9db730a08daef1fdfb8090435b82465abbf758a596f0ea88727e/mako-1.3.12-py3-none-any.whl", hash = "sha256:8f61569480282dbf557145ce441e4ba888be453c30989f879f0d652e39f53ea9", size = 78521, upload-time = "2026-04-28T19:01:10.393Z" }, +] + +[[package]] +name = "markdown" +version = "3.10.2" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/2b/f4/69fa6ed85ae003c2378ffa8f6d2e3234662abd02c10d216c0ba96081a238/markdown-3.10.2.tar.gz", hash = "sha256:994d51325d25ad8aa7ce4ebaec003febcce822c3f8c911e3b17c52f7f589f950", size = 368805, upload-time = "2026-02-09T14:57:26.942Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/de/1f/77fa3081e4f66ca3576c896ae5d31c3002ac6607f9747d2e3aa49227e464/markdown-3.10.2-py3-none-any.whl", hash = "sha256:e91464b71ae3ee7afd3017d9f358ef0baf158fd9a298db92f1d4761133824c36", size = 108180, upload-time = "2026-02-09T14:57:25.787Z" }, ] [[package]] @@ -1504,7 +1491,7 @@ wheels = [ [[package]] name = "matplotlib" -version = "3.10.8" +version = "3.10.9" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "contourpy" }, @@ -1517,39 +1504,39 @@ dependencies = [ { name = "pyparsing" }, { name = "python-dateutil" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/8a/76/d3c6e3a13fe484ebe7718d14e269c9569c4eb0020a968a327acb3b9a8fe6/matplotlib-3.10.8.tar.gz", hash = "sha256:2299372c19d56bcd35cf05a2738308758d32b9eaed2371898d8f5bd33f084aa3", size = 34806269, upload-time = "2025-12-10T22:56:51.155Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/f8/86/de7e3a1cdcfc941483af70609edc06b83e7c8a0e0dc9ac325200a3f4d220/matplotlib-3.10.8-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:6be43b667360fef5c754dda5d25a32e6307a03c204f3c0fc5468b78fa87b4160", size = 8251215, upload-time = "2025-12-10T22:55:16.175Z" }, - { url = "https://files.pythonhosted.org/packages/fd/14/baad3222f424b19ce6ad243c71de1ad9ec6b2e4eb1e458a48fdc6d120401/matplotlib-3.10.8-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:a2b336e2d91a3d7006864e0990c83b216fcdca64b5a6484912902cef87313d78", size = 8139625, upload-time = "2025-12-10T22:55:17.712Z" }, - { url = "https://files.pythonhosted.org/packages/8f/a0/7024215e95d456de5883e6732e708d8187d9753a21d32f8ddb3befc0c445/matplotlib-3.10.8-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:efb30e3baaea72ce5928e32bab719ab4770099079d66726a62b11b1ef7273be4", size = 8712614, upload-time = "2025-12-10T22:55:20.8Z" }, - { url = "https://files.pythonhosted.org/packages/5a/f4/b8347351da9a5b3f41e26cf547252d861f685c6867d179a7c9d60ad50189/matplotlib-3.10.8-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d56a1efd5bfd61486c8bc968fa18734464556f0fb8e51690f4ac25d85cbbbbc2", size = 9540997, upload-time = "2025-12-10T22:55:23.258Z" }, - { url = "https://files.pythonhosted.org/packages/9e/c0/c7b914e297efe0bc36917bf216b2acb91044b91e930e878ae12981e461e5/matplotlib-3.10.8-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:238b7ce5717600615c895050239ec955d91f321c209dd110db988500558e70d6", size = 9596825, upload-time = "2025-12-10T22:55:25.217Z" }, - { url = "https://files.pythonhosted.org/packages/6f/d3/a4bbc01c237ab710a1f22b4da72f4ff6d77eb4c7735ea9811a94ae239067/matplotlib-3.10.8-cp311-cp311-win_amd64.whl", hash = "sha256:18821ace09c763ec93aef5eeff087ee493a24051936d7b9ebcad9662f66501f9", size = 8135090, upload-time = "2025-12-10T22:55:27.162Z" }, - { url = "https://files.pythonhosted.org/packages/89/dd/a0b6588f102beab33ca6f5218b31725216577b2a24172f327eaf6417d5c9/matplotlib-3.10.8-cp311-cp311-win_arm64.whl", hash = "sha256:bab485bcf8b1c7d2060b4fcb6fc368a9e6f4cd754c9c2fea281f4be21df394a2", size = 8012377, upload-time = "2025-12-10T22:55:29.185Z" }, - { url = "https://files.pythonhosted.org/packages/9e/67/f997cdcbb514012eb0d10cd2b4b332667997fb5ebe26b8d41d04962fa0e6/matplotlib-3.10.8-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:64fcc24778ca0404ce0cb7b6b77ae1f4c7231cdd60e6778f999ee05cbd581b9a", size = 8260453, upload-time = "2025-12-10T22:55:30.709Z" }, - { url = "https://files.pythonhosted.org/packages/7e/65/07d5f5c7f7c994f12c768708bd2e17a4f01a2b0f44a1c9eccad872433e2e/matplotlib-3.10.8-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:b9a5ca4ac220a0cdd1ba6bcba3608547117d30468fefce49bb26f55c1a3d5c58", size = 8148321, upload-time = "2025-12-10T22:55:33.265Z" }, - { url = "https://files.pythonhosted.org/packages/3e/f3/c5195b1ae57ef85339fd7285dfb603b22c8b4e79114bae5f4f0fcf688677/matplotlib-3.10.8-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:3ab4aabc72de4ff77b3ec33a6d78a68227bf1123465887f9905ba79184a1cc04", size = 8716944, upload-time = "2025-12-10T22:55:34.922Z" }, - { url = "https://files.pythonhosted.org/packages/00/f9/7638f5cc82ec8a7aa005de48622eecc3ed7c9854b96ba15bd76b7fd27574/matplotlib-3.10.8-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:24d50994d8c5816ddc35411e50a86ab05f575e2530c02752e02538122613371f", size = 9550099, upload-time = "2025-12-10T22:55:36.789Z" }, - { url = "https://files.pythonhosted.org/packages/57/61/78cd5920d35b29fd2a0fe894de8adf672ff52939d2e9b43cb83cd5ce1bc7/matplotlib-3.10.8-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:99eefd13c0dc3b3c1b4d561c1169e65fe47aab7b8158754d7c084088e2329466", size = 9613040, upload-time = "2025-12-10T22:55:38.715Z" }, - { url = "https://files.pythonhosted.org/packages/30/4e/c10f171b6e2f44d9e3a2b96efa38b1677439d79c99357600a62cc1e9594e/matplotlib-3.10.8-cp312-cp312-win_amd64.whl", hash = "sha256:dd80ecb295460a5d9d260df63c43f4afbdd832d725a531f008dad1664f458adf", size = 8142717, upload-time = "2025-12-10T22:55:41.103Z" }, - { url = "https://files.pythonhosted.org/packages/f1/76/934db220026b5fef85f45d51a738b91dea7d70207581063cd9bd8fafcf74/matplotlib-3.10.8-cp312-cp312-win_arm64.whl", hash = "sha256:3c624e43ed56313651bc18a47f838b60d7b8032ed348911c54906b130b20071b", size = 8012751, upload-time = "2025-12-10T22:55:42.684Z" }, - { url = "https://files.pythonhosted.org/packages/3d/b9/15fd5541ef4f5b9a17eefd379356cf12175fe577424e7b1d80676516031a/matplotlib-3.10.8-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:3f2e409836d7f5ac2f1c013110a4d50b9f7edc26328c108915f9075d7d7a91b6", size = 8261076, upload-time = "2025-12-10T22:55:44.648Z" }, - { url = "https://files.pythonhosted.org/packages/8d/a0/2ba3473c1b66b9c74dc7107c67e9008cb1782edbe896d4c899d39ae9cf78/matplotlib-3.10.8-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:56271f3dac49a88d7fca5060f004d9d22b865f743a12a23b1e937a0be4818ee1", size = 8148794, upload-time = "2025-12-10T22:55:46.252Z" }, - { url = "https://files.pythonhosted.org/packages/75/97/a471f1c3eb1fd6f6c24a31a5858f443891d5127e63a7788678d14e249aea/matplotlib-3.10.8-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:a0a7f52498f72f13d4a25ea70f35f4cb60642b466cbb0a9be951b5bc3f45a486", size = 8718474, upload-time = "2025-12-10T22:55:47.864Z" }, - { url = "https://files.pythonhosted.org/packages/01/be/cd478f4b66f48256f42927d0acbcd63a26a893136456cd079c0cc24fbabf/matplotlib-3.10.8-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:646d95230efb9ca614a7a594d4fcacde0ac61d25e37dd51710b36477594963ce", size = 9549637, upload-time = "2025-12-10T22:55:50.048Z" }, - { url = "https://files.pythonhosted.org/packages/5d/7c/8dc289776eae5109e268c4fb92baf870678dc048a25d4ac903683b86d5bf/matplotlib-3.10.8-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:f89c151aab2e2e23cb3fe0acad1e8b82841fd265379c4cecd0f3fcb34c15e0f6", size = 9613678, upload-time = "2025-12-10T22:55:52.21Z" }, - { url = "https://files.pythonhosted.org/packages/64/40/37612487cc8a437d4dd261b32ca21fe2d79510fe74af74e1f42becb1bdb8/matplotlib-3.10.8-cp313-cp313-win_amd64.whl", hash = "sha256:e8ea3e2d4066083e264e75c829078f9e149fa119d27e19acd503de65e0b13149", size = 8142686, upload-time = "2025-12-10T22:55:54.253Z" }, - { url = "https://files.pythonhosted.org/packages/66/52/8d8a8730e968185514680c2a6625943f70269509c3dcfc0dcf7d75928cb8/matplotlib-3.10.8-cp313-cp313-win_arm64.whl", hash = "sha256:c108a1d6fa78a50646029cb6d49808ff0fc1330fda87fa6f6250c6b5369b6645", size = 8012917, upload-time = "2025-12-10T22:55:56.268Z" }, - { url = "https://files.pythonhosted.org/packages/b5/27/51fe26e1062f298af5ef66343d8ef460e090a27fea73036c76c35821df04/matplotlib-3.10.8-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:ad3d9833a64cf48cc4300f2b406c3d0f4f4724a91c0bd5640678a6ba7c102077", size = 8305679, upload-time = "2025-12-10T22:55:57.856Z" }, - { url = "https://files.pythonhosted.org/packages/2c/1e/4de865bc591ac8e3062e835f42dd7fe7a93168d519557837f0e37513f629/matplotlib-3.10.8-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:eb3823f11823deade26ce3b9f40dcb4a213da7a670013929f31d5f5ed1055b22", size = 8198336, upload-time = "2025-12-10T22:55:59.371Z" }, - { url = "https://files.pythonhosted.org/packages/c6/cb/2f7b6e75fb4dce87ef91f60cac4f6e34f4c145ab036a22318ec837971300/matplotlib-3.10.8-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d9050fee89a89ed57b4fb2c1bfac9a3d0c57a0d55aed95949eedbc42070fea39", size = 8731653, upload-time = "2025-12-10T22:56:01.032Z" }, - { url = "https://files.pythonhosted.org/packages/46/b3/bd9c57d6ba670a37ab31fb87ec3e8691b947134b201f881665b28cc039ff/matplotlib-3.10.8-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b44d07310e404ba95f8c25aa5536f154c0a8ec473303535949e52eb71d0a1565", size = 9561356, upload-time = "2025-12-10T22:56:02.95Z" }, - { url = "https://files.pythonhosted.org/packages/c0/3d/8b94a481456dfc9dfe6e39e93b5ab376e50998cddfd23f4ae3b431708f16/matplotlib-3.10.8-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:0a33deb84c15ede243aead39f77e990469fff93ad1521163305095b77b72ce4a", size = 9614000, upload-time = "2025-12-10T22:56:05.411Z" }, - { url = "https://files.pythonhosted.org/packages/bd/cd/bc06149fe5585ba800b189a6a654a75f1f127e8aab02fd2be10df7fa500c/matplotlib-3.10.8-cp313-cp313t-win_amd64.whl", hash = "sha256:3a48a78d2786784cc2413e57397981fb45c79e968d99656706018d6e62e57958", size = 8220043, upload-time = "2025-12-10T22:56:07.551Z" }, - { url = "https://files.pythonhosted.org/packages/e3/de/b22cf255abec916562cc04eef457c13e58a1990048de0c0c3604d082355e/matplotlib-3.10.8-cp313-cp313t-win_arm64.whl", hash = "sha256:15d30132718972c2c074cd14638c7f4592bd98719e2308bccea40e0538bc0cb5", size = 8062075, upload-time = "2025-12-10T22:56:09.178Z" }, - { url = "https://files.pythonhosted.org/packages/04/30/3afaa31c757f34b7725ab9d2ba8b48b5e89c2019c003e7d0ead143aabc5a/matplotlib-3.10.8-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:6da7c2ce169267d0d066adcf63758f0604aa6c3eebf67458930f9d9b79ad1db1", size = 8249198, upload-time = "2025-12-10T22:56:45.584Z" }, - { url = "https://files.pythonhosted.org/packages/48/2f/6334aec331f57485a642a7c8be03cb286f29111ae71c46c38b363230063c/matplotlib-3.10.8-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:9153c3292705be9f9c64498a8872118540c3f4123d1a1c840172edf262c8be4a", size = 8136817, upload-time = "2025-12-10T22:56:47.339Z" }, - { url = "https://files.pythonhosted.org/packages/73/e4/6d6f14b2a759c622f191b2d67e9075a3f56aaccb3be4bb9bb6890030d0a0/matplotlib-3.10.8-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1ae029229a57cd1e8fe542485f27e7ca7b23aa9e8944ddb4985d0bc444f1eca2", size = 8713867, upload-time = "2025-12-10T22:56:48.954Z" }, +sdist = { url = "https://files.pythonhosted.org/packages/63/1b/4be5be87d43d327a0cf4de1a56e86f7f84c89312452406cf122efe2839e6/matplotlib-3.10.9.tar.gz", hash = "sha256:fd66508e8c6877d98e586654b608a0456db8d7e8a546eb1e2600efd957302358", size = 34811233, upload-time = "2026-04-24T00:14:13.539Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/4c/8c/290f021104741fea63769c31494f5324c0cd249bf536a65a4350767b1f22/matplotlib-3.10.9-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:68cfdcede415f7c8f5577b03303dd94526cdb6d11036cecdc205e08733b2d2bb", size = 8306860, upload-time = "2026-04-24T00:12:01.207Z" }, + { url = "https://files.pythonhosted.org/packages/51/18/325cd32ece1120d1da51cc4e4294c6580190699490183fc2fe8cb6d61ec5/matplotlib-3.10.9-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:dfca0129678bd56379db26c52b5d77ed7de314c047492fbdc763aa7501710cfb", size = 8199254, upload-time = "2026-04-24T00:12:04.239Z" }, + { url = "https://files.pythonhosted.org/packages/79/db/e28c1b83e3680740aa78925f5fb2ae4d16207207419ad75ea9fe604f8676/matplotlib-3.10.9-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:8e436d155fa8a3399dc62683f8f5d0e2e50d25d0144a73edd73f82eec8f4abfb", size = 8777092, upload-time = "2026-04-24T00:12:06.793Z" }, + { url = "https://files.pythonhosted.org/packages/55/fa/3ce7adfe9ba101748f465211660d9c6374c876b671bdb8c2bb6d347e8b94/matplotlib-3.10.9-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:56fc0bd271b00025c6edfdc7c2dcd247372c8e1544971d62e1dc7c17367e8bf9", size = 9595691, upload-time = "2026-04-24T00:12:09.706Z" }, + { url = "https://files.pythonhosted.org/packages/36/c4/6960a76686ed668f2c60f84e9799ba4c0d56abdb36b1577b60c1d061d1ec/matplotlib-3.10.9-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:a5a6104ed666402ba5106d7f36e0e0cdca4e8d7fa4d39708ca88019e2835a2eb", size = 9659771, upload-time = "2026-04-24T00:12:12.766Z" }, + { url = "https://files.pythonhosted.org/packages/7e/0d/271aace3342157c64700c9ff4c59c7b392f3dbab393692e8db6fbe7ab96c/matplotlib-3.10.9-cp311-cp311-win_amd64.whl", hash = "sha256:d730e984eddf56974c3e72b6129c7ca462ac38dc624338f4b0b23eb23ecba00f", size = 8205112, upload-time = "2026-04-24T00:12:15.773Z" }, + { url = "https://files.pythonhosted.org/packages/e2/ee/cb57ad4754f3e7b9174ce6ce66d9205fb827067e48a9f58ac09d7e7d6b77/matplotlib-3.10.9-cp311-cp311-win_arm64.whl", hash = "sha256:51bf0ddbdc598e060d46c16b5590708f81a1624cefbaaf62f6a81bf9285b8c80", size = 8132310, upload-time = "2026-04-24T00:12:18.645Z" }, + { url = "https://files.pythonhosted.org/packages/35/c6/5581e26c72233ebb2a2a6fed2d24fb7c66b4700120b813f51b0555acf0b6/matplotlib-3.10.9-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:f0c3c28d9fbcc1fe7a03be236d73430cf6409c41fb2383a7ac52fe932b072cb1", size = 8319908, upload-time = "2026-04-24T00:12:21.323Z" }, + { url = "https://files.pythonhosted.org/packages/b7/18/4880dd762e40cd360c1bf06e890c5a97b997e91cb324602b1a19950ad5ce/matplotlib-3.10.9-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:41cb28c2bd769aa3e98322c6ab09854cbcc52ab69d2759d681bba3e327b2b320", size = 8216016, upload-time = "2026-04-24T00:12:23.4Z" }, + { url = "https://files.pythonhosted.org/packages/32/91/d024616abdba99e83120e07a20658976f6a343646710760c4a51df126029/matplotlib-3.10.9-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:ae20801130378b82d647ff5047c07316295b68dc054ca6b3c13519d0ea624285", size = 8789336, upload-time = "2026-04-24T00:12:26.096Z" }, + { url = "https://files.pythonhosted.org/packages/5c/04/030a2f61ef2158f5e4c259487a92ac877732499fb33d871585d89e03c42d/matplotlib-3.10.9-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6c63ebcd8b4b169eb2f5c200552ae6b8be8999a005b6b507ed76fb8d7d674fe2", size = 9604602, upload-time = "2026-04-24T00:12:29.052Z" }, + { url = "https://files.pythonhosted.org/packages/fc/c2/541e4d09d87bb6b5830fc28b4c887a9a8cf4e1c6cee698a8c05552ae2003/matplotlib-3.10.9-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:d75d11c949914165976c621b2324f9ef162af7ebf4b057ddf95dd1dba7e5edcf", size = 9670966, upload-time = "2026-04-24T00:12:32.131Z" }, + { url = "https://files.pythonhosted.org/packages/04/a1/4571fc46e7702de8d0c2dc54ad1b2f8e29328dea3ee90831181f7353d93c/matplotlib-3.10.9-cp312-cp312-win_amd64.whl", hash = "sha256:d091f9d758b34aaaaa6331d13574bf01891d903b3dec59bfff458ef7551de5d6", size = 8217462, upload-time = "2026-04-24T00:12:35.226Z" }, + { url = "https://files.pythonhosted.org/packages/4b/d0/2269edb12aa30c13c8bcc9382892e39943ce1d28aab4ec296e0381798e81/matplotlib-3.10.9-cp312-cp312-win_arm64.whl", hash = "sha256:10cc5ce06d10231c36f40e875f3c7e8050362a4ee8f0ee5d29a6b3277d57bb42", size = 8136688, upload-time = "2026-04-24T00:12:37.442Z" }, + { url = "https://files.pythonhosted.org/packages/aa/d3/8d4f6afbecb49fc04e060a57c0fce39ea51cc163a6bd87303ccd698e4fa6/matplotlib-3.10.9-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:b580440f1ff81a0e34122051a3dfabb7e4b7f9e380629929bde0eff9af72165f", size = 8320331, upload-time = "2026-04-24T00:12:39.688Z" }, + { url = "https://files.pythonhosted.org/packages/63/d9/9e14bc7564bf92d5ffa801ae5fac819ce74b925dfb55e3ebde61a3bbad3e/matplotlib-3.10.9-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:b1b745c489cd1a77a0dc1120a05dc87af9798faebc913601feb8c73d89bf2d1e", size = 8216461, upload-time = "2026-04-24T00:12:42.494Z" }, + { url = "https://files.pythonhosted.org/packages/8a/17/4402d0d14ccf1dfc70932600b68097fbbf9c898a4871d2cbbe79c7801a32/matplotlib-3.10.9-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:8f3bcac1ca5ed000a6f4337d47ba67dfddf37ed6a46c15fd7f014997f7bf865f", size = 8790091, upload-time = "2026-04-24T00:12:44.789Z" }, + { url = "https://files.pythonhosted.org/packages/3e/0b/322aeec06dd9b91411f92028b37d447342770a24392aa4813e317064dad5/matplotlib-3.10.9-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7a8d66a55def891c33147ba3ba9bfcabf0b526a43764c818acbb4525e5ed0838", size = 9605027, upload-time = "2026-04-24T00:12:47.583Z" }, + { url = "https://files.pythonhosted.org/packages/74/88/5f13482f55e7b00bcfc09838b093c2456e1379978d2a146844aae05350ad/matplotlib-3.10.9-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:d843374407c4017a6403b59c6c81606773d136f3259d5b6da3131bc814542cc2", size = 9671269, upload-time = "2026-04-24T00:12:50.878Z" }, + { url = "https://files.pythonhosted.org/packages/c5/e0/0840fd2f93da988ec660b8ad1984abe9f25d2aed22a5e394ff1c68c88307/matplotlib-3.10.9-cp313-cp313-win_amd64.whl", hash = "sha256:f4399f64b3e94cd500195490972ae1ee81170df1636fa15364d157d5bdd7b921", size = 8217588, upload-time = "2026-04-24T00:12:53.784Z" }, + { url = "https://files.pythonhosted.org/packages/47/b9/d706d06dd605c49b9f83a2aed8c13e3e5db70697d7a80b7e3d7915de6b17/matplotlib-3.10.9-cp313-cp313-win_arm64.whl", hash = "sha256:ba7b3b8ef09eab7df0e86e9ae086faa433efbfbdb46afcb3aa16aabf779469a8", size = 8136913, upload-time = "2026-04-24T00:12:56.501Z" }, + { url = "https://files.pythonhosted.org/packages/9b/45/6e32d96978264c8ca8c4b1010adb955a1a49cfaf314e212bbc8908f04a61/matplotlib-3.10.9-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:09218df8a93712bd6ea133e83a153c755448cf7868316c531cffcc43f69d1cc9", size = 8368019, upload-time = "2026-04-24T00:12:58.896Z" }, + { url = "https://files.pythonhosted.org/packages/86/0a/c8e3d3bba245f0f7fc424937f8ff7ef77291a36af3edb97ccd78aa93d84f/matplotlib-3.10.9-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:82368699727bfb7b0182e1aa13082e3c08e092fa1a25d3e1fd92405bff96f6d4", size = 8264645, upload-time = "2026-04-24T00:13:01.406Z" }, + { url = "https://files.pythonhosted.org/packages/3d/aa/5bf5a14fe4fed73a4209a155606f8096ff797aad89c6c35179026571133e/matplotlib-3.10.9-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:3225f4e1edcb8c86c884ddf79ebe20ecd0a67d30188f279897554ccd8fded4dc", size = 8802194, upload-time = "2026-04-24T00:13:03.702Z" }, + { url = "https://files.pythonhosted.org/packages/dd/5e/b4be852d6bba6fd15893fadf91ff26ae49cb91aac789e95dde9d342e664f/matplotlib-3.10.9-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:de2445a0c6690d21b7eb6ce071cebad6d40a2e9bdf10d039074a96ba19797b99", size = 9622684, upload-time = "2026-04-24T00:13:06.647Z" }, + { url = "https://files.pythonhosted.org/packages/4c/3d/ed428c971139112ef730f62770654d609467346d09d4b62617e1afd68a5a/matplotlib-3.10.9-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:b2b9516251cb89ff618d757daec0e2ed1bf21248013844a853d87ef85ab3081d", size = 9680790, upload-time = "2026-04-24T00:13:10.009Z" }, + { url = "https://files.pythonhosted.org/packages/e7/09/052e884aaf2b985c63cb79f715f1d5b6a3eaa7de78f6a52b9dbc077d5b53/matplotlib-3.10.9-cp313-cp313t-win_amd64.whl", hash = "sha256:e9fae004b941b23ff2edcf1567a857ed77bafc8086ffa258190462328434faf8", size = 8287571, upload-time = "2026-04-24T00:13:13.087Z" }, + { url = "https://files.pythonhosted.org/packages/f4/38/ae27288e788c35a4250491422f3db7750366fc8c97d6f36fbdecfc1f5518/matplotlib-3.10.9-cp313-cp313t-win_arm64.whl", hash = "sha256:6b63d9c7c769b88ab81e10dc86e4e0607cf56817b9f9e6cf24b2a5f1693b8e38", size = 8188292, upload-time = "2026-04-24T00:13:15.546Z" }, + { url = "https://files.pythonhosted.org/packages/63/e2/9f66ca6a651a52abfe0d4964ce01439ed34f3f1e119de10ff3a07f403043/matplotlib-3.10.9-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:42fb814efabe95c06c1994d8ab5a8385f43a249e23badd3ba931d4308e5bca20", size = 8304420, upload-time = "2026-04-24T00:14:04.57Z" }, + { url = "https://files.pythonhosted.org/packages/e8/e8/467c03568218792906aa87b5e7bb379b605e056ed0c74fe00c051786d925/matplotlib-3.10.9-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:f76e640a5268850bfda54b5131b1b1941cc685e42c5fa98ed9f2d64038308cba", size = 8197981, upload-time = "2026-04-24T00:14:07.233Z" }, + { url = "https://files.pythonhosted.org/packages/6f/87/afead29192170917537934c6aff4b008c805fff7b1ccea0c79120d96beda/matplotlib-3.10.9-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:3fc0364dfbe1d07f6d15c5ebd0c5bf89e126916e5a8667dd4a7a6e84c36653d4", size = 8774002, upload-time = "2026-04-24T00:14:09.816Z" }, ] [[package]] @@ -1747,8 +1734,7 @@ source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "importlib-metadata" }, { name = "ipykernel" }, - { name = "ipython", version = "9.10.1", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.12'" }, - { name = "ipython", version = "9.12.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.12'" }, + { name = "ipython" }, { name = "jupyter-cache" }, { name = "myst-parser" }, { name = "nbclient" }, @@ -1973,11 +1959,11 @@ wheels = [ [[package]] name = "packaging" -version = "26.0" +version = "26.2" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/65/ee/299d360cdc32edc7d2cf530f3accf79c4fca01e96ffc950d8a52213bd8e4/packaging-26.0.tar.gz", hash = "sha256:00243ae351a257117b6a241061796684b084ed1c516a08c48a3f7e147a9d80b4", size = 143416, upload-time = "2026-01-21T20:50:39.064Z" } +sdist = { url = "https://files.pythonhosted.org/packages/d7/f1/e7a6dd94a8d4a5626c03e4e99c87f241ba9e350cd9e6d75123f992427270/packaging-26.2.tar.gz", hash = "sha256:ff452ff5a3e828ce110190feff1178bb1f2ea2281fa2075aadb987c2fb221661", size = 228134, upload-time = "2026-04-24T20:15:23.917Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/b7/b9/c538f279a4e237a006a2c98387d081e9eb060d203d8ed34467cc0f0b9b53/packaging-26.0-py3-none-any.whl", hash = "sha256:b36f1fef9334a5588b4166f8bcd26a14e521f2b55e6b9de3aaa80d3ff7a37529", size = 74366, upload-time = "2026-01-21T20:50:37.788Z" }, + { url = "https://files.pythonhosted.org/packages/df/b2/87e62e8c3e2f4b32e5fe99e0b86d576da1312593b39f47d8ceef365e95ed/packaging-26.2-py3-none-any.whl", hash = "sha256:5fc45236b9446107ff2415ce77c807cee2862cb6fac22b8a73826d0693b0980e", size = 100195, upload-time = "2026-04-24T20:15:22.081Z" }, ] [[package]] @@ -2047,73 +2033,73 @@ wheels = [ [[package]] name = "pillow" -version = "12.1.1" -source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/1f/42/5c74462b4fd957fcd7b13b04fb3205ff8349236ea74c7c375766d6c82288/pillow-12.1.1.tar.gz", hash = "sha256:9ad8fa5937ab05218e2b6a4cff30295ad35afd2f83ac592e68c0d871bb0fdbc4", size = 46980264, upload-time = "2026-02-11T04:23:07.146Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/2b/46/5da1ec4a5171ee7bf1a0efa064aba70ba3d6e0788ce3f5acd1375d23c8c0/pillow-12.1.1-cp311-cp311-macosx_10_10_x86_64.whl", hash = "sha256:e879bb6cd5c73848ef3b2b48b8af9ff08c5b71ecda8048b7dd22d8a33f60be32", size = 5304084, upload-time = "2026-02-11T04:20:27.501Z" }, - { url = "https://files.pythonhosted.org/packages/78/93/a29e9bc02d1cf557a834da780ceccd54e02421627200696fcf805ebdc3fb/pillow-12.1.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:365b10bb9417dd4498c0e3b128018c4a624dc11c7b97d8cc54effe3b096f4c38", size = 4657866, upload-time = "2026-02-11T04:20:29.827Z" }, - { url = "https://files.pythonhosted.org/packages/13/84/583a4558d492a179d31e4aae32eadce94b9acf49c0337c4ce0b70e0a01f2/pillow-12.1.1-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:d4ce8e329c93845720cd2014659ca67eac35f6433fd3050393d85f3ecef0dad5", size = 6232148, upload-time = "2026-02-11T04:20:31.329Z" }, - { url = "https://files.pythonhosted.org/packages/d5/e2/53c43334bbbb2d3b938978532fbda8e62bb6e0b23a26ce8592f36bcc4987/pillow-12.1.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:fc354a04072b765eccf2204f588a7a532c9511e8b9c7f900e1b64e3e33487090", size = 8038007, upload-time = "2026-02-11T04:20:34.225Z" }, - { url = "https://files.pythonhosted.org/packages/b8/a6/3d0e79c8a9d58150dd98e199d7c1c56861027f3829a3a60b3c2784190180/pillow-12.1.1-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7e7976bf1910a8116b523b9f9f58bf410f3e8aa330cd9a2bb2953f9266ab49af", size = 6345418, upload-time = "2026-02-11T04:20:35.858Z" }, - { url = "https://files.pythonhosted.org/packages/a2/c8/46dfeac5825e600579157eea177be43e2f7ff4a99da9d0d0a49533509ac5/pillow-12.1.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:597bd9c8419bc7c6af5604e55847789b69123bbe25d65cc6ad3012b4f3c98d8b", size = 7034590, upload-time = "2026-02-11T04:20:37.91Z" }, - { url = "https://files.pythonhosted.org/packages/af/bf/e6f65d3db8a8bbfeaf9e13cc0417813f6319863a73de934f14b2229ada18/pillow-12.1.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:2c1fc0f2ca5f96a3c8407e41cca26a16e46b21060fe6d5b099d2cb01412222f5", size = 6458655, upload-time = "2026-02-11T04:20:39.496Z" }, - { url = "https://files.pythonhosted.org/packages/f9/c2/66091f3f34a25894ca129362e510b956ef26f8fb67a0e6417bc5744e56f1/pillow-12.1.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:578510d88c6229d735855e1f278aa305270438d36a05031dfaae5067cc8eb04d", size = 7159286, upload-time = "2026-02-11T04:20:41.139Z" }, - { url = "https://files.pythonhosted.org/packages/7b/5a/24bc8eb526a22f957d0cec6243146744966d40857e3d8deb68f7902ca6c1/pillow-12.1.1-cp311-cp311-win32.whl", hash = "sha256:7311c0a0dcadb89b36b7025dfd8326ecfa36964e29913074d47382706e516a7c", size = 6328663, upload-time = "2026-02-11T04:20:43.184Z" }, - { url = "https://files.pythonhosted.org/packages/31/03/bef822e4f2d8f9d7448c133d0a18185d3cce3e70472774fffefe8b0ed562/pillow-12.1.1-cp311-cp311-win_amd64.whl", hash = "sha256:fbfa2a7c10cc2623f412753cddf391c7f971c52ca40a3f65dc5039b2939e8563", size = 7031448, upload-time = "2026-02-11T04:20:44.696Z" }, - { url = "https://files.pythonhosted.org/packages/49/70/f76296f53610bd17b2e7d31728b8b7825e3ac3b5b3688b51f52eab7c0818/pillow-12.1.1-cp311-cp311-win_arm64.whl", hash = "sha256:b81b5e3511211631b3f672a595e3221252c90af017e399056d0faabb9538aa80", size = 2453651, upload-time = "2026-02-11T04:20:46.243Z" }, - { url = "https://files.pythonhosted.org/packages/07/d3/8df65da0d4df36b094351dce696f2989bec731d4f10e743b1c5f4da4d3bf/pillow-12.1.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:ab323b787d6e18b3d91a72fc99b1a2c28651e4358749842b8f8dfacd28ef2052", size = 5262803, upload-time = "2026-02-11T04:20:47.653Z" }, - { url = "https://files.pythonhosted.org/packages/d6/71/5026395b290ff404b836e636f51d7297e6c83beceaa87c592718747e670f/pillow-12.1.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:adebb5bee0f0af4909c30db0d890c773d1a92ffe83da908e2e9e720f8edf3984", size = 4657601, upload-time = "2026-02-11T04:20:49.328Z" }, - { url = "https://files.pythonhosted.org/packages/b1/2e/1001613d941c67442f745aff0f7cc66dd8df9a9c084eb497e6a543ee6f7e/pillow-12.1.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:bb66b7cc26f50977108790e2456b7921e773f23db5630261102233eb355a3b79", size = 6234995, upload-time = "2026-02-11T04:20:51.032Z" }, - { url = "https://files.pythonhosted.org/packages/07/26/246ab11455b2549b9233dbd44d358d033a2f780fa9007b61a913c5b2d24e/pillow-12.1.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:aee2810642b2898bb187ced9b349e95d2a7272930796e022efaf12e99dccd293", size = 8045012, upload-time = "2026-02-11T04:20:52.882Z" }, - { url = "https://files.pythonhosted.org/packages/b2/8b/07587069c27be7535ac1fe33874e32de118fbd34e2a73b7f83436a88368c/pillow-12.1.1-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a0b1cd6232e2b618adcc54d9882e4e662a089d5768cd188f7c245b4c8c44a397", size = 6349638, upload-time = "2026-02-11T04:20:54.444Z" }, - { url = "https://files.pythonhosted.org/packages/ff/79/6df7b2ee763d619cda2fb4fea498e5f79d984dae304d45a8999b80d6cf5c/pillow-12.1.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7aac39bcf8d4770d089588a2e1dd111cbaa42df5a94be3114222057d68336bd0", size = 7041540, upload-time = "2026-02-11T04:20:55.97Z" }, - { url = "https://files.pythonhosted.org/packages/2c/5e/2ba19e7e7236d7529f4d873bdaf317a318896bac289abebd4bb00ef247f0/pillow-12.1.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:ab174cd7d29a62dd139c44bf74b698039328f45cb03b4596c43473a46656b2f3", size = 6462613, upload-time = "2026-02-11T04:20:57.542Z" }, - { url = "https://files.pythonhosted.org/packages/03/03/31216ec124bb5c3dacd74ce8efff4cc7f52643653bad4825f8f08c697743/pillow-12.1.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:339ffdcb7cbeaa08221cd401d517d4b1fe7a9ed5d400e4a8039719238620ca35", size = 7166745, upload-time = "2026-02-11T04:20:59.196Z" }, - { url = "https://files.pythonhosted.org/packages/1f/e7/7c4552d80052337eb28653b617eafdef39adfb137c49dd7e831b8dc13bc5/pillow-12.1.1-cp312-cp312-win32.whl", hash = "sha256:5d1f9575a12bed9e9eedd9a4972834b08c97a352bd17955ccdebfeca5913fa0a", size = 6328823, upload-time = "2026-02-11T04:21:01.385Z" }, - { url = "https://files.pythonhosted.org/packages/3d/17/688626d192d7261bbbf98846fc98995726bddc2c945344b65bec3a29d731/pillow-12.1.1-cp312-cp312-win_amd64.whl", hash = "sha256:21329ec8c96c6e979cd0dfd29406c40c1d52521a90544463057d2aaa937d66a6", size = 7033367, upload-time = "2026-02-11T04:21:03.536Z" }, - { url = "https://files.pythonhosted.org/packages/ed/fe/a0ef1f73f939b0eca03ee2c108d0043a87468664770612602c63266a43c4/pillow-12.1.1-cp312-cp312-win_arm64.whl", hash = "sha256:af9a332e572978f0218686636610555ae3defd1633597be015ed50289a03c523", size = 2453811, upload-time = "2026-02-11T04:21:05.116Z" }, - { url = "https://files.pythonhosted.org/packages/d5/11/6db24d4bd7685583caeae54b7009584e38da3c3d4488ed4cd25b439de486/pillow-12.1.1-cp313-cp313-ios_13_0_arm64_iphoneos.whl", hash = "sha256:d242e8ac078781f1de88bf823d70c1a9b3c7950a44cdf4b7c012e22ccbcd8e4e", size = 4062689, upload-time = "2026-02-11T04:21:06.804Z" }, - { url = "https://files.pythonhosted.org/packages/33/c0/ce6d3b1fe190f0021203e0d9b5b99e57843e345f15f9ef22fcd43842fd21/pillow-12.1.1-cp313-cp313-ios_13_0_arm64_iphonesimulator.whl", hash = "sha256:02f84dfad02693676692746df05b89cf25597560db2857363a208e393429f5e9", size = 4138535, upload-time = "2026-02-11T04:21:08.452Z" }, - { url = "https://files.pythonhosted.org/packages/a0/c6/d5eb6a4fb32a3f9c21a8c7613ec706534ea1cf9f4b3663e99f0d83f6fca8/pillow-12.1.1-cp313-cp313-ios_13_0_x86_64_iphonesimulator.whl", hash = "sha256:e65498daf4b583091ccbb2556c7000abf0f3349fcd57ef7adc9a84a394ed29f6", size = 3601364, upload-time = "2026-02-11T04:21:10.194Z" }, - { url = "https://files.pythonhosted.org/packages/14/a1/16c4b823838ba4c9c52c0e6bbda903a3fe5a1bdbf1b8eb4fff7156f3e318/pillow-12.1.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:6c6db3b84c87d48d0088943bf33440e0c42370b99b1c2a7989216f7b42eede60", size = 5262561, upload-time = "2026-02-11T04:21:11.742Z" }, - { url = "https://files.pythonhosted.org/packages/bb/ad/ad9dc98ff24f485008aa5cdedaf1a219876f6f6c42a4626c08bc4e80b120/pillow-12.1.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:8b7e5304e34942bf62e15184219a7b5ad4ff7f3bb5cca4d984f37df1a0e1aee2", size = 4657460, upload-time = "2026-02-11T04:21:13.786Z" }, - { url = "https://files.pythonhosted.org/packages/9e/1b/f1a4ea9a895b5732152789326202a82464d5254759fbacae4deea3069334/pillow-12.1.1-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:18e5bddd742a44b7e6b1e773ab5db102bd7a94c32555ba656e76d319d19c3850", size = 6232698, upload-time = "2026-02-11T04:21:15.949Z" }, - { url = "https://files.pythonhosted.org/packages/95/f4/86f51b8745070daf21fd2e5b1fe0eb35d4db9ca26e6d58366562fb56a743/pillow-12.1.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:fc44ef1f3de4f45b50ccf9136999d71abb99dca7706bc75d222ed350b9fd2289", size = 8041706, upload-time = "2026-02-11T04:21:17.723Z" }, - { url = "https://files.pythonhosted.org/packages/29/9b/d6ecd956bb1266dd1045e995cce9b8d77759e740953a1c9aad9502a0461e/pillow-12.1.1-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5a8eb7ed8d4198bccbd07058416eeec51686b498e784eda166395a23eb99138e", size = 6346621, upload-time = "2026-02-11T04:21:19.547Z" }, - { url = "https://files.pythonhosted.org/packages/71/24/538bff45bde96535d7d998c6fed1a751c75ac7c53c37c90dc2601b243893/pillow-12.1.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:47b94983da0c642de92ced1702c5b6c292a84bd3a8e1d1702ff923f183594717", size = 7038069, upload-time = "2026-02-11T04:21:21.378Z" }, - { url = "https://files.pythonhosted.org/packages/94/0e/58cb1a6bc48f746bc4cb3adb8cabff73e2742c92b3bf7a220b7cf69b9177/pillow-12.1.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:518a48c2aab7ce596d3bf79d0e275661b846e86e4d0e7dec34712c30fe07f02a", size = 6460040, upload-time = "2026-02-11T04:21:23.148Z" }, - { url = "https://files.pythonhosted.org/packages/6c/57/9045cb3ff11eeb6c1adce3b2d60d7d299d7b273a2e6c8381a524abfdc474/pillow-12.1.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:a550ae29b95c6dc13cf69e2c9dc5747f814c54eeb2e32d683e5e93af56caa029", size = 7164523, upload-time = "2026-02-11T04:21:25.01Z" }, - { url = "https://files.pythonhosted.org/packages/73/f2/9be9cb99f2175f0d4dbadd6616ce1bf068ee54a28277ea1bf1fbf729c250/pillow-12.1.1-cp313-cp313-win32.whl", hash = "sha256:a003d7422449f6d1e3a34e3dd4110c22148336918ddbfc6a32581cd54b2e0b2b", size = 6332552, upload-time = "2026-02-11T04:21:27.238Z" }, - { url = "https://files.pythonhosted.org/packages/3f/eb/b0834ad8b583d7d9d42b80becff092082a1c3c156bb582590fcc973f1c7c/pillow-12.1.1-cp313-cp313-win_amd64.whl", hash = "sha256:344cf1e3dab3be4b1fa08e449323d98a2a3f819ad20f4b22e77a0ede31f0faa1", size = 7040108, upload-time = "2026-02-11T04:21:29.462Z" }, - { url = "https://files.pythonhosted.org/packages/d5/7d/fc09634e2aabdd0feabaff4a32f4a7d97789223e7c2042fd805ea4b4d2c2/pillow-12.1.1-cp313-cp313-win_arm64.whl", hash = "sha256:5c0dd1636633e7e6a0afe7bf6a51a14992b7f8e60de5789018ebbdfae55b040a", size = 2453712, upload-time = "2026-02-11T04:21:31.072Z" }, - { url = "https://files.pythonhosted.org/packages/19/2a/b9d62794fc8a0dd14c1943df68347badbd5511103e0d04c035ffe5cf2255/pillow-12.1.1-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:0330d233c1a0ead844fc097a7d16c0abff4c12e856c0b325f231820fee1f39da", size = 5264880, upload-time = "2026-02-11T04:21:32.865Z" }, - { url = "https://files.pythonhosted.org/packages/26/9d/e03d857d1347fa5ed9247e123fcd2a97b6220e15e9cb73ca0a8d91702c6e/pillow-12.1.1-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:5dae5f21afb91322f2ff791895ddd8889e5e947ff59f71b46041c8ce6db790bc", size = 4660616, upload-time = "2026-02-11T04:21:34.97Z" }, - { url = "https://files.pythonhosted.org/packages/f7/ec/8a6d22afd02570d30954e043f09c32772bfe143ba9285e2fdb11284952cd/pillow-12.1.1-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:2e0c664be47252947d870ac0d327fea7e63985a08794758aa8af5b6cb6ec0c9c", size = 6269008, upload-time = "2026-02-11T04:21:36.623Z" }, - { url = "https://files.pythonhosted.org/packages/3d/1d/6d875422c9f28a4a361f495a5f68d9de4a66941dc2c619103ca335fa6446/pillow-12.1.1-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:691ab2ac363b8217f7d31b3497108fb1f50faab2f75dfb03284ec2f217e87bf8", size = 8073226, upload-time = "2026-02-11T04:21:38.585Z" }, - { url = "https://files.pythonhosted.org/packages/a1/cd/134b0b6ee5eda6dc09e25e24b40fdafe11a520bc725c1d0bbaa5e00bf95b/pillow-12.1.1-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e9e8064fb1cc019296958595f6db671fba95209e3ceb0c4734c9baf97de04b20", size = 6380136, upload-time = "2026-02-11T04:21:40.562Z" }, - { url = "https://files.pythonhosted.org/packages/7a/a9/7628f013f18f001c1b98d8fffe3452f306a70dc6aba7d931019e0492f45e/pillow-12.1.1-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:472a8d7ded663e6162dafdf20015c486a7009483ca671cece7a9279b512fcb13", size = 7067129, upload-time = "2026-02-11T04:21:42.521Z" }, - { url = "https://files.pythonhosted.org/packages/1e/f8/66ab30a2193b277785601e82ee2d49f68ea575d9637e5e234faaa98efa4c/pillow-12.1.1-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:89b54027a766529136a06cfebeecb3a04900397a3590fd252160b888479517bf", size = 6491807, upload-time = "2026-02-11T04:21:44.22Z" }, - { url = "https://files.pythonhosted.org/packages/da/0b/a877a6627dc8318fdb84e357c5e1a758c0941ab1ddffdafd231983788579/pillow-12.1.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:86172b0831b82ce4f7877f280055892b31179e1576aa00d0df3bb1bbf8c3e524", size = 7190954, upload-time = "2026-02-11T04:21:46.114Z" }, - { url = "https://files.pythonhosted.org/packages/83/43/6f732ff85743cf746b1361b91665d9f5155e1483817f693f8d57ea93147f/pillow-12.1.1-cp313-cp313t-win32.whl", hash = "sha256:44ce27545b6efcf0fdbdceb31c9a5bdea9333e664cda58a7e674bb74608b3986", size = 6336441, upload-time = "2026-02-11T04:21:48.22Z" }, - { url = "https://files.pythonhosted.org/packages/3b/44/e865ef3986611bb75bfabdf94a590016ea327833f434558801122979cd0e/pillow-12.1.1-cp313-cp313t-win_amd64.whl", hash = "sha256:a285e3eb7a5a45a2ff504e31f4a8d1b12ef62e84e5411c6804a42197c1cf586c", size = 7045383, upload-time = "2026-02-11T04:21:50.015Z" }, - { url = "https://files.pythonhosted.org/packages/a8/c6/f4fb24268d0c6908b9f04143697ea18b0379490cb74ba9e8d41b898bd005/pillow-12.1.1-cp313-cp313t-win_arm64.whl", hash = "sha256:cc7d296b5ea4d29e6570dabeaed58d31c3fea35a633a69679fb03d7664f43fb3", size = 2456104, upload-time = "2026-02-11T04:21:51.633Z" }, - { url = "https://files.pythonhosted.org/packages/56/11/5d43209aa4cb58e0cc80127956ff1796a68b928e6324bbf06ef4db34367b/pillow-12.1.1-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:600fd103672b925fe62ed08e0d874ea34d692474df6f4bf7ebe148b30f89f39f", size = 5228606, upload-time = "2026-02-11T04:22:52.106Z" }, - { url = "https://files.pythonhosted.org/packages/5f/d5/3b005b4e4fda6698b371fa6c21b097d4707585d7db99e98d9b0b87ac612a/pillow-12.1.1-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:665e1b916b043cef294bc54d47bf02d87e13f769bc4bc5fa225a24b3a6c5aca9", size = 4622321, upload-time = "2026-02-11T04:22:53.827Z" }, - { url = "https://files.pythonhosted.org/packages/df/36/ed3ea2d594356fd8037e5a01f6156c74bc8d92dbb0fa60746cc96cabb6e8/pillow-12.1.1-pp311-pypy311_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:495c302af3aad1ca67420ddd5c7bd480c8867ad173528767d906428057a11f0e", size = 5247579, upload-time = "2026-02-11T04:22:56.094Z" }, - { url = "https://files.pythonhosted.org/packages/54/9a/9cc3e029683cf6d20ae5085da0dafc63148e3252c2f13328e553aaa13cfb/pillow-12.1.1-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:8fd420ef0c52c88b5a035a0886f367748c72147b2b8f384c9d12656678dfdfa9", size = 6989094, upload-time = "2026-02-11T04:22:58.288Z" }, - { url = "https://files.pythonhosted.org/packages/00/98/fc53ab36da80b88df0967896b6c4b4cd948a0dc5aa40a754266aa3ae48b3/pillow-12.1.1-pp311-pypy311_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:f975aa7ef9684ce7e2c18a3aa8f8e2106ce1e46b94ab713d156b2898811651d3", size = 5313850, upload-time = "2026-02-11T04:23:00.554Z" }, - { url = "https://files.pythonhosted.org/packages/30/02/00fa585abfd9fe9d73e5f6e554dc36cc2b842898cbfc46d70353dae227f8/pillow-12.1.1-pp311-pypy311_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8089c852a56c2966cf18835db62d9b34fef7ba74c726ad943928d494fa7f4735", size = 5963343, upload-time = "2026-02-11T04:23:02.934Z" }, - { url = "https://files.pythonhosted.org/packages/f2/26/c56ce33ca856e358d27fda9676c055395abddb82c35ac0f593877ed4562e/pillow-12.1.1-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:cb9bb857b2d057c6dfc72ac5f3b44836924ba15721882ef103cecb40d002d80e", size = 7029880, upload-time = "2026-02-11T04:23:04.783Z" }, +version = "12.2.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/8c/21/c2bcdd5906101a30244eaffc1b6e6ce71a31bd0742a01eb89e660ebfac2d/pillow-12.2.0.tar.gz", hash = "sha256:a830b1a40919539d07806aa58e1b114df53ddd43213d9c8b75847eee6c0182b5", size = 46987819, upload-time = "2026-04-01T14:46:17.687Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/68/e1/748f5663efe6edcfc4e74b2b93edfb9b8b99b67f21a854c3ae416500a2d9/pillow-12.2.0-cp311-cp311-macosx_10_10_x86_64.whl", hash = "sha256:8be29e59487a79f173507c30ddf57e733a357f67881430449bb32614075a40ab", size = 5354347, upload-time = "2026-04-01T14:42:44.255Z" }, + { url = "https://files.pythonhosted.org/packages/47/a1/d5ff69e747374c33a3b53b9f98cca7889fce1fd03d79cdc4e1bccc6c5a87/pillow-12.2.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:71cde9a1e1551df7d34a25462fc60325e8a11a82cc2e2f54578e5e9a1e153d65", size = 4695873, upload-time = "2026-04-01T14:42:46.452Z" }, + { url = "https://files.pythonhosted.org/packages/df/21/e3fbdf54408a973c7f7f89a23b2cb97a7ef30c61ab4142af31eee6aebc88/pillow-12.2.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:f490f9368b6fc026f021db16d7ec2fbf7d89e2edb42e8ec09d2c60505f5729c7", size = 6280168, upload-time = "2026-04-01T14:42:49.228Z" }, + { url = "https://files.pythonhosted.org/packages/d3/f1/00b7278c7dd52b17ad4329153748f87b6756ec195ff786c2bdf12518337d/pillow-12.2.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:8bd7903a5f2a4545f6fd5935c90058b89d30045568985a71c79f5fd6edf9b91e", size = 8088188, upload-time = "2026-04-01T14:42:51.735Z" }, + { url = "https://files.pythonhosted.org/packages/ad/cf/220a5994ef1b10e70e85748b75649d77d506499352be135a4989c957b701/pillow-12.2.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3997232e10d2920a68d25191392e3a4487d8183039e1c74c2297f00ed1c50705", size = 6394401, upload-time = "2026-04-01T14:42:54.343Z" }, + { url = "https://files.pythonhosted.org/packages/e9/bd/e51a61b1054f09437acfbc2ff9106c30d1eb76bc1453d428399946781253/pillow-12.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e74473c875d78b8e9d5da2a70f7099549f9eb37ded4e2f6a463e60125bccd176", size = 7079655, upload-time = "2026-04-01T14:42:56.954Z" }, + { url = "https://files.pythonhosted.org/packages/6b/3d/45132c57d5fb4b5744567c3817026480ac7fc3ce5d4c47902bc0e7f6f853/pillow-12.2.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:56a3f9c60a13133a98ecff6197af34d7824de9b7b38c3654861a725c970c197b", size = 6503105, upload-time = "2026-04-01T14:42:59.847Z" }, + { url = "https://files.pythonhosted.org/packages/7d/2e/9df2fc1e82097b1df3dce58dc43286aa01068e918c07574711fcc53e6fb4/pillow-12.2.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:90e6f81de50ad6b534cab6e5aef77ff6e37722b2f5d908686f4a5c9eba17a909", size = 7203402, upload-time = "2026-04-01T14:43:02.664Z" }, + { url = "https://files.pythonhosted.org/packages/bd/2e/2941e42858ebb67e50ae741473de81c2984e6eff7b397017623c676e2e8d/pillow-12.2.0-cp311-cp311-win32.whl", hash = "sha256:8c984051042858021a54926eb597d6ee3012393ce9c181814115df4c60b9a808", size = 6378149, upload-time = "2026-04-01T14:43:05.274Z" }, + { url = "https://files.pythonhosted.org/packages/69/42/836b6f3cd7f3e5fa10a1f1a5420447c17966044c8fbf589cc0452d5502db/pillow-12.2.0-cp311-cp311-win_amd64.whl", hash = "sha256:6e6b2a0c538fc200b38ff9eb6628228b77908c319a005815f2dde585a0664b60", size = 7082626, upload-time = "2026-04-01T14:43:08.557Z" }, + { url = "https://files.pythonhosted.org/packages/c2/88/549194b5d6f1f494b485e493edc6693c0a16f4ada488e5bd974ed1f42fad/pillow-12.2.0-cp311-cp311-win_arm64.whl", hash = "sha256:9a8a34cc89c67a65ea7437ce257cea81a9dad65b29805f3ecee8c8fe8ff25ffe", size = 2463531, upload-time = "2026-04-01T14:43:10.743Z" }, + { url = "https://files.pythonhosted.org/packages/58/be/7482c8a5ebebbc6470b3eb791812fff7d5e0216c2be3827b30b8bb6603ed/pillow-12.2.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:2d192a155bbcec180f8564f693e6fd9bccff5a7af9b32e2e4bf8c9c69dbad6b5", size = 5308279, upload-time = "2026-04-01T14:43:13.246Z" }, + { url = "https://files.pythonhosted.org/packages/d8/95/0a351b9289c2b5cbde0bacd4a83ebc44023e835490a727b2a3bd60ddc0f4/pillow-12.2.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:f3f40b3c5a968281fd507d519e444c35f0ff171237f4fdde090dd60699458421", size = 4695490, upload-time = "2026-04-01T14:43:15.584Z" }, + { url = "https://files.pythonhosted.org/packages/de/af/4e8e6869cbed569d43c416fad3dc4ecb944cb5d9492defaed89ddd6fe871/pillow-12.2.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:03e7e372d5240cc23e9f07deca4d775c0817bffc641b01e9c3af208dbd300987", size = 6284462, upload-time = "2026-04-01T14:43:18.268Z" }, + { url = "https://files.pythonhosted.org/packages/e9/9e/c05e19657fd57841e476be1ab46c4d501bffbadbafdc31a6d665f8b737b6/pillow-12.2.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:b86024e52a1b269467a802258c25521e6d742349d760728092e1bc2d135b4d76", size = 8094744, upload-time = "2026-04-01T14:43:20.716Z" }, + { url = "https://files.pythonhosted.org/packages/2b/54/1789c455ed10176066b6e7e6da1b01e50e36f94ba584dc68d9eebfe9156d/pillow-12.2.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7371b48c4fa448d20d2714c9a1f775a81155050d383333e0a6c15b1123dda005", size = 6398371, upload-time = "2026-04-01T14:43:23.443Z" }, + { url = "https://files.pythonhosted.org/packages/43/e3/fdc657359e919462369869f1c9f0e973f353f9a9ee295a39b1fea8ee1a77/pillow-12.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:62f5409336adb0663b7caa0da5c7d9e7bdbaae9ce761d34669420c2a801b2780", size = 7087215, upload-time = "2026-04-01T14:43:26.758Z" }, + { url = "https://files.pythonhosted.org/packages/8b/f8/2f6825e441d5b1959d2ca5adec984210f1ec086435b0ed5f52c19b3b8a6e/pillow-12.2.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:01afa7cf67f74f09523699b4e88c73fb55c13346d212a59a2db1f86b0a63e8c5", size = 6509783, upload-time = "2026-04-01T14:43:29.56Z" }, + { url = "https://files.pythonhosted.org/packages/67/f9/029a27095ad20f854f9dba026b3ea6428548316e057e6fc3545409e86651/pillow-12.2.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:fc3d34d4a8fbec3e88a79b92e5465e0f9b842b628675850d860b8bd300b159f5", size = 7212112, upload-time = "2026-04-01T14:43:32.091Z" }, + { url = "https://files.pythonhosted.org/packages/be/42/025cfe05d1be22dbfdb4f264fe9de1ccda83f66e4fc3aac94748e784af04/pillow-12.2.0-cp312-cp312-win32.whl", hash = "sha256:58f62cc0f00fd29e64b29f4fd923ffdb3859c9f9e6105bfc37ba1d08994e8940", size = 6378489, upload-time = "2026-04-01T14:43:34.601Z" }, + { url = "https://files.pythonhosted.org/packages/5d/7b/25a221d2c761c6a8ae21bfa3874988ff2583e19cf8a27bf2fee358df7942/pillow-12.2.0-cp312-cp312-win_amd64.whl", hash = "sha256:7f84204dee22a783350679a0333981df803dac21a0190d706a50475e361c93f5", size = 7084129, upload-time = "2026-04-01T14:43:37.213Z" }, + { url = "https://files.pythonhosted.org/packages/10/e1/542a474affab20fd4a0f1836cb234e8493519da6b76899e30bcc5d990b8b/pillow-12.2.0-cp312-cp312-win_arm64.whl", hash = "sha256:af73337013e0b3b46f175e79492d96845b16126ddf79c438d7ea7ff27783a414", size = 2463612, upload-time = "2026-04-01T14:43:39.421Z" }, + { url = "https://files.pythonhosted.org/packages/4a/01/53d10cf0dbad820a8db274d259a37ba50b88b24768ddccec07355382d5ad/pillow-12.2.0-cp313-cp313-ios_13_0_arm64_iphoneos.whl", hash = "sha256:8297651f5b5679c19968abefd6bb84d95fe30ef712eb1b2d9b2d31ca61267f4c", size = 4100837, upload-time = "2026-04-01T14:43:41.506Z" }, + { url = "https://files.pythonhosted.org/packages/0f/98/f3a6657ecb698c937f6c76ee564882945f29b79bad496abcba0e84659ec5/pillow-12.2.0-cp313-cp313-ios_13_0_arm64_iphonesimulator.whl", hash = "sha256:50d8520da2a6ce0af445fa6d648c4273c3eeefbc32d7ce049f22e8b5c3daecc2", size = 4176528, upload-time = "2026-04-01T14:43:43.773Z" }, + { url = "https://files.pythonhosted.org/packages/69/bc/8986948f05e3ea490b8442ea1c1d4d990b24a7e43d8a51b2c7d8b1dced36/pillow-12.2.0-cp313-cp313-ios_13_0_x86_64_iphonesimulator.whl", hash = "sha256:766cef22385fa1091258ad7e6216792b156dc16d8d3fa607e7545b2b72061f1c", size = 3640401, upload-time = "2026-04-01T14:43:45.87Z" }, + { url = "https://files.pythonhosted.org/packages/34/46/6c717baadcd62bc8ed51d238d521ab651eaa74838291bda1f86fe1f864c9/pillow-12.2.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:5d2fd0fa6b5d9d1de415060363433f28da8b1526c1c129020435e186794b3795", size = 5308094, upload-time = "2026-04-01T14:43:48.438Z" }, + { url = "https://files.pythonhosted.org/packages/71/43/905a14a8b17fdb1ccb58d282454490662d2cb89a6bfec26af6d3520da5ec/pillow-12.2.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:56b25336f502b6ed02e889f4ece894a72612fe885889a6e8c4c80239ff6e5f5f", size = 4695402, upload-time = "2026-04-01T14:43:51.292Z" }, + { url = "https://files.pythonhosted.org/packages/73/dd/42107efcb777b16fa0393317eac58f5b5cf30e8392e266e76e51cff28c3d/pillow-12.2.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:f1c943e96e85df3d3478f7b691f229887e143f81fedab9b20205349ab04d73ed", size = 6280005, upload-time = "2026-04-01T14:43:54.242Z" }, + { url = "https://files.pythonhosted.org/packages/a8/68/b93e09e5e8549019e61acf49f65b1a8530765a7f812c77a7461bca7e4494/pillow-12.2.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:03f6fab9219220f041c74aeaa2939ff0062bd5c364ba9ce037197f4c6d498cd9", size = 8090669, upload-time = "2026-04-01T14:43:57.335Z" }, + { url = "https://files.pythonhosted.org/packages/4b/6e/3ccb54ce8ec4ddd1accd2d89004308b7b0b21c4ac3d20fa70af4760a4330/pillow-12.2.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5cdfebd752ec52bf5bb4e35d9c64b40826bc5b40a13df7c3cda20a2c03a0f5ed", size = 6395194, upload-time = "2026-04-01T14:43:59.864Z" }, + { url = "https://files.pythonhosted.org/packages/67/ee/21d4e8536afd1a328f01b359b4d3997b291ffd35a237c877b331c1c3b71c/pillow-12.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:eedf4b74eda2b5a4b2b2fb4c006d6295df3bf29e459e198c90ea48e130dc75c3", size = 7082423, upload-time = "2026-04-01T14:44:02.74Z" }, + { url = "https://files.pythonhosted.org/packages/78/5f/e9f86ab0146464e8c133fe85df987ed9e77e08b29d8d35f9f9f4d6f917ba/pillow-12.2.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:00a2865911330191c0b818c59103b58a5e697cae67042366970a6b6f1b20b7f9", size = 6505667, upload-time = "2026-04-01T14:44:05.381Z" }, + { url = "https://files.pythonhosted.org/packages/ed/1e/409007f56a2fdce61584fd3acbc2bbc259857d555196cedcadc68c015c82/pillow-12.2.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:1e1757442ed87f4912397c6d35a0db6a7b52592156014706f17658ff58bbf795", size = 7208580, upload-time = "2026-04-01T14:44:08.39Z" }, + { url = "https://files.pythonhosted.org/packages/23/c4/7349421080b12fb35414607b8871e9534546c128a11965fd4a7002ccfbee/pillow-12.2.0-cp313-cp313-win32.whl", hash = "sha256:144748b3af2d1b358d41286056d0003f47cb339b8c43a9ea42f5fea4d8c66b6e", size = 6375896, upload-time = "2026-04-01T14:44:11.197Z" }, + { url = "https://files.pythonhosted.org/packages/3f/82/8a3739a5e470b3c6cbb1d21d315800d8e16bff503d1f16b03a4ec3212786/pillow-12.2.0-cp313-cp313-win_amd64.whl", hash = "sha256:390ede346628ccc626e5730107cde16c42d3836b89662a115a921f28440e6a3b", size = 7081266, upload-time = "2026-04-01T14:44:13.947Z" }, + { url = "https://files.pythonhosted.org/packages/c3/25/f968f618a062574294592f668218f8af564830ccebdd1fa6200f598e65c5/pillow-12.2.0-cp313-cp313-win_arm64.whl", hash = "sha256:8023abc91fba39036dbce14a7d6535632f99c0b857807cbbbf21ecc9f4717f06", size = 2463508, upload-time = "2026-04-01T14:44:16.312Z" }, + { url = "https://files.pythonhosted.org/packages/4d/a4/b342930964e3cb4dce5038ae34b0eab4653334995336cd486c5a8c25a00c/pillow-12.2.0-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:042db20a421b9bafecc4b84a8b6e444686bd9d836c7fd24542db3e7df7baad9b", size = 5309927, upload-time = "2026-04-01T14:44:18.89Z" }, + { url = "https://files.pythonhosted.org/packages/9f/de/23198e0a65a9cf06123f5435a5d95cea62a635697f8f03d134d3f3a96151/pillow-12.2.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:dd025009355c926a84a612fecf58bb315a3f6814b17ead51a8e48d3823d9087f", size = 4698624, upload-time = "2026-04-01T14:44:21.115Z" }, + { url = "https://files.pythonhosted.org/packages/01/a6/1265e977f17d93ea37aa28aa81bad4fa597933879fac2520d24e021c8da3/pillow-12.2.0-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:88ddbc66737e277852913bd1e07c150cc7bb124539f94c4e2df5344494e0a612", size = 6321252, upload-time = "2026-04-01T14:44:23.663Z" }, + { url = "https://files.pythonhosted.org/packages/3c/83/5982eb4a285967baa70340320be9f88e57665a387e3a53a7f0db8231a0cd/pillow-12.2.0-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d362d1878f00c142b7e1a16e6e5e780f02be8195123f164edf7eddd911eefe7c", size = 8126550, upload-time = "2026-04-01T14:44:26.772Z" }, + { url = "https://files.pythonhosted.org/packages/4e/48/6ffc514adce69f6050d0753b1a18fd920fce8cac87620d5a31231b04bfc5/pillow-12.2.0-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2c727a6d53cb0018aadd8018c2b938376af27914a68a492f59dfcaca650d5eea", size = 6433114, upload-time = "2026-04-01T14:44:29.615Z" }, + { url = "https://files.pythonhosted.org/packages/36/a3/f9a77144231fb8d40ee27107b4463e205fa4677e2ca2548e14da5cf18dce/pillow-12.2.0-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:efd8c21c98c5cc60653bcb311bef2ce0401642b7ce9d09e03a7da87c878289d4", size = 7115667, upload-time = "2026-04-01T14:44:32.773Z" }, + { url = "https://files.pythonhosted.org/packages/c1/fc/ac4ee3041e7d5a565e1c4fd72a113f03b6394cc72ab7089d27608f8aaccb/pillow-12.2.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:9f08483a632889536b8139663db60f6724bfcb443c96f1b18855860d7d5c0fd4", size = 6538966, upload-time = "2026-04-01T14:44:35.252Z" }, + { url = "https://files.pythonhosted.org/packages/c0/a8/27fb307055087f3668f6d0a8ccb636e7431d56ed0750e07a60547b1e083e/pillow-12.2.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:dac8d77255a37e81a2efcbd1fc05f1c15ee82200e6c240d7e127e25e365c39ea", size = 7238241, upload-time = "2026-04-01T14:44:37.875Z" }, + { url = "https://files.pythonhosted.org/packages/ad/4b/926ab182c07fccae9fcb120043464e1ff1564775ec8864f21a0ebce6ac25/pillow-12.2.0-cp313-cp313t-win32.whl", hash = "sha256:ee3120ae9dff32f121610bb08e4313be87e03efeadfc6c0d18f89127e24d0c24", size = 6379592, upload-time = "2026-04-01T14:44:40.336Z" }, + { url = "https://files.pythonhosted.org/packages/c2/c4/f9e476451a098181b30050cc4c9a3556b64c02cf6497ea421ac047e89e4b/pillow-12.2.0-cp313-cp313t-win_amd64.whl", hash = "sha256:325ca0528c6788d2a6c3d40e3568639398137346c3d6e66bb61db96b96511c98", size = 7085542, upload-time = "2026-04-01T14:44:43.251Z" }, + { url = "https://files.pythonhosted.org/packages/00/a4/285f12aeacbe2d6dc36c407dfbbe9e96d4a80b0fb710a337f6d2ad978c75/pillow-12.2.0-cp313-cp313t-win_arm64.whl", hash = "sha256:2e5a76d03a6c6dcef67edabda7a52494afa4035021a79c8558e14af25313d453", size = 2465765, upload-time = "2026-04-01T14:44:45.996Z" }, + { url = "https://files.pythonhosted.org/packages/4e/b7/2437044fb910f499610356d1352e3423753c98e34f915252aafecc64889f/pillow-12.2.0-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:0538bd5e05efec03ae613fd89c4ce0368ecd2ba239cc25b9f9be7ed426b0af1f", size = 5273969, upload-time = "2026-04-01T14:45:55.538Z" }, + { url = "https://files.pythonhosted.org/packages/f6/f4/8316e31de11b780f4ac08ef3654a75555e624a98db1056ecb2122d008d5a/pillow-12.2.0-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:394167b21da716608eac917c60aa9b969421b5dcbbe02ae7f013e7b85811c69d", size = 4659674, upload-time = "2026-04-01T14:45:58.093Z" }, + { url = "https://files.pythonhosted.org/packages/d4/37/664fca7201f8bb2aa1d20e2c3d5564a62e6ae5111741966c8319ca802361/pillow-12.2.0-pp311-pypy311_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5d04bfa02cc2d23b497d1e90a0f927070043f6cbf303e738300532379a4b4e0f", size = 5288479, upload-time = "2026-04-01T14:46:01.141Z" }, + { url = "https://files.pythonhosted.org/packages/49/62/5b0ed78fce87346be7a5cfcfaaad91f6a1f98c26f86bdbafa2066c647ef6/pillow-12.2.0-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:0c838a5125cee37e68edec915651521191cef1e6aa336b855f495766e77a366e", size = 7032230, upload-time = "2026-04-01T14:46:03.874Z" }, + { url = "https://files.pythonhosted.org/packages/c3/28/ec0fc38107fc32536908034e990c47914c57cd7c5a3ece4d8d8f7ffd7e27/pillow-12.2.0-pp311-pypy311_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4a6c9fa44005fa37a91ebfc95d081e8079757d2e904b27103f4f5fa6f0bf78c0", size = 5355404, upload-time = "2026-04-01T14:46:06.33Z" }, + { url = "https://files.pythonhosted.org/packages/5e/8b/51b0eddcfa2180d60e41f06bd6d0a62202b20b59c68f5a132e615b75aecf/pillow-12.2.0-pp311-pypy311_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:25373b66e0dd5905ed63fa3cae13c82fbddf3079f2c8bf15c6fb6a35586324c1", size = 6002215, upload-time = "2026-04-01T14:46:08.83Z" }, + { url = "https://files.pythonhosted.org/packages/bc/60/5382c03e1970de634027cee8e1b7d39776b778b81812aaf45b694dfe9e28/pillow-12.2.0-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:bfa9c230d2fe991bed5318a5f119bd6780cda2915cca595393649fc118ab895e", size = 7080946, upload-time = "2026-04-01T14:46:11.734Z" }, ] [[package]] name = "platformdirs" -version = "4.9.4" +version = "4.9.6" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/19/56/8d4c30c8a1d07013911a8fdbd8f89440ef9f08d07a1b50ab8ca8be5a20f9/platformdirs-4.9.4.tar.gz", hash = "sha256:1ec356301b7dc906d83f371c8f487070e99d3ccf9e501686456394622a01a934", size = 28737, upload-time = "2026-03-05T18:34:13.271Z" } +sdist = { url = "https://files.pythonhosted.org/packages/9f/4a/0883b8e3802965322523f0b200ecf33d31f10991d0401162f4b23c698b42/platformdirs-4.9.6.tar.gz", hash = "sha256:3bfa75b0ad0db84096ae777218481852c0ebc6c727b3168c1b9e0118e458cf0a", size = 29400, upload-time = "2026-04-09T00:04:10.812Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/63/d7/97f7e3a6abb67d8080dd406fd4df842c2be0efaf712d1c899c32a075027c/platformdirs-4.9.4-py3-none-any.whl", hash = "sha256:68a9a4619a666ea6439f2ff250c12a853cd1cbd5158d258bd824a7df6be2f868", size = 21216, upload-time = "2026-03-05T18:34:12.172Z" }, + { url = "https://files.pythonhosted.org/packages/75/a6/a0a304dc33b49145b21f4808d763822111e67d1c3a32b524a1baf947b6e1/platformdirs-4.9.6-py3-none-any.whl", hash = "sha256:e61adb1d5e5cb3441b4b7710bea7e4c12250ca49439228cc1021c00dcfac0917", size = 21348, upload-time = "2026-04-09T00:04:09.463Z" }, ] [[package]] @@ -2127,7 +2113,7 @@ wheels = [ [[package]] name = "pre-commit" -version = "4.5.1" +version = "4.6.0" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "cfgv" }, @@ -2136,9 +2122,9 @@ dependencies = [ { name = "pyyaml" }, { name = "virtualenv" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/40/f1/6d86a29246dfd2e9b6237f0b5823717f60cad94d47ddc26afa916d21f525/pre_commit-4.5.1.tar.gz", hash = "sha256:eb545fcff725875197837263e977ea257a402056661f09dae08e4b149b030a61", size = 198232, upload-time = "2025-12-16T21:14:33.552Z" } +sdist = { url = "https://files.pythonhosted.org/packages/8e/22/2de9408ac81acbb8a7d05d4cc064a152ccf33b3d480ebe0cd292153db239/pre_commit-4.6.0.tar.gz", hash = "sha256:718d2208cef53fdc38206e40524a6d4d9576d103eb16f0fec11c875e7716e9d9", size = 198525, upload-time = "2026-04-21T20:31:41.613Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/5d/19/fd3ef348460c80af7bb4669ea7926651d1f95c23ff2df18b9d24bab4f3fa/pre_commit-4.5.1-py2.py3-none-any.whl", hash = "sha256:3b3afd891e97337708c1674210f8eba659b52a38ea5f822ff142d10786221f77", size = 226437, upload-time = "2025-12-16T21:14:32.409Z" }, + { url = "https://files.pythonhosted.org/packages/80/6e/4b28b62ecb6aae56769c34a8ff1d661473ec1e9519e2d5f8b2c150086b26/pre_commit-4.6.0-py2.py3-none-any.whl", hash = "sha256:e2cf246f7299edcabcf15f9b0571fdce06058527f0a06535068a86d38089f29b", size = 226472, upload-time = "2026-04-21T20:31:40.092Z" }, ] [[package]] @@ -2264,51 +2250,51 @@ wheels = [ [[package]] name = "pyarrow" -version = "23.0.1" -source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/88/22/134986a4cc224d593c1afde5494d18ff629393d74cc2eddb176669f234a4/pyarrow-23.0.1.tar.gz", hash = "sha256:b8c5873e33440b2bc2f4a79d2b47017a89c5a24116c055625e6f2ee50523f019", size = 1167336, upload-time = "2026-02-16T10:14:12.39Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/b0/41/8e6b6ef7e225d4ceead8459427a52afdc23379768f54dd3566014d7618c1/pyarrow-23.0.1-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:6f0147ee9e0386f519c952cc670eb4a8b05caa594eeffe01af0e25f699e4e9bb", size = 34302230, upload-time = "2026-02-16T10:09:03.859Z" }, - { url = "https://files.pythonhosted.org/packages/bf/4a/1472c00392f521fea03ae93408bf445cc7bfa1ab81683faf9bc188e36629/pyarrow-23.0.1-cp311-cp311-macosx_12_0_x86_64.whl", hash = "sha256:0ae6e17c828455b6265d590100c295193f93cc5675eb0af59e49dbd00d2de350", size = 35850050, upload-time = "2026-02-16T10:09:11.877Z" }, - { url = "https://files.pythonhosted.org/packages/0c/b2/bd1f2f05ded56af7f54d702c8364c9c43cd6abb91b0e9933f3d77b4f4132/pyarrow-23.0.1-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:fed7020203e9ef273360b9e45be52a2a47d3103caf156a30ace5247ffb51bdbd", size = 44491918, upload-time = "2026-02-16T10:09:18.144Z" }, - { url = "https://files.pythonhosted.org/packages/0b/62/96459ef5b67957eac38a90f541d1c28833d1b367f014a482cb63f3b7cd2d/pyarrow-23.0.1-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:26d50dee49d741ac0e82185033488d28d35be4d763ae6f321f97d1140eb7a0e9", size = 47562811, upload-time = "2026-02-16T10:09:25.792Z" }, - { url = "https://files.pythonhosted.org/packages/7d/94/1170e235add1f5f45a954e26cd0e906e7e74e23392dcb560de471f7366ec/pyarrow-23.0.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:3c30143b17161310f151f4a2bcfe41b5ff744238c1039338779424e38579d701", size = 48183766, upload-time = "2026-02-16T10:09:34.645Z" }, - { url = "https://files.pythonhosted.org/packages/0e/2d/39a42af4570377b99774cdb47f63ee6c7da7616bd55b3d5001aa18edfe4f/pyarrow-23.0.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:db2190fa79c80a23fdd29fef4b8992893f024ae7c17d2f5f4db7171fa30c2c78", size = 50607669, upload-time = "2026-02-16T10:09:44.153Z" }, - { url = "https://files.pythonhosted.org/packages/00/ca/db94101c187f3df742133ac837e93b1f269ebdac49427f8310ee40b6a58f/pyarrow-23.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:f00f993a8179e0e1c9713bcc0baf6d6c01326a406a9c23495ec1ba9c9ebf2919", size = 27527698, upload-time = "2026-02-16T10:09:50.263Z" }, - { url = "https://files.pythonhosted.org/packages/9a/4b/4166bb5abbfe6f750fc60ad337c43ecf61340fa52ab386da6e8dbf9e63c4/pyarrow-23.0.1-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:f4b0dbfa124c0bb161f8b5ebb40f1a680b70279aa0c9901d44a2b5a20806039f", size = 34214575, upload-time = "2026-02-16T10:09:56.225Z" }, - { url = "https://files.pythonhosted.org/packages/e1/da/3f941e3734ac8088ea588b53e860baeddac8323ea40ce22e3d0baa865cc9/pyarrow-23.0.1-cp312-cp312-macosx_12_0_x86_64.whl", hash = "sha256:7707d2b6673f7de054e2e83d59f9e805939038eebe1763fe811ee8fa5c0cd1a7", size = 35832540, upload-time = "2026-02-16T10:10:03.428Z" }, - { url = "https://files.pythonhosted.org/packages/88/7c/3d841c366620e906d54430817531b877ba646310296df42ef697308c2705/pyarrow-23.0.1-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:86ff03fb9f1a320266e0de855dee4b17da6794c595d207f89bba40d16b5c78b9", size = 44470940, upload-time = "2026-02-16T10:10:10.704Z" }, - { url = "https://files.pythonhosted.org/packages/2c/a5/da83046273d990f256cb79796a190bbf7ec999269705ddc609403f8c6b06/pyarrow-23.0.1-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:813d99f31275919c383aab17f0f455a04f5a429c261cc411b1e9a8f5e4aaaa05", size = 47586063, upload-time = "2026-02-16T10:10:17.95Z" }, - { url = "https://files.pythonhosted.org/packages/5b/3c/b7d2ebcff47a514f47f9da1e74b7949138c58cfeb108cdd4ee62f43f0cf3/pyarrow-23.0.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:bf5842f960cddd2ef757d486041d57c96483efc295a8c4a0e20e704cbbf39c67", size = 48173045, upload-time = "2026-02-16T10:10:25.363Z" }, - { url = "https://files.pythonhosted.org/packages/43/b2/b40961262213beaba6acfc88698eb773dfce32ecdf34d19291db94c2bd73/pyarrow-23.0.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:564baf97c858ecc03ec01a41062e8f4698abc3e6e2acd79c01c2e97880a19730", size = 50621741, upload-time = "2026-02-16T10:10:33.477Z" }, - { url = "https://files.pythonhosted.org/packages/f6/70/1fdda42d65b28b078e93d75d371b2185a61da89dda4def8ba6ba41ebdeb4/pyarrow-23.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:07deae7783782ac7250989a7b2ecde9b3c343a643f82e8a4df03d93b633006f0", size = 27620678, upload-time = "2026-02-16T10:10:39.31Z" }, - { url = "https://files.pythonhosted.org/packages/47/10/2cbe4c6f0fb83d2de37249567373d64327a5e4d8db72f486db42875b08f6/pyarrow-23.0.1-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:6b8fda694640b00e8af3c824f99f789e836720aa8c9379fb435d4c4953a756b8", size = 34210066, upload-time = "2026-02-16T10:10:45.487Z" }, - { url = "https://files.pythonhosted.org/packages/cb/4f/679fa7e84dadbaca7a65f7cdba8d6c83febbd93ca12fa4adf40ba3b6362b/pyarrow-23.0.1-cp313-cp313-macosx_12_0_x86_64.whl", hash = "sha256:8ff51b1addc469b9444b7c6f3548e19dc931b172ab234e995a60aea9f6e6025f", size = 35825526, upload-time = "2026-02-16T10:10:52.266Z" }, - { url = "https://files.pythonhosted.org/packages/f9/63/d2747d930882c9d661e9398eefc54f15696547b8983aaaf11d4a2e8b5426/pyarrow-23.0.1-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:71c5be5cbf1e1cb6169d2a0980850bccb558ddc9b747b6206435313c47c37677", size = 44473279, upload-time = "2026-02-16T10:11:01.557Z" }, - { url = "https://files.pythonhosted.org/packages/b3/93/10a48b5e238de6d562a411af6467e71e7aedbc9b87f8d3a35f1560ae30fb/pyarrow-23.0.1-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:9b6f4f17b43bc39d56fec96e53fe89d94bac3eb134137964371b45352d40d0c2", size = 47585798, upload-time = "2026-02-16T10:11:09.401Z" }, - { url = "https://files.pythonhosted.org/packages/5c/20/476943001c54ef078dbf9542280e22741219a184a0632862bca4feccd666/pyarrow-23.0.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:9fc13fc6c403d1337acab46a2c4346ca6c9dec5780c3c697cf8abfd5e19b6b37", size = 48179446, upload-time = "2026-02-16T10:11:17.781Z" }, - { url = "https://files.pythonhosted.org/packages/4b/b6/5dd0c47b335fcd8edba9bfab78ad961bd0fd55ebe53468cc393f45e0be60/pyarrow-23.0.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:5c16ed4f53247fa3ffb12a14d236de4213a4415d127fe9cebed33d51671113e2", size = 50623972, upload-time = "2026-02-16T10:11:26.185Z" }, - { url = "https://files.pythonhosted.org/packages/d5/09/a532297c9591a727d67760e2e756b83905dd89adb365a7f6e9c72578bcc1/pyarrow-23.0.1-cp313-cp313-win_amd64.whl", hash = "sha256:cecfb12ef629cf6be0b1887f9f86463b0dd3dc3195ae6224e74006be4736035a", size = 27540749, upload-time = "2026-02-16T10:12:23.297Z" }, - { url = "https://files.pythonhosted.org/packages/a5/8e/38749c4b1303e6ae76b3c80618f84861ae0c55dd3c2273842ea6f8258233/pyarrow-23.0.1-cp313-cp313t-macosx_12_0_arm64.whl", hash = "sha256:29f7f7419a0e30264ea261fdc0e5fe63ce5a6095003db2945d7cd78df391a7e1", size = 34471544, upload-time = "2026-02-16T10:11:32.535Z" }, - { url = "https://files.pythonhosted.org/packages/a3/73/f237b2bc8c669212f842bcfd842b04fc8d936bfc9d471630569132dc920d/pyarrow-23.0.1-cp313-cp313t-macosx_12_0_x86_64.whl", hash = "sha256:33d648dc25b51fd8055c19e4261e813dfc4d2427f068bcecc8b53d01b81b0500", size = 35949911, upload-time = "2026-02-16T10:11:39.813Z" }, - { url = "https://files.pythonhosted.org/packages/0c/86/b912195eee0903b5611bf596833def7d146ab2d301afeb4b722c57ffc966/pyarrow-23.0.1-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:cd395abf8f91c673dd3589cadc8cc1ee4e8674fa61b2e923c8dd215d9c7d1f41", size = 44520337, upload-time = "2026-02-16T10:11:47.764Z" }, - { url = "https://files.pythonhosted.org/packages/69/c2/f2a717fb824f62d0be952ea724b4f6f9372a17eed6f704b5c9526f12f2f1/pyarrow-23.0.1-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:00be9576d970c31defb5c32eb72ef585bf600ef6d0a82d5eccaae96639cf9d07", size = 47548944, upload-time = "2026-02-16T10:11:56.607Z" }, - { url = "https://files.pythonhosted.org/packages/84/a7/90007d476b9f0dc308e3bc57b832d004f848fd6c0da601375d20d92d1519/pyarrow-23.0.1-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:c2139549494445609f35a5cda4eb94e2c9e4d704ce60a095b342f82460c73a83", size = 48236269, upload-time = "2026-02-16T10:12:04.47Z" }, - { url = "https://files.pythonhosted.org/packages/b0/3f/b16fab3e77709856eb6ac328ce35f57a6d4a18462c7ca5186ef31b45e0e0/pyarrow-23.0.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:7044b442f184d84e2351e5084600f0d7343d6117aabcbc1ac78eb1ae11eb4125", size = 50604794, upload-time = "2026-02-16T10:12:11.797Z" }, - { url = "https://files.pythonhosted.org/packages/e9/a1/22df0620a9fac31d68397a75465c344e83c3dfe521f7612aea33e27ab6c0/pyarrow-23.0.1-cp313-cp313t-win_amd64.whl", hash = "sha256:a35581e856a2fafa12f3f54fce4331862b1cfb0bef5758347a858a4aa9d6bae8", size = 27660642, upload-time = "2026-02-16T10:12:17.746Z" }, +version = "24.0.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/91/13/13e1069b351bdc3881266e11147ffccf687505dbb0ea74036237f5d454a5/pyarrow-24.0.0.tar.gz", hash = "sha256:85fe721a14dd823aca09127acbb06c3ca723efbd436c004f16bca601b04dcc83", size = 1180261, upload-time = "2026-04-21T10:51:25.837Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/62/c9/a47ab7ece0d86cbe6678418a0fbd1ac4bb493b9184a3891dfa0e7f287ae0/pyarrow-24.0.0-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:b0e131f880cda8d04e076cee175a46fc0e8bc8b65c99c6c09dff6669335fde74", size = 35068898, upload-time = "2026-04-21T10:46:36.599Z" }, + { url = "https://files.pythonhosted.org/packages/d1/bc/8db86617a9a58008acf8913d6fed68ea2a46acb6de928db28d724c891a68/pyarrow-24.0.0-cp311-cp311-macosx_12_0_x86_64.whl", hash = "sha256:1b2fe7f9a5566401a0ef2571f197eb92358925c1f0c8dba305d6e43ea0871bb3", size = 36679915, upload-time = "2026-04-21T10:46:42.602Z" }, + { url = "https://files.pythonhosted.org/packages/eb/8e/fb178720400ef69db251eb4a9c3ccf4af269bc1feb5055529b8fc87170d1/pyarrow-24.0.0-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:0b3537c00fb8d384f15ac1e79b6eb6db04a16514c8c1d22e59a9b95c8ba42868", size = 45697931, upload-time = "2026-04-21T10:46:48.403Z" }, + { url = "https://files.pythonhosted.org/packages/f3/27/99c42abe8e21b44f4917f62631f3aa31404882a2c41d8a4cd5c110e13d52/pyarrow-24.0.0-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:14e31a3c9e35f1ab6356c6378f6f72830e6d2d5f1791df3774a7b097d18a6a1e", size = 48837449, upload-time = "2026-04-21T10:46:55.329Z" }, + { url = "https://files.pythonhosted.org/packages/36/b6/333749e2666e9032891125bf9c691146e92901bece62030ac1430e2e7c88/pyarrow-24.0.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:b7d9a514e73bc42711e6a35aaccf3587c520024fe0a25d830a1a8a27c15f4f57", size = 49395949, upload-time = "2026-04-21T10:47:01.869Z" }, + { url = "https://files.pythonhosted.org/packages/17/25/c5201706a2dd374e8ba6ee3fd7a8c89fb7ffc16eed5217a91fd2bd7f7626/pyarrow-24.0.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:b196eb3f931862af3fa84c2a253514d859c08e0d8fe020e07be12e75a5a9780c", size = 51912986, upload-time = "2026-04-21T10:47:09.872Z" }, + { url = "https://files.pythonhosted.org/packages/f8/d2/4d1bbba65320b21a49678d6fbdc6ff7c649251359fdcfc03568c4136231d/pyarrow-24.0.0-cp311-cp311-win_amd64.whl", hash = "sha256:35405aecb474e683fb36af650618fd5340ee5471fc65a21b36076a18bbc6c981", size = 27255371, upload-time = "2026-04-21T10:47:15.943Z" }, + { url = "https://files.pythonhosted.org/packages/b4/a9/9686d9f07837f91f775e8932659192e02c74f9d8920524b480b85212cc68/pyarrow-24.0.0-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:6233c9ed9ab9d1db47de57d9753256d9dcffbf42db341576099f0fd9f6bf4810", size = 34981559, upload-time = "2026-04-21T10:47:22.17Z" }, + { url = "https://files.pythonhosted.org/packages/80/b6/0ddf0e9b6ead3474ab087ae598c76b031fc45532bf6a63f3a553440fb258/pyarrow-24.0.0-cp312-cp312-macosx_12_0_x86_64.whl", hash = "sha256:f7616236ec1bc2b15bfdec22a71ab38851c86f8f05ff64f379e1278cf20c634a", size = 36663654, upload-time = "2026-04-21T10:47:28.315Z" }, + { url = "https://files.pythonhosted.org/packages/7c/3b/926382efe8ce27ba729071d3566ade6dfb86bdf112f366000196b2f5780a/pyarrow-24.0.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:1617043b99bd33e5318ae18eb2919af09c71322ef1ca46566cdafc6e6712fb66", size = 45679394, upload-time = "2026-04-21T10:47:34.821Z" }, + { url = "https://files.pythonhosted.org/packages/b3/7a/829f7d9dfd37c207206081d6dad474d81dde29952401f07f2ba507814818/pyarrow-24.0.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:6165461f55ef6314f026de6638d661188e3455d3ec49834556a0ebbdbace18bb", size = 48863122, upload-time = "2026-04-21T10:47:42.056Z" }, + { url = "https://files.pythonhosted.org/packages/5f/e8/f88ce625fe8babaae64e8db2d417c7653adb3019b08aae85c5ed787dc816/pyarrow-24.0.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:3b13dedfe76a0ad2d1d859b0811b53827a4e9d93a0bcb05cf59333ab4980cc7e", size = 49376032, upload-time = "2026-04-21T10:47:48.967Z" }, + { url = "https://files.pythonhosted.org/packages/36/7a/82c363caa145fff88fb475da50d3bf52bb024f61917be5424c3392eaf878/pyarrow-24.0.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:25ea65d868eb04015cd18e6df2fbe98f07e5bda2abefabcb88fce39a947716f6", size = 51929490, upload-time = "2026-04-21T10:47:55.981Z" }, + { url = "https://files.pythonhosted.org/packages/66/1c/e3e72c8014ad2743ca64a701652c733cc5cbcee15c0463a32a8c55518d9e/pyarrow-24.0.0-cp312-cp312-win_amd64.whl", hash = "sha256:295f0a7f2e242dabd513737cf076007dc5b2d59237e3eca37b05c0c6446f3826", size = 27355660, upload-time = "2026-04-21T10:48:01.718Z" }, + { url = "https://files.pythonhosted.org/packages/6f/d3/a1abf004482026ddc17f4503db227787fa3cfe41ec5091ff20e4fea55e57/pyarrow-24.0.0-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:02b001b3ed4723caa44f6cd1af2d5c86aa2cf9971dacc2ffa55b21237713dfba", size = 34976759, upload-time = "2026-04-21T10:48:07.258Z" }, + { url = "https://files.pythonhosted.org/packages/4f/4a/34f0a36d28a2dd32225301b79daad44e243dc1a2bb77d43b60749be255c4/pyarrow-24.0.0-cp313-cp313-macosx_12_0_x86_64.whl", hash = "sha256:04920d6a71aabd08a0417709efce97d45ea8e6fb733d9ca9ecffb13c67839f68", size = 36658471, upload-time = "2026-04-21T10:48:13.347Z" }, + { url = "https://files.pythonhosted.org/packages/1f/78/543b94712ae8bb1a6023bcc1acf1a740fbff8286747c289cd9468fced2a5/pyarrow-24.0.0-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:a964266397740257f16f7bb2e4f08a0c81454004beab8ff59dd531b73610e9f2", size = 45675981, upload-time = "2026-04-21T10:48:20.201Z" }, + { url = "https://files.pythonhosted.org/packages/84/9f/8fb7c222b100d314137fa40ec050de56cd8c6d957d1cfff685ce72f15b17/pyarrow-24.0.0-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:6f066b179d68c413374294bc1735f68475457c933258df594443bb9d88ddc2a0", size = 48859172, upload-time = "2026-04-21T10:48:27.541Z" }, + { url = "https://files.pythonhosted.org/packages/a7/d3/1ea72538e6c8b3b475ed78d1049a2c518e655761ea50fe1171fc855fcab7/pyarrow-24.0.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:1183baeb14c5f587b1ec52831e665718ce632caab84b7cd6b85fd44f96114495", size = 49385733, upload-time = "2026-04-21T10:48:34.7Z" }, + { url = "https://files.pythonhosted.org/packages/c3/be/c3d8b06a1ba35f2260f8e1f771abbee7d5e345c0937aab90675706b1690a/pyarrow-24.0.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:806f24b4085453c197a5078218d1ee08783ebbba271badd153d1ae22a3ee804f", size = 51934335, upload-time = "2026-04-21T10:48:42.099Z" }, + { url = "https://files.pythonhosted.org/packages/9c/62/89e07a1e7329d2cde3e3c6994ba0839a24977a2beda8be6005ea3d860b99/pyarrow-24.0.0-cp313-cp313-win_amd64.whl", hash = "sha256:e4505fc6583f7b05ab854934896bcac8253b04ac1171a77dfb73efef92076d91", size = 27271748, upload-time = "2026-04-21T10:49:42.532Z" }, + { url = "https://files.pythonhosted.org/packages/17/1a/cff3a59f80b5b1658549d46611b67163f65e0664431c076ad728bf9d5af4/pyarrow-24.0.0-cp313-cp313t-macosx_12_0_arm64.whl", hash = "sha256:1a4e45017efbf115032e4475ee876d525e0e36c742214fbe405332480ecd6275", size = 35238554, upload-time = "2026-04-21T10:48:48.526Z" }, + { url = "https://files.pythonhosted.org/packages/a8/99/cce0f42a327bfef2c420fb6078a3eb834826e5d6697bf3009fe11d2ad051/pyarrow-24.0.0-cp313-cp313t-macosx_12_0_x86_64.whl", hash = "sha256:7986f1fa71cee060ad00758bcc79d3a93bab8559bf978fab9e53472a2e25a17b", size = 36782301, upload-time = "2026-04-21T10:48:55.181Z" }, + { url = "https://files.pythonhosted.org/packages/2a/66/8e560d5ff6793ca29aca213c53eec0dd482dd46cb93b2819e5aab52e4252/pyarrow-24.0.0-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:d3e0b61e8efb24ed38898e5cdc5fffa9124be480008d401a1f8071500494ae42", size = 45721929, upload-time = "2026-04-21T10:49:03.676Z" }, + { url = "https://files.pythonhosted.org/packages/27/0c/a26e25505d030716e078d9f16eb74973cbf0b33b672884e9f9da1c83b871/pyarrow-24.0.0-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:55a3bc1e3df3b5567b7d27ef551b2283f0c68a5e86f1cd56abc569da4f31335b", size = 48825365, upload-time = "2026-04-21T10:49:11.714Z" }, + { url = "https://files.pythonhosted.org/packages/5f/eb/771f9ecb0c65e73fe9dccdd1717901b9594f08c4515d000c7c62df573811/pyarrow-24.0.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:641f795b361874ac9da5294f8f443dfdbee355cf2bd9e3b8d97aaac2306b9b37", size = 49451819, upload-time = "2026-04-21T10:49:21.474Z" }, + { url = "https://files.pythonhosted.org/packages/48/da/61ae89a88732f5a785646f3ec6125dbb640fa98a540eb2b9889caa561403/pyarrow-24.0.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:8adc8e6ce5fccf5dc707046ae4914fd537def529709cc0d285d37a7f9cd442ca", size = 51909252, upload-time = "2026-04-21T10:49:31.164Z" }, + { url = "https://files.pythonhosted.org/packages/cb/1a/8dd5cafab7b66573fa91c03d06d213356ad4edd71813aa75e08ce2b3a844/pyarrow-24.0.0-cp313-cp313t-win_amd64.whl", hash = "sha256:9b18371ad2f44044b81a8d23bc2d8a9b6a6226dca775e8e16cfee640473d6c5d", size = 27388127, upload-time = "2026-04-21T10:49:37.334Z" }, ] [[package]] name = "pybtex" -version = "0.25.1" +version = "0.26.1" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "latexcodec" }, { name = "pyyaml" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/5f/bc/c2be05ca72f8c103670e983df8be26d1e288bc6556f487fa8cccaa27779f/pybtex-0.25.1.tar.gz", hash = "sha256:9eaf90267c7e83e225af89fea65c370afbf65f458220d3946a9e3049e1eca491", size = 406157, upload-time = "2025-06-26T13:27:41.903Z" } +sdist = { url = "https://files.pythonhosted.org/packages/4d/f5/f30da9c93f0fa6d619332b2f69597219b625f35780473a05164a9981fd9a/pybtex-0.26.1.tar.gz", hash = "sha256:2e5543bea424e60e9e42eef70bff597be48649d8f68ba061a7a092b2477d5464", size = 692991, upload-time = "2026-04-03T13:05:39.014Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/25/68/ceb5d6679baa326261f5d3e5113d9cfed6efef2810afd9f18bffb8ed312b/pybtex-0.25.1-py2.py3-none-any.whl", hash = "sha256:9053b0d619409a0a83f38abad5d9921de5f7b3ede00742beafcd9f10ad0d8c5c", size = 127437, upload-time = "2025-06-26T13:27:43.585Z" }, + { url = "https://files.pythonhosted.org/packages/44/f6/775eb92e865b28cdb4ad1f2bed7a5446197516f76b58a950faa3be3fd08d/pybtex-0.26.1-py3-none-any.whl", hash = "sha256:e26c0412cc54f5f21b2a6d9d175762a2d2af9ccf3a8f651cdb89ec035db77aa1", size = 126134, upload-time = "2026-04-03T13:05:40.623Z" }, ] [[package]] @@ -2349,7 +2335,7 @@ wheels = [ [[package]] name = "pydantic" -version = "2.12.5" +version = "2.13.3" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "annotated-types" }, @@ -2357,78 +2343,81 @@ dependencies = [ { name = "typing-extensions" }, { name = "typing-inspection" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/69/44/36f1a6e523abc58ae5f928898e4aca2e0ea509b5aa6f6f392a5d882be928/pydantic-2.12.5.tar.gz", hash = "sha256:4d351024c75c0f085a9febbb665ce8c0c6ec5d30e903bdb6394b7ede26aebb49", size = 821591, upload-time = "2025-11-26T15:11:46.471Z" } +sdist = { url = "https://files.pythonhosted.org/packages/d9/e4/40d09941a2cebcb20609b86a559817d5b9291c49dd6f8c87e5feffbe703a/pydantic-2.13.3.tar.gz", hash = "sha256:af09e9d1d09f4e7fe37145c1f577e1d61ceb9a41924bf0094a36506285d0a84d", size = 844068, upload-time = "2026-04-20T14:46:43.632Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/5a/87/b70ad306ebb6f9b585f114d0ac2137d792b48be34d732d60e597c2f8465a/pydantic-2.12.5-py3-none-any.whl", hash = "sha256:e561593fccf61e8a20fc46dfc2dfe075b8be7d0188df33f221ad1f0139180f9d", size = 463580, upload-time = "2025-11-26T15:11:44.605Z" }, + { url = "https://files.pythonhosted.org/packages/f3/0a/fd7d723f8f8153418fb40cf9c940e82004fce7e987026b08a68a36dd3fe7/pydantic-2.13.3-py3-none-any.whl", hash = "sha256:6db14ac8dfc9a1e57f87ea2c0de670c251240f43cb0c30a5130e9720dc612927", size = 471981, upload-time = "2026-04-20T14:46:41.402Z" }, ] [[package]] name = "pydantic-core" -version = "2.41.5" +version = "2.46.3" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "typing-extensions" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/71/70/23b021c950c2addd24ec408e9ab05d59b035b39d97cdc1130e1bce647bb6/pydantic_core-2.41.5.tar.gz", hash = "sha256:08daa51ea16ad373ffd5e7606252cc32f07bc72b28284b6bc9c6df804816476e", size = 460952, upload-time = "2025-11-04T13:43:49.098Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/e8/72/74a989dd9f2084b3d9530b0915fdda64ac48831c30dbf7c72a41a5232db8/pydantic_core-2.41.5-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:a3a52f6156e73e7ccb0f8cced536adccb7042be67cb45f9562e12b319c119da6", size = 2105873, upload-time = "2025-11-04T13:39:31.373Z" }, - { url = "https://files.pythonhosted.org/packages/12/44/37e403fd9455708b3b942949e1d7febc02167662bf1a7da5b78ee1ea2842/pydantic_core-2.41.5-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:7f3bf998340c6d4b0c9a2f02d6a400e51f123b59565d74dc60d252ce888c260b", size = 1899826, upload-time = "2025-11-04T13:39:32.897Z" }, - { url = "https://files.pythonhosted.org/packages/33/7f/1d5cab3ccf44c1935a359d51a8a2a9e1a654b744b5e7f80d41b88d501eec/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:378bec5c66998815d224c9ca994f1e14c0c21cb95d2f52b6021cc0b2a58f2a5a", size = 1917869, upload-time = "2025-11-04T13:39:34.469Z" }, - { url = "https://files.pythonhosted.org/packages/6e/6a/30d94a9674a7fe4f4744052ed6c5e083424510be1e93da5bc47569d11810/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:e7b576130c69225432866fe2f4a469a85a54ade141d96fd396dffcf607b558f8", size = 2063890, upload-time = "2025-11-04T13:39:36.053Z" }, - { url = "https://files.pythonhosted.org/packages/50/be/76e5d46203fcb2750e542f32e6c371ffa9b8ad17364cf94bb0818dbfb50c/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6cb58b9c66f7e4179a2d5e0f849c48eff5c1fca560994d6eb6543abf955a149e", size = 2229740, upload-time = "2025-11-04T13:39:37.753Z" }, - { url = "https://files.pythonhosted.org/packages/d3/ee/fed784df0144793489f87db310a6bbf8118d7b630ed07aa180d6067e653a/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:88942d3a3dff3afc8288c21e565e476fc278902ae4d6d134f1eeda118cc830b1", size = 2350021, upload-time = "2025-11-04T13:39:40.94Z" }, - { url = "https://files.pythonhosted.org/packages/c8/be/8fed28dd0a180dca19e72c233cbf58efa36df055e5b9d90d64fd1740b828/pydantic_core-2.41.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f31d95a179f8d64d90f6831d71fa93290893a33148d890ba15de25642c5d075b", size = 2066378, upload-time = "2025-11-04T13:39:42.523Z" }, - { url = "https://files.pythonhosted.org/packages/b0/3b/698cf8ae1d536a010e05121b4958b1257f0b5522085e335360e53a6b1c8b/pydantic_core-2.41.5-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:c1df3d34aced70add6f867a8cf413e299177e0c22660cc767218373d0779487b", size = 2175761, upload-time = "2025-11-04T13:39:44.553Z" }, - { url = "https://files.pythonhosted.org/packages/b8/ba/15d537423939553116dea94ce02f9c31be0fa9d0b806d427e0308ec17145/pydantic_core-2.41.5-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:4009935984bd36bd2c774e13f9a09563ce8de4abaa7226f5108262fa3e637284", size = 2146303, upload-time = "2025-11-04T13:39:46.238Z" }, - { url = "https://files.pythonhosted.org/packages/58/7f/0de669bf37d206723795f9c90c82966726a2ab06c336deba4735b55af431/pydantic_core-2.41.5-cp311-cp311-musllinux_1_1_armv7l.whl", hash = "sha256:34a64bc3441dc1213096a20fe27e8e128bd3ff89921706e83c0b1ac971276594", size = 2340355, upload-time = "2025-11-04T13:39:48.002Z" }, - { url = "https://files.pythonhosted.org/packages/e5/de/e7482c435b83d7e3c3ee5ee4451f6e8973cff0eb6007d2872ce6383f6398/pydantic_core-2.41.5-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:c9e19dd6e28fdcaa5a1de679aec4141f691023916427ef9bae8584f9c2fb3b0e", size = 2319875, upload-time = "2025-11-04T13:39:49.705Z" }, - { url = "https://files.pythonhosted.org/packages/fe/e6/8c9e81bb6dd7560e33b9053351c29f30c8194b72f2d6932888581f503482/pydantic_core-2.41.5-cp311-cp311-win32.whl", hash = "sha256:2c010c6ded393148374c0f6f0bf89d206bf3217f201faa0635dcd56bd1520f6b", size = 1987549, upload-time = "2025-11-04T13:39:51.842Z" }, - { url = "https://files.pythonhosted.org/packages/11/66/f14d1d978ea94d1bc21fc98fcf570f9542fe55bfcc40269d4e1a21c19bf7/pydantic_core-2.41.5-cp311-cp311-win_amd64.whl", hash = "sha256:76ee27c6e9c7f16f47db7a94157112a2f3a00e958bc626e2f4ee8bec5c328fbe", size = 2011305, upload-time = "2025-11-04T13:39:53.485Z" }, - { url = "https://files.pythonhosted.org/packages/56/d8/0e271434e8efd03186c5386671328154ee349ff0354d83c74f5caaf096ed/pydantic_core-2.41.5-cp311-cp311-win_arm64.whl", hash = "sha256:4bc36bbc0b7584de96561184ad7f012478987882ebf9f9c389b23f432ea3d90f", size = 1972902, upload-time = "2025-11-04T13:39:56.488Z" }, - { url = "https://files.pythonhosted.org/packages/5f/5d/5f6c63eebb5afee93bcaae4ce9a898f3373ca23df3ccaef086d0233a35a7/pydantic_core-2.41.5-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:f41a7489d32336dbf2199c8c0a215390a751c5b014c2c1c5366e817202e9cdf7", size = 2110990, upload-time = "2025-11-04T13:39:58.079Z" }, - { url = "https://files.pythonhosted.org/packages/aa/32/9c2e8ccb57c01111e0fd091f236c7b371c1bccea0fa85247ac55b1e2b6b6/pydantic_core-2.41.5-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:070259a8818988b9a84a449a2a7337c7f430a22acc0859c6b110aa7212a6d9c0", size = 1896003, upload-time = "2025-11-04T13:39:59.956Z" }, - { url = "https://files.pythonhosted.org/packages/68/b8/a01b53cb0e59139fbc9e4fda3e9724ede8de279097179be4ff31f1abb65a/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e96cea19e34778f8d59fe40775a7a574d95816eb150850a85a7a4c8f4b94ac69", size = 1919200, upload-time = "2025-11-04T13:40:02.241Z" }, - { url = "https://files.pythonhosted.org/packages/38/de/8c36b5198a29bdaade07b5985e80a233a5ac27137846f3bc2d3b40a47360/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:ed2e99c456e3fadd05c991f8f437ef902e00eedf34320ba2b0842bd1c3ca3a75", size = 2052578, upload-time = "2025-11-04T13:40:04.401Z" }, - { url = "https://files.pythonhosted.org/packages/00/b5/0e8e4b5b081eac6cb3dbb7e60a65907549a1ce035a724368c330112adfdd/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:65840751b72fbfd82c3c640cff9284545342a4f1eb1586ad0636955b261b0b05", size = 2208504, upload-time = "2025-11-04T13:40:06.072Z" }, - { url = "https://files.pythonhosted.org/packages/77/56/87a61aad59c7c5b9dc8caad5a41a5545cba3810c3e828708b3d7404f6cef/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:e536c98a7626a98feb2d3eaf75944ef6f3dbee447e1f841eae16f2f0a72d8ddc", size = 2335816, upload-time = "2025-11-04T13:40:07.835Z" }, - { url = "https://files.pythonhosted.org/packages/0d/76/941cc9f73529988688a665a5c0ecff1112b3d95ab48f81db5f7606f522d3/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:eceb81a8d74f9267ef4081e246ffd6d129da5d87e37a77c9bde550cb04870c1c", size = 2075366, upload-time = "2025-11-04T13:40:09.804Z" }, - { url = "https://files.pythonhosted.org/packages/d3/43/ebef01f69baa07a482844faaa0a591bad1ef129253ffd0cdaa9d8a7f72d3/pydantic_core-2.41.5-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:d38548150c39b74aeeb0ce8ee1d8e82696f4a4e16ddc6de7b1d8823f7de4b9b5", size = 2171698, upload-time = "2025-11-04T13:40:12.004Z" }, - { url = "https://files.pythonhosted.org/packages/b1/87/41f3202e4193e3bacfc2c065fab7706ebe81af46a83d3e27605029c1f5a6/pydantic_core-2.41.5-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:c23e27686783f60290e36827f9c626e63154b82b116d7fe9adba1fda36da706c", size = 2132603, upload-time = "2025-11-04T13:40:13.868Z" }, - { url = "https://files.pythonhosted.org/packages/49/7d/4c00df99cb12070b6bccdef4a195255e6020a550d572768d92cc54dba91a/pydantic_core-2.41.5-cp312-cp312-musllinux_1_1_armv7l.whl", hash = "sha256:482c982f814460eabe1d3bb0adfdc583387bd4691ef00b90575ca0d2b6fe2294", size = 2329591, upload-time = "2025-11-04T13:40:15.672Z" }, - { url = "https://files.pythonhosted.org/packages/cc/6a/ebf4b1d65d458f3cda6a7335d141305dfa19bdc61140a884d165a8a1bbc7/pydantic_core-2.41.5-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:bfea2a5f0b4d8d43adf9d7b8bf019fb46fdd10a2e5cde477fbcb9d1fa08c68e1", size = 2319068, upload-time = "2025-11-04T13:40:17.532Z" }, - { url = "https://files.pythonhosted.org/packages/49/3b/774f2b5cd4192d5ab75870ce4381fd89cf218af999515baf07e7206753f0/pydantic_core-2.41.5-cp312-cp312-win32.whl", hash = "sha256:b74557b16e390ec12dca509bce9264c3bbd128f8a2c376eaa68003d7f327276d", size = 1985908, upload-time = "2025-11-04T13:40:19.309Z" }, - { url = "https://files.pythonhosted.org/packages/86/45/00173a033c801cacf67c190fef088789394feaf88a98a7035b0e40d53dc9/pydantic_core-2.41.5-cp312-cp312-win_amd64.whl", hash = "sha256:1962293292865bca8e54702b08a4f26da73adc83dd1fcf26fbc875b35d81c815", size = 2020145, upload-time = "2025-11-04T13:40:21.548Z" }, - { url = "https://files.pythonhosted.org/packages/f9/22/91fbc821fa6d261b376a3f73809f907cec5ca6025642c463d3488aad22fb/pydantic_core-2.41.5-cp312-cp312-win_arm64.whl", hash = "sha256:1746d4a3d9a794cacae06a5eaaccb4b8643a131d45fbc9af23e353dc0a5ba5c3", size = 1976179, upload-time = "2025-11-04T13:40:23.393Z" }, - { url = "https://files.pythonhosted.org/packages/87/06/8806241ff1f70d9939f9af039c6c35f2360cf16e93c2ca76f184e76b1564/pydantic_core-2.41.5-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:941103c9be18ac8daf7b7adca8228f8ed6bb7a1849020f643b3a14d15b1924d9", size = 2120403, upload-time = "2025-11-04T13:40:25.248Z" }, - { url = "https://files.pythonhosted.org/packages/94/02/abfa0e0bda67faa65fef1c84971c7e45928e108fe24333c81f3bfe35d5f5/pydantic_core-2.41.5-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:112e305c3314f40c93998e567879e887a3160bb8689ef3d2c04b6cc62c33ac34", size = 1896206, upload-time = "2025-11-04T13:40:27.099Z" }, - { url = "https://files.pythonhosted.org/packages/15/df/a4c740c0943e93e6500f9eb23f4ca7ec9bf71b19e608ae5b579678c8d02f/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0cbaad15cb0c90aa221d43c00e77bb33c93e8d36e0bf74760cd00e732d10a6a0", size = 1919307, upload-time = "2025-11-04T13:40:29.806Z" }, - { url = "https://files.pythonhosted.org/packages/9a/e3/6324802931ae1d123528988e0e86587c2072ac2e5394b4bc2bc34b61ff6e/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:03ca43e12fab6023fc79d28ca6b39b05f794ad08ec2feccc59a339b02f2b3d33", size = 2063258, upload-time = "2025-11-04T13:40:33.544Z" }, - { url = "https://files.pythonhosted.org/packages/c9/d4/2230d7151d4957dd79c3044ea26346c148c98fbf0ee6ebd41056f2d62ab5/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:dc799088c08fa04e43144b164feb0c13f9a0bc40503f8df3e9fde58a3c0c101e", size = 2214917, upload-time = "2025-11-04T13:40:35.479Z" }, - { url = "https://files.pythonhosted.org/packages/e6/9f/eaac5df17a3672fef0081b6c1bb0b82b33ee89aa5cec0d7b05f52fd4a1fa/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:97aeba56665b4c3235a0e52b2c2f5ae9cd071b8a8310ad27bddb3f7fb30e9aa2", size = 2332186, upload-time = "2025-11-04T13:40:37.436Z" }, - { url = "https://files.pythonhosted.org/packages/cf/4e/35a80cae583a37cf15604b44240e45c05e04e86f9cfd766623149297e971/pydantic_core-2.41.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:406bf18d345822d6c21366031003612b9c77b3e29ffdb0f612367352aab7d586", size = 2073164, upload-time = "2025-11-04T13:40:40.289Z" }, - { url = "https://files.pythonhosted.org/packages/bf/e3/f6e262673c6140dd3305d144d032f7bd5f7497d3871c1428521f19f9efa2/pydantic_core-2.41.5-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:b93590ae81f7010dbe380cdeab6f515902ebcbefe0b9327cc4804d74e93ae69d", size = 2179146, upload-time = "2025-11-04T13:40:42.809Z" }, - { url = "https://files.pythonhosted.org/packages/75/c7/20bd7fc05f0c6ea2056a4565c6f36f8968c0924f19b7d97bbfea55780e73/pydantic_core-2.41.5-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:01a3d0ab748ee531f4ea6c3e48ad9dac84ddba4b0d82291f87248f2f9de8d740", size = 2137788, upload-time = "2025-11-04T13:40:44.752Z" }, - { url = "https://files.pythonhosted.org/packages/3a/8d/34318ef985c45196e004bc46c6eab2eda437e744c124ef0dbe1ff2c9d06b/pydantic_core-2.41.5-cp313-cp313-musllinux_1_1_armv7l.whl", hash = "sha256:6561e94ba9dacc9c61bce40e2d6bdc3bfaa0259d3ff36ace3b1e6901936d2e3e", size = 2340133, upload-time = "2025-11-04T13:40:46.66Z" }, - { url = "https://files.pythonhosted.org/packages/9c/59/013626bf8c78a5a5d9350d12e7697d3d4de951a75565496abd40ccd46bee/pydantic_core-2.41.5-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:915c3d10f81bec3a74fbd4faebe8391013ba61e5a1a8d48c4455b923bdda7858", size = 2324852, upload-time = "2025-11-04T13:40:48.575Z" }, - { url = "https://files.pythonhosted.org/packages/1a/d9/c248c103856f807ef70c18a4f986693a46a8ffe1602e5d361485da502d20/pydantic_core-2.41.5-cp313-cp313-win32.whl", hash = "sha256:650ae77860b45cfa6e2cdafc42618ceafab3a2d9a3811fcfbd3bbf8ac3c40d36", size = 1994679, upload-time = "2025-11-04T13:40:50.619Z" }, - { url = "https://files.pythonhosted.org/packages/9e/8b/341991b158ddab181cff136acd2552c9f35bd30380422a639c0671e99a91/pydantic_core-2.41.5-cp313-cp313-win_amd64.whl", hash = "sha256:79ec52ec461e99e13791ec6508c722742ad745571f234ea6255bed38c6480f11", size = 2019766, upload-time = "2025-11-04T13:40:52.631Z" }, - { url = "https://files.pythonhosted.org/packages/73/7d/f2f9db34af103bea3e09735bb40b021788a5e834c81eedb541991badf8f5/pydantic_core-2.41.5-cp313-cp313-win_arm64.whl", hash = "sha256:3f84d5c1b4ab906093bdc1ff10484838aca54ef08de4afa9de0f5f14d69639cd", size = 1981005, upload-time = "2025-11-04T13:40:54.734Z" }, - { url = "https://files.pythonhosted.org/packages/11/72/90fda5ee3b97e51c494938a4a44c3a35a9c96c19bba12372fb9c634d6f57/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-macosx_10_12_x86_64.whl", hash = "sha256:b96d5f26b05d03cc60f11a7761a5ded1741da411e7fe0909e27a5e6a0cb7b034", size = 2115441, upload-time = "2025-11-04T13:42:39.557Z" }, - { url = "https://files.pythonhosted.org/packages/1f/53/8942f884fa33f50794f119012dc6a1a02ac43a56407adaac20463df8e98f/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-macosx_11_0_arm64.whl", hash = "sha256:634e8609e89ceecea15e2d61bc9ac3718caaaa71963717bf3c8f38bfde64242c", size = 1930291, upload-time = "2025-11-04T13:42:42.169Z" }, - { url = "https://files.pythonhosted.org/packages/79/c8/ecb9ed9cd942bce09fc888ee960b52654fbdbede4ba6c2d6e0d3b1d8b49c/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:93e8740d7503eb008aa2df04d3b9735f845d43ae845e6dcd2be0b55a2da43cd2", size = 1948632, upload-time = "2025-11-04T13:42:44.564Z" }, - { url = "https://files.pythonhosted.org/packages/2e/1b/687711069de7efa6af934e74f601e2a4307365e8fdc404703afc453eab26/pydantic_core-2.41.5-graalpy311-graalpy242_311_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f15489ba13d61f670dcc96772e733aad1a6f9c429cc27574c6cdaed82d0146ad", size = 2138905, upload-time = "2025-11-04T13:42:47.156Z" }, - { url = "https://files.pythonhosted.org/packages/09/32/59b0c7e63e277fa7911c2fc70ccfb45ce4b98991e7ef37110663437005af/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-macosx_10_12_x86_64.whl", hash = "sha256:7da7087d756b19037bc2c06edc6c170eeef3c3bafcb8f532ff17d64dc427adfd", size = 2110495, upload-time = "2025-11-04T13:42:49.689Z" }, - { url = "https://files.pythonhosted.org/packages/aa/81/05e400037eaf55ad400bcd318c05bb345b57e708887f07ddb2d20e3f0e98/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-macosx_11_0_arm64.whl", hash = "sha256:aabf5777b5c8ca26f7824cb4a120a740c9588ed58df9b2d196ce92fba42ff8dc", size = 1915388, upload-time = "2025-11-04T13:42:52.215Z" }, - { url = "https://files.pythonhosted.org/packages/6e/0d/e3549b2399f71d56476b77dbf3cf8937cec5cd70536bdc0e374a421d0599/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c007fe8a43d43b3969e8469004e9845944f1a80e6acd47c150856bb87f230c56", size = 1942879, upload-time = "2025-11-04T13:42:56.483Z" }, - { url = "https://files.pythonhosted.org/packages/f7/07/34573da085946b6a313d7c42f82f16e8920bfd730665de2d11c0c37a74b5/pydantic_core-2.41.5-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:76d0819de158cd855d1cbb8fcafdf6f5cf1eb8e470abe056d5d161106e38062b", size = 2139017, upload-time = "2025-11-04T13:42:59.471Z" }, - { url = "https://files.pythonhosted.org/packages/5f/9b/1b3f0e9f9305839d7e84912f9e8bfbd191ed1b1ef48083609f0dabde978c/pydantic_core-2.41.5-pp311-pypy311_pp73-macosx_10_12_x86_64.whl", hash = "sha256:b2379fa7ed44ddecb5bfe4e48577d752db9fc10be00a6b7446e9663ba143de26", size = 2101980, upload-time = "2025-11-04T13:43:25.97Z" }, - { url = "https://files.pythonhosted.org/packages/a4/ed/d71fefcb4263df0da6a85b5d8a7508360f2f2e9b3bf5814be9c8bccdccc1/pydantic_core-2.41.5-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:266fb4cbf5e3cbd0b53669a6d1b039c45e3ce651fd5442eff4d07c2cc8d66808", size = 1923865, upload-time = "2025-11-04T13:43:28.763Z" }, - { url = "https://files.pythonhosted.org/packages/ce/3a/626b38db460d675f873e4444b4bb030453bbe7b4ba55df821d026a0493c4/pydantic_core-2.41.5-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:58133647260ea01e4d0500089a8c4f07bd7aa6ce109682b1426394988d8aaacc", size = 2134256, upload-time = "2025-11-04T13:43:31.71Z" }, - { url = "https://files.pythonhosted.org/packages/83/d9/8412d7f06f616bbc053d30cb4e5f76786af3221462ad5eee1f202021eb4e/pydantic_core-2.41.5-pp311-pypy311_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:287dad91cfb551c363dc62899a80e9e14da1f0e2b6ebde82c806612ca2a13ef1", size = 2174762, upload-time = "2025-11-04T13:43:34.744Z" }, - { url = "https://files.pythonhosted.org/packages/55/4c/162d906b8e3ba3a99354e20faa1b49a85206c47de97a639510a0e673f5da/pydantic_core-2.41.5-pp311-pypy311_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:03b77d184b9eb40240ae9fd676ca364ce1085f203e1b1256f8ab9984dca80a84", size = 2143141, upload-time = "2025-11-04T13:43:37.701Z" }, - { url = "https://files.pythonhosted.org/packages/1f/f2/f11dd73284122713f5f89fc940f370d035fa8e1e078d446b3313955157fe/pydantic_core-2.41.5-pp311-pypy311_pp73-musllinux_1_1_armv7l.whl", hash = "sha256:a668ce24de96165bb239160b3d854943128f4334822900534f2fe947930e5770", size = 2330317, upload-time = "2025-11-04T13:43:40.406Z" }, - { url = "https://files.pythonhosted.org/packages/88/9d/b06ca6acfe4abb296110fb1273a4d848a0bfb2ff65f3ee92127b3244e16b/pydantic_core-2.41.5-pp311-pypy311_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:f14f8f046c14563f8eb3f45f499cc658ab8d10072961e07225e507adb700e93f", size = 2316992, upload-time = "2025-11-04T13:43:43.602Z" }, - { url = "https://files.pythonhosted.org/packages/36/c7/cfc8e811f061c841d7990b0201912c3556bfeb99cdcb7ed24adc8d6f8704/pydantic_core-2.41.5-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:56121965f7a4dc965bff783d70b907ddf3d57f6eba29b6d2e5dabfaf07799c51", size = 2145302, upload-time = "2025-11-04T13:43:46.64Z" }, +sdist = { url = "https://files.pythonhosted.org/packages/2a/ef/f7abb56c49382a246fd2ce9c799691e3c3e7175ec74b14d99e798bcddb1a/pydantic_core-2.46.3.tar.gz", hash = "sha256:41c178f65b8c29807239d47e6050262eb6bf84eb695e41101e62e38df4a5bc2c", size = 471412, upload-time = "2026-04-20T14:40:56.672Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/22/a2/1ba90a83e85a3f94c796b184f3efde9c72f2830dcda493eea8d59ba78e6d/pydantic_core-2.46.3-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:ab124d49d0459b2373ecf54118a45c28a1e6d4192a533fbc915e70f556feb8e5", size = 2106740, upload-time = "2026-04-20T14:41:20.932Z" }, + { url = "https://files.pythonhosted.org/packages/b6/f6/99ae893c89a0b9d3daec9f95487aa676709aa83f67643b3f0abaf4ab628a/pydantic_core-2.46.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:cca67d52a5c7a16aed2b3999e719c4bcf644074eac304a5d3d62dd70ae7d4b2c", size = 1948293, upload-time = "2026-04-20T14:43:42.115Z" }, + { url = "https://files.pythonhosted.org/packages/3e/b8/2e8e636dc9e3f16c2e16bf0849e24be82c5ee82c603c65fc0326666328fc/pydantic_core-2.46.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5c024e08c0ba23e6fd68c771a521e9d6a792f2ebb0fa734296b36394dc30390e", size = 1973222, upload-time = "2026-04-20T14:41:57.841Z" }, + { url = "https://files.pythonhosted.org/packages/34/36/0e730beec4d83c5306f417afbd82ff237d9a21e83c5edf675f31ed84c1fe/pydantic_core-2.46.3-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:6645ce7eec4928e29a1e3b3d5c946621d105d3e79f0c9cddf07c2a9770949287", size = 2053852, upload-time = "2026-04-20T14:40:43.077Z" }, + { url = "https://files.pythonhosted.org/packages/4b/f0/3071131f47e39136a17814576e0fada9168569f7f8c0e6ac4d1ede6a4958/pydantic_core-2.46.3-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:a712c7118e6c5ea96562f7b488435172abb94a3c53c22c9efc1412264a45cbbe", size = 2221134, upload-time = "2026-04-20T14:43:03.349Z" }, + { url = "https://files.pythonhosted.org/packages/2f/a9/a2dc023eec5aa4b02a467874bad32e2446957d2adcab14e107eab502e978/pydantic_core-2.46.3-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:69a868ef3ff206343579021c40faf3b1edc64b1cc508ff243a28b0a514ccb050", size = 2279785, upload-time = "2026-04-20T14:41:19.285Z" }, + { url = "https://files.pythonhosted.org/packages/0a/44/93f489d16fb63fbd41c670441536541f6e8cfa1e5a69f40bc9c5d30d8c90/pydantic_core-2.46.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:cc7e8c32db809aa0f6ea1d6869ebc8518a65d5150fdfad8bcae6a49ae32a22e2", size = 2089404, upload-time = "2026-04-20T14:43:10.108Z" }, + { url = "https://files.pythonhosted.org/packages/2a/78/8692e3aa72b2d004f7a5d937f1dfdc8552ba26caf0bec75f342c40f00dec/pydantic_core-2.46.3-cp311-cp311-manylinux_2_31_riscv64.whl", hash = "sha256:3481bd1341dc85779ee506bc8e1196a277ace359d89d28588a9468c3ecbe63fa", size = 2114898, upload-time = "2026-04-20T14:44:51.475Z" }, + { url = "https://files.pythonhosted.org/packages/6a/62/e83133f2e7832532060175cebf1f13748f4c7e7e7165cdd1f611f174494b/pydantic_core-2.46.3-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:8690eba565c6d68ffd3a8655525cbdd5246510b44a637ee2c6c03a7ebfe64d3c", size = 2157856, upload-time = "2026-04-20T14:43:46.64Z" }, + { url = "https://files.pythonhosted.org/packages/6d/ec/6a500e3ad7718ee50583fae79c8651f5d37e3abce1fa9ae177ae65842c53/pydantic_core-2.46.3-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:4de88889d7e88d50d40ee5b39d5dac0bcaef9ba91f7e536ac064e6b2834ecccf", size = 2180168, upload-time = "2026-04-20T14:42:00.302Z" }, + { url = "https://files.pythonhosted.org/packages/d8/53/8267811054b1aa7fc1dc7ded93812372ef79a839f5e23558136a6afbfde1/pydantic_core-2.46.3-cp311-cp311-musllinux_1_1_armv7l.whl", hash = "sha256:e480080975c1ef7f780b8f99ed72337e7cc5efea2e518a20a692e8e7b278eb8b", size = 2322885, upload-time = "2026-04-20T14:41:05.253Z" }, + { url = "https://files.pythonhosted.org/packages/c8/c1/1c0acdb3aa0856ddc4ecc55214578f896f2de16f400cf51627eb3c26c1c4/pydantic_core-2.46.3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:de3a5c376f8cd94da9a1b8fd3dd1c16c7a7b216ed31dc8ce9fd7a22bf13b836e", size = 2360328, upload-time = "2026-04-20T14:41:43.991Z" }, + { url = "https://files.pythonhosted.org/packages/f0/d0/ef39cd0f4a926814f360e71c1adeab48ad214d9727e4deb48eedfb5bce1a/pydantic_core-2.46.3-cp311-cp311-win32.whl", hash = "sha256:fc331a5314ffddd5385b9ee9d0d2fee0b13c27e0e02dad71b1ae5d6561f51eeb", size = 1979464, upload-time = "2026-04-20T14:43:12.215Z" }, + { url = "https://files.pythonhosted.org/packages/18/9c/f41951b0d858e343f1cf09398b2a7b3014013799744f2c4a8ad6a3eec4f2/pydantic_core-2.46.3-cp311-cp311-win_amd64.whl", hash = "sha256:b5b9c6cf08a8a5e502698f5e153056d12c34b8fb30317e0c5fd06f45162a6346", size = 2070837, upload-time = "2026-04-20T14:41:47.707Z" }, + { url = "https://files.pythonhosted.org/packages/9f/1e/264a17cd582f6ed50950d4d03dd5fefd84e570e238afe1cb3e25cf238769/pydantic_core-2.46.3-cp311-cp311-win_arm64.whl", hash = "sha256:5dfd51cf457482f04ec49491811a2b8fd5b843b64b11eecd2d7a1ee596ea78a6", size = 2053647, upload-time = "2026-04-20T14:42:27.535Z" }, + { url = "https://files.pythonhosted.org/packages/4b/cb/5b47425556ecc1f3fe18ed2a0083188aa46e1dd812b06e406475b3a5d536/pydantic_core-2.46.3-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:b11b59b3eee90a80a36701ddb4576d9ae31f93f05cb9e277ceaa09e6bf074a67", size = 2101946, upload-time = "2026-04-20T14:40:52.581Z" }, + { url = "https://files.pythonhosted.org/packages/a1/4f/2fb62c2267cae99b815bbf4a7b9283812c88ca3153ef29f7707200f1d4e5/pydantic_core-2.46.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:af8653713055ea18a3abc1537fe2ebc42f5b0bbb768d1eb79fd74eb47c0ac089", size = 1951612, upload-time = "2026-04-20T14:42:42.996Z" }, + { url = "https://files.pythonhosted.org/packages/50/6e/b7348fd30d6556d132cddd5bd79f37f96f2601fe0608afac4f5fb01ec0b3/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:75a519dab6d63c514f3a81053e5266c549679e4aa88f6ec57f2b7b854aceb1b0", size = 1977027, upload-time = "2026-04-20T14:42:02.001Z" }, + { url = "https://files.pythonhosted.org/packages/82/11/31d60ee2b45540d3fb0b29302a393dbc01cd771c473f5b5147bcd353e593/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:a6cd87cb1575b1ad05ba98894c5b5c96411ef678fa2f6ed2576607095b8d9789", size = 2063008, upload-time = "2026-04-20T14:44:17.952Z" }, + { url = "https://files.pythonhosted.org/packages/8a/db/3a9d1957181b59258f44a2300ab0f0be9d1e12d662a4f57bb31250455c52/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:f80a55484b8d843c8ada81ebf70a682f3f00a3d40e378c06cf17ecb44d280d7d", size = 2233082, upload-time = "2026-04-20T14:40:57.934Z" }, + { url = "https://files.pythonhosted.org/packages/9c/e1/3277c38792aeb5cfb18c2f0c5785a221d9ff4e149abbe1184d53d5f72273/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:3861f1731b90c50a3266316b9044f5c9b405eecb8e299b0a7120596334e4fe9c", size = 2304615, upload-time = "2026-04-20T14:42:12.584Z" }, + { url = "https://files.pythonhosted.org/packages/5e/d5/e3d9717c9eba10855325650afd2a9cba8e607321697f18953af9d562da2f/pydantic_core-2.46.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fb528e295ed31570ac3dcc9bfdd6e0150bc11ce6168ac87a8082055cf1a67395", size = 2094380, upload-time = "2026-04-20T14:43:05.522Z" }, + { url = "https://files.pythonhosted.org/packages/a1/20/abac35dedcbfd66c6f0b03e4e3564511771d6c9b7ede10a362d03e110d9b/pydantic_core-2.46.3-cp312-cp312-manylinux_2_31_riscv64.whl", hash = "sha256:367508faa4973b992b271ba1494acaab36eb7e8739d1e47be5035fb1ea225396", size = 2135429, upload-time = "2026-04-20T14:41:55.549Z" }, + { url = "https://files.pythonhosted.org/packages/6c/a5/41bfd1df69afad71b5cf0535055bccc73022715ad362edbc124bc1e021d7/pydantic_core-2.46.3-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:5ad3c826fe523e4becf4fe39baa44286cff85ef137c729a2c5e269afbfd0905d", size = 2174582, upload-time = "2026-04-20T14:41:45.96Z" }, + { url = "https://files.pythonhosted.org/packages/79/65/38d86ea056b29b2b10734eb23329b7a7672ca604df4f2b6e9c02d4ee22fe/pydantic_core-2.46.3-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:ec638c5d194ef8af27db69f16c954a09797c0dc25015ad6123eb2c73a4d271ca", size = 2187533, upload-time = "2026-04-20T14:40:55.367Z" }, + { url = "https://files.pythonhosted.org/packages/b6/55/a1129141678a2026badc539ad1dee0a71d06f54c2f06a4bd68c030ac781b/pydantic_core-2.46.3-cp312-cp312-musllinux_1_1_armv7l.whl", hash = "sha256:28ed528c45446062ee66edb1d33df5d88828ae167de76e773a3c7f64bd14e976", size = 2332985, upload-time = "2026-04-20T14:44:13.05Z" }, + { url = "https://files.pythonhosted.org/packages/d7/60/cb26f4077719f709e54819f4e8e1d43f4091f94e285eb6bd21e1190a7b7c/pydantic_core-2.46.3-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:aed19d0c783886d5bd86d80ae5030006b45e28464218747dcf83dabfdd092c7b", size = 2373670, upload-time = "2026-04-20T14:41:53.421Z" }, + { url = "https://files.pythonhosted.org/packages/6b/7e/c3f21882bdf1d8d086876f81b5e296206c69c6082551d776895de7801fa0/pydantic_core-2.46.3-cp312-cp312-win32.whl", hash = "sha256:06d5d8820cbbdb4147578c1fe7ffcd5b83f34508cb9f9ab76e807be7db6ff0a4", size = 1966722, upload-time = "2026-04-20T14:44:30.588Z" }, + { url = "https://files.pythonhosted.org/packages/57/be/6b5e757b859013ebfbd7adba02f23b428f37c86dcbf78b5bb0b4ffd36e99/pydantic_core-2.46.3-cp312-cp312-win_amd64.whl", hash = "sha256:c3212fda0ee959c1dd04c60b601ec31097aaa893573a3a1abd0a47bcac2968c1", size = 2072970, upload-time = "2026-04-20T14:42:54.248Z" }, + { url = "https://files.pythonhosted.org/packages/bf/f8/a989b21cc75e9a32d24192ef700eea606521221a89faa40c919ce884f2b1/pydantic_core-2.46.3-cp312-cp312-win_arm64.whl", hash = "sha256:f1f8338dd7a7f31761f1f1a3c47503a9a3b34eea3c8b01fa6ee96408affb5e72", size = 2035963, upload-time = "2026-04-20T14:44:20.4Z" }, + { url = "https://files.pythonhosted.org/packages/9b/3c/9b5e8eb9821936d065439c3b0fb1490ffa64163bfe7e1595985a47896073/pydantic_core-2.46.3-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:12bc98de041458b80c86c56b24df1d23832f3e166cbaff011f25d187f5c62c37", size = 2102109, upload-time = "2026-04-20T14:41:24.219Z" }, + { url = "https://files.pythonhosted.org/packages/91/97/1c41d1f5a19f241d8069f1e249853bcce378cdb76eec8ab636d7bc426280/pydantic_core-2.46.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:85348b8f89d2c3508b65b16c3c33a4da22b8215138d8b996912bb1532868885f", size = 1951820, upload-time = "2026-04-20T14:42:14.236Z" }, + { url = "https://files.pythonhosted.org/packages/30/b4/d03a7ae14571bc2b6b3c7b122441154720619afe9a336fa3a95434df5e2f/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1105677a6df914b1fb71a81b96c8cce7726857e1717d86001f29be06a25ee6f8", size = 1977785, upload-time = "2026-04-20T14:42:31.648Z" }, + { url = "https://files.pythonhosted.org/packages/ae/0c/4086f808834b59e3c8f1aa26df8f4b6d998cdcf354a143d18ef41529d1fe/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:87082cd65669a33adeba5470769e9704c7cf026cc30afb9cc77fd865578ebaad", size = 2062761, upload-time = "2026-04-20T14:40:37.093Z" }, + { url = "https://files.pythonhosted.org/packages/fa/71/a649be5a5064c2df0db06e0a512c2281134ed2fcc981f52a657936a7527c/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:60e5f66e12c4f5212d08522963380eaaeac5ebd795826cfd19b2dfb0c7a52b9c", size = 2232989, upload-time = "2026-04-20T14:42:59.254Z" }, + { url = "https://files.pythonhosted.org/packages/a2/84/7756e75763e810b3a710f4724441d1ecc5883b94aacb07ca71c5fb5cfb69/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b6cdf19bf84128d5e7c37e8a73a0c5c10d51103a650ac585d42dd6ae233f2b7f", size = 2303975, upload-time = "2026-04-20T14:41:32.287Z" }, + { url = "https://files.pythonhosted.org/packages/6c/35/68a762e0c1e31f35fa0dac733cbd9f5b118042853698de9509c8e5bf128b/pydantic_core-2.46.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:031bb17f4885a43773c8c763089499f242aee2ea85cf17154168775dccdecf35", size = 2095325, upload-time = "2026-04-20T14:42:47.685Z" }, + { url = "https://files.pythonhosted.org/packages/77/bf/1bf8c9a8e91836c926eae5e3e51dce009bf495a60ca56060689d3df3f340/pydantic_core-2.46.3-cp313-cp313-manylinux_2_31_riscv64.whl", hash = "sha256:bcf2a8b2982a6673693eae7348ef3d8cf3979c1d63b54fca7c397a635cc68687", size = 2133368, upload-time = "2026-04-20T14:41:22.766Z" }, + { url = "https://files.pythonhosted.org/packages/e5/50/87d818d6bab915984995157ceb2380f5aac4e563dddbed6b56f0ed057aba/pydantic_core-2.46.3-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:28e8cf2f52d72ced402a137145923a762cbb5081e48b34312f7a0c8f55928ec3", size = 2173908, upload-time = "2026-04-20T14:42:52.044Z" }, + { url = "https://files.pythonhosted.org/packages/91/88/a311fb306d0bd6185db41fa14ae888fb81d0baf648a761ae760d30819d33/pydantic_core-2.46.3-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:17eaface65d9fc5abb940003020309c1bf7a211f5f608d7870297c367e6f9022", size = 2186422, upload-time = "2026-04-20T14:43:29.55Z" }, + { url = "https://files.pythonhosted.org/packages/8f/79/28fd0d81508525ab2054fef7c77a638c8b5b0afcbbaeee493cf7c3fef7e1/pydantic_core-2.46.3-cp313-cp313-musllinux_1_1_armv7l.whl", hash = "sha256:93fd339f23408a07e98950a89644f92c54d8729719a40b30c0a30bb9ebc55d23", size = 2332709, upload-time = "2026-04-20T14:42:16.134Z" }, + { url = "https://files.pythonhosted.org/packages/b3/21/795bf5fe5c0f379308b8ef19c50dedab2e7711dbc8d0c2acf08f1c7daa05/pydantic_core-2.46.3-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:23cbdb3aaa74dfe0837975dbf69b469753bbde8eacace524519ffdb6b6e89eb7", size = 2372428, upload-time = "2026-04-20T14:41:10.974Z" }, + { url = "https://files.pythonhosted.org/packages/45/b3/ed14c659cbe7605e3ef063077680a64680aec81eb1a04763a05190d49b7f/pydantic_core-2.46.3-cp313-cp313-win32.whl", hash = "sha256:610eda2e3838f401105e6326ca304f5da1e15393ae25dacae5c5c63f2c275b13", size = 1965601, upload-time = "2026-04-20T14:41:42.128Z" }, + { url = "https://files.pythonhosted.org/packages/ef/bb/adb70d9a762ddd002d723fbf1bd492244d37da41e3af7b74ad212609027e/pydantic_core-2.46.3-cp313-cp313-win_amd64.whl", hash = "sha256:68cc7866ed863db34351294187f9b729964c371ba33e31c26f478471c52e1ed0", size = 2071517, upload-time = "2026-04-20T14:43:36.096Z" }, + { url = "https://files.pythonhosted.org/packages/52/eb/66faefabebfe68bd7788339c9c9127231e680b11906368c67ce112fdb47f/pydantic_core-2.46.3-cp313-cp313-win_arm64.whl", hash = "sha256:f64b5537ac62b231572879cd08ec05600308636a5d63bcbdb15063a466977bec", size = 2035802, upload-time = "2026-04-20T14:43:38.507Z" }, + { url = "https://files.pythonhosted.org/packages/66/7f/03dbad45cd3aa9083fbc93c210ae8b005af67e4136a14186950a747c6874/pydantic_core-2.46.3-graalpy311-graalpy242_311_native-macosx_10_12_x86_64.whl", hash = "sha256:9715525891ed524a0a1eb6d053c74d4d4ad5017677fb00af0b7c2644a31bae46", size = 2105683, upload-time = "2026-04-20T14:42:19.779Z" }, + { url = "https://files.pythonhosted.org/packages/26/22/4dc186ac8ea6b257e9855031f51b62a9637beac4d68ac06bee02f046f836/pydantic_core-2.46.3-graalpy311-graalpy242_311_native-macosx_11_0_arm64.whl", hash = "sha256:9d2f400712a99a013aff420ef1eb9be077f8189a36c1e3ef87660b4e1088a874", size = 1940052, upload-time = "2026-04-20T14:43:59.274Z" }, + { url = "https://files.pythonhosted.org/packages/0d/ca/d376391a5aff1f2e8188960d7873543608130a870961c2b6b5236627c116/pydantic_core-2.46.3-graalpy311-graalpy242_311_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:bd2aab0e2e9dc2daf36bd2686c982535d5e7b1d930a1344a7bb6e82baab42a76", size = 1988172, upload-time = "2026-04-20T14:41:17.469Z" }, + { url = "https://files.pythonhosted.org/packages/0e/6b/523b9f85c23788755d6ab949329de692a2e3a584bc6beb67fef5e035aa9d/pydantic_core-2.46.3-graalpy311-graalpy242_311_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4e9d76736da5f362fabfeea6a69b13b7f2be405c6d6966f06b2f6bfff7e64531", size = 2128596, upload-time = "2026-04-20T14:40:41.707Z" }, + { url = "https://files.pythonhosted.org/packages/34/42/f426db557e8ab2791bc7562052299944a118655496fbff99914e564c0a94/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-macosx_10_12_x86_64.whl", hash = "sha256:b12dd51f1187c2eb489af8e20f880362db98e954b54ab792fa5d92e8bcc6b803", size = 2091877, upload-time = "2026-04-20T14:43:27.091Z" }, + { url = "https://files.pythonhosted.org/packages/5c/4f/86a832a9d14df58e663bfdf4627dc00d3317c2bd583c4fb23390b0f04b8e/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-macosx_11_0_arm64.whl", hash = "sha256:f00a0961b125f1a47af7bcc17f00782e12f4cd056f83416006b30111d941dfa3", size = 1932428, upload-time = "2026-04-20T14:40:45.781Z" }, + { url = "https://files.pythonhosted.org/packages/11/1a/fe857968954d93fb78e0d4b6df5c988c74c4aaa67181c60be7cfe327c0ca/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:57697d7c056aca4bbb680200f96563e841a6386ac1129370a0102592f4dddff5", size = 1997550, upload-time = "2026-04-20T14:44:02.425Z" }, + { url = "https://files.pythonhosted.org/packages/17/eb/9d89ad2d9b0ba8cd65393d434471621b98912abb10fbe1df08e480ba57b5/pydantic_core-2.46.3-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fd35aa21299def8db7ef4fe5c4ff862941a9a158ca7b63d61e66fe67d30416b4", size = 2137657, upload-time = "2026-04-20T14:42:45.149Z" }, + { url = "https://files.pythonhosted.org/packages/1f/da/99d40830684f81dec901cac521b5b91c095394cc1084b9433393cde1c2df/pydantic_core-2.46.3-pp311-pypy311_pp73-macosx_10_12_x86_64.whl", hash = "sha256:13afdd885f3d71280cf286b13b310ee0f7ccfefd1dbbb661514a474b726e2f25", size = 2107973, upload-time = "2026-04-20T14:42:06.175Z" }, + { url = "https://files.pythonhosted.org/packages/99/a5/87024121818d75bbb2a98ddbaf638e40e7a18b5e0f5492c9ca4b1b316107/pydantic_core-2.46.3-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:f91c0aff3e3ee0928edd1232c57f643a7a003e6edf1860bc3afcdc749cb513f3", size = 1947191, upload-time = "2026-04-20T14:43:14.319Z" }, + { url = "https://files.pythonhosted.org/packages/60/62/0c1acfe10945b83a6a59d19fbaa92f48825381509e5701b855c08f13db76/pydantic_core-2.46.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6529d1d128321a58d30afcc97b49e98836542f68dd41b33c2e972bb9e5290536", size = 2123791, upload-time = "2026-04-20T14:43:22.766Z" }, + { url = "https://files.pythonhosted.org/packages/75/3e/3b2393b4c8f44285561dc30b00cf307a56a2eff7c483a824db3b8221ca51/pydantic_core-2.46.3-pp311-pypy311_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:975c267cff4f7e7272eacbe50f6cc03ca9a3da4c4fbd66fffd89c94c1e311aa1", size = 2153197, upload-time = "2026-04-20T14:44:27.932Z" }, + { url = "https://files.pythonhosted.org/packages/ba/75/5af02fb35505051eee727c061f2881c555ab4f8ddb2d42da715a42c9731b/pydantic_core-2.46.3-pp311-pypy311_pp73-musllinux_1_1_aarch64.whl", hash = "sha256:2b8e4f2bbdf71415c544b4b1138b8060db7b6611bc927e8064c769f64bed651c", size = 2181073, upload-time = "2026-04-20T14:43:20.729Z" }, + { url = "https://files.pythonhosted.org/packages/10/92/7e0e1bd9ca3c68305db037560ca2876f89b2647deb2f8b6319005de37505/pydantic_core-2.46.3-pp311-pypy311_pp73-musllinux_1_1_armv7l.whl", hash = "sha256:e61ea8e9fff9606d09178f577ff8ccdd7206ff73d6552bcec18e1033c4254b85", size = 2315886, upload-time = "2026-04-20T14:44:04.826Z" }, + { url = "https://files.pythonhosted.org/packages/b8/d8/101655f27eaf3e44558ead736b2795d12500598beed4683f279396fa186e/pydantic_core-2.46.3-pp311-pypy311_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:b504bda01bafc69b6d3c7a0c7f039dcf60f47fab70e06fe23f57b5c75bdc82b8", size = 2360528, upload-time = "2026-04-20T14:40:47.431Z" }, + { url = "https://files.pythonhosted.org/packages/07/0f/1c34a74c8d07136f0d729ffe5e1fdab04fbdaa7684f61a92f92511a84a15/pydantic_core-2.46.3-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:b00b76f7142fc60c762ce579bd29c8fa44aaa56592dd3c54fab3928d0d4ca6ff", size = 2184144, upload-time = "2026-04-20T14:42:57Z" }, ] [[package]] @@ -2440,6 +2429,19 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/f4/7e/a72dd26f3b0f4f2bf1dd8923c85f7ceb43172af56d63c7383eb62b332364/pygments-2.20.0-py3-none-any.whl", hash = "sha256:81a9e26dd42fd28a23a2d169d86d7ac03b46e2f8b59ed4698fb4785f946d0176", size = 1231151, upload-time = "2026-03-29T13:29:30.038Z" }, ] +[[package]] +name = "pymdown-extensions" +version = "10.21.2" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "markdown" }, + { name = "pyyaml" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/df/08/f1c908c581fd11913da4711ea7ba32c0eee40b0190000996bb863b0c9349/pymdown_extensions-10.21.2.tar.gz", hash = "sha256:c3f55a5b8a1d0edf6699e35dcbea71d978d34ff3fa79f3d807b8a5b3fa90fbdc", size = 853922, upload-time = "2026-03-29T15:01:55.233Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/f7/27/a2fc51a4a122dfd1015e921ae9d22fee3d20b0b8080d9a704578bf9deece/pymdown_extensions-10.21.2-py3-none-any.whl", hash = "sha256:5c0fd2a2bea14eb39af8ff284f1066d898ab2187d81b889b75d46d4348c01638", size = 268901, upload-time = "2026-03-29T15:01:53.244Z" }, +] + [[package]] name = "pyparsing" version = "3.3.2" @@ -2462,6 +2464,7 @@ dependencies = [ { name = "scikit-learn" }, { name = "tqdm" }, { name = "zarr" }, + { name = "zensical" }, ] [package.dev-dependencies] @@ -2495,6 +2498,7 @@ requires-dist = [ { name = "scikit-learn", specifier = ">=1.4,<2" }, { name = "tqdm", specifier = ">=4.60,<5" }, { name = "zarr", specifier = ">=3.1,<4" }, + { name = "zensical", specifier = ">=0.0.38" }, ] [package.metadata.requires-dev] @@ -2518,7 +2522,7 @@ dev = [ [[package]] name = "pytest" -version = "9.0.2" +version = "9.0.3" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "colorama", marker = "sys_platform == 'win32'" }, @@ -2527,9 +2531,9 @@ dependencies = [ { name = "pluggy" }, { name = "pygments" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/d1/db/7ef3487e0fb0049ddb5ce41d3a49c235bf9ad299b6a25d5780a89f19230f/pytest-9.0.2.tar.gz", hash = "sha256:75186651a92bd89611d1d9fc20f0b4345fd827c41ccd5c299a868a05d70edf11", size = 1568901, upload-time = "2025-12-06T21:30:51.014Z" } +sdist = { url = "https://files.pythonhosted.org/packages/7d/0d/549bd94f1a0a402dc8cf64563a117c0f3765662e2e668477624baeec44d5/pytest-9.0.3.tar.gz", hash = "sha256:b86ada508af81d19edeb213c681b1d48246c1a91d304c6c81a427674c17eb91c", size = 1572165, upload-time = "2026-04-07T17:16:18.027Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/3b/ab/b3226f0bd7cdcf710fbede2b3548584366da3b19b5021e74f5bde2a8fa3f/pytest-9.0.2-py3-none-any.whl", hash = "sha256:711ffd45bf766d5264d487b917733b453d917afd2b0ad65223959f59089f875b", size = 374801, upload-time = "2025-12-06T21:30:49.154Z" }, + { url = "https://files.pythonhosted.org/packages/d4/24/a372aaf5c9b7208e7112038812994107bc65a84cd00e0354a88c2c77a617/pytest-9.0.3-py3-none-any.whl", hash = "sha256:2c5efc453d45394fdd706ade797c0a81091eccd1d6e4bccfcd476e2b8e0ab5d9", size = 375249, upload-time = "2026-04-07T17:16:16.13Z" }, ] [[package]] @@ -2560,15 +2564,15 @@ wheels = [ [[package]] name = "python-discovery" -version = "1.2.1" +version = "1.2.2" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "filelock" }, { name = "platformdirs" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/b9/88/815e53084c5079a59df912825a279f41dd2e0df82281770eadc732f5352c/python_discovery-1.2.1.tar.gz", hash = "sha256:180c4d114bff1c32462537eac5d6a332b768242b76b69c0259c7d14b1b680c9e", size = 58457, upload-time = "2026-03-26T22:30:44.496Z" } +sdist = { url = "https://files.pythonhosted.org/packages/de/ef/3bae0e537cfe91e8431efcba4434463d2c5a65f5a89edd47c6cf2f03c55f/python_discovery-1.2.2.tar.gz", hash = "sha256:876e9c57139eb757cb5878cbdd9ae5379e5d96266c99ef731119e04fffe533bb", size = 58872, upload-time = "2026-04-07T17:28:49.249Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/67/0f/019d3949a40280f6193b62bc010177d4ce702d0fce424322286488569cd3/python_discovery-1.2.1-py3-none-any.whl", hash = "sha256:b6a957b24c1cd79252484d3566d1b49527581d46e789aaf43181005e56201502", size = 31674, upload-time = "2026-03-26T22:30:43.396Z" }, + { url = "https://files.pythonhosted.org/packages/d8/db/795879cc3ddfe338599bddea6388cc5100b088db0a4caf6e6c1af1c27e04/python_discovery-1.2.2-py3-none-any.whl", hash = "sha256:e1ae95d9af875e78f15e19aed0c6137ab1bb49c200f21f5061786490c9585c7a", size = 31894, upload-time = "2026-04-07T17:28:48.09Z" }, ] [[package]] @@ -2703,15 +2707,15 @@ wheels = [ [[package]] name = "rich" -version = "14.3.3" +version = "15.0.0" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "markdown-it-py" }, { name = "pygments" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/b3/c6/f3b320c27991c46f43ee9d856302c70dc2d0fb2dba4842ff739d5f46b393/rich-14.3.3.tar.gz", hash = "sha256:b8daa0b9e4eef54dd8cf7c86c03713f53241884e814f4e2f5fb342fe520f639b", size = 230582, upload-time = "2026-02-19T17:23:12.474Z" } +sdist = { url = "https://files.pythonhosted.org/packages/c0/8f/0722ca900cc807c13a6a0c696dacf35430f72e0ec571c4275d2371fca3e9/rich-15.0.0.tar.gz", hash = "sha256:edd07a4824c6b40189fb7ac9bc4c52536e9780fbbfbddf6f1e2502c31b068c36", size = 230680, upload-time = "2026-04-12T08:24:00.75Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/14/25/b208c5683343959b670dc001595f2f3737e051da617f66c31f7c4fa93abc/rich-14.3.3-py3-none-any.whl", hash = "sha256:793431c1f8619afa7d3b52b2cdec859562b950ea0d4b6b505397612db8d5362d", size = 310458, upload-time = "2026-02-19T17:23:13.732Z" }, + { url = "https://files.pythonhosted.org/packages/82/3b/64d4899d73f91ba49a8c18a8ff3f0ea8f1c1d75481760df8c68ef5235bf5/rich-15.0.0-py3-none-any.whl", hash = "sha256:33bd4ef74232fb73fe9279a257718407f169c09b78a87ad3d296f548e27de0bb", size = 310654, upload-time = "2026-04-12T08:24:02.83Z" }, ] [[package]] @@ -2804,27 +2808,27 @@ wheels = [ [[package]] name = "ruff" -version = "0.15.8" -source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/14/b0/73cf7550861e2b4824950b8b52eebdcc5adc792a00c514406556c5b80817/ruff-0.15.8.tar.gz", hash = "sha256:995f11f63597ee362130d1d5a327a87cb6f3f5eae3094c620bcc632329a4d26e", size = 4610921, upload-time = "2026-03-26T18:39:38.675Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/4a/92/c445b0cd6da6e7ae51e954939cb69f97e008dbe750cfca89b8cedc081be7/ruff-0.15.8-py3-none-linux_armv6l.whl", hash = "sha256:cbe05adeba76d58162762d6b239c9056f1a15a55bd4b346cfd21e26cd6ad7bc7", size = 10527394, upload-time = "2026-03-26T18:39:41.566Z" }, - { url = "https://files.pythonhosted.org/packages/eb/92/f1c662784d149ad1414cae450b082cf736430c12ca78367f20f5ed569d65/ruff-0.15.8-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:d3e3d0b6ba8dca1b7ef9ab80a28e840a20070c4b62e56d675c24f366ef330570", size = 10905693, upload-time = "2026-03-26T18:39:30.364Z" }, - { url = "https://files.pythonhosted.org/packages/ca/f2/7a631a8af6d88bcef997eb1bf87cc3da158294c57044aafd3e17030613de/ruff-0.15.8-py3-none-macosx_11_0_arm64.whl", hash = "sha256:6ee3ae5c65a42f273f126686353f2e08ff29927b7b7e203b711514370d500de3", size = 10323044, upload-time = "2026-03-26T18:39:33.37Z" }, - { url = "https://files.pythonhosted.org/packages/67/18/1bf38e20914a05e72ef3b9569b1d5c70a7ef26cd188d69e9ca8ef588d5bf/ruff-0.15.8-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fdce027ada77baa448077ccc6ebb2fa9c3c62fd110d8659d601cf2f475858d94", size = 10629135, upload-time = "2026-03-26T18:39:44.142Z" }, - { url = "https://files.pythonhosted.org/packages/d2/e9/138c150ff9af60556121623d41aba18b7b57d95ac032e177b6a53789d279/ruff-0.15.8-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:12e617fc01a95e5821648a6df341d80456bd627bfab8a829f7cfc26a14a4b4a3", size = 10348041, upload-time = "2026-03-26T18:39:52.178Z" }, - { url = "https://files.pythonhosted.org/packages/02/f1/5bfb9298d9c323f842c5ddeb85f1f10ef51516ac7a34ba446c9347d898df/ruff-0.15.8-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:432701303b26416d22ba696c39f2c6f12499b89093b61360abc34bcc9bf07762", size = 11121987, upload-time = "2026-03-26T18:39:55.195Z" }, - { url = "https://files.pythonhosted.org/packages/10/11/6da2e538704e753c04e8d86b1fc55712fdbdcc266af1a1ece7a51fff0d10/ruff-0.15.8-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:d910ae974b7a06a33a057cb87d2a10792a3b2b3b35e33d2699fdf63ec8f6b17a", size = 11951057, upload-time = "2026-03-26T18:39:19.18Z" }, - { url = "https://files.pythonhosted.org/packages/83/f0/c9208c5fd5101bf87002fed774ff25a96eea313d305f1e5d5744698dc314/ruff-0.15.8-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2033f963c43949d51e6fdccd3946633c6b37c484f5f98c3035f49c27395a8ab8", size = 11464613, upload-time = "2026-03-26T18:40:06.301Z" }, - { url = "https://files.pythonhosted.org/packages/f8/22/d7f2fabdba4fae9f3b570e5605d5eb4500dcb7b770d3217dca4428484b17/ruff-0.15.8-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0f29b989a55572fb885b77464cf24af05500806ab4edf9a0fd8977f9759d85b1", size = 11257557, upload-time = "2026-03-26T18:39:57.972Z" }, - { url = "https://files.pythonhosted.org/packages/71/8c/382a9620038cf6906446b23ce8632ab8c0811b8f9d3e764f58bedd0c9a6f/ruff-0.15.8-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:ac51d486bf457cdc985a412fb1801b2dfd1bd8838372fc55de64b1510eff4bec", size = 11169440, upload-time = "2026-03-26T18:39:22.205Z" }, - { url = "https://files.pythonhosted.org/packages/4d/0d/0994c802a7eaaf99380085e4e40c845f8e32a562e20a38ec06174b52ef24/ruff-0.15.8-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:c9861eb959edab053c10ad62c278835ee69ca527b6dcd72b47d5c1e5648964f6", size = 10605963, upload-time = "2026-03-26T18:39:46.682Z" }, - { url = "https://files.pythonhosted.org/packages/19/aa/d624b86f5b0aad7cef6bbf9cd47a6a02dfdc4f72c92a337d724e39c9d14b/ruff-0.15.8-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:8d9a5b8ea13f26ae90838afc33f91b547e61b794865374f114f349e9036835fb", size = 10357484, upload-time = "2026-03-26T18:39:49.176Z" }, - { url = "https://files.pythonhosted.org/packages/35/c3/e0b7835d23001f7d999f3895c6b569927c4d39912286897f625736e1fd04/ruff-0.15.8-py3-none-musllinux_1_2_i686.whl", hash = "sha256:c2a33a529fb3cbc23a7124b5c6ff121e4d6228029cba374777bd7649cc8598b8", size = 10830426, upload-time = "2026-03-26T18:40:03.702Z" }, - { url = "https://files.pythonhosted.org/packages/f0/51/ab20b322f637b369383adc341d761eaaa0f0203d6b9a7421cd6e783d81b9/ruff-0.15.8-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:75e5cd06b1cf3f47a3996cfc999226b19aa92e7cce682dcd62f80d7035f98f49", size = 11345125, upload-time = "2026-03-26T18:39:27.799Z" }, - { url = "https://files.pythonhosted.org/packages/37/e6/90b2b33419f59d0f2c4c8a48a4b74b460709a557e8e0064cf33ad894f983/ruff-0.15.8-py3-none-win32.whl", hash = "sha256:bc1f0a51254ba21767bfa9a8b5013ca8149dcf38092e6a9eb704d876de94dc34", size = 10571959, upload-time = "2026-03-26T18:39:36.117Z" }, - { url = "https://files.pythonhosted.org/packages/1f/a2/ef467cb77099062317154c63f234b8a7baf7cb690b99af760c5b68b9ee7f/ruff-0.15.8-py3-none-win_amd64.whl", hash = "sha256:04f79eff02a72db209d47d665ba7ebcad609d8918a134f86cb13dd132159fc89", size = 11743893, upload-time = "2026-03-26T18:39:25.01Z" }, - { url = "https://files.pythonhosted.org/packages/15/e2/77be4fff062fa78d9b2a4dea85d14785dac5f1d0c1fb58ed52331f0ebe28/ruff-0.15.8-py3-none-win_arm64.whl", hash = "sha256:cf891fa8e3bb430c0e7fac93851a5978fc99c8fa2c053b57b118972866f8e5f2", size = 11048175, upload-time = "2026-03-26T18:40:01.06Z" }, +version = "0.15.12" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/99/43/3291f1cc9106f4c63bdce7a8d0df5047fe8422a75b091c16b5e9355e0b11/ruff-0.15.12.tar.gz", hash = "sha256:ecea26adb26b4232c0c2ca19ccbc0083a68344180bba2a600605538ce51a40a6", size = 4643852, upload-time = "2026-04-24T18:17:14.305Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/c3/6e/e78ffb61d4686f3d96ba3df2c801161843746dcbcbb17a1e927d4829312b/ruff-0.15.12-py3-none-linux_armv6l.whl", hash = "sha256:f86f176e188e94d6bdbc09f09bfd9dc729059ad93d0e7390b5a73efe19f8861c", size = 10640713, upload-time = "2026-04-24T18:17:22.841Z" }, + { url = "https://files.pythonhosted.org/packages/ae/08/a317bc231fb9e7b93e4ef3089501e51922ff88d6936ce5cf870c4fe55419/ruff-0.15.12-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:e3bcd123364c3770b8e1b7baaf343cc99a35f197c5c6e8af79015c666c423a6c", size = 11069267, upload-time = "2026-04-24T18:17:30.105Z" }, + { url = "https://files.pythonhosted.org/packages/aa/a4/f828e9718d3dce1f5f11c39c4f65afd32783c8b2aebb2e3d259e492c47bd/ruff-0.15.12-py3-none-macosx_11_0_arm64.whl", hash = "sha256:fe87510d000220aa1ed530d4448a7c696a0cae1213e5ec30e5874287b66557b5", size = 10397182, upload-time = "2026-04-24T18:17:07.177Z" }, + { url = "https://files.pythonhosted.org/packages/71/e0/3310fc6d1b5e1fdea22bf3b1b807c7e187b581021b0d7d4514cccdb5fb71/ruff-0.15.12-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:84a1630093121375a3e2a95b4a6dc7b59e2b4ee76216e32d81aae550a832d002", size = 10758012, upload-time = "2026-04-24T18:16:55.759Z" }, + { url = "https://files.pythonhosted.org/packages/11/c1/a606911aee04c324ddaa883ae418f3569792fd3c4a10c50e0dd0a2311e1e/ruff-0.15.12-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:fb129f40f114f089ebe0ca56c0d251cf2061b17651d464bb6478dc01e69f11f5", size = 10447479, upload-time = "2026-04-24T18:16:51.677Z" }, + { url = "https://files.pythonhosted.org/packages/9d/68/4201e8444f0894f21ab4aeeaee68aa4f10b51613514a20d80bd628d57e88/ruff-0.15.12-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:b0c862b172d695db7598426b8af465e7e9ac00a3ea2a3630ee67eb82e366aaa6", size = 11234040, upload-time = "2026-04-24T18:17:16.529Z" }, + { url = "https://files.pythonhosted.org/packages/34/ff/8a6d6cf4ccc23fd67060874e832c18919d1557a0611ebef03fdb01fff11e/ruff-0.15.12-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:2849ea9f3484c3aca43a82f484210370319e7170df4dfe4843395ddf6c57bc33", size = 12087377, upload-time = "2026-04-24T18:17:04.944Z" }, + { url = "https://files.pythonhosted.org/packages/85/f6/c669cf73f5152f623d34e69866a46d5e6185816b19fcd5b6dd8a2d299922/ruff-0.15.12-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:9e77c7e51c07fe396826d5969a5b846d9cd4c402535835fb6e21ce8b28fef847", size = 11367784, upload-time = "2026-04-24T18:17:25.409Z" }, + { url = "https://files.pythonhosted.org/packages/e8/39/c61d193b8a1daaa8977f7dea9e8d8ba866e02ea7b65d32f6861693aa4c12/ruff-0.15.12-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:83b2f4f2f3b1026b5fb449b467d9264bf22067b600f7b6f41fc5958909f449d0", size = 11344088, upload-time = "2026-04-24T18:17:12.258Z" }, + { url = "https://files.pythonhosted.org/packages/c2/8d/49afab3645e31e12c590acb6d3b5b69d7aab5b81926dbaf7461f9441f37a/ruff-0.15.12-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:9ba3b8f1afd7e2e43d8943e55f249e13f9682fde09711644a6e7290eb4f3e339", size = 11271770, upload-time = "2026-04-24T18:17:02.457Z" }, + { url = "https://files.pythonhosted.org/packages/46/06/33f41fe94403e2b755481cdfb9b7ef3e4e0ed031c4581124658d935d52b4/ruff-0.15.12-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:e852ba9fdc890655e1d78f2df1499efbe0e54126bd405362154a75e2bde159c5", size = 10719355, upload-time = "2026-04-24T18:17:27.648Z" }, + { url = "https://files.pythonhosted.org/packages/0d/59/18aa4e014debbf559670e4048e39260a85c7fcee84acfd761ac01e7b8d35/ruff-0.15.12-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:dd8aed930da53780d22fc70bdf84452c843cf64f8cb4eb38984319c24c5cd5fd", size = 10462758, upload-time = "2026-04-24T18:17:32.347Z" }, + { url = "https://files.pythonhosted.org/packages/25/e7/cc9f16fd0f3b5fddcbd7ec3d6ae30c8f3fde1047f32a4093a98d633c6570/ruff-0.15.12-py3-none-musllinux_1_2_i686.whl", hash = "sha256:01da3988d225628b709493d7dc67c3b9b12c0210016b08690ef9bd27970b262b", size = 10953498, upload-time = "2026-04-24T18:17:20.674Z" }, + { url = "https://files.pythonhosted.org/packages/72/7a/a9ba7f98c7a575978698f4230c5e8cc54bbc761af34f560818f933dafa0c/ruff-0.15.12-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:9cae0f92bd5700d1213188b31cd3bdd2b315361296d10b96b8e2337d3d11f53e", size = 11447765, upload-time = "2026-04-24T18:17:09.755Z" }, + { url = "https://files.pythonhosted.org/packages/ea/f9/0ae446942c846b8266059ad8a30702a35afae55f5cdc54c5adf8d7afdc27/ruff-0.15.12-py3-none-win32.whl", hash = "sha256:d0185894e038d7043ba8fd6aee7499ece6462dc0ea9f1e260c7451807c714c20", size = 10657277, upload-time = "2026-04-24T18:17:18.591Z" }, + { url = "https://files.pythonhosted.org/packages/33/f1/9614e03e1cdcbf9437570b5400ced8a720b5db22b28d8e0f1bda429f660d/ruff-0.15.12-py3-none-win_amd64.whl", hash = "sha256:c87a162d61ab3adca47c03f7f717c68672edec7d1b5499e652331780fe74950d", size = 11837758, upload-time = "2026-04-24T18:17:00.113Z" }, + { url = "https://files.pythonhosted.org/packages/c0/98/6beb4b351e472e5f4c4613f7c35a5290b8be2497e183825310c4c3a3984b/ruff-0.15.12-py3-none-win_arm64.whl", hash = "sha256:a538f7a82d061cee7be55542aca1d86d1393d55d81d4fcc314370f4340930d4f", size = 11120821, upload-time = "2026-04-24T18:16:57.979Z" }, ] [[package]] @@ -3170,42 +3174,42 @@ wheels = [ [[package]] name = "sqlalchemy" -version = "2.0.48" +version = "2.0.49" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "greenlet", marker = "platform_machine == 'AMD64' or platform_machine == 'WIN32' or platform_machine == 'aarch64' or platform_machine == 'amd64' or platform_machine == 'ppc64le' or platform_machine == 'win32' or platform_machine == 'x86_64'" }, { name = "typing-extensions" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/1f/73/b4a9737255583b5fa858e0bb8e116eb94b88c910164ed2ed719147bde3de/sqlalchemy-2.0.48.tar.gz", hash = "sha256:5ca74f37f3369b45e1f6b7b06afb182af1fd5dde009e4ffd831830d98cbe5fe7", size = 9886075, upload-time = "2026-03-02T15:28:51.474Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/d7/6d/b8b78b5b80f3c3ab3f7fa90faa195ec3401f6d884b60221260fd4d51864c/sqlalchemy-2.0.48-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:1b4c575df7368b3b13e0cebf01d4679f9a28ed2ae6c1cd0b1d5beffb6b2007dc", size = 2157184, upload-time = "2026-03-02T15:38:28.161Z" }, - { url = "https://files.pythonhosted.org/packages/21/4b/4f3d4a43743ab58b95b9ddf5580a265b593d017693df9e08bd55780af5bb/sqlalchemy-2.0.48-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e83e3f959aaa1c9df95c22c528096d94848a1bc819f5d0ebf7ee3df0ca63db6c", size = 3313555, upload-time = "2026-03-02T15:58:57.21Z" }, - { url = "https://files.pythonhosted.org/packages/21/dd/3b7c53f1dbbf736fd27041aee68f8ac52226b610f914085b1652c2323442/sqlalchemy-2.0.48-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6f7b7243850edd0b8b97043f04748f31de50cf426e939def5c16bedb540698f7", size = 3313057, upload-time = "2026-03-02T15:52:29.366Z" }, - { url = "https://files.pythonhosted.org/packages/d9/cc/3e600a90ae64047f33313d7d32e5ad025417f09d2ded487e8284b5e21a15/sqlalchemy-2.0.48-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:82745b03b4043e04600a6b665cb98697c4339b24e34d74b0a2ac0a2488b6f94d", size = 3265431, upload-time = "2026-03-02T15:58:59.096Z" }, - { url = "https://files.pythonhosted.org/packages/8b/19/780138dacfe3f5024f4cf96e4005e91edf6653d53d3673be4844578faf1d/sqlalchemy-2.0.48-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:e5e088bf43f6ee6fec7dbf1ef7ff7774a616c236b5c0cb3e00662dd71a56b571", size = 3287646, upload-time = "2026-03-02T15:52:31.569Z" }, - { url = "https://files.pythonhosted.org/packages/40/fd/f32ced124f01a23151f4777e4c705f3a470adc7bd241d9f36a7c941a33bf/sqlalchemy-2.0.48-cp311-cp311-win32.whl", hash = "sha256:9c7d0a77e36b5f4b01ca398482230ab792061d243d715299b44a0b55c89fe617", size = 2116956, upload-time = "2026-03-02T15:46:54.535Z" }, - { url = "https://files.pythonhosted.org/packages/58/d5/dd767277f6feef12d05651538f280277e661698f617fa4d086cce6055416/sqlalchemy-2.0.48-cp311-cp311-win_amd64.whl", hash = "sha256:583849c743e0e3c9bb7446f5b5addeacedc168d657a69b418063dfdb2d90081c", size = 2141627, upload-time = "2026-03-02T15:46:55.849Z" }, - { url = "https://files.pythonhosted.org/packages/ef/91/a42ae716f8925e9659df2da21ba941f158686856107a61cc97a95e7647a3/sqlalchemy-2.0.48-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:348174f228b99f33ca1f773e85510e08927620caa59ffe7803b37170df30332b", size = 2155737, upload-time = "2026-03-02T15:49:13.207Z" }, - { url = "https://files.pythonhosted.org/packages/b9/52/f75f516a1f3888f027c1cfb5d22d4376f4b46236f2e8669dcb0cddc60275/sqlalchemy-2.0.48-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:53667b5f668991e279d21f94ccfa6e45b4e3f4500e7591ae59a8012d0f010dcb", size = 3337020, upload-time = "2026-03-02T15:50:34.547Z" }, - { url = "https://files.pythonhosted.org/packages/37/9a/0c28b6371e0cdcb14f8f1930778cb3123acfcbd2c95bb9cf6b4a2ba0cce3/sqlalchemy-2.0.48-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:34634e196f620c7a61d18d5cf7dc841ca6daa7961aed75d532b7e58b309ac894", size = 3349983, upload-time = "2026-03-02T15:53:25.542Z" }, - { url = "https://files.pythonhosted.org/packages/1c/46/0aee8f3ff20b1dcbceb46ca2d87fcc3d48b407925a383ff668218509d132/sqlalchemy-2.0.48-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:546572a1793cc35857a2ffa1fe0e58571af1779bcc1ffa7c9fb0839885ed69a9", size = 3279690, upload-time = "2026-03-02T15:50:36.277Z" }, - { url = "https://files.pythonhosted.org/packages/ce/8c/a957bc91293b49181350bfd55e6dfc6e30b7f7d83dc6792d72043274a390/sqlalchemy-2.0.48-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:07edba08061bc277bfdc772dd2a1a43978f5a45994dd3ede26391b405c15221e", size = 3314738, upload-time = "2026-03-02T15:53:27.519Z" }, - { url = "https://files.pythonhosted.org/packages/4b/44/1d257d9f9556661e7bdc83667cc414ba210acfc110c82938cb3611eea58f/sqlalchemy-2.0.48-cp312-cp312-win32.whl", hash = "sha256:908a3fa6908716f803b86896a09a2c4dde5f5ce2bb07aacc71ffebb57986ce99", size = 2115546, upload-time = "2026-03-02T15:54:31.591Z" }, - { url = "https://files.pythonhosted.org/packages/f2/af/c3c7e1f3a2b383155a16454df62ae8c62a30dd238e42e68c24cebebbfae6/sqlalchemy-2.0.48-cp312-cp312-win_amd64.whl", hash = "sha256:68549c403f79a8e25984376480959975212a670405e3913830614432b5daa07a", size = 2142484, upload-time = "2026-03-02T15:54:34.072Z" }, - { url = "https://files.pythonhosted.org/packages/d1/c6/569dc8bf3cd375abc5907e82235923e986799f301cd79a903f784b996fca/sqlalchemy-2.0.48-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:e3070c03701037aa418b55d36532ecb8f8446ed0135acb71c678dbdf12f5b6e4", size = 2152599, upload-time = "2026-03-02T15:49:14.41Z" }, - { url = "https://files.pythonhosted.org/packages/6d/ff/f4e04a4bd5a24304f38cb0d4aa2ad4c0fb34999f8b884c656535e1b2b74c/sqlalchemy-2.0.48-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2645b7d8a738763b664a12a1542c89c940daa55196e8d73e55b169cc5c99f65f", size = 3278825, upload-time = "2026-03-02T15:50:38.269Z" }, - { url = "https://files.pythonhosted.org/packages/fe/88/cb59509e4668d8001818d7355d9995be90c321313078c912420603a7cb95/sqlalchemy-2.0.48-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b19151e76620a412c2ac1c6f977ab1b9fa7ad43140178345136456d5265b32ed", size = 3295200, upload-time = "2026-03-02T15:53:29.366Z" }, - { url = "https://files.pythonhosted.org/packages/87/dc/1609a4442aefd750ea2f32629559394ec92e89ac1d621a7f462b70f736ff/sqlalchemy-2.0.48-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:5b193a7e29fd9fa56e502920dca47dffe60f97c863494946bd698c6058a55658", size = 3226876, upload-time = "2026-03-02T15:50:39.802Z" }, - { url = "https://files.pythonhosted.org/packages/37/c3/6ae2ab5ea2fa989fbac4e674de01224b7a9d744becaf59bb967d62e99bed/sqlalchemy-2.0.48-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:36ac4ddc3d33e852da9cb00ffb08cea62ca05c39711dc67062ca2bb1fae35fd8", size = 3265045, upload-time = "2026-03-02T15:53:31.421Z" }, - { url = "https://files.pythonhosted.org/packages/6f/82/ea4665d1bb98c50c19666e672f21b81356bd6077c4574e3d2bbb84541f53/sqlalchemy-2.0.48-cp313-cp313-win32.whl", hash = "sha256:389b984139278f97757ea9b08993e7b9d1142912e046ab7d82b3fbaeb0209131", size = 2113700, upload-time = "2026-03-02T15:54:35.825Z" }, - { url = "https://files.pythonhosted.org/packages/b7/2b/b9040bec58c58225f073f5b0c1870defe1940835549dafec680cbd58c3c3/sqlalchemy-2.0.48-cp313-cp313-win_amd64.whl", hash = "sha256:d612c976cbc2d17edfcc4c006874b764e85e990c29ce9bd411f926bbfb02b9a2", size = 2139487, upload-time = "2026-03-02T15:54:37.079Z" }, - { url = "https://files.pythonhosted.org/packages/f4/f4/7b17bd50244b78a49d22cc63c969d71dc4de54567dc152a9b46f6fae40ce/sqlalchemy-2.0.48-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:69f5bc24904d3bc3640961cddd2523e361257ef68585d6e364166dfbe8c78fae", size = 3558851, upload-time = "2026-03-02T15:57:48.607Z" }, - { url = "https://files.pythonhosted.org/packages/20/0d/213668e9aca61d370f7d2a6449ea4ec699747fac67d4bda1bb3d129025be/sqlalchemy-2.0.48-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:fd08b90d211c086181caed76931ecfa2bdfc83eea3cfccdb0f82abc6c4b876cb", size = 3525525, upload-time = "2026-03-02T16:04:38.058Z" }, - { url = "https://files.pythonhosted.org/packages/85/d7/a84edf412979e7d59c69b89a5871f90a49228360594680e667cb2c46a828/sqlalchemy-2.0.48-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:1ccd42229aaac2df431562117ac7e667d702e8e44afdb6cf0e50fa3f18160f0b", size = 3466611, upload-time = "2026-03-02T15:57:50.759Z" }, - { url = "https://files.pythonhosted.org/packages/86/55/42404ce5770f6be26a2b0607e7866c31b9a4176c819e9a7a5e0a055770be/sqlalchemy-2.0.48-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:f0dcbc588cd5b725162c076eb9119342f6579c7f7f55057bb7e3c6ff27e13121", size = 3475812, upload-time = "2026-03-02T16:04:40.092Z" }, - { url = "https://files.pythonhosted.org/packages/ae/ae/29b87775fadc43e627cf582fe3bda4d02e300f6b8f2747c764950d13784c/sqlalchemy-2.0.48-cp313-cp313t-win32.whl", hash = "sha256:9764014ef5e58aab76220c5664abb5d47d5bc858d9debf821e55cfdd0f128485", size = 2141335, upload-time = "2026-03-02T15:52:51.518Z" }, - { url = "https://files.pythonhosted.org/packages/91/44/f39d063c90f2443e5b46ec4819abd3d8de653893aae92df42a5c4f5843de/sqlalchemy-2.0.48-cp313-cp313t-win_amd64.whl", hash = "sha256:e2f35b4cccd9ed286ad62e0a3c3ac21e06c02abc60e20aa51a3e305a30f5fa79", size = 2173095, upload-time = "2026-03-02T15:52:52.79Z" }, - { url = "https://files.pythonhosted.org/packages/46/2c/9664130905f03db57961b8980b05cab624afd114bf2be2576628a9f22da4/sqlalchemy-2.0.48-py3-none-any.whl", hash = "sha256:a66fe406437dd65cacd96a72689a3aaaecaebbcd62d81c5ac1c0fdbeac835096", size = 1940202, upload-time = "2026-03-02T15:52:43.285Z" }, +sdist = { url = "https://files.pythonhosted.org/packages/09/45/461788f35e0364a8da7bda51a1fe1b09762d0c32f12f63727998d85a873b/sqlalchemy-2.0.49.tar.gz", hash = "sha256:d15950a57a210e36dd4cec1aac22787e2a4d57ba9318233e2ef8b2daf9ff2d5f", size = 9898221, upload-time = "2026-04-03T16:38:11.704Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/60/b5/e3617cc67420f8f403efebd7b043128f94775e57e5b84e7255203390ceae/sqlalchemy-2.0.49-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:c5070135e1b7409c4161133aa525419b0062088ed77c92b1da95366ec5cbebbe", size = 2159126, upload-time = "2026-04-03T16:50:13.242Z" }, + { url = "https://files.pythonhosted.org/packages/20/9b/91ca80403b17cd389622a642699e5f6564096b698e7cdcbcbb6409898bc4/sqlalchemy-2.0.49-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9ac7a3e245fd0310fd31495eb61af772e637bdf7d88ee81e7f10a3f271bff014", size = 3315509, upload-time = "2026-04-03T16:54:49.332Z" }, + { url = "https://files.pythonhosted.org/packages/b1/61/0722511d98c54de95acb327824cb759e8653789af2b1944ab1cc69d32565/sqlalchemy-2.0.49-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4d4e5a0ceba319942fa6b585cf82539288a61e314ef006c1209f734551ab9536", size = 3315014, upload-time = "2026-04-03T16:56:56.376Z" }, + { url = "https://files.pythonhosted.org/packages/46/55/d514a653ffeb4cebf4b54c47bec32ee28ad89d39fafba16eeed1d81dccd5/sqlalchemy-2.0.49-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:3ddcb27fb39171de36e207600116ac9dfd4ae46f86c82a9bf3934043e80ebb88", size = 3267388, upload-time = "2026-04-03T16:54:51.272Z" }, + { url = "https://files.pythonhosted.org/packages/2f/16/0dcc56cb6d3335c1671a2258f5d2cb8267c9a2260e27fde53cbfb1b3540a/sqlalchemy-2.0.49-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:32fe6a41ad97302db2931f05bb91abbcc65b5ce4c675cd44b972428dd2947700", size = 3289602, upload-time = "2026-04-03T16:56:57.63Z" }, + { url = "https://files.pythonhosted.org/packages/51/6c/f8ab6fb04470a133cd80608db40aa292e6bae5f162c3a3d4ab19544a67af/sqlalchemy-2.0.49-cp311-cp311-win32.whl", hash = "sha256:46d51518d53edfbe0563662c96954dc8fcace9832332b914375f45a99b77cc9a", size = 2119044, upload-time = "2026-04-03T17:00:53.455Z" }, + { url = "https://files.pythonhosted.org/packages/c4/59/55a6d627d04b6ebb290693681d7683c7da001eddf90b60cfcc41ee907978/sqlalchemy-2.0.49-cp311-cp311-win_amd64.whl", hash = "sha256:951d4a210744813be63019f3df343bf233b7432aadf0db54c75802247330d3af", size = 2143642, upload-time = "2026-04-03T17:00:54.769Z" }, + { url = "https://files.pythonhosted.org/packages/49/b3/2de412451330756aaaa72d27131db6dde23995efe62c941184e15242a5fa/sqlalchemy-2.0.49-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:4bbccb45260e4ff1b7db0be80a9025bb1e6698bdb808b83fff0000f7a90b2c0b", size = 2157681, upload-time = "2026-04-03T16:53:07.132Z" }, + { url = "https://files.pythonhosted.org/packages/50/84/b2a56e2105bd11ebf9f0b93abddd748e1a78d592819099359aa98134a8bf/sqlalchemy-2.0.49-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fb37f15714ec2652d574f021d479e78cd4eb9d04396dca36568fdfffb3487982", size = 3338976, upload-time = "2026-04-03T17:07:40Z" }, + { url = "https://files.pythonhosted.org/packages/2c/fa/65fcae2ed62f84ab72cf89536c7c3217a156e71a2c111b1305ab6f0690e2/sqlalchemy-2.0.49-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:3bb9ec6436a820a4c006aad1ac351f12de2f2dbdaad171692ee457a02429b672", size = 3351937, upload-time = "2026-04-03T17:12:23.374Z" }, + { url = "https://files.pythonhosted.org/packages/f8/2f/6fd118563572a7fe475925742eb6b3443b2250e346a0cc27d8d408e73773/sqlalchemy-2.0.49-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:8d6efc136f44a7e8bc8088507eaabbb8c2b55b3dbb63fe102c690da0ddebe55e", size = 3281646, upload-time = "2026-04-03T17:07:41.949Z" }, + { url = "https://files.pythonhosted.org/packages/c5/d7/410f4a007c65275b9cf82354adb4bb8ba587b176d0a6ee99caa16fe638f8/sqlalchemy-2.0.49-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:e06e617e3d4fd9e51d385dfe45b077a41e9d1b033a7702551e3278ac597dc750", size = 3316695, upload-time = "2026-04-03T17:12:25.642Z" }, + { url = "https://files.pythonhosted.org/packages/d9/95/81f594aa60ded13273a844539041ccf1e66c5a7bed0a8e27810a3b52d522/sqlalchemy-2.0.49-cp312-cp312-win32.whl", hash = "sha256:83101a6930332b87653886c01d1ee7e294b1fe46a07dd9a2d2b4f91bcc88eec0", size = 2117483, upload-time = "2026-04-03T17:05:40.896Z" }, + { url = "https://files.pythonhosted.org/packages/47/9e/fd90114059175cac64e4fafa9bf3ac20584384d66de40793ae2e2f26f3bb/sqlalchemy-2.0.49-cp312-cp312-win_amd64.whl", hash = "sha256:618a308215b6cececb6240b9abde545e3acdabac7ae3e1d4e666896bf5ba44b4", size = 2144494, upload-time = "2026-04-03T17:05:42.282Z" }, + { url = "https://files.pythonhosted.org/packages/ae/81/81755f50eb2478eaf2049728491d4ea4f416c1eb013338682173259efa09/sqlalchemy-2.0.49-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:df2d441bacf97022e81ad047e1597552eb3f83ca8a8f1a1fdd43cd7fe3898120", size = 2154547, upload-time = "2026-04-03T16:53:08.64Z" }, + { url = "https://files.pythonhosted.org/packages/a2/bc/3494270da80811d08bcfa247404292428c4fe16294932bce5593f215cad9/sqlalchemy-2.0.49-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8e20e511dc15265fb433571391ba313e10dd8ea7e509d51686a51313b4ac01a2", size = 3280782, upload-time = "2026-04-03T17:07:43.508Z" }, + { url = "https://files.pythonhosted.org/packages/cd/f5/038741f5e747a5f6ea3e72487211579d8cbea5eb9827a9cbd61d0108c4bd/sqlalchemy-2.0.49-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:47604cb2159f8bbd5a1ab48a714557156320f20871ee64d550d8bf2683d980d3", size = 3297156, upload-time = "2026-04-03T17:12:27.697Z" }, + { url = "https://files.pythonhosted.org/packages/88/50/a6af0ff9dc954b43a65ca9b5367334e45d99684c90a3d3413fc19a02d43c/sqlalchemy-2.0.49-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:22d8798819f86720bc646ab015baff5ea4c971d68121cb36e2ebc2ee43ead2b7", size = 3228832, upload-time = "2026-04-03T17:07:45.38Z" }, + { url = "https://files.pythonhosted.org/packages/bc/d1/5f6bdad8de0bf546fc74370939621396515e0cdb9067402d6ba1b8afbe9a/sqlalchemy-2.0.49-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:9b1c058c171b739e7c330760044803099c7fff11511e3ab3573e5327116a9c33", size = 3267000, upload-time = "2026-04-03T17:12:29.657Z" }, + { url = "https://files.pythonhosted.org/packages/f7/30/ad62227b4a9819a5e1c6abff77c0f614fa7c9326e5a3bdbee90f7139382b/sqlalchemy-2.0.49-cp313-cp313-win32.whl", hash = "sha256:a143af2ea6672f2af3f44ed8f9cd020e9cc34c56f0e8db12019d5d9ecf41cb3b", size = 2115641, upload-time = "2026-04-03T17:05:43.989Z" }, + { url = "https://files.pythonhosted.org/packages/17/3a/7215b1b7d6d49dc9a87211be44562077f5f04f9bb5a59552c1c8e2d98173/sqlalchemy-2.0.49-cp313-cp313-win_amd64.whl", hash = "sha256:12b04d1db2663b421fe072d638a138460a51d5a862403295671c4f3987fb9148", size = 2141498, upload-time = "2026-04-03T17:05:45.7Z" }, + { url = "https://files.pythonhosted.org/packages/28/4b/52a0cb2687a9cd1648252bb257be5a1ba2c2ded20ba695c65756a55a15a4/sqlalchemy-2.0.49-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:24bd94bb301ec672d8f0623eba9226cc90d775d25a0c92b5f8e4965d7f3a1518", size = 3560807, upload-time = "2026-04-03T16:58:31.666Z" }, + { url = "https://files.pythonhosted.org/packages/8c/d8/fda95459204877eed0458550d6c7c64c98cc50c2d8d618026737de9ed41a/sqlalchemy-2.0.49-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a51d3db74ba489266ef55c7a4534eb0b8db9a326553df481c11e5d7660c8364d", size = 3527481, upload-time = "2026-04-03T17:06:00.155Z" }, + { url = "https://files.pythonhosted.org/packages/ff/0a/2aac8b78ac6487240cf7afef8f203ca783e8796002dc0cf65c4ee99ff8bb/sqlalchemy-2.0.49-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:55250fe61d6ebfd6934a272ee16ef1244e0f16b7af6cd18ab5b1fc9f08631db0", size = 3468565, upload-time = "2026-04-03T16:58:33.414Z" }, + { url = "https://files.pythonhosted.org/packages/a5/3d/ce71cfa82c50a373fd2148b3c870be05027155ce791dc9a5dcf439790b8b/sqlalchemy-2.0.49-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:46796877b47034b559a593d7e4b549aba151dae73f9e78212a3478161c12ab08", size = 3477769, upload-time = "2026-04-03T17:06:02.787Z" }, + { url = "https://files.pythonhosted.org/packages/d5/e8/0a9f5c1f7c6f9ca480319bf57c2d7423f08d31445974167a27d14483c948/sqlalchemy-2.0.49-cp313-cp313t-win32.whl", hash = "sha256:9c4969a86e41454f2858256c39bdfb966a20961e9b58bf8749b65abf447e9a8d", size = 2143319, upload-time = "2026-04-03T17:02:04.328Z" }, + { url = "https://files.pythonhosted.org/packages/0e/51/fb5240729fbec73006e137c4f7a7918ffd583ab08921e6ff81a999d6517a/sqlalchemy-2.0.49-cp313-cp313t-win_amd64.whl", hash = "sha256:b9870d15ef00e4d0559ae10ee5bc71b654d1f20076dbe8bc7ed19b4c0625ceba", size = 2175104, upload-time = "2026-04-03T17:02:05.989Z" }, + { url = "https://files.pythonhosted.org/packages/e5/30/8519fdde58a7bdf155b714359791ad1dc018b47d60269d5d160d311fdc36/sqlalchemy-2.0.49-py3-none-any.whl", hash = "sha256:ec44cfa7ef1a728e88ad41674de50f6db8cfdb3e2af84af86e0041aaf02d43d0", size = 1942158, upload-time = "2026-04-03T16:53:44.135Z" }, ] [[package]] @@ -3328,7 +3332,7 @@ wheels = [ [[package]] name = "typer" -version = "0.24.1" +version = "0.25.0" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "annotated-doc" }, @@ -3336,9 +3340,9 @@ dependencies = [ { name = "rich" }, { name = "shellingham" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/f5/24/cb09efec5cc954f7f9b930bf8279447d24618bb6758d4f6adf2574c41780/typer-0.24.1.tar.gz", hash = "sha256:e39b4732d65fbdcde189ae76cf7cd48aeae72919dea1fdfc16593be016256b45", size = 118613, upload-time = "2026-02-21T16:54:40.609Z" } +sdist = { url = "https://files.pythonhosted.org/packages/7b/27/ede8cec7596e0041ba7e7b80b47d132562f56ff454313a16f6084e555c9f/typer-0.25.0.tar.gz", hash = "sha256:123eaf9f19bb40fd268310e12a542c0c6b4fab9c98d9d23342a01ff95e3ce930", size = 120150, upload-time = "2026-04-26T08:46:14.767Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/4a/91/48db081e7a63bb37284f9fbcefda7c44c277b18b0e13fbc36ea2335b71e6/typer-0.24.1-py3-none-any.whl", hash = "sha256:112c1f0ce578bfb4cab9ffdabc68f031416ebcc216536611ba21f04e9aa84c9e", size = 56085, upload-time = "2026-02-21T16:54:41.616Z" }, + { url = "https://files.pythonhosted.org/packages/9a/72/193d4e586ec5a4db834a36bbeb47641a62f951f114ffd0fe5b1b46e8d56f/typer-0.25.0-py3-none-any.whl", hash = "sha256:ac01b48823d3db9a83c9e164338057eadbb1c9957a2a6b4eeb486669c560b5dc", size = 55993, upload-time = "2026-04-26T08:46:15.889Z" }, ] [[package]] @@ -3364,11 +3368,11 @@ wheels = [ [[package]] name = "tzdata" -version = "2025.3" +version = "2026.2" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/5e/a7/c202b344c5ca7daf398f3b8a477eeb205cf3b6f32e7ec3a6bac0629ca975/tzdata-2025.3.tar.gz", hash = "sha256:de39c2ca5dc7b0344f2eba86f49d614019d29f060fc4ebc8a417896a620b56a7", size = 196772, upload-time = "2025-12-13T17:45:35.667Z" } +sdist = { url = "https://files.pythonhosted.org/packages/ba/19/1b9b0e29f30c6d35cb345486df41110984ea67ae69dddbc0e8a100999493/tzdata-2026.2.tar.gz", hash = "sha256:9173fde7d80d9018e02a662e168e5a2d04f87c41ea174b139fbef642eda62d10", size = 198254, upload-time = "2026-04-24T15:22:08.651Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/c7/b0/003792df09decd6849a5e39c28b513c06e84436a54440380862b5aeff25d/tzdata-2025.3-py2.py3-none-any.whl", hash = "sha256:06a47e5700f3081aab02b2e513160914ff0694bce9947d6b76ebd6bf57cfc5d1", size = 348521, upload-time = "2025-12-13T17:45:33.889Z" }, + { url = "https://files.pythonhosted.org/packages/ce/e4/dccd7f47c4b64213ac01ef921a1337ee6e30e8c6466046018326977efd95/tzdata-2026.2-py2.py3-none-any.whl", hash = "sha256:bbe9af844f658da81a5f95019480da3a89415801f6cc966806612cc7169bffe7", size = 349321, upload-time = "2026-04-24T15:22:05.876Z" }, ] [[package]] @@ -3382,7 +3386,7 @@ wheels = [ [[package]] name = "virtualenv" -version = "21.2.0" +version = "21.3.0" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "distlib" }, @@ -3390,9 +3394,9 @@ dependencies = [ { name = "platformdirs" }, { name = "python-discovery" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/aa/92/58199fe10049f9703c2666e809c4f686c54ef0a68b0f6afccf518c0b1eb9/virtualenv-21.2.0.tar.gz", hash = "sha256:1720dc3a62ef5b443092e3f499228599045d7fea4c79199770499df8becf9098", size = 5840618, upload-time = "2026-03-09T17:24:38.013Z" } +sdist = { url = "https://files.pythonhosted.org/packages/3f/8b/6331f7a7fe70131c301106ec1e7cf23e2501bf7d4ca3636805801ca191bb/virtualenv-21.3.0.tar.gz", hash = "sha256:733750db978ec95c2d8eb4feadaa57091002bce404cb39ba69899cf7bd28944e", size = 7614069, upload-time = "2026-04-27T17:05:58.927Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/c6/59/7d02447a55b2e55755011a647479041bc92a82e143f96a8195cb33bd0a1c/virtualenv-21.2.0-py3-none-any.whl", hash = "sha256:1bd755b504931164a5a496d217c014d098426cddc79363ad66ac78125f9d908f", size = 5825084, upload-time = "2026-03-09T17:24:35.378Z" }, + { url = "https://files.pythonhosted.org/packages/4b/eb/03bfb1299d4c4510329e470f13f9a4ce793df7fcb5a2fd3510f911066f61/virtualenv-21.3.0-py3-none-any.whl", hash = "sha256:4d28ee41f6d9ec8f1f00cd472b9ffbcedda1b3d3b9a575b5c94a2d004fd51bd7", size = 7594690, upload-time = "2026-04-27T17:05:55.468Z" }, ] [[package]] @@ -3406,75 +3410,97 @@ wheels = [ [[package]] name = "xxhash" -version = "3.6.0" -source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/02/84/30869e01909fb37a6cc7e18688ee8bf1e42d57e7e0777636bd47524c43c7/xxhash-3.6.0.tar.gz", hash = "sha256:f0162a78b13a0d7617b2845b90c763339d1f1d82bb04a4b07f4ab535cc5e05d6", size = 85160, upload-time = "2025-10-02T14:37:08.097Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/17/d4/cc2f0400e9154df4b9964249da78ebd72f318e35ccc425e9f403c392f22a/xxhash-3.6.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:b47bbd8cf2d72797f3c2772eaaac0ded3d3af26481a26d7d7d41dc2d3c46b04a", size = 32844, upload-time = "2025-10-02T14:34:14.037Z" }, - { url = "https://files.pythonhosted.org/packages/5e/ec/1cc11cd13e26ea8bc3cb4af4eaadd8d46d5014aebb67be3f71fb0b68802a/xxhash-3.6.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:2b6821e94346f96db75abaa6e255706fb06ebd530899ed76d32cd99f20dc52fa", size = 30809, upload-time = "2025-10-02T14:34:15.484Z" }, - { url = "https://files.pythonhosted.org/packages/04/5f/19fe357ea348d98ca22f456f75a30ac0916b51c753e1f8b2e0e6fb884cce/xxhash-3.6.0-cp311-cp311-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:d0a9751f71a1a65ce3584e9cae4467651c7e70c9d31017fa57574583a4540248", size = 194665, upload-time = "2025-10-02T14:34:16.541Z" }, - { url = "https://files.pythonhosted.org/packages/90/3b/d1f1a8f5442a5fd8beedae110c5af7604dc37349a8e16519c13c19a9a2de/xxhash-3.6.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8b29ee68625ab37b04c0b40c3fafdf24d2f75ccd778333cfb698f65f6c463f62", size = 213550, upload-time = "2025-10-02T14:34:17.878Z" }, - { url = "https://files.pythonhosted.org/packages/c4/ef/3a9b05eb527457d5db13a135a2ae1a26c80fecd624d20f3e8dcc4cb170f3/xxhash-3.6.0-cp311-cp311-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:6812c25fe0d6c36a46ccb002f40f27ac903bf18af9f6dd8f9669cb4d176ab18f", size = 212384, upload-time = "2025-10-02T14:34:19.182Z" }, - { url = "https://files.pythonhosted.org/packages/0f/18/ccc194ee698c6c623acbf0f8c2969811a8a4b6185af5e824cd27b9e4fd3e/xxhash-3.6.0-cp311-cp311-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:4ccbff013972390b51a18ef1255ef5ac125c92dc9143b2d1909f59abc765540e", size = 445749, upload-time = "2025-10-02T14:34:20.659Z" }, - { url = "https://files.pythonhosted.org/packages/a5/86/cf2c0321dc3940a7aa73076f4fd677a0fb3e405cb297ead7d864fd90847e/xxhash-3.6.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:297b7fbf86c82c550e12e8fb71968b3f033d27b874276ba3624ea868c11165a8", size = 193880, upload-time = "2025-10-02T14:34:22.431Z" }, - { url = "https://files.pythonhosted.org/packages/82/fb/96213c8560e6f948a1ecc9a7613f8032b19ee45f747f4fca4eb31bb6d6ed/xxhash-3.6.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:dea26ae1eb293db089798d3973a5fc928a18fdd97cc8801226fae705b02b14b0", size = 210912, upload-time = "2025-10-02T14:34:23.937Z" }, - { url = "https://files.pythonhosted.org/packages/40/aa/4395e669b0606a096d6788f40dbdf2b819d6773aa290c19e6e83cbfc312f/xxhash-3.6.0-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:7a0b169aafb98f4284f73635a8e93f0735f9cbde17bd5ec332480484241aaa77", size = 198654, upload-time = "2025-10-02T14:34:25.644Z" }, - { url = "https://files.pythonhosted.org/packages/67/74/b044fcd6b3d89e9b1b665924d85d3f400636c23590226feb1eb09e1176ce/xxhash-3.6.0-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:08d45aef063a4531b785cd72de4887766d01dc8f362a515693df349fdb825e0c", size = 210867, upload-time = "2025-10-02T14:34:27.203Z" }, - { url = "https://files.pythonhosted.org/packages/bc/fd/3ce73bf753b08cb19daee1eb14aa0d7fe331f8da9c02dd95316ddfe5275e/xxhash-3.6.0-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:929142361a48ee07f09121fe9e96a84950e8d4df3bb298ca5d88061969f34d7b", size = 414012, upload-time = "2025-10-02T14:34:28.409Z" }, - { url = "https://files.pythonhosted.org/packages/ba/b3/5a4241309217c5c876f156b10778f3ab3af7ba7e3259e6d5f5c7d0129eb2/xxhash-3.6.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:51312c768403d8540487dbbfb557454cfc55589bbde6424456951f7fcd4facb3", size = 191409, upload-time = "2025-10-02T14:34:29.696Z" }, - { url = "https://files.pythonhosted.org/packages/c0/01/99bfbc15fb9abb9a72b088c1d95219fc4782b7d01fc835bd5744d66dd0b8/xxhash-3.6.0-cp311-cp311-win32.whl", hash = "sha256:d1927a69feddc24c987b337ce81ac15c4720955b667fe9b588e02254b80446fd", size = 30574, upload-time = "2025-10-02T14:34:31.028Z" }, - { url = "https://files.pythonhosted.org/packages/65/79/9d24d7f53819fe301b231044ea362ce64e86c74f6e8c8e51320de248b3e5/xxhash-3.6.0-cp311-cp311-win_amd64.whl", hash = "sha256:26734cdc2d4ffe449b41d186bbeac416f704a482ed835d375a5c0cb02bc63fef", size = 31481, upload-time = "2025-10-02T14:34:32.062Z" }, - { url = "https://files.pythonhosted.org/packages/30/4e/15cd0e3e8772071344eab2961ce83f6e485111fed8beb491a3f1ce100270/xxhash-3.6.0-cp311-cp311-win_arm64.whl", hash = "sha256:d72f67ef8bf36e05f5b6c65e8524f265bd61071471cd4cf1d36743ebeeeb06b7", size = 27861, upload-time = "2025-10-02T14:34:33.555Z" }, - { url = "https://files.pythonhosted.org/packages/9a/07/d9412f3d7d462347e4511181dea65e47e0d0e16e26fbee2ea86a2aefb657/xxhash-3.6.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:01362c4331775398e7bb34e3ab403bc9ee9f7c497bc7dee6272114055277dd3c", size = 32744, upload-time = "2025-10-02T14:34:34.622Z" }, - { url = "https://files.pythonhosted.org/packages/79/35/0429ee11d035fc33abe32dca1b2b69e8c18d236547b9a9b72c1929189b9a/xxhash-3.6.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:b7b2df81a23f8cb99656378e72501b2cb41b1827c0f5a86f87d6b06b69f9f204", size = 30816, upload-time = "2025-10-02T14:34:36.043Z" }, - { url = "https://files.pythonhosted.org/packages/b7/f2/57eb99aa0f7d98624c0932c5b9a170e1806406cdbcdb510546634a1359e0/xxhash-3.6.0-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:dc94790144e66b14f67b10ac8ed75b39ca47536bf8800eb7c24b50271ea0c490", size = 194035, upload-time = "2025-10-02T14:34:37.354Z" }, - { url = "https://files.pythonhosted.org/packages/4c/ed/6224ba353690d73af7a3f1c7cdb1fc1b002e38f783cb991ae338e1eb3d79/xxhash-3.6.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:93f107c673bccf0d592cdba077dedaf52fe7f42dcd7676eba1f6d6f0c3efffd2", size = 212914, upload-time = "2025-10-02T14:34:38.6Z" }, - { url = "https://files.pythonhosted.org/packages/38/86/fb6b6130d8dd6b8942cc17ab4d90e223653a89aa32ad2776f8af7064ed13/xxhash-3.6.0-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:2aa5ee3444c25b69813663c9f8067dcfaa2e126dc55e8dddf40f4d1c25d7effa", size = 212163, upload-time = "2025-10-02T14:34:39.872Z" }, - { url = "https://files.pythonhosted.org/packages/ee/dc/e84875682b0593e884ad73b2d40767b5790d417bde603cceb6878901d647/xxhash-3.6.0-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:f7f99123f0e1194fa59cc69ad46dbae2e07becec5df50a0509a808f90a0f03f0", size = 445411, upload-time = "2025-10-02T14:34:41.569Z" }, - { url = "https://files.pythonhosted.org/packages/11/4f/426f91b96701ec2f37bb2b8cec664eff4f658a11f3fa9d94f0a887ea6d2b/xxhash-3.6.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:49e03e6fe2cac4a1bc64952dd250cf0dbc5ef4ebb7b8d96bce82e2de163c82a2", size = 193883, upload-time = "2025-10-02T14:34:43.249Z" }, - { url = "https://files.pythonhosted.org/packages/53/5a/ddbb83eee8e28b778eacfc5a85c969673e4023cdeedcfcef61f36731610b/xxhash-3.6.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:bd17fede52a17a4f9a7bc4472a5867cb0b160deeb431795c0e4abe158bc784e9", size = 210392, upload-time = "2025-10-02T14:34:45.042Z" }, - { url = "https://files.pythonhosted.org/packages/1e/c2/ff69efd07c8c074ccdf0a4f36fcdd3d27363665bcdf4ba399abebe643465/xxhash-3.6.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:6fb5f5476bef678f69db04f2bd1efbed3030d2aba305b0fc1773645f187d6a4e", size = 197898, upload-time = "2025-10-02T14:34:46.302Z" }, - { url = "https://files.pythonhosted.org/packages/58/ca/faa05ac19b3b622c7c9317ac3e23954187516298a091eb02c976d0d3dd45/xxhash-3.6.0-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:843b52f6d88071f87eba1631b684fcb4b2068cd2180a0224122fe4ef011a9374", size = 210655, upload-time = "2025-10-02T14:34:47.571Z" }, - { url = "https://files.pythonhosted.org/packages/d4/7a/06aa7482345480cc0cb597f5c875b11a82c3953f534394f620b0be2f700c/xxhash-3.6.0-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:7d14a6cfaf03b1b6f5f9790f76880601ccc7896aff7ab9cd8978a939c1eb7e0d", size = 414001, upload-time = "2025-10-02T14:34:49.273Z" }, - { url = "https://files.pythonhosted.org/packages/23/07/63ffb386cd47029aa2916b3d2f454e6cc5b9f5c5ada3790377d5430084e7/xxhash-3.6.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:418daf3db71e1413cfe211c2f9a528456936645c17f46b5204705581a45390ae", size = 191431, upload-time = "2025-10-02T14:34:50.798Z" }, - { url = "https://files.pythonhosted.org/packages/0f/93/14fde614cadb4ddf5e7cebf8918b7e8fac5ae7861c1875964f17e678205c/xxhash-3.6.0-cp312-cp312-win32.whl", hash = "sha256:50fc255f39428a27299c20e280d6193d8b63b8ef8028995323bf834a026b4fbb", size = 30617, upload-time = "2025-10-02T14:34:51.954Z" }, - { url = "https://files.pythonhosted.org/packages/13/5d/0d125536cbe7565a83d06e43783389ecae0c0f2ed037b48ede185de477c0/xxhash-3.6.0-cp312-cp312-win_amd64.whl", hash = "sha256:c0f2ab8c715630565ab8991b536ecded9416d615538be8ecddce43ccf26cbc7c", size = 31534, upload-time = "2025-10-02T14:34:53.276Z" }, - { url = "https://files.pythonhosted.org/packages/54/85/6ec269b0952ec7e36ba019125982cf11d91256a778c7c3f98a4c5043d283/xxhash-3.6.0-cp312-cp312-win_arm64.whl", hash = "sha256:eae5c13f3bc455a3bbb68bdc513912dc7356de7e2280363ea235f71f54064829", size = 27876, upload-time = "2025-10-02T14:34:54.371Z" }, - { url = "https://files.pythonhosted.org/packages/33/76/35d05267ac82f53ae9b0e554da7c5e281ee61f3cad44c743f0fcd354f211/xxhash-3.6.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:599e64ba7f67472481ceb6ee80fa3bd828fd61ba59fb11475572cc5ee52b89ec", size = 32738, upload-time = "2025-10-02T14:34:55.839Z" }, - { url = "https://files.pythonhosted.org/packages/31/a8/3fbce1cd96534a95e35d5120637bf29b0d7f5d8fa2f6374e31b4156dd419/xxhash-3.6.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:7d8b8aaa30fca4f16f0c84a5c8d7ddee0e25250ec2796c973775373257dde8f1", size = 30821, upload-time = "2025-10-02T14:34:57.219Z" }, - { url = "https://files.pythonhosted.org/packages/0c/ea/d387530ca7ecfa183cb358027f1833297c6ac6098223fd14f9782cd0015c/xxhash-3.6.0-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:d597acf8506d6e7101a4a44a5e428977a51c0fadbbfd3c39650cca9253f6e5a6", size = 194127, upload-time = "2025-10-02T14:34:59.21Z" }, - { url = "https://files.pythonhosted.org/packages/ba/0c/71435dcb99874b09a43b8d7c54071e600a7481e42b3e3ce1eb5226a5711a/xxhash-3.6.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:858dc935963a33bc33490128edc1c12b0c14d9c7ebaa4e387a7869ecc4f3e263", size = 212975, upload-time = "2025-10-02T14:35:00.816Z" }, - { url = "https://files.pythonhosted.org/packages/84/7a/c2b3d071e4bb4a90b7057228a99b10d51744878f4a8a6dd643c8bd897620/xxhash-3.6.0-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:ba284920194615cb8edf73bf52236ce2e1664ccd4a38fdb543506413529cc546", size = 212241, upload-time = "2025-10-02T14:35:02.207Z" }, - { url = "https://files.pythonhosted.org/packages/81/5f/640b6eac0128e215f177df99eadcd0f1b7c42c274ab6a394a05059694c5a/xxhash-3.6.0-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:4b54219177f6c6674d5378bd862c6aedf64725f70dd29c472eaae154df1a2e89", size = 445471, upload-time = "2025-10-02T14:35:03.61Z" }, - { url = "https://files.pythonhosted.org/packages/5e/1e/3c3d3ef071b051cc3abbe3721ffb8365033a172613c04af2da89d5548a87/xxhash-3.6.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:42c36dd7dbad2f5238950c377fcbf6811b1cdb1c444fab447960030cea60504d", size = 193936, upload-time = "2025-10-02T14:35:05.013Z" }, - { url = "https://files.pythonhosted.org/packages/2c/bd/4a5f68381939219abfe1c22a9e3a5854a4f6f6f3c4983a87d255f21f2e5d/xxhash-3.6.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:f22927652cba98c44639ffdc7aaf35828dccf679b10b31c4ad72a5b530a18eb7", size = 210440, upload-time = "2025-10-02T14:35:06.239Z" }, - { url = "https://files.pythonhosted.org/packages/eb/37/b80fe3d5cfb9faff01a02121a0f4d565eb7237e9e5fc66e73017e74dcd36/xxhash-3.6.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:b45fad44d9c5c119e9c6fbf2e1c656a46dc68e280275007bbfd3d572b21426db", size = 197990, upload-time = "2025-10-02T14:35:07.735Z" }, - { url = "https://files.pythonhosted.org/packages/d7/fd/2c0a00c97b9e18f72e1f240ad4e8f8a90fd9d408289ba9c7c495ed7dc05c/xxhash-3.6.0-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:6f2580ffab1a8b68ef2b901cde7e55fa8da5e4be0977c68f78fc80f3c143de42", size = 210689, upload-time = "2025-10-02T14:35:09.438Z" }, - { url = "https://files.pythonhosted.org/packages/93/86/5dd8076a926b9a95db3206aba20d89a7fc14dd5aac16e5c4de4b56033140/xxhash-3.6.0-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:40c391dd3cd041ebc3ffe6f2c862f402e306eb571422e0aa918d8070ba31da11", size = 414068, upload-time = "2025-10-02T14:35:11.162Z" }, - { url = "https://files.pythonhosted.org/packages/af/3c/0bb129170ee8f3650f08e993baee550a09593462a5cddd8e44d0011102b1/xxhash-3.6.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:f205badabde7aafd1a31e8ca2a3e5a763107a71c397c4481d6a804eb5063d8bd", size = 191495, upload-time = "2025-10-02T14:35:12.971Z" }, - { url = "https://files.pythonhosted.org/packages/e9/3a/6797e0114c21d1725e2577508e24006fd7ff1d8c0c502d3b52e45c1771d8/xxhash-3.6.0-cp313-cp313-win32.whl", hash = "sha256:2577b276e060b73b73a53042ea5bd5203d3e6347ce0d09f98500f418a9fcf799", size = 30620, upload-time = "2025-10-02T14:35:14.129Z" }, - { url = "https://files.pythonhosted.org/packages/86/15/9bc32671e9a38b413a76d24722a2bf8784a132c043063a8f5152d390b0f9/xxhash-3.6.0-cp313-cp313-win_amd64.whl", hash = "sha256:757320d45d2fbcce8f30c42a6b2f47862967aea7bf458b9625b4bbe7ee390392", size = 31542, upload-time = "2025-10-02T14:35:15.21Z" }, - { url = "https://files.pythonhosted.org/packages/39/c5/cc01e4f6188656e56112d6a8e0dfe298a16934b8c47a247236549a3f7695/xxhash-3.6.0-cp313-cp313-win_arm64.whl", hash = "sha256:457b8f85dec5825eed7b69c11ae86834a018b8e3df5e77783c999663da2f96d6", size = 27880, upload-time = "2025-10-02T14:35:16.315Z" }, - { url = "https://files.pythonhosted.org/packages/f3/30/25e5321c8732759e930c555176d37e24ab84365482d257c3b16362235212/xxhash-3.6.0-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:a42e633d75cdad6d625434e3468126c73f13f7584545a9cf34e883aa1710e702", size = 32956, upload-time = "2025-10-02T14:35:17.413Z" }, - { url = "https://files.pythonhosted.org/packages/9f/3c/0573299560d7d9f8ab1838f1efc021a280b5ae5ae2e849034ef3dee18810/xxhash-3.6.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:568a6d743219e717b07b4e03b0a828ce593833e498c3b64752e0f5df6bfe84db", size = 31072, upload-time = "2025-10-02T14:35:18.844Z" }, - { url = "https://files.pythonhosted.org/packages/7a/1c/52d83a06e417cd9d4137722693424885cc9878249beb3a7c829e74bf7ce9/xxhash-3.6.0-cp313-cp313t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:bec91b562d8012dae276af8025a55811b875baace6af510412a5e58e3121bc54", size = 196409, upload-time = "2025-10-02T14:35:20.31Z" }, - { url = "https://files.pythonhosted.org/packages/e3/8e/c6d158d12a79bbd0b878f8355432075fc82759e356ab5a111463422a239b/xxhash-3.6.0-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:78e7f2f4c521c30ad5e786fdd6bae89d47a32672a80195467b5de0480aa97b1f", size = 215736, upload-time = "2025-10-02T14:35:21.616Z" }, - { url = "https://files.pythonhosted.org/packages/bc/68/c4c80614716345d55071a396cf03d06e34b5f4917a467faf43083c995155/xxhash-3.6.0-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:3ed0df1b11a79856df5ffcab572cbd6b9627034c1c748c5566fa79df9048a7c5", size = 214833, upload-time = "2025-10-02T14:35:23.32Z" }, - { url = "https://files.pythonhosted.org/packages/7e/e9/ae27c8ffec8b953efa84c7c4a6c6802c263d587b9fc0d6e7cea64e08c3af/xxhash-3.6.0-cp313-cp313t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:0e4edbfc7d420925b0dd5e792478ed393d6e75ff8fc219a6546fb446b6a417b1", size = 448348, upload-time = "2025-10-02T14:35:25.111Z" }, - { url = "https://files.pythonhosted.org/packages/d7/6b/33e21afb1b5b3f46b74b6bd1913639066af218d704cc0941404ca717fc57/xxhash-3.6.0-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:fba27a198363a7ef87f8c0f6b171ec36b674fe9053742c58dd7e3201c1ab30ee", size = 196070, upload-time = "2025-10-02T14:35:26.586Z" }, - { url = "https://files.pythonhosted.org/packages/96/b6/fcabd337bc5fa624e7203aa0fa7d0c49eed22f72e93229431752bddc83d9/xxhash-3.6.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:794fe9145fe60191c6532fa95063765529770edcdd67b3d537793e8004cabbfd", size = 212907, upload-time = "2025-10-02T14:35:28.087Z" }, - { url = "https://files.pythonhosted.org/packages/4b/d3/9ee6160e644d660fcf176c5825e61411c7f62648728f69c79ba237250143/xxhash-3.6.0-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:6105ef7e62b5ac73a837778efc331a591d8442f8ef5c7e102376506cb4ae2729", size = 200839, upload-time = "2025-10-02T14:35:29.857Z" }, - { url = "https://files.pythonhosted.org/packages/0d/98/e8de5baa5109394baf5118f5e72ab21a86387c4f89b0e77ef3e2f6b0327b/xxhash-3.6.0-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:f01375c0e55395b814a679b3eea205db7919ac2af213f4a6682e01220e5fe292", size = 213304, upload-time = "2025-10-02T14:35:31.222Z" }, - { url = "https://files.pythonhosted.org/packages/7b/1d/71056535dec5c3177eeb53e38e3d367dd1d16e024e63b1cee208d572a033/xxhash-3.6.0-cp313-cp313t-musllinux_1_2_s390x.whl", hash = "sha256:d706dca2d24d834a4661619dcacf51a75c16d65985718d6a7d73c1eeeb903ddf", size = 416930, upload-time = "2025-10-02T14:35:32.517Z" }, - { url = "https://files.pythonhosted.org/packages/dc/6c/5cbde9de2cd967c322e651c65c543700b19e7ae3e0aae8ece3469bf9683d/xxhash-3.6.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:5f059d9faeacd49c0215d66f4056e1326c80503f51a1532ca336a385edadd033", size = 193787, upload-time = "2025-10-02T14:35:33.827Z" }, - { url = "https://files.pythonhosted.org/packages/19/fa/0172e350361d61febcea941b0cc541d6e6c8d65d153e85f850a7b256ff8a/xxhash-3.6.0-cp313-cp313t-win32.whl", hash = "sha256:1244460adc3a9be84731d72b8e80625788e5815b68da3da8b83f78115a40a7ec", size = 30916, upload-time = "2025-10-02T14:35:35.107Z" }, - { url = "https://files.pythonhosted.org/packages/ad/e6/e8cf858a2b19d6d45820f072eff1bea413910592ff17157cabc5f1227a16/xxhash-3.6.0-cp313-cp313t-win_amd64.whl", hash = "sha256:b1e420ef35c503869c4064f4a2f2b08ad6431ab7b229a05cce39d74268bca6b8", size = 31799, upload-time = "2025-10-02T14:35:36.165Z" }, - { url = "https://files.pythonhosted.org/packages/56/15/064b197e855bfb7b343210e82490ae672f8bc7cdf3ddb02e92f64304ee8a/xxhash-3.6.0-cp313-cp313t-win_arm64.whl", hash = "sha256:ec44b73a4220623235f67a996c862049f375df3b1052d9899f40a6382c32d746", size = 28044, upload-time = "2025-10-02T14:35:37.195Z" }, - { url = "https://files.pythonhosted.org/packages/93/1e/8aec23647a34a249f62e2398c42955acd9b4c6ed5cf08cbea94dc46f78d2/xxhash-3.6.0-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:0f7b7e2ec26c1666ad5fc9dbfa426a6a3367ceaf79db5dd76264659d509d73b0", size = 30662, upload-time = "2025-10-02T14:37:01.743Z" }, - { url = "https://files.pythonhosted.org/packages/b8/0b/b14510b38ba91caf43006209db846a696ceea6a847a0c9ba0a5b1adc53d6/xxhash-3.6.0-pp311-pypy311_pp73-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:5dc1e14d14fa0f5789ec29a7062004b5933964bb9b02aae6622b8f530dc40296", size = 41056, upload-time = "2025-10-02T14:37:02.879Z" }, - { url = "https://files.pythonhosted.org/packages/50/55/15a7b8a56590e66ccd374bbfa3f9ffc45b810886c8c3b614e3f90bd2367c/xxhash-3.6.0-pp311-pypy311_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:881b47fc47e051b37d94d13e7455131054b56749b91b508b0907eb07900d1c13", size = 36251, upload-time = "2025-10-02T14:37:04.44Z" }, - { url = "https://files.pythonhosted.org/packages/62/b2/5ac99a041a29e58e95f907876b04f7067a0242cb85b5f39e726153981503/xxhash-3.6.0-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c6dc31591899f5e5666f04cc2e529e69b4072827085c1ef15294d91a004bc1bd", size = 32481, upload-time = "2025-10-02T14:37:05.869Z" }, - { url = "https://files.pythonhosted.org/packages/7b/d9/8d95e906764a386a3d3b596f3c68bb63687dfca806373509f51ce8eea81f/xxhash-3.6.0-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:15e0dac10eb9309508bfc41f7f9deaa7755c69e35af835db9cb10751adebc35d", size = 31565, upload-time = "2025-10-02T14:37:06.966Z" }, +version = "3.7.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/24/2f/e183a1b407002f5af81822bee18b61cdb94b8670208ef34734d8d2b8ebe9/xxhash-3.7.0.tar.gz", hash = "sha256:6cc4eefbb542a5d6ffd6d70ea9c502957c925e800f998c5630ecc809d6702bae", size = 82022, upload-time = "2026-04-25T11:10:32.553Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/3b/f4/7bd35089ff1f8e2c96baa2dce05775a122aacd2e3830a73165e27a4d0848/xxhash-3.7.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:fdc7d06929ae28dda98297a18eef7b0fd38991a3b405d8d7b55c9ef24c296958", size = 33423, upload-time = "2026-04-25T11:05:47.628Z" }, + { url = "https://files.pythonhosted.org/packages/a3/26/4e00c88a6a2c8a759cfb77d2a9a405f901e8aa66e60ef1fd0aeb35edda48/xxhash-3.7.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:ea6daa712f4e094a30830cf01e9b47d03b24d05cc9dab8609f0d9a9db8454712", size = 30857, upload-time = "2026-04-25T11:05:49.189Z" }, + { url = "https://files.pythonhosted.org/packages/82/2f/eeb942c17a5a761a8f01cb9180a0b76bfb62a2c39e6f46b1f9001899027a/xxhash-3.7.0-cp311-cp311-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:9e6c0d843f1daf85ea23aeb053579135552bde575b7b98af20bfc667b6e4548d", size = 194702, upload-time = "2026-04-25T11:05:50.457Z" }, + { url = "https://files.pythonhosted.org/packages/0e/fd/96f132c08b1e5951c68691d3b9ec351ec2edc028f6a01fcd294f46b9d9f0/xxhash-3.7.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:363c139bf15e1ac5f136b981d3c077eb551299b1effede7f12faa010b8590a60", size = 213613, upload-time = "2026-04-25T11:05:52.571Z" }, + { url = "https://files.pythonhosted.org/packages/82/89/d4e92b796c5ed052d29ed324dbfc1dc1188e0c4bf64bebbf0f8fc20698df/xxhash-3.7.0-cp311-cp311-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:a778b25874cb0f862eaab5986bff4ca49ffb0def7c0a34c237b948b3c6c775b2", size = 236726, upload-time = "2026-04-25T11:05:54.395Z" }, + { url = "https://files.pythonhosted.org/packages/40/f1/81fc4361921dc6e557a9c60cb3712f36d244d06eeeb71cd2f4252ac42678/xxhash-3.7.0-cp311-cp311-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:3e1860f1e43d40e9d904cf22d93e587ea42e010ebce4160877e46bcab4bc232a", size = 212443, upload-time = "2026-04-25T11:05:56.334Z" }, + { url = "https://files.pythonhosted.org/packages/6a/d0/afeddd4cff50a332f50d4b8a2e8857673153ab0564ef472fcdeb0b5430df/xxhash-3.7.0-cp311-cp311-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:9122ad6f867c4a0f5e655f5c3bdf89103852009dbb442a3d23e688b9e699e800", size = 445793, upload-time = "2026-04-25T11:05:58.953Z" }, + { url = "https://files.pythonhosted.org/packages/f7/d0/3c91e4e6a05ca4d7df8e39ec3a75b713609258ec84705ab34be6430826a1/xxhash-3.7.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d7d9110d0c3fb02679972837a033251fd186c529aa62f19c132fc909c74052b8", size = 193937, upload-time = "2026-04-25T11:06:00.546Z" }, + { url = "https://files.pythonhosted.org/packages/4e/3a/a6b0772d9801dd4bea4ca4fd34734d6e9b51a711c8a611a24a79de26a878/xxhash-3.7.0-cp311-cp311-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:347a93f2b4ce67ce61959665e32a7447c380f8347e55e100daa23766baacf0e5", size = 285188, upload-time = "2026-04-25T11:06:01.96Z" }, + { url = "https://files.pythonhosted.org/packages/6c/f8/cf8e31fd7282230fe7367cd501a2e75b4b67b222bfc7eacccfc20d2652cb/xxhash-3.7.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:acbb48679ddf3852c45280c10ff10d52ca2cd1da2e552fb81db1ff786c75d0e4", size = 210966, upload-time = "2026-04-25T11:06:03.453Z" }, + { url = "https://files.pythonhosted.org/packages/cc/f0/fd36cc4a81bf52ee5633275daae2b93dd958aace67fd4f5d466ec83b5f35/xxhash-3.7.0-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:fe14c356f8b23ad811dc026077a6d4abccdaa7bce5ca98579605550657b6fcfb", size = 241994, upload-time = "2026-04-25T11:06:05.264Z" }, + { url = "https://files.pythonhosted.org/packages/08/e1/67f5d9c9369be42eaf99ba02c01bf14c5ecd67087b02567960bfcee43b63/xxhash-3.7.0-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:f420ad3d41e38194353a498bbc9561fd5a9973a27b536ce46d8583479cf44335", size = 198707, upload-time = "2026-04-25T11:06:07.044Z" }, + { url = "https://files.pythonhosted.org/packages/50/17/a4c865ca22d2da6b1bc7d739bf88cab209533cf52ba06ca9da27c3039bee/xxhash-3.7.0-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:693d02c6dc7d1aa0a45921d54cd8c1ff629e09dfdc2238471507af1f7a1c6f04", size = 210917, upload-time = "2026-04-25T11:06:08.853Z" }, + { url = "https://files.pythonhosted.org/packages/49/8b/453b35810d697abac3c96bde3528bece685869227da274eb80a4a4d4a119/xxhash-3.7.0-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:14bf7a54e43825ec131ee7fe3c60e142e7c2c1e676ad0f93fc893432d15414af", size = 275772, upload-time = "2026-04-25T11:06:10.645Z" }, + { url = "https://files.pythonhosted.org/packages/b5/ad/4eed7eab07fd3ee6678f416190f0413d097ab5d7c1278906bf1e9549d789/xxhash-3.7.0-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:ae3a39a4d96bdb6f8d154fd7f490c4ad06f0532fcd2bb656052a9a7762cf5d31", size = 414068, upload-time = "2026-04-25T11:06:12.511Z" }, + { url = "https://files.pythonhosted.org/packages/d3/4e/fd6f8a680ba248fdb83054fa71a8bfa3891225200de1708b888ef2c49829/xxhash-3.7.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:1cc07c639e3a77ef1d32987464d3e408565b8a3be57b545d3542b191054d9923", size = 191459, upload-time = "2026-04-25T11:06:14.07Z" }, + { url = "https://files.pythonhosted.org/packages/50/7c/8cb34b3bed4f44ca6827a534d50833f9bc6c006e83b0eb410ac9fa0793bd/xxhash-3.7.0-cp311-cp311-win32.whl", hash = "sha256:3281ba1d1e60ee7a382a7b958513ba03c2c0d5fcbd9a6f7517c0a81251a23422", size = 30628, upload-time = "2026-04-25T11:06:15.802Z" }, + { url = "https://files.pythonhosted.org/packages/0b/47/a49767bd7b40782bedae9ff0721bfe1d7e4dd9dc1585dea684e57ba67c20/xxhash-3.7.0-cp311-cp311-win_amd64.whl", hash = "sha256:a7f25baec4c5d851d40718d6fae52285b31683093d4ff5207e63ab306ccf14a5", size = 31461, upload-time = "2026-04-25T11:06:17.104Z" }, + { url = "https://files.pythonhosted.org/packages/7c/c6/3957bfacfb706bd687be246dfa8dd60f8df97c44186d229f7fd6e26c4b7e/xxhash-3.7.0-cp311-cp311-win_arm64.whl", hash = "sha256:4c2454448ce847c72635827bb75c15c5a3434b03ee1afd28cb6dc6fb2597d830", size = 27746, upload-time = "2026-04-25T11:06:18.716Z" }, + { url = "https://files.pythonhosted.org/packages/f2/8a/51a14cdef4728c6c2337db8a7d8704422cc65676d9199d77215464c880af/xxhash-3.7.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:082c87bfdd2b9f457606c7a4a53457f4c4b48b0cdc48de0277f4349d79bb3d7a", size = 33357, upload-time = "2026-04-25T11:06:20.44Z" }, + { url = "https://files.pythonhosted.org/packages/b9/1b/0c2c933809421ffd9bf42b59315552c143c755db5d9a816b2f1ae273e884/xxhash-3.7.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:5e7ce913b61f35b0c1c839a49ac9c8e75dd8d860150688aed353b0ce1bf409d8", size = 30869, upload-time = "2026-04-25T11:06:21.989Z" }, + { url = "https://files.pythonhosted.org/packages/03/a8/89d5fdd6ee12d70ba99451de46dd0e8010167468dcd913ec855653f4dd50/xxhash-3.7.0-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:3beb1de3b1e9694fcdd853e570ee64c631c7062435d2f8c69c1adf809bc086f0", size = 194100, upload-time = "2026-04-25T11:06:23.586Z" }, + { url = "https://files.pythonhosted.org/packages/87/ee/2f9f2ed993e77206d1e66991290a1ebe22e843351ca3ebec8e49e01ba186/xxhash-3.7.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:f3e7b689c3bce16699efcf736066f5c6cc4472c3840fe4b22bd8279daf4abdac", size = 212977, upload-time = "2026-04-25T11:06:25.019Z" }, + { url = "https://files.pythonhosted.org/packages/de/60/5a91644615a9e9d4e42c2e9925f1908e3a24e4e691d9de7340d565bea024/xxhash-3.7.0-cp312-cp312-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:a6545e6b409e3d5cbafc850fb84c55a1ca26ed15a6b11e3bf07a0e0cd84517c8", size = 236373, upload-time = "2026-04-25T11:06:26.482Z" }, + { url = "https://files.pythonhosted.org/packages/22/c0/f3a9384eaaed9d14d4d062a5d953aa0da489bfe9747877aa994caa87cd0b/xxhash-3.7.0-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:31ab1461c77a11461d703c88eb949e132a1c6515933cf675d97ec680f4bd18de", size = 212229, upload-time = "2026-04-25T11:06:28.065Z" }, + { url = "https://files.pythonhosted.org/packages/2e/67/02f07a9fd79726804190f2172c4894c3ed9a4ebccaca05653c84beb58025/xxhash-3.7.0-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:7c4d596b7676f811172687ec567cbafb9e4dea2f9be1bbb4f622410cb7f40f40", size = 445462, upload-time = "2026-04-25T11:06:30.048Z" }, + { url = "https://files.pythonhosted.org/packages/40/37/558f5a90c0672fc9b4402dc25d87ac5b7406616e8969430c9ca4e52ee74d/xxhash-3.7.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:13805f0461cba0a857924e70ff91ae6d52d2598f79a884e788db80532614a4a1", size = 193932, upload-time = "2026-04-25T11:06:31.857Z" }, + { url = "https://files.pythonhosted.org/packages/d5/90/aaa09cd58661d32044dbbad7df55bbe22a623032b810e7ed3b8c569a2a6f/xxhash-3.7.0-cp312-cp312-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:1d398f372496152f1c6933a33566373f8d1b37b98b8c9d608fa6edc0976f23b2", size = 284807, upload-time = "2026-04-25T11:06:33.697Z" }, + { url = "https://files.pythonhosted.org/packages/d6/f3/53df3719ab127a02c174f0c1c74924fcd110866e89c966bc7909cfa8fa84/xxhash-3.7.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:d610aa62cdb7d4d497740741772a24a794903bf3e79eaa51d2e800082abe11e5", size = 210445, upload-time = "2026-04-25T11:06:35.488Z" }, + { url = "https://files.pythonhosted.org/packages/72/33/d219975c0e8b6fa2eb9ccd486fe47e21bf1847985b878dd2fbc3126e0d5c/xxhash-3.7.0-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:073c23900a9fbf3d26616c17c830db28af9803677cd5b33aea3224d824111514", size = 241273, upload-time = "2026-04-25T11:06:37.24Z" }, + { url = "https://files.pythonhosted.org/packages/3e/50/49b1afe610eb3964cedcb90a4d4c3d46a261ee8669cbd4f060652619ae3c/xxhash-3.7.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:418a463c3e6a590c0cdc890f8be19adb44a8c8acd175ca5b2a6de77e61d0b386", size = 197950, upload-time = "2026-04-25T11:06:39.148Z" }, + { url = "https://files.pythonhosted.org/packages/c6/75/5f42a1a4c78717d906a4b6a140c6dbf837ab1f547a54d23c4e2903310936/xxhash-3.7.0-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:03f8ff4474ee61c845758ce00711d7087a770d77efb36f7e74a6e867301000b8", size = 210709, upload-time = "2026-04-25T11:06:40.958Z" }, + { url = "https://files.pythonhosted.org/packages/8a/85/237e446c25abced71e9c53d269f2cef5bab8a82b3f88a12e00c5368e7368/xxhash-3.7.0-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:44fba4a5f1d179b7ddc7b3dc40f56f9209046421679b57025d4d8821b376fd8d", size = 275345, upload-time = "2026-04-25T11:06:42.525Z" }, + { url = "https://files.pythonhosted.org/packages/62/34/c2c26c0a6a9cc739bc2a5f0ae03ba8b87deb12b8bce35f7ac495e790dc6d/xxhash-3.7.0-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:31e3516a0f829d06ded4a2c0f3c7c5561993256bfa1c493975fb9dc7bfa828a1", size = 414056, upload-time = "2026-04-25T11:06:44.343Z" }, + { url = "https://files.pythonhosted.org/packages/a0/aa/5c58e9bc8071b8afd8dcf297ff362f723c4892168faba149f19904132bf4/xxhash-3.7.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:b59ee2ac81de57771a09ecad09191e840a1d2fae1ef684208320591055768f83", size = 191485, upload-time = "2026-04-25T11:06:46.262Z" }, + { url = "https://files.pythonhosted.org/packages/d4/69/a929cf9d1e2e65a48b818cdce72cb6b69eab2e6877f21436d0a1942aff43/xxhash-3.7.0-cp312-cp312-win32.whl", hash = "sha256:74bbd92f8c7fcc397ba0a11bfdc106bc72ad7f11e3a60277753f87e7532b4d81", size = 30671, upload-time = "2026-04-25T11:06:48.039Z" }, + { url = "https://files.pythonhosted.org/packages/b9/1b/104b41a8947f4e1d4a66ce1e628eea752f37d1890bfd7453559ca7a3d950/xxhash-3.7.0-cp312-cp312-win_amd64.whl", hash = "sha256:7bd7bc82dd4f185f28f35193c2e968ef46131628e3cac62f639dadf321cba4d1", size = 31514, upload-time = "2026-04-25T11:06:49.279Z" }, + { url = "https://files.pythonhosted.org/packages/98/a0/1fd0ea1f1b886d9e7c73f0397571e22333a7d79e31da6d7127c2a4a71d75/xxhash-3.7.0-cp312-cp312-win_arm64.whl", hash = "sha256:7d7148180ec99ba36585b42c8c5de25e9b40191613bc4be68909b4d25a77a852", size = 27761, upload-time = "2026-04-25T11:06:50.448Z" }, + { url = "https://files.pythonhosted.org/packages/c1/ca/d5174b4c36d10f64d4ca7050563138c5a599efb01a765858ddefc9c1202a/xxhash-3.7.0-cp313-cp313-android_21_arm64_v8a.whl", hash = "sha256:4b6d6b33f141158692bd4eafbb96edbc5aa0dabdb593a962db01a91983d4f8fa", size = 36813, upload-time = "2026-04-25T11:06:51.73Z" }, + { url = "https://files.pythonhosted.org/packages/41/d0/abc6c9d347ba1f1e1e1d98125d0881a0452c7f9a76a9dd03a7b5d2197f23/xxhash-3.7.0-cp313-cp313-android_21_x86_64.whl", hash = "sha256:845d347df254d6c619f616afa921331bada8614b8d373d58725c663ba97c3605", size = 35121, upload-time = "2026-04-25T11:06:53.048Z" }, + { url = "https://files.pythonhosted.org/packages/bf/11/4cc834eb3d79f2f2b3a6ef7324195208bcdfbdcf7534d2b17267aa5f3a8f/xxhash-3.7.0-cp313-cp313-ios_13_0_arm64_iphoneos.whl", hash = "sha256:fddbbb69a6fff4f421e7a0d1fa28f894b20112e9e3fab306af451e2dfd0e459b", size = 29624, upload-time = "2026-04-25T11:06:54.311Z" }, + { url = "https://files.pythonhosted.org/packages/23/83/e97d3e7b635fe73a1dfb1e91f805324dd6d930bb42041cbf18f183bc0b6d/xxhash-3.7.0-cp313-cp313-ios_13_0_arm64_iphonesimulator.whl", hash = "sha256:54876a4e45101cec2bf8f31a973cda073a23e2e108538dad224ba07f85f22487", size = 30638, upload-time = "2026-04-25T11:06:55.864Z" }, + { url = "https://files.pythonhosted.org/packages/f4/40/d84951d80c35db1f4c40a29a64a8520eea5d56e764c603906b4fe763580f/xxhash-3.7.0-cp313-cp313-ios_13_0_x86_64_iphonesimulator.whl", hash = "sha256:0c72fe9c7e3d6dfd7f1e21e224a877917fa09c465694ba4e06464b9511b65544", size = 33323, upload-time = "2026-04-25T11:06:57.336Z" }, + { url = "https://files.pythonhosted.org/packages/89/cc/c7dc6558d97e9ab023f663d69ab28b340ed9bf4d2d94f2c259cf896bb354/xxhash-3.7.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:a6d73a830b17ef49bc04e00182bd839164c1b3c59c127cd7c54fcb10c7ed8ee8", size = 33362, upload-time = "2026-04-25T11:06:58.656Z" }, + { url = "https://files.pythonhosted.org/packages/2a/6e/46b84017b1301d54091430353d4ad5901654a3e0871649877a416f7f1644/xxhash-3.7.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:91c3b07cf3362086d8f126c6aecd8e5e9396ad8b2f2219ea7e49a8250c318acd", size = 30874, upload-time = "2026-04-25T11:06:59.834Z" }, + { url = "https://files.pythonhosted.org/packages/df/5e/8f9158e3ab906ad3fec51e09b5ea0093e769f12207bfa42a368ca204e7ab/xxhash-3.7.0-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:50e879ebbac351c81565ca108db766d7832f5b8b6a5b14b8c0151f7190028e3d", size = 194185, upload-time = "2026-04-25T11:07:01.658Z" }, + { url = "https://files.pythonhosted.org/packages/f3/29/a804ded9f5d3d3758292678d23e7528b08fda7b7e750688d08b052322475/xxhash-3.7.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:921c14e93817842dd0dd9f372890a0f0c72e534650b6ab13c5be5cd0db11d47e", size = 213033, upload-time = "2026-04-25T11:07:03.606Z" }, + { url = "https://files.pythonhosted.org/packages/8b/91/1ce5a7d2fdc975267320e2c78fc1cecfe7ab735ccbcf6993ec5dd541cb2c/xxhash-3.7.0-cp313-cp313-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:e64a7c9d7dfca3e0fafcbc5e455519090706a3e36e95d655cec3e04e79f95aaa", size = 236140, upload-time = "2026-04-25T11:07:05.396Z" }, + { url = "https://files.pythonhosted.org/packages/34/04/fd595a4fd8617b05fa27bd9b684ecb4985bfed27917848eea85d54036d06/xxhash-3.7.0-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:2220af08163baf5fa36c2b8af079dc2cbe6e66ae061385267f9472362dfd53c6", size = 212291, upload-time = "2026-04-25T11:07:06.966Z" }, + { url = "https://files.pythonhosted.org/packages/03/fb/f1a379cbc372ae5b9f4ab36154c48a849ca6ebe3ac477067a57865bf3bc6/xxhash-3.7.0-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:f14bb8b22a4a91325813e3d553b8963c10cf8c756cff65ee50c194431296c655", size = 445532, upload-time = "2026-04-25T11:07:08.525Z" }, + { url = "https://files.pythonhosted.org/packages/65/59/172424b79f8cfd4b6d8a122b2193e6b8ad4b11f7159bb3b6f9b3191329bb/xxhash-3.7.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:496736f86a9bedaf64b0dc70e3539d0766df01c71ea22032698e88f3f04a1ce9", size = 193990, upload-time = "2026-04-25T11:07:10.315Z" }, + { url = "https://files.pythonhosted.org/packages/b9/19/aeac22161d953f139f07ba5586cb4a17c5b7b6dff985122803bb12933500/xxhash-3.7.0-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:0ff71596bd79816975b3de7130ab1ff4541410285a3c084584eeb1c8239996fd", size = 284876, upload-time = "2026-04-25T11:07:12.15Z" }, + { url = "https://files.pythonhosted.org/packages/77/d5/4fd0b59e7a02242953da05ff679fbb961b0a4368eac97a217e11dae110c1/xxhash-3.7.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:1ad86695c19b1d46fe106925db3c7a37f16be37669dcf58dcc70a9dd6e324676", size = 210495, upload-time = "2026-04-25T11:07:13.952Z" }, + { url = "https://files.pythonhosted.org/packages/aa/fb/976a3165c728c7faf74aa1b5ab3cf6a85e6d731612894741840524c7d28c/xxhash-3.7.0-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:970f9f8c50961d639cbd0d988c96f80ddf66006de93641719282c4fe7a87c5e6", size = 241331, upload-time = "2026-04-25T11:07:15.557Z" }, + { url = "https://files.pythonhosted.org/packages/4a/2c/6763d5901d53ac9e6ba296e5717ae599025c9d268396e8faa8b4b0a8e0ac/xxhash-3.7.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:5886ad85e9e347911783760a1d16cb6b393e8f9e3b52c982568226cb56927bdc", size = 198037, upload-time = "2026-04-25T11:07:17.563Z" }, + { url = "https://files.pythonhosted.org/packages/61/2b/876e722d533833f5f9a83473e6ba993e48745701096944e77bbecf29b2c3/xxhash-3.7.0-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:6e934bbae1e0ec74e27d5f0d7f37ef547ce5ff9f0a7e63fb39e559fc99526734", size = 210744, upload-time = "2026-04-25T11:07:19.055Z" }, + { url = "https://files.pythonhosted.org/packages/21/e6/d7e7baef7ce24166b4668d3c48557bb35a23b92ecadcac7e7718d099ab69/xxhash-3.7.0-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:3b6b3d28228af044ebcded71c4a3dd86e1dbd7e2f4645bf40f7b5da65bb5fb5a", size = 275406, upload-time = "2026-04-25T11:07:20.908Z" }, + { url = "https://files.pythonhosted.org/packages/92/fe/198b3763b2e01ca908f2154969a2352ec99bda892b574a11a9a151c5ede4/xxhash-3.7.0-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:6be4d70d9ab76c9f324ead9c01af6ff52c324745ea0c3731682a0cf99720f1fe", size = 414125, upload-time = "2026-04-25T11:07:23.037Z" }, + { url = "https://files.pythonhosted.org/packages/3a/6d/019a11affd5a5499137cacca53808659964785439855b5aa40dfd3412916/xxhash-3.7.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:151d7520838d4465461a0b7f4ae488b3b00de16183dd3214c1a6b14bf89d7fb6", size = 191555, upload-time = "2026-04-25T11:07:24.991Z" }, + { url = "https://files.pythonhosted.org/packages/76/21/b96d58568df2d01533244c3e0e5cbdd0c8b2b25c4bec4d72f19259a292d7/xxhash-3.7.0-cp313-cp313-win32.whl", hash = "sha256:d798c1e291bffb8e37b5bbe0dda77fc767cd19e89cadaf66e6ed5d0ff88c9fe6", size = 30668, upload-time = "2026-04-25T11:07:26.665Z" }, + { url = "https://files.pythonhosted.org/packages/99/57/d849a8d3afa1f8f4bc6a831cd89f49f9706fbbad94d2975d6140a171988c/xxhash-3.7.0-cp313-cp313-win_amd64.whl", hash = "sha256:875811ba23c543b1a1c3143c926e43996eb27ebb8f52d3500744aa608c275aed", size = 31524, upload-time = "2026-04-25T11:07:27.92Z" }, + { url = "https://files.pythonhosted.org/packages/81/52/bacc753e92dee78b058af8dcef0a50815f5f860986c664a92d75f965b6a5/xxhash-3.7.0-cp313-cp313-win_arm64.whl", hash = "sha256:54a675cb300dda83d71daae2a599389d22db8021a0f8db0dd659e14626eb3ecc", size = 27768, upload-time = "2026-04-25T11:07:29.113Z" }, + { url = "https://files.pythonhosted.org/packages/1c/47/ddbd683b7fc7e592c1a8d9d65f73ce9ab513f082b3967eee2baf549b8fc6/xxhash-3.7.0-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:a3b19a42111c4057c1547a4a1396a53961dca576a0f6b82bfa88a2d1561764b2", size = 33576, upload-time = "2026-04-25T11:07:30.469Z" }, + { url = "https://files.pythonhosted.org/packages/07/f2/36d3310161db7f72efb4562aadde0ed429f1d0531782dd6345b12d2da527/xxhash-3.7.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:8f4608a06e4d61b7a3425665a46d00e0579122e1a2fae97a0c52953a3aad9aa3", size = 31123, upload-time = "2026-04-25T11:07:31.989Z" }, + { url = "https://files.pythonhosted.org/packages/0d/3f/75937a5c69556ed213021e43cbedd84c8e0279d0d74e7d41a255d84ba4b1/xxhash-3.7.0-cp313-cp313t-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:ad37c7792479e49cf96c1ab25517d7003fe0d93687a772ba19a097d235bbe41e", size = 196491, upload-time = "2026-04-25T11:07:33.358Z" }, + { url = "https://files.pythonhosted.org/packages/22/29/f10d7ff8c7a733d4403a43b9de18c8fabc005f98cec054644f04418659ee/xxhash-3.7.0-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:dc026e3b89d98e30a8288c95cb696e77d150b3f0fb7a51f73dcd49ee6b5577fa", size = 215793, upload-time = "2026-04-25T11:07:34.919Z" }, + { url = "https://files.pythonhosted.org/packages/8b/fd/778f60aa295f58907938f030a8b514611f391405614a525cccd2ffc00eb5/xxhash-3.7.0-cp313-cp313t-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:c9b31ab1f28b078a6a1ac1a54eb35e7d5390deddd56870d0be3a0a733d1c321c", size = 237993, upload-time = "2026-04-25T11:07:36.638Z" }, + { url = "https://files.pythonhosted.org/packages/70/f5/736db5de387b4a540e37a05b84b40dc58a1ce974bfd2b4e5754ce29b68c3/xxhash-3.7.0-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:3bb5fd680c038fd5229e44e9c493782f90df9bef632fd0499d442374688ff70b", size = 214887, upload-time = "2026-04-25T11:07:38.564Z" }, + { url = "https://files.pythonhosted.org/packages/4d/aa/09a095f22fdb9a27fbb716841fbff52119721f9ca4261952d07a912f7839/xxhash-3.7.0-cp313-cp313t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:030c0fd688fce3569fbb49a2feefd4110cbb0b650186fb4610759ecfac677548", size = 448407, upload-time = "2026-04-25T11:07:40.552Z" }, + { url = "https://files.pythonhosted.org/packages/74/8a/b745efeeca9e34a91c26fdc97ad8514c43d5a81ac78565cba80a1353870a/xxhash-3.7.0-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5b1bde10324f4c31812ae0d0502e92d916ae8917cad7209353f122b8b8f610c3", size = 196119, upload-time = "2026-04-25T11:07:42.101Z" }, + { url = "https://files.pythonhosted.org/packages/8a/5c/0cfceb024af90c191f665c7933b1f318ee234f4797858383bebd1881d52f/xxhash-3.7.0-cp313-cp313t-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:503722d52a615f2604f5e7611de7d43878df010dc0053094ef91cb9a9ac3d987", size = 286751, upload-time = "2026-04-25T11:07:43.568Z" }, + { url = "https://files.pythonhosted.org/packages/0b/0a/0793e405dc3cf8f4ebe2c1acec1e4e4608cd9e7e50ea691dabbc2a95ccbb/xxhash-3.7.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:c72500a3b6d6c30ebfc135035bcace9eb5884f2dc220804efcaaba43e9f611dd", size = 212961, upload-time = "2026-04-25T11:07:45.388Z" }, + { url = "https://files.pythonhosted.org/packages/0c/7e/721118ffc63bfff94aa565bcf2555a820f9f4bdb0f001e0d609bdfad70de/xxhash-3.7.0-cp313-cp313t-musllinux_1_2_armv7l.whl", hash = "sha256:43475925a766d01ca8cd9a857fd87f3d50406983c8506a4c07c4df12adcc867f", size = 243703, upload-time = "2026-04-25T11:07:47.053Z" }, + { url = "https://files.pythonhosted.org/packages/6e/18/16f6267160488b8276fd3d449d425712512add292ba545c1b6946bfdb7dd/xxhash-3.7.0-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:8d09dfd2ab135b985daf868b594315ebe11ad86cd9fea46e6c69f19b28f7d25a", size = 200894, upload-time = "2026-04-25T11:07:48.657Z" }, + { url = "https://files.pythonhosted.org/packages/2d/94/80ba841287fd97e3e9cac1d228788c8ef623746f570404961eec748ecb5c/xxhash-3.7.0-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:c50269d0055ac1faecfd559886d2cbe4b730de236585aba0e873f9d9dadbe585", size = 213357, upload-time = "2026-04-25T11:07:50.257Z" }, + { url = "https://files.pythonhosted.org/packages/a1/7e/106d4067130c59f1e18a55ffadcd876d8c68534883a1e02685b29d3d8153/xxhash-3.7.0-cp313-cp313t-musllinux_1_2_riscv64.whl", hash = "sha256:1910df4756a5ab58cfad8744fc2d0f23926e3efcc346ee76e87b974abab922f4", size = 277600, upload-time = "2026-04-25T11:07:51.745Z" }, + { url = "https://files.pythonhosted.org/packages/c5/86/a081dd30da71d720b2612a792bfd55e45fa9a07ac76a0507f60487473c25/xxhash-3.7.0-cp313-cp313t-musllinux_1_2_s390x.whl", hash = "sha256:d006faf3b491957efcb433489be3c149efe4787b7063d5cddb8ddaefdc60e0c1", size = 416980, upload-time = "2026-04-25T11:07:53.504Z" }, + { url = "https://files.pythonhosted.org/packages/35/29/1a95221a029a3c1293773869e1ab47b07cbbdd82444a42809e8c60156626/xxhash-3.7.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:abb65b4e947e958f7b3b0d71db3ce447d1bc5f37f5eab871ce7223bda8768a04", size = 193840, upload-time = "2026-04-25T11:07:55.103Z" }, + { url = "https://files.pythonhosted.org/packages/c5/e0/db909dd0823285de2286f67e10ee4d81e96ad35d7d8e964ecb07fccd8af9/xxhash-3.7.0-cp313-cp313t-win32.whl", hash = "sha256:178959906cb1716a1ce08e0d69c82886c70a15a6f2790fc084fdd146ca30cd49", size = 30966, upload-time = "2026-04-25T11:07:56.524Z" }, + { url = "https://files.pythonhosted.org/packages/7b/ff/d705b15b22f21ee106adce239cb65d35067a158c630b240270f09b17c2e6/xxhash-3.7.0-cp313-cp313t-win_amd64.whl", hash = "sha256:2524a1e20d4c231d13b50f7cf39e44265b055669a64a7a4b9a2a44faa03f19b6", size = 31784, upload-time = "2026-04-25T11:07:57.758Z" }, + { url = "https://files.pythonhosted.org/packages/a2/1f/b2cf83c3638fd0588e0b17f22e5a9400bdfb1a3e3755324ac0aee2250b88/xxhash-3.7.0-cp313-cp313t-win_arm64.whl", hash = "sha256:37d994d0ffe81ef087bb330d392caa809bb5853c77e22ea3f71db024a0543dba", size = 27932, upload-time = "2026-04-25T11:07:59.109Z" }, + { url = "https://files.pythonhosted.org/packages/54/c1/e57ac7317b1f58a92bab692da6d497e2a7ce44735b224e296347a7ecc754/xxhash-3.7.0-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:ad3aa71e12ee634f22b39a0ff439357583706e50765f17f05550f92dbf128a23", size = 31232, upload-time = "2026-04-25T11:10:21.51Z" }, + { url = "https://files.pythonhosted.org/packages/4f/4e/075559bd712bc62e84915ea46bbee859f935d285659082c129bdbff679dd/xxhash-3.7.0-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:5de686e73690cdaf72b96d4fa083c230ec9020bcc2627ce6316138e2cf2fe2d1", size = 28553, upload-time = "2026-04-25T11:10:23.1Z" }, + { url = "https://files.pythonhosted.org/packages/92/ca/a9c78cb384d4b033b0c58196bd5c8509873cabe76389e195127b0302a741/xxhash-3.7.0-pp311-pypy311_pp73-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:7fbec49f5341bbdea0c471f7d1e2fb41ae8925af9b6f28025c28defd8eb94274", size = 41109, upload-time = "2026-04-25T11:10:25.022Z" }, + { url = "https://files.pythonhosted.org/packages/bd/b1/dfe2629f7c77eb2fa234c72ff537cdd64939763df704e256446ed364a16d/xxhash-3.7.0-pp311-pypy311_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:48b542c347c2089f43dc5a6db31d2a6f3cdb04ee33505ec6e9f653834dbb0bde", size = 36307, upload-time = "2026-04-25T11:10:26.949Z" }, + { url = "https://files.pythonhosted.org/packages/e7/f7/5a484afce0f48dd8083208b42e4911f290a82c7b52458ef2927e4d421a45/xxhash-3.7.0-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a169a036bed0995e090d1493b283cc2cc8a6f5046821086b843abefff80643bc", size = 32534, upload-time = "2026-04-25T11:10:29.01Z" }, + { url = "https://files.pythonhosted.org/packages/0f/5f/4acfcd490db9780cf36c58534d828003c564cde5350220a1c783c4d10776/xxhash-3.7.0-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:ec101643395d7f21405b640f728f6f627e6986557027d740f2f9b220955edafe", size = 31552, upload-time = "2026-04-25T11:10:30.727Z" }, ] [[package]] @@ -3580,11 +3606,40 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/de/7c/ba8ca8cbe9dbef8e83a95fc208fed8e6686c98b4719aaa0aa7f3d31fe390/zarr-3.1.6-py3-none-any.whl", hash = "sha256:b5a82c5079d1c3d4ee8f06746fa3b9a98a7d804300fa3f4be154362a33e1207e", size = 295655, upload-time = "2026-03-23T17:25:17.189Z" }, ] +[[package]] +name = "zensical" +version = "0.0.38" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "click" }, + { name = "deepmerge" }, + { name = "markdown" }, + { name = "pygments" }, + { name = "pymdown-extensions" }, + { name = "pyyaml" }, + { name = "tomli" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/1d/3d/96301349abd6e425b580f33474a51a5b6d68332ed538b8b6000497883794/zensical-0.0.38.tar.gz", hash = "sha256:e6fbf98dd851f5772d84648443e44fc8d8194ba0e09ec75c267fa033f6a0e43c", size = 3912956, upload-time = "2026-04-30T12:05:02.704Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/7d/4d/6c7111f9885dd128b7caf742a160041b01d53bd61e501b8ec19c597fe699/zensical-0.0.38-cp310-abi3-macosx_10_12_x86_64.whl", hash = "sha256:c1d498eecfba2d876ef6fb535fe867af5d752ea38551faab4bc70fd5f25ed5aa", size = 12666775, upload-time = "2026-04-30T12:04:21.522Z" }, + { url = "https://files.pythonhosted.org/packages/c8/8a/d1a8359b5308cf4b0859741acbc7e5cd90641d1e4591e3bd3ca688bb8038/zensical-0.0.38-cp310-abi3-macosx_11_0_arm64.whl", hash = "sha256:edb2e54f1d299a0b5b177fc55d15e198ccb0bf143991bb2f4b2d8db0a6c3b932", size = 12528871, upload-time = "2026-04-30T12:04:25.419Z" }, + { url = "https://files.pythonhosted.org/packages/34/8b/6a47e5065bd9baf161785f1afd2c6e67dd3a7eafccb7ed06e0c7efd7b424/zensical-0.0.38-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8adc9e87d2d5921d9aa4204c4f7488b6349efd57916680d4905414e6461c942b", size = 12925558, upload-time = "2026-04-30T12:04:29.073Z" }, + { url = "https://files.pythonhosted.org/packages/62/2a/62338132326dbc81bfd45d3ba47440dbd689be6c2cccf75f0005c6d0183d/zensical-0.0.38-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:9576d21e3d5d6d6208df0873231838a3e42f05ba95316e4129df26a20edb8226", size = 12887161, upload-time = "2026-04-30T12:04:32.118Z" }, + { url = "https://files.pythonhosted.org/packages/04/b3/f4f0af1eb6caf2d163fb9ba97da4592c74f26fe77309093bec35d8dbab5c/zensical-0.0.38-cp310-abi3-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d649045e59b6ecb0f543fddeed5b0dc4dab3fdeb0dae791d71b2be29335dd603", size = 13252488, upload-time = "2026-04-30T12:04:35.558Z" }, + { url = "https://files.pythonhosted.org/packages/9f/e4/d5329e20c9417ca4789150cf78c994e2489c0c8fd92f10d93fe13c9d71da/zensical-0.0.38-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:196aa6ffaf2e80a173233e5e639227e59437a2dc31849051901a9456960f5f1a", size = 12955366, upload-time = "2026-04-30T12:04:39.159Z" }, + { url = "https://files.pythonhosted.org/packages/38/26/11ca657164a2ca9347ffe665b57f5e788b628b6f21e7cf171cda7295a730/zensical-0.0.38-cp310-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:3a0d173f4402a6201d990f05cb766aad872f222fffd9022d42421b331f69c60c", size = 13101610, upload-time = "2026-04-30T12:04:42.531Z" }, + { url = "https://files.pythonhosted.org/packages/bd/c7/0247c1efff36914b8a720dbe4accc5e1065d4ae986a81c71fb69cb1cc3e8/zensical-0.0.38-cp310-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:c6056675c5f9e2e00afe6770232213e7dcf07e7e87a5e278d0dd7dbbd8b52316", size = 13159871, upload-time = "2026-04-30T12:04:46.169Z" }, + { url = "https://files.pythonhosted.org/packages/b5/ec/5ff0d64e58f2f498ba1696de3dccf147aec024f374ece4ae55f1313ad3c2/zensical-0.0.38-cp310-abi3-musllinux_1_2_i686.whl", hash = "sha256:e447ca87827b7db7802a4b071247fb72968ab482f611eb8a951917f63b7784b2", size = 13311076, upload-time = "2026-04-30T12:04:49.826Z" }, + { url = "https://files.pythonhosted.org/packages/78/80/8bd9054e15ac992c911a87a9d2651aa3468bc370ad97084f9902f2c9f7e0/zensical-0.0.38-cp310-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:8b913573ec99171534f51f0a5ab2032eee5416981ba2fe502601c5ac5a6da898", size = 13237935, upload-time = "2026-04-30T12:04:53.104Z" }, + { url = "https://files.pythonhosted.org/packages/63/75/d81ca979bc770c0d678717687b9b9fdf1e3afc0e3d52b05092a0391866c8/zensical-0.0.38-cp310-abi3-win32.whl", hash = "sha256:a2eebc767037943f93fa6f5b74f409ad2ca53d1eda7776092ebb455d7b42eb67", size = 12228161, upload-time = "2026-04-30T12:04:56.641Z" }, + { url = "https://files.pythonhosted.org/packages/14/09/52965dcb9bbae6883a1981a23d926b6410fdf61bd83f399fc9acda5ccb98/zensical-0.0.38-cp310-abi3-win_amd64.whl", hash = "sha256:e91412a38c4a7099e498b656eaf858b1f9d6c3b09dab05a4bdc65a6c3b9a45a1", size = 12469561, upload-time = "2026-04-30T12:04:59.632Z" }, +] + [[package]] name = "zipp" -version = "3.23.0" +version = "3.23.1" source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/e3/02/0f2892c661036d50ede074e376733dca2ae7c6eb617489437771209d4180/zipp-3.23.0.tar.gz", hash = "sha256:a07157588a12518c9d4034df3fbbee09c814741a33ff63c05fa29d26a2404166", size = 25547, upload-time = "2025-06-08T17:06:39.4Z" } +sdist = { url = "https://files.pythonhosted.org/packages/30/21/093488dfc7cc8964ded15ab726fad40f25fd3d788fd741cc1c5a17d78ee8/zipp-3.23.1.tar.gz", hash = "sha256:32120e378d32cd9714ad503c1d024619063ec28aad2248dc6672ad13edfa5110", size = 25965, upload-time = "2026-04-13T23:21:46.6Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/2e/54/647ade08bf0db230bfea292f893923872fd20be6ac6f53b2b936ba839d75/zipp-3.23.0-py3-none-any.whl", hash = "sha256:071652d6115ed432f5ce1d34c336c0adfd6a884660d1e9712a256d3d3bd4b14e", size = 10276, upload-time = "2025-06-08T17:06:38.034Z" }, + { url = "https://files.pythonhosted.org/packages/08/8a/0861bec20485572fbddf3dfba2910e38fe249796cb73ecdeb74e07eeb8d3/zipp-3.23.1-py3-none-any.whl", hash = "sha256:0b3596c50a5c700c9cb40ba8d86d9f2cc4807e9bedb06bcdf7fac85633e444dc", size = 10378, upload-time = "2026-04-13T23:21:45.386Z" }, ]