Skip to content

Conversation

@bocklund
Copy link
Member

Implements pydantic models for the currently supported datasets. This should make it easier to implement new types of datasets that conform to normal expectations, make the datasets more usable when implementing residual functions, and provide a sane path for refactoring/removing/deprecating dataset types if that's ever needed*.

In this PR, we only use pydantic for validation within the dataset loader (rather than check_dataset and clean_dataset which are now deprecated to be removed in ESPEI 0.11 and those validations were migrated to the pydantic models). For now, we don't use the pydantic objects anywhere else in the code. New code should use the pydantic objects instead of the arbitrary dictionary representations, and existing code should migrate as they are updated.

Some things to do before merging:

  • check existing datasets against strict mode to make sure we didn't miss anything. Strict mode probably is not something we will be using in production to ease development and allow users to have their own arbitrary comment, etc. keys
  • try to make some purposely faulty datasets - are the error messages useful? Can we make them more useful? Does adding Field(..., description="...") help at all?
  • (it would be nice to) update the web documentation for datasets to include typed versions. Ideally we could autogenerate typed schemas with descriptions, but I am skeptical of whether something autogenerated would be more human readable. The docs could perhaps be shortened with the typed version with explanations of the fields, then give several examples rather than the long and mixed text that is there now, favoring more copy-pastable examples than the long prose that is there.

*Eventually I'd like to:

  • merge activity activity into equilibrium property datasets
  • refactor fixed configuration datasets to use more uniform configuration and occupancies, that is: don't allow sublattices with a single occupant to not be surrounded by braces, i.e. change the type from list[list[float | list[float]] to just list[list[list[float]])
  • reconsider how broadcasting works in general

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant