Support reading from / writing to scaled or unwrapped coordinates

As coordinates in LAMMPS dump files may be scaled (`xs`, `ys`, `zs`), unwrapped (`xu`, `yu`, `zu`) or both (`xsu`, `ysu`, `zsu`), it's desirable to support reading and writing those formats.

I have a primitive implementation for reading dumps with scaled and/or unwrapped coordinates at [PhilLecl/read-dump-su](https://github.com/PhilLecl/lammpsio/tree/read-dump-su).
This implementation only uses the other coordinate formats to calculate the unscaled, wrapped coordinates (`snap.position`) and image flags (`snap.image`), if those are not present in the schema.

Before creating a more advanced solution, the following points should be considered/discussed:

- Backwards compatibility: `snap.position` should continue expose the unscaled and wrapped coordinates (corresponding to `x`, `y`, `z` in a dump file) and the image flags should remain accessible at `snap.image`
- How should redundant information be handled? To my knowledge, dumps may contain multiple coordinate representations as well as both unwrapped coordinates and image flags.
  - Should there be a verification that the dump is internally consistent, i.e. that the information of all contained representations is equivalent?
- Should the scaled and/or unwrapped formats be accessible as a property of a snapshot? Or is it sufficient to have methods to convert from/to them?
  - In case of the former: Should the "alternative" representations be mutable? This would be a nice convenience but introduces potential downsides discussed below.

My proposal:

- `snap.position` should be an object of a subclass of `numpy.ndarray`, so that:
  - It can continue to be used as the wrapped, unscaled values
  - The other formats are available as methods of `snap.position`
  - In the simple case, these would be read-only, in which case they should probably have names that clearly reflect that (e.g. `position.to_scaled()` or `position.copy_as_scaled()`).

The alternative representations could also be properties such as `snap.position.scaled`, which are mutually updating so that each representation can be written to while maintaining internal consistency.  
For example, if one changes an atom's `xu`, the change should be reflected in the corresponding `x`, `xs`, `xsu`, and `ix` for consistency. In a triclinic box, updates to `y`, `z` (and the corresponding other formats) may also be required when `x` is changed.
This could be achieved by creating a subclass of `numpy.ndarray` and overriding `__setitem__`.
However, this would introduce more code complexity, storage overhead and (in case of frequent updates to the coordinates) CPU overhead, so I would be wary of choosing this option.

In a somewhat less complicated solution, one could also store only the representation that was most recently written to and calculate the other representations only when they are accessed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support reading from / writing to scaled or unwrapped coordinates #51

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support reading from / writing to scaled or unwrapped coordinates #51

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions