-
Notifications
You must be signed in to change notification settings - Fork 3
Description
As coordinates in LAMMPS dump files may be scaled (xs, ys, zs), unwrapped (xu, yu, zu) or both (xsu, ysu, zsu), it's desirable to support reading and writing those formats.
I have a primitive implementation for reading dumps with scaled and/or unwrapped coordinates at PhilLecl/read-dump-su.
This implementation only uses the other coordinate formats to calculate the unscaled, wrapped coordinates (snap.position) and image flags (snap.image), if those are not present in the schema.
Before creating a more advanced solution, the following points should be considered/discussed:
- Backwards compatibility:
snap.positionshould continue expose the unscaled and wrapped coordinates (corresponding tox,y,zin a dump file) and the image flags should remain accessible atsnap.image - How should redundant information be handled? To my knowledge, dumps may contain multiple coordinate representations as well as both unwrapped coordinates and image flags.
- Should there be a verification that the dump is internally consistent, i.e. that the information of all contained representations is equivalent?
- Should the scaled and/or unwrapped formats be accessible as a property of a snapshot? Or is it sufficient to have methods to convert from/to them?
- In case of the former: Should the "alternative" representations be mutable? This would be a nice convenience but introduces potential downsides discussed below.
My proposal:
snap.positionshould be an object of a subclass ofnumpy.ndarray, so that:- It can continue to be used as the wrapped, unscaled values
- The other formats are available as methods of
snap.position - In the simple case, these would be read-only, in which case they should probably have names that clearly reflect that (e.g.
position.to_scaled()orposition.copy_as_scaled()).
The alternative representations could also be properties such as snap.position.scaled, which are mutually updating so that each representation can be written to while maintaining internal consistency.
For example, if one changes an atom's xu, the change should be reflected in the corresponding x, xs, xsu, and ix for consistency. In a triclinic box, updates to y, z (and the corresponding other formats) may also be required when x is changed.
This could be achieved by creating a subclass of numpy.ndarray and overriding __setitem__.
However, this would introduce more code complexity, storage overhead and (in case of frequent updates to the coordinates) CPU overhead, so I would be wary of choosing this option.
In a somewhat less complicated solution, one could also store only the representation that was most recently written to and calculate the other representations only when they are accessed.