Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: DimensionSystem and decomposition #16

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nstarman
Copy link

@nstarman nstarman commented Mar 29, 2025

Growing out of discussion in #4.
This definitely needs the consensus of all stakeholder libraries as implementing this on Dimension(-adjacent) objects will require adding a new class: the dimension system.
There is no rush to get this in, just thought I'd formalize the ideas in code.

Method name in other libraries:

  • astropy: decompose
  • pint
  • unyt
  • unxt: decompose
  • [comment down below to be added]

@nstarman nstarman marked this pull request as draft March 29, 2025 17:17
@nstarman nstarman force-pushed the dimension-decompose branch 2 times, most recently from 609af76 to 2788bd9 Compare March 29, 2025 17:20
@mhvk
Copy link

mhvk commented Mar 29, 2025

I like the decompose method, but am less sure about the need for a DimensionSystem. Though I guess it can be used to verify that the tuple[Dimension, ...] is actually self-consistent?

A bit more generally, I think it is really good to have the standard bases listed as an example, but even better would be to have a different example worked out in a bit more detail. How would a c=G=h=1 system actually be implemented? Or one that includes velocity as a base (presumably then missing either length or time)?

Now on to something a bit more abstract - should there be an easy way to associate base units with dimensions? In particular, unit can logically also have a .decompose method, but it would take a set of base units. As you know, what we have in astropy works great for SI/cgs, but at best half-heartedly for anything more custom.

I guess a larger question would be whether it makes sense to have some form of reference implementation of all this... With, arguably more importantly, a reference test suite.

@nstarman
Copy link
Author

Though I guess it can be used to verify that the tuple[Dimension, ...] is actually self-consistent?

Yes! It's along the "parse, not validate" model, where instead of having decompose need to check every single time that the argument is self-consistent, it never needs to check, because DimensionSystem is the guarantor of self-consistency.

Now on to something a bit more abstract - should there be an easy way to associate base units with dimensions?

A UnitSystem? ... Yes. That would be good. Something like.

class UnitSystem(Protocol):
    ...
    @property
    def dimensionsystem(self) -> DimensionSystem:
        """DimensionSystem associated with the unit system"""

@@ -17,6 +17,70 @@ def __pow__(self, other: int, /) -> Self: ...
def __rmul__(self, other: Self, /) -> Self: ...
def __rtruediv__(self, other: Self, /) -> Self: ...

def decompose(self, dimension_system: "DimensionSystem", /) -> "dict[Dimension, float | int]":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to stick to int | Fraction no? Don't know if float makes sense in this context.
Any examples ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also it will ease manipulation when going from one system to another manipulating only integers all the way.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be expected to return all the base units, or just the ones with nonzero exponents?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what would be the expected decomposition of something like dBmW, or other equation based units. For example in the library I work there is also mechanics for completely new units, internally these are represented by specific bit sequences, which could be translated to the existing base units, but would be nonsensical and not translatable outside the string representation if the other libraries happened to represent them.

Also if this is to be a method in an implemented dimension class in a library, does it make sense to require it take as an argument a class defined in a higher level library. Wouldn't it be better to have it take a set(or Tuple, or list) of Dimensions so as to not define a separate class for that concept. That would seem more portable. Or not have an argument to that method and just return the dictionary of dimensions and exponents as defined by a library.

I guess I am not seeing the compelling case for DimensionSystem as of yet.

Copy link
Author

@nstarman nstarman Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be expected to return all the base units, or just the ones with nonzero exponents?

I think that would be up to implementing libraries. We just specify the types.

Might be good to stick to int | Fraction no? Don't know if float makes sense in this context.
Any examples ?

A "natural" unit system taking h-bar, c, and G to be one and to define the base dimensions (a little strange, but not unreasonable) then $L_P = \sqrt{\frac{\hbar,G}{c^3}}$ has fractional exponents.
So Fraction would work in this case but int is insufficient.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@andrewgsavage
Copy link
Contributor

I feel like this is fairly niche, I don't think any existing libraries have this functionality ? You can do Planck units in pint, but it doesn't change the base dimensions; it'll still use length (but scaled to planck_length = (hbar * gravitational_constant / c ** 3) ** 0.5)

Could we set this as optional?

@mhvk
Copy link

mhvk commented Mar 31, 2025

I guess I agree with @andrewgsavage - certainly, for astropy, our physical_type was an afterthought for units.

It may be most fruitful to define the behaviour of units first, and then use the more exotic cases as examples we want to be sure to keep possible. Eg., decomposing to SI is likely available in all units packages, so that should be supported, but everything else would not seem essential. Maybe just like the Array API defines a couple of standard data types that have to be supported, we should similarly define a couple of unit (strings) that have to be supported, and define how units interact with each other?

p.s. Classes are nearly over, so I hope to have a bit more time to contribute instead of just commenting from the sidelines...

@nstarman
Copy link
Author

nstarman commented Mar 31, 2025

I'm very happy to strip the DimensionSystem from this PR and approach it at a later date!
As stated in the opening comment (and draft status), this was to codify an ongoing discussion, not for immediate merging.

In general, dimension systems can be useful for:

  1. defining unit systems by associating a unit to each dimension.
  2. dimensional analysis, using a clean definition of the base set of dimensions, e.g. Buckingham pi algs.
  3. Parsing, not validating, which can can have a large impact performance by pre-empting the need to check inputs.
  4. Easy equivalency checks between unit systems, since if each unit system has an associated dimension-system then equivalency between unit systems is just equality between dimension systems. This helps with converting a quantity between unit systems.

Also, there is prior art: https://docs.sympy.org/latest/modules/physics/units/dimensions.html#sympy.physics.units.dimensions.DimensionSystem

All that being said, it's very easy to start with the decompose() method sans dimension-system arguments and then add it in a future version.

@andrewgsavage
Copy link
Contributor

I think this would be a bit easier to think about if we did the analogous unit method first.

There's a few questions that come to mind:

  • Is decompose the most clear function name, maybe to_dimension_system?
  • Should Dimension('velocity') == Dimension('length') / Dimension('time')
  • How about Dimension('torque') == Dimension('energy')
  • Should the api specify how equality behaves?

this leads into the concept of quantity kinds.
pint only provides the decomposed dimensions so hasn't had to deal with this

@mhvk
Copy link

mhvk commented Apr 1, 2025

The power of concrete examples... If we are having dimensions, I'd like 'torque' to be different from 'energy and, my personal bugbear, 'pressure' not to be the same as 'energy density' even if their units decompose to the same (N/m2 and J/m3 both decompose to kg/(m s2)). But I'd also like 'velocity' to be 'length' / 'time'...
But I'm not sure whether I'd want J/m3 to be convertible to N/m2. In astropy, it is, but arguably it should be possible to to choose to convert to the same physical type.
I guess this comes down a bit to how one deals with composite units. In astropy, different units are kept separate, but the same ones are combined, so one can have "erg/s/cm2/Å" but not "erg/s/cm2/cm" -- the latter will become "erg/s/cm3".

Aside: I like trying to think through how to deal with physical quantities as best as possible, but worry a bit that it distracts from getting us to a minimal API that we can all agree on...

@nstarman nstarman force-pushed the dimension-decompose branch 2 times, most recently from 50f3d4d to 9195af7 Compare April 1, 2025 17:22
Signed-off-by: Nathaniel Starkman <[email protected]>
@nstarman nstarman force-pushed the dimension-decompose branch from 9195af7 to 9f5b897 Compare April 1, 2025 17:27
@nstarman
Copy link
Author

nstarman commented Apr 1, 2025

this leads into the concept of quantity kinds.

@andrewgsavage As I understood it from https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3045r0.html there's a difference between Dimension and QuantityKind which is that

  • Dimension('velocity') == Dimension('length') / Dimension('time')
  • Dimension('torque') == Dimension('energy')

For QuantityKind these are not necessarily true.

@mhvk
Copy link

mhvk commented Apr 1, 2025

@nstarman - thanks for pointing back at that. I guess the conclusion is that both types of logic have their place, and that they are best represented by different classes.

@nstarman
Copy link
Author

nstarman commented Apr 1, 2025

Agreed. @andrewgsavage I'm a huge fan for the QuantityKind idea and think we should find some way to support it in the API at a level where Python libraries can still choose whether to use it. I think Dimensions, Units, and Quantities are the fundamental building blocks but that QuantityKinds provide an incredibly useful means to distinguish units and sensible math with Quantities, preventing incompatibly-kinded operations.

@nstarman
Copy link
Author

nstarman commented Apr 1, 2025

As for the name of the method, I'd love to get some inputs from other libraries about what they've implemented that's similar. Is there a common method name? Or is there a method name we'd all be willing to support?

@neutrinoceros
Copy link

unyt uses sympy objects to represents dimensions. Base dimensions are implemented as Symbol objects and composition gives derivatives (Pow, Mul, ...) objects. As such, dimensions get a simplify method, which I think is what comes closest to a decompose method, but in any case this is pretty far removed from a deliberate API (although I note that Unit.simplify also exists within unyt). It is not clear to me that unyt could support the proposed Dimension protocol without a substantial overhaul that I don't know anyone has the capacity to conduct, but I also don't think this should stop adoption for other implementers.

@lucascolley
Copy link
Member

It is not clear to me that unyt could support the proposed Dimension protocol without a substantial overhaul

Could you point to which parts of the protocol would be problematic?

@phlptp
Copy link

phlptp commented Apr 2, 2025

the units library I work with is primary in C++. The unit is stored as a 32 bit integer with bitfields representing the standard SI base units, along with count, radians, and currency. In addition it has 4 bit flags representing(iFlag - imaginary, eFlag-extra, perUnit, and equation). The unit object also includes a floating point scalar. The closest thing to a dimension is the base_unit which just includes the bitfield portion and no scalar. What I am doing in python is to just have the Dimension class be a unit, and force the scalar to be 1. Then for string interaction use a defaultunit method which can take any "dimension" or "measurementType" and get the default unit for it. As well as the reverse. I will probably add some ability to add the math operations in that as well to handle more complex strings. I haven't added a decompose method yet but will probably just have the method output the values for the base units as a dictionary. Not clear what should be done with the flags as in some cases those could be used as a discriminator of the "Kinds" mentioned earlier, and I doubt they have equivalents in other libraries.

@phlptp
Copy link

phlptp commented Apr 2, 2025

I should also mention that the python portion is pretty new so I can match any naming conventions pretty easily since everything in python is still pretty experimental and easy to tweak.

@neutrinoceros
Copy link

Could you point to which parts of the protocol would be problematic?

all of it, potentially, since we don't have our own Dimension class(es) at the moment, but the slot is already taken in a way. Adopting any standardized API there without breaking backwards compatibility will be a challenge, I expect.

@mhvk
Copy link

mhvk commented Apr 2, 2025

While I like the idea of a Dimension and the logic we are pursuing here, I think I should second @neutrinoceros here that it implies work in implementation for benefits that are not all that obvious -- at least, in astropy.units, what comes closest (PhysicalType) is really an afterthought that, as far as I know, is basically not used. In practice, the important class is Quantity, and unit operations that help make it function. Given this, while I think it is important to define what a Dimension should do, I'm not sure it really belongs in a minimal, required API.

@lucascolley
Copy link
Member

It sounds sensible to me to leave out Dimension for now, then. The most important missing piece right now is Unit conversion #6, right?

@mhvk
Copy link

mhvk commented Apr 2, 2025

Indeed, I think a close look at the API for unit conversion makes sense. I think it should start with a function, though, not a method on Quantity -- at least, it seems obvious to me we want some way to convert a number/array from one unit to another.

@andrewgsavage
Copy link
Contributor

we have unit conversion using asquantity

@andrewgsavage
Copy link
Contributor

It sounds sensible to me to leave out Dimension for now, then.

That seems reasonable. We should have a way to denote optional classes

@mhvk
Copy link

mhvk commented Apr 2, 2025

we have unit conversion using asquantity

Did we decide that asquantity(quantity, unit) does unit conversion? I see it defined in #9, and I think it is fairly logical, but maybe it needs some thought. Which perhaps is better done there, though... Comment forthcoming...

@lucascolley
Copy link
Member

lucascolley commented Apr 2, 2025

we have unit conversion using asquantity

I guess so, although we haven't yet spelled out the semantics. At https://github.com/quantity-dev/quantity-array/blob/ff14ef6e3e8925ce6e8fd44d79ae44d681190ccb/src/quantity_array/__init__.py#L745-L755,

def multiply(x1, x2, /, *args, **kwargs):
    x1 = asarray(x1)
    x2 = asarray(x2)
    units = x1.units * x2.units
    x1_magnitude = xp.asarray(x1.magnitude, copy=True)
    x2_magnitude = x2.m_as(x1.units)
    magnitude = xp.multiply(x1_magnitude, x2_magnitude, *args, **kwargs)
    return ArrayUnitQuantity(magnitude, units)

could then look like

def multiply(x1, x2, /, *args, **kwargs):
    x1 = as_quantity_array(x1)  # interpret `x1` as an `ArrayQuantity`
    x2 = as_quantity_array(x2)  # interpret `x2` as an `ArrayQuantity`
    unit = x1.unit * x2.unit
    x1_value = xp.asarray(x1.value, copy=True)
    x2_value = mn.asquantity(x2, x1.unit).value  # use `mn.asquantity` for conversion
    value = xp.multiply(x1_value, x2_value, *args, **kwargs)
    return as_quantity_array(value, unit)

where as_quantity_array would be a wrapper around mn.asquantity and xp.asarray to return an intersection type of Array and Quantity, ArrayQuantity, which implements the https://data-apis.org/array-api/draft/API_specification/array_object.html operators, attributes and methods.


Side note: is mn = q.__metrology_namespace__ a nice convention to mimic xp = x.__array_namespace__?

@mhvk
Copy link

mhvk commented Apr 2, 2025

See #9 (comment) for discussion about asquantity - my sense is to just keep that for creation, and have a different function for conversion. Basic worry is actually elucidated here: your as_quantity_array will assign dimensionless to any given V, while keeping the unit for Q, while asquantity would do different things for both.

@phlptp

This comment was marked as off-topic.

@mhvk
Copy link

mhvk commented Apr 2, 2025

@phlptp - that's a useful summary, but I think this belongs more in #9 rather than this discussion of Dimension!

@nstarman
Copy link
Author

nstarman commented Apr 2, 2025

See #9 (comment) for discussion about asquantity - my sense is to just keep that for creation, and have a different function for conversion. Basic worry is actually elucidated here: your as_quantity_array will assign dimensionless to any given V, while keeping the unit for Q, while asquantity would do different things for both

@mhvk I strongly agree! Also we should move this discussion elsewhere.

@andrewgsavage
Copy link
Contributor

unyt's dimension is recognised as a dimension with the current api:

@runtime_checkable
class Dimension(Protocol):
    def __mul__(self, other: Self, /) -> Self: ...
    def __truediv__(self, other: Self, /) -> Self: ...
    def __pow__(self, other: int, /) -> Self: ...

    def __rmul__(self, other: Self, /) -> Self: ...
    def __rtruediv__(self, other: Self, /) -> Self: ...

isinstance(unyt.dimensions.jerk, Dimension)
True

We could leave the decompose method to a future version

@mhvk
Copy link

mhvk commented Apr 6, 2025

While I agree with leaving decomposition for later, isn't a problem with the definition you gavethat isinstance(1.0, Dimension) would yield True too? (Or in fact an instance of almost any numerical class...)

@nstarman
Copy link
Author

nstarman commented Apr 6, 2025

Yes. This was part of the subject of #4

@andrewgsavage
Copy link
Contributor

While I agree with leaving decomposition for later, isn't a problem with the definition you gavethat isinstance(1.0, Dimension) would yield True too? (Or in fact an instance of almost any numerical class...)

yes - but you've still standardised on .dimension and asdimension('length'), which I think is preferable to not having these features.
the other options are to remove dimension from a first version, or create a first version that is incompatible with unyt

@nstarman
Copy link
Author

nstarman commented Apr 9, 2025

@neutrinoceros what are the prospects for Dimension in unyt?
Would it be worthwhile to create a sympy subclass called Dimension (or adapt / push PRs to https://docs.sympy.org/latest/modules/physics/units/dimensions.html#sympy.physics.units.dimensions.Dimension) that adheres to the Dimension API we're working on here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants