Skip to content

Conversation

sethaxen
Copy link
Collaborator

@sethaxen sethaxen commented Oct 16, 2025

This PR adds a keyword refdims to DimTable, defaulting to the refdims of the input an empty tuple. All refdims are included as table columns. Fixes #884

Potentially breaking changes:

  • dims field added to DimTable
  • refdims columns included by default

Example

julia> using DimensionalData, Tables

julia> ds = DimStack((; x=DimArray(randn(5, 3), (X, Y)), y=DimArray(randn(5), X)))  # current behavior unchanged5×3 DimStack ┐
├──────────────┴───────────────────── dims ┐
   X,  Y
├────────────────────────────────── layers ┤
  :x eltype: Float64 dims: X, Y size: 5×3
  :y eltype: Float64 dims: X size: 5
└──────────────────────────────────────────┘

julia> Tables.columntable(DimTable(ds))
(X = [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5], Y = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3], x = [0.8254699243160034, 0.6563247573123833, -1.015386149269186, -0.20779307600510832, 0.8077917439418265, -0.2589884103396112, -0.28982813765235693, -0.4140129099684021, -0.009627277680257368, 0.9864538849444385, 0.6227934921148799, -0.39095938820818243, 1.0829181416944083, -0.9513581890871674, -0.7324766125110088], y = [-1.1164293882611314, -0.5407934437151065, -1.7982872571605262, -0.10874883722506151, 0.7559998068264897, -1.1164293882611314, -0.5407934437151065, -1.7982872571605262, -0.10874883722506151, 0.7559998068264897, -1.1164293882611314, -0.5407934437151065, -1.7982872571605262, -0.10874883722506151, 0.7559998068264897])

julia> Tables.columntable(DimTable(ds[X(3)])) # refdims NOT included by default
(Y = Base.OneTo(3), x = [-1.015386149269186, -0.4140129099684021, 1.0829181416944083], y = [-1.7982872571605262, #undef, #undef])

julia> Tables.columntable(DimTable(ds[X(3)]; refdims=(X(3:3),)))  # manually added refdims
(Y = [1, 2, 3], X = [3, 3, 3], x = [-1.015386149269186, -0.4140129099684021, 1.0829181416944083], y = [-1.7982872571605262, -1.7982872571605262, -1.7982872571605262])

end
function DimTable(xs::Vararg{AbstractDimArray}; layernames=nothing, mergedims=nothing)
function DimTable(
xs::Vararg{AbstractDimArray};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear from the documentation what assumptions this method makes about xs, so it's possible there's a mistake in this method.


end

_dims(t::DimTable) = getfield(t, :dims)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it at all weird that dims(t::DimTable) will return fewer dims than the actual dims included in the table (because it just forwards to the parent)?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't even know it did that. We can fix these things and merge to breaking instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't do that until this PR, since previously the table's dims were the parent dims, but now additional dims may be included.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh you mean the refdims? Yeah hkw do we keep that separate.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 ways I can think of:

  • Keep dims(::DimTable) forwarding to the parent and add the method refdims(t::DimTable) = otherdims(dims(t), _dims(t)). Maybe not the right way to go if the parent is a slice of a DimMatrix where both dimensions have the same dim, but that kind of thing in general may not be well supported.
  • Remove the dims field and add a refdims and refdimarrays field. Then refdims(t::DimTable) = getfield(t, :refdims). I started implementing this version originally and abandoned it because it makes the column indexing more complicated, but I could bring it back.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not just do refdims(dt::DimTable) = refdims(parent(dt)) ? Maybe I'm missing something

Copy link
Collaborator Author

@sethaxen sethaxen Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends on what dims(::DimTable) and refdims(::DimTable) should mean. With this PR, a user can provide arbitrary refdims to the DimTable, so they may not even be in the parent. But should refdims(::DimTable) return those user-provided refdims (currently only stored in (::DimTable).dims) or return those of the parent? Should dims(::DimTable) return just the dims of the parent, or should it return all dimensions corresponding to columns (with this PR, also includes user-provided refdims).

Copy link

codecov bot commented Oct 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.89%. Comparing base (6db30de) to head (bb56926).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1119      +/-   ##
==========================================
- Coverage   86.90%   86.89%   -0.02%     
==========================================
  Files          55       55              
  Lines        5338     5341       +3     
==========================================
+ Hits         4639     4641       +2     
- Misses        699      700       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/tables.jl Outdated
Comment on lines 62 to 63
- `refdims`: Additional reference dimensions to add to the table, defaults to the reference
dimensions of the input.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer if this was a bool that determines whether or not refdims are included as columns

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, though the current approach would support only including a subset of refdims, which maybe is useful in some cases? What other methods implemented here produce refdims besides slicing?

Copy link
Owner

@rafaqz rafaqz Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind the current syntax with a tuple as it's the same as the constructor syntax, and its flexible.

But to merge this to main () would need to be the default.

We can then change that on breaking.

Copy link
Owner

@rafaqz rafaqz Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sethaxen CF standards has its own refdims concept so Rasters.jl Raster can load with refdims already in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make refdims table columns

3 participants