Skip to content

Conversation

omkar-ethz
Copy link
Member

@omkar-ethz omkar-ethz commented Oct 15, 2025

Description

Depends on #2282

Motivation

With the above PR, history for new records will be included in the datasets response. The migration script is needed to backfill the older history records as well.

Fixes

Changes:

  • Added up and down migration function for legacy history <-> generic history
  • Added a util function to convert obsolete history to generic history, used in up migration
  • Modified migrate:db:[up|down] to include npm run build, as the migration script depends on compiled dataset/utils module

Tests included

  • Included for each change/fix?
  • Passing?

Documentation

  • swagger documentation updated (required for API changes)
  • official documentation updated

Testing

Tested locally that up / down scripts preserve records across migration.

@omkar-ethz omkar-ethz marked this pull request as ready for review October 15, 2025 13:36
@omkar-ethz omkar-ethz requested a review from a team as a code owner October 15, 2025 13:36
@omkar-ethz omkar-ethz force-pushed the v3-history-migration branch from daa3dae to 8ee342b Compare October 16, 2025 08:25
@minottic
Copy link
Member

is there a way to do this with an aggregation pipeline (or at least with a cursor)? I know it might duplicate logic, but I think to find and then loop is fairly inefficient

@omkar-ethz
Copy link
Member Author

is there a way to do this with an aggregation pipeline (or at least with a cursor)? I know it might duplicate logic, but I think to find and then loop is fairly inefficient

find returns a cursor. It would have been memory inefficient if I did a toArray on the cursor.
As for doing the entire migration with an aggregation pipeline, i think the logic is complex enough, (e.g. in the up case to compute the intersection of fields in datasetlifecycle.previous/current value that have changed; or in the down case to reconstruct the previousValue using datasetSnapshot), that it might be too complex to express this using mongodb operators.. do you think it's worth trying?
As a small optimization, i replaced the insertOne with insertMany in the up case.

Base automatically changed from v3-history to master October 16, 2025 09:29
@omkar-ethz omkar-ethz force-pushed the v3-history-migration branch from 2c4cf76 to da1b775 Compare October 16, 2025 09:34
@minottic
Copy link
Member

find returns a cursor

ah, you are right, since this is not using mongoose but the mongo driver, my bad.

I gave a quick try to see how the aggregation pipeline would look, and it's indeed a bit complex and prone to typos. Happy to use this then, let's reconsider if necessary after we test the migration

Copy link
Member

@minottic minottic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, comment, happy to approve otherwise

@omkar-ethz omkar-ethz merged commit d57621b into master Oct 16, 2025
13 checks passed
@omkar-ethz omkar-ethz deleted the v3-history-migration branch October 16, 2025 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants