Skip to content

[CORE-580] Enable Compact Data Table Migration from Workspace Settings #3407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: develop
Choose a base branch
from

Conversation

kevinmarete
Copy link
Contributor

Ticket: https://broadworkbench.atlassian.net/browse/CORE-580

Enable Compact Data Table Migration when CompactDataTables Setting is Enabled
This PR replaces #3400


PR checklist

  • Include the JIRA issue number in the PR description and title
  • Make sure Swagger is updated if API changes
    • ...and Orchestration's Swagger too!
  • If you changed anything in model/, then you should publish a new official rawls-model and perform the corresponding dependency updates as specified in the README:
    • in the automation subdirectory
    • in workbench-libs
    • in firecloud-orchestration
  • Get two thumbsworth of PR review
  • Verify all tests go green, including CI tests
  • Squash commits and merge to develop (branches are automatically deleted after merging)
  • Inform other teams of any substantial changes via Slack and/or email

@kevinmarete kevinmarete self-assigned this Jul 15, 2025
@kevinmarete kevinmarete marked this pull request as ready for review July 15, 2025 06:17
@kevinmarete kevinmarete requested a review from a team as a code owner July 15, 2025 06:17
Copy link
Contributor

@davidangb davidangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious what issues you ran into in changing CompactDataTablesConfig(enabled: Boolean) to something like CompactDataTablesConfig(enabled: Boolean, migrationStatus: Option[SomeStatusEnum[); I still think having an enum dedicated to tracking a migration would be cleaner than reusing/overloading the pending state of settings.

From my earlier comment, I would really love to see the EntityManager check for migration status and prevent access to data tables while a migration is running. I am fine if that work happens in a follow-on PR ... but I think exploring that implementation would shed light on THIS pr's implementation. EntityManager currently retrieves the workspace setting and returns the appropriate EntityProvider for that setting. How could we modify it to prevent access during a migration?

  1. EntityManager could also check for any pending settings and prevent data table access when settings are pending. This would prevent access to data tables while a user is changing bucket lifecycle rules or soft-delete policy.
  2. EntityManager could retrieve the CompactDataTablesSetting as it does now, but enhance to check both its enabled value and its new migrationStatus value, and prevent access if migrationStatus shows a migration in process

@kevinmarete
Copy link
Contributor Author

I am curious what issues you ran into in changing CompactDataTablesConfig(enabled: Boolean) to something like CompactDataTablesConfig(enabled: Boolean, migrationStatus: Option[SomeStatusEnum[); I still think having an enum dedicated to tracking a migration would be cleaner than reusing/overloading the pending state of settings.

From my earlier comment, I would really love to see the EntityManager check for migration status and prevent access to data tables while a migration is running. I am fine if that work happens in a follow-on PR ... but I think exploring that implementation would shed light on THIS pr's implementation. EntityManager currently retrieves the workspace setting and returns the appropriate EntityProvider for that setting. How could we modify it to prevent access during a migration?

  1. EntityManager could also check for any pending settings and prevent data table access when settings are pending. This would prevent access to data tables while a user is changing bucket lifecycle rules or soft-delete policy.
  2. EntityManager could retrieve the CompactDataTablesSetting as it does now, but enhance to check both its enabled value and its new migrationStatus value, and prevent access if migrationStatus shows a migration in process

In the previous PR, I had added state to the setting, but found it unnecessary as explained here #3407 (comment), there was already pre-built tracking of pending settings, as they were being executed. Therefore, it was simpler to re-use that logic and incorporate it in the migration and as you have suggested in the EntityManager. I have gone ahead with option 1, 67ce049 but ensured to only check for pendingSettings of the CompactDataTables type.

Copy link
Contributor

@davidangb davidangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it would be better to have an enum dedicated to this specific setting instead of reusing the pending/applied state - any refactoring of how we persist workspace settings could break the migration. BUT ... I won't block the PR on that, and this only needs to function until we get all workspaces migrated and then we don't need it any more.

What about removing the migration API endpoint, and making it so the only way to trigger a migration is by changing the setting? Would that simplify anything for you?

@kevinmarete
Copy link
Contributor Author

I still think it would be better to have an enum dedicated to this specific setting instead of reusing the pending/applied state - any refactoring of how we persist workspace settings could break the migration. BUT ... I won't block the PR on that, and this only needs to function until we get all workspaces migrated and then we don't need it any more.

What about removing the migration API endpoint, and making it so the only way to trigger a migration is by changing the setting? Would that simplify anything for you?

So far it works fine, so we can keep it, but we can reconsider if it becomes problematic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants