Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

918 ability to freeze datasets or version #941

Merged
merged 101 commits into from
Jun 28, 2024

Conversation

longshuicy
Copy link
Member

@longshuicy longshuicy commented Feb 23, 2024

This PR includes features to "release" a dataset. Once dataset is released, the dataset document will be moved to a new collection "freeze_datasets" with additional fields: frozen flag to indicate released; a version number.
Then the "dataset" and "freeze_datasets" collection are joined using mongo view to provide the combined list of datasets for client/frontend consumption. Similar concept with file, folder, metadata, visualization and thumbnails. Details please refer to the diagrams.

Backend change:

  • New model for frozen dataset; and new Dataset Views
  • New model for frozen files, folder, metadata, visualization (config & data), and thumbnail. With new views to join frozen + regular db accordingly
  • Change all the related "GET" endpoints to use the joined views so we can see all the versions; but all the POST, PUT, PATCH are still pointing to the original dataset to ensure frozen
  • Add new endpoints to freeze dataset.
    • Once it's frozen, the associated file, folder (parent folders recursively), metadata, thumbnails, visualization are all properly copied and linked.
    • Use a "origin_id" to establish linkage with their origin within each category (dataset, file, folder, vis, metadata, thumbnail)
    • Dataset version will increment using frozen_version_num field
  • A few more additional endpoints to list all frozen dataset, list a specific version, list the latest version
  • Simple Pytest

Frontend change see screenshots:

image image image image

Todos:

- Visualization
- Thumbnail
- Public datasets


Clowder V2 Dataset Version
Clowder V2 Dataset Version (1)
Clowder V2 Dataset Version (2)

@longshuicy longshuicy linked an issue Feb 23, 2024 that may be closed by this pull request
@ddey2
Copy link
Member

ddey2 commented May 31, 2024

It says "Viewing dataset version -999" when I change the status to 'PUBLIC'. May be we want to say current version?
Screenshot 2024-05-31 at 11 41 05 AM

frontend/src/components/datasets/OtherMenu.tsx Outdated Show resolved Hide resolved
frontend/src/components/datasets/DatasetVersions.tsx Outdated Show resolved Hide resolved
backend/app/models/datasets.py Show resolved Hide resolved
backend/app/models/datasets.py Show resolved Hide resolved
frontend/src/components/datasets/PublicDatasetVersions.tsx Outdated Show resolved Hide resolved
Copy link
Member

@lmarini lmarini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor UI comments:

  • When looking at a version, the pill with the version number next to the title pushes the tabs to the right. Try switching between current and a version and look at the tabs.

  • When loading the dataset page, we always show the notification showing with version we are viewing. So by default it shows "Viewing dataset version current unreleased." Do we always want to pop this up? (Testing on the delete branch, maybe it's just there).

  • Want to try and remove the gray lines below every version? You have enough white space. Not sure it is needed.

  • When looking at a specific version there is a little arrow that shows up to the right of the tabs. It's like the tabs think there is one more tab to the right.

@longshuicy
Copy link
Member Author

Being a little picky

I was thinking may be just putting the version name beside the dataset name?

Right now, if I move from/back to current version from a versioned one, the alignment of below menu bar for Files, Metadata etc changes. Screenshot 2024-05-31 at 11 36 31 AM Screenshot 2024-05-31 at 11 36 39 AM

Good catch. The alignment below has been fixed. It's not related to the version pill but related to the tabs being scrollable. There was an extra scroll button "> " that causing the alignment issue. Please check again :-)

We had a discussed about the best way of displaying dataset version and that's implementation out of that. For the current version, we display nothing; for released version, we put a tab next to the dataset name like it is.

@longshuicy
Copy link
Member Author

Some minor UI comments:

  • When looking at a version, the pill with the version number next to the title pushes the tabs to the right. Try switching between current and a version and look at the tabs.
  • When loading the dataset page, we always show the notification showing with version we are viewing. So by default it shows "Viewing dataset version current unreleased." Do we always want to pop this up? (Testing on the delete branch, maybe it's just there).
  • Want to try and remove the gray lines below every version? You have enough white space. Not sure it is needed.
  • When looking at a specific version there is a little arrow that shows up to the right of the tabs. It's like the tabs think there is one more tab to the right.
  1. Tab position issue is fixed now. It was due to the tab item number changed with the autoscroll in place. We never had chances to visualize the switch between an item has more tabs vs less tabs.
  2. I change the wording to "Latest" and also turned off snack bar notification for the latest. Not sure if that's what you meant though. We can discuss more about it.
  3. The latest version doesn't have any gray lines. Could you pull again? Note in the "delete dataset versions" PR the look of that section will be further changed. i.e. how it got highlighted
  4. Arrow is fixed together with the tab position (1.)

@longshuicy
Copy link
Member Author

longshuicy commented Jun 4, 2024

It says "Viewing dataset version -999" when I change the status to 'PUBLIC'. May be we want to say current version?

Good catch. I fixed this bug. Now when it's latest version, I'm not showing any notification.

@longshuicy longshuicy requested review from lmarini and ddey2 June 4, 2024 16:00
@ddey2
Copy link
Member

ddey2 commented Jun 7, 2024

This looks good to me. @longshuicy you might want to resolve the conflicts. You can do it before emerging too. You might get some more conflicts :)

* fix bug with get/delete a specific version of dataset; frontend delete button is working

* dataset versioning styling

* public dataset versioning styling

* delete thumbnail and vis endpoint considering versioning

* might be a bug deleting thumbnails

* switch the order

* write utility funcitons to help manage delete

* backend delete is configured

* frontend delete wired in correctly

* delete modal color and prompt updated

* add docstrings

* add last modified

* add specific pytest

* add visualization tests to dataset versioning

* add more pytest after deleting the latest dataset
* first draft of the versioning documentation

* update documentation

* wording
@tcnichol
Copy link
Contributor

A question about datasets and the dataset status.

I created a dataset and went through a few versions, each one adding file, metadata etc.

I took the latest version and I made it 'public.'

The latest version is visible on the public page, but none of the previous versions.

It also looks like once a dataset is frozen, its no longer able to be shared or have its status changed.

Does this seem like something that we might want users to be able to do? I could imagine a case where someone might decide they want to show previous versions to users. Or not?

@tcnichol
Copy link
Contributor

Something else I noticed.

I have a dataset with 5 versions plus latest.

For the latest, I changed the dataset from 'PUBLIC' to 'PRIVATE.'

On the 'public' page, version 5 is no longer visible.

Should an older version show up on the 'public' page if it was made public, but a later version is not?

@longshuicy
Copy link
Member Author

Something else I noticed.

I have a dataset with 5 versions plus latest.

For the latest, I changed the dataset from 'PUBLIC' to 'PRIVATE.'

On the 'public' page, version 5 is no longer visible.

Should an older version show up on the 'public' page if it was made public, but a later version is not?

I created an issue: #1110

Copy link
Contributor

@tcnichol tcnichol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking approved. Made a comment about public datasets and previous versions that will be resolved in a later issue, so this one is good to go.

Copy link
Member

@ddey2 ddey2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@longshuicy longshuicy merged commit 001c686 into main Jun 28, 2024
6 checks passed
@longshuicy longshuicy deleted the 918-ability-to-freeze-datasets-or-version branch June 28, 2024 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to freeze datasets or version
4 participants