Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate data pivot to visual #1114

Open
wants to merge 40 commits into
base: main
Choose a base branch
from
Open

migrate data pivot to visual #1114

wants to merge 40 commits into from

Conversation

shapiromatron
Copy link
Owner

@shapiromatron shapiromatron commented Sep 28, 2024

The DataPivot was a separate visualization type that even had a multi-table inheritance pattern, but in reality it is just a specific visualization type. This PR migrates all data pivots to be a specific visualization type, instead of a standalone model.

The main reason for the migration was delete a bunch of redundant code, and simplify cases where we want to show visualizations and data pivots on the same page, which would require 5+ queries instead of just a single query.

Migrate from this:

summary <django app>
├── Visual
└── DataPivot
    ├── DataPivotQuery
    └── DataPivotUpload

To this:

summary <django app>
└── Visual

One of the trickiest parts of the migration was migrating a few additional settings from DataPivot into the prefilters field in the Visual model; there's a little extra complexity in setting up the DataPivotQuery form that effectively unpacks and repacks this information when building the user-interface. I think/hope I added enough tests to cover the logic.

Data Pivot File are now migrated to datasets instead of a excel-file on a visual. Timestamps were preserved with the migration.

A few additional changes were also made with this migration:

  • Change all forms in summary to use fields instead of exclude
  • Added debug badge to many visual detail pages
  • Added a URL redirect to go from old Data Pivot URL to the new visual-based URL
  • Added faceted counts to django admin views for summary visuals and tables

@shapiromatron shapiromatron marked this pull request as ready for review January 23, 2025 04:33
Copy link
Collaborator

@caseyhans caseyhans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a hefty PR and covers a lot of changes, great stuff. Most of this looks good. I noticed the LabeledItems are not migrating correctly, and the LabeledItemListView (and admin list view) are still trying to prefetch information for the old labels via the nonexistent DataPivot models. We could just exclude the old labels in the queryset manager (.exclude(content_type__model__in=["datapivotquery", "datapivotupload"])) or perhaps overwrite the prefetching to avoid these content types. Otherwise, looks good to me.


for label in LabeledItem.objects.filter(object_id=dp.id, content_type_id=dpq_ct):
new_label = LabeledItem.objects.create(
object_id=visualization.id, content_type_id=dpq_ct, label_id=label.label_id
Copy link
Collaborator

@caseyhans caseyhans Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Labels are not transferring from the old datapivots to the new ones. Changing the content_type_id to the Visual content type (not the datapivot type) fixes this.


for label in LabeledItem.objects.filter(object_id=dp.id, content_type_id=dpu_ct):
new_label = LabeledItem.objects.create(
object_id=visualization.id, content_type_id=dpu_ct, label_id=label.label_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment; content_type_id needs to be changed from datapivot to visual content type id

@caseyhans caseyhans assigned shapiromatron and unassigned caseyhans Feb 14, 2025
Copy link
Collaborator

@rabstejnek rabstejnek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Numbers seem to match up correctly after migrations (ie Visual and Dataset counts) and views seem to be working correctly. My only comment is that I don't see any use for the dp_id field being kept around; we should consider removing this field unless we want to keep it for posterity / have future need of it.

@rabstejnek rabstejnek removed their assignment Feb 14, 2025

Visual.objects.bulk_update(dps, fields=("created", "last_updated"))

pd.DataFrame(data=mapping, columns="model|old_id|new_id".split("|")).to_csv(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this excel was generated just for debugging purposes, but if you intend to keep this code here consider adding the generated excel file to gitignore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants