chore(export): Added ability to export chart YAML files with Unicode characters, fix #20331#28008
chore(export): Added ability to export chart YAML files with Unicode characters, fix #20331#28008xyb wants to merge 4 commits into
Conversation
|
Thanks @xyb for the change. Would you mind adding either a unit or integration test? |
|
@john-bodley Certainly. I've previously dedicated some time to exploring the optimal location for writing unit tests, yet I was unable to identify an appropriate spot within the tests directory. Could you possibly offer some guidance on where the unit test code should be situated, and what would be the best approach to handling fixtures? |
Take a look at: https://github.com/apache/superset/blob/master/tests/integration_tests/charts/commands_tests.py |
dpgaspar
left a comment
There was a problem hiding this comment.
Thank you for the PR, please a test for chart, dashboard, database and dataset.
|
@john-bodley @dpgaspar I've added some tests, but I'm not familiar with superset and its testing framework, I'd greatly appreciate it if someone could take a look at it and provide some guidance on how to improve it. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #28008 +/- ##
==========================================
- Coverage 64.19% 64.18% -0.01%
==========================================
Files 2655 2655
Lines 143925 143925
Branches 33181 33181
==========================================
- Hits 92387 92385 -2
Misses 49918 49918
- Partials 1620 1622 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
@john-bodley @dpgaspar Before merging, are there any remaining changes or tasks I should complete to ensure a smooth merge? |
|
Thank you for adding the tests. We can merge this only if/when @dpgaspar re-reviews it :D |
|
@dpgaspar gently ping |
villebro
left a comment
There was a problem hiding this comment.
Looks great! Just one minor comment (If Daniel isn't around to review I think I can merge ASAP to get this in for 4.1)
|
@villebro If there's anything I can do, please don't hesitate to let me know. |
|
Looks like this is stuck (again) in need of a rebase. Assuming this is still needed, let's see if we can get it through :D |
|
@rusackas rebased |
|
@dpgaspar, @rusackas, @john-bodley, @villebro, the branch has been rebased. Kindly review and let me know if additional changes are required. |
|
We rebased this branch onto |
Code Review Agent Run #2cf025Actionable Suggestions - 0Additional Suggestions - 1
Filtered by Review RulesBito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
rusackas
left a comment
There was a problem hiding this comment.
Still LGTM from me. Two loose ends before this can land: it's blocked on @dpgaspar's earlier change-request, so we need a re-review (I can't clear it). I just updated the PR, so we'll see how CI does. Also tiny thing: the test_export_chart_command_unicode_chars docstring still says "same order as export_fields" — looks copied from the key-order test.
| engine.execute("DROP TABLE IF EXISTS 中文") | ||
| engine.execute("CREATE TABLE 中文 AS SELECT 2 as col") | ||
| if db.session.query(SqlaTable).filter_by(table_name="中文").count(): | ||
| db.session.query(SqlaTable).filter_by(table_name="中文").delete() |
There was a problem hiding this comment.
Suggestion: The cleanup query deletes datasets by table_name only, without scoping to the current example database. If another database contains a dataset with the same table name, this test will delete unrelated rows and can make other tests fail unpredictably. Restrict the deletion to the target database (for example by database_id) before creating the new dataset. [logic error]
Severity Level: Major ⚠️
- ⚠️ Other databases' datasets named 中文 deleted during test cleanup.
- ⚠️ Integration tests may fail when sharing table_name 中文.Steps of Reproduction ✅
1. Before running the dataset tests, create a dataset with `table_name="中文"` on a
non-example database by calling `CreateDatasetCommand({"table_name": "中文", "database":
<other_db_id>}).run()` (implemented in `superset/commands/dataset/create.py:3-13`, which
persists a new `SqlaTable` row with `database_id=<other_db_id>`).
2. Run the integration test `test_export_dataset_command_unicode_chars` defined at
`tests/integration_tests/datasets/commands_tests.py:276-305`, which sets up a physical
table `中文` on `examples_db` and prepares to create a dataset for it.
3. Inside this test, the cleanup guard at
`tests/integration_tests/datasets/commands_tests.py:281-285` executes
`if db.session.query(SqlaTable).filter_by(table_name="中文").count():
db.session.query(SqlaTable).filter_by(table_name="中文").delete()`
(the delete is at line 284), issuing a DELETE scoped only by `table_name` and not by
`database_id`.
4. After the test finishes, run `db.session.query(SqlaTable).filter_by(table_name="中文",
database_id=<other_db_id>).one_or_none()` in the same test process and observe that it
returns `None`, showing that the unrelated dataset on the other database was deleted by
this test's cleanup.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** tests/integration_tests/datasets/commands_tests.py
**Line:** 284:284
**Comment:**
*Logic Error: The cleanup query deletes datasets by `table_name` only, without scoping to the current example database. If another database contains a dataset with the same table name, this test will delete unrelated rows and can make other tests fail unpredictably. Restrict the deletion to the target database (for example by `database_id`) before creating the new dataset.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| assert metadata["slice_name"] == "中文" | ||
| assert "slice_name: 中文" in yaml_content | ||
| # make sure the tearDown method works well | ||
| db.session.query(Slice).filter_by(slice_name="中文").update( | ||
| {"slice_name": "Energy Sankey"}, | ||
| ) |
There was a problem hiding this comment.
Suggestion: The state reset for the renamed chart is placed after assertions instead of in a guaranteed cleanup path. If any assertion fails, the chart name remains changed and fixture teardown can fail because cleanup expects Energy Sankey to exist. Move the rename-back logic into a finally block so cleanup always runs. [missing cleanup]
Severity Level: Major ⚠️
- ❌ Failing chart export test leaves renamed slice in metadata.
- ⚠️ Later energy-dashboard fixtures can accumulate unexpected duplicate slices.Steps of Reproduction ✅
1. Run the test `TestExportChartsCommand.test_export_chart_command_unicode_chars` defined
in `tests/integration_tests/charts/commands_tests.py:41-63`, which is marked with
`@pytest.mark.usefixtures("load_energy_table_with_slice")`.
2. The `load_energy_table_with_slice` fixture in
`tests/integration_tests/fixtures/energy_dashboard.py:8-13` calls `_create_energy_table()`
(lines 21-40, 54-66) to create and COMMIT slices including one named `"Energy Sankey"`,
then yields control and later calls `_cleanup()` on teardown.
3. Inside the test, the slice `"Energy Sankey"` is renamed to `"中文"` via
`db.session.query(Slice).filter_by(slice_name="Energy Sankey").update({"slice_name":
"中文"})` at `charts/commands_tests.py:45-48`; the name is only restored back to `"Energy
Sankey"` by the `update(...)` block at lines 60-63 (197-202) if all preceding assertions
at 197-199 succeed.
4. If any assertion in this test fails (for example `assert "slice_name: 中文" in
yaml_content` at 198 when a regression reintroduces escaping), the restore `update()` at
200-202 is never executed; pytest still invokes the fixture teardown `_cleanup()`
(energy_dashboard.py:23-38), which tries to delete slices by the original `"Energy
Sankey"` title and thus leaves the renamed `"中文"` slice in the metadata database,
polluting state for any later tests that rely on `load_energy_table_with_slice`.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** tests/integration_tests/charts/commands_tests.py
**Line:** 197:202
**Comment:**
*Missing Cleanup: The state reset for the renamed chart is placed after assertions instead of in a guaranteed cleanup path. If any assertion fails, the chart name remains changed and fixture teardown can fail because cleanup expects `Energy Sankey` to exist. Move the rename-back logic into a `finally` block so cleanup always runs.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| assert path in set(contents.keys()) | ||
| yaml_content = contents[path]() | ||
| assert "database_name: 中文" in yaml_content | ||
| db.session.query(Database).filter_by(database_name="中文").delete() |
There was a problem hiding this comment.
Suggestion: The test deletes the temporary 中文 database at the end but never commits that deletion. Since creation is committed inside CreateDatabaseCommand.run(), the row can remain in the database and leak into later tests. Commit after the delete (or wrap creation/deletion in a guaranteed cleanup transaction). [missing cleanup]
Severity Level: Major ⚠️
- ⚠️ Temporary "中文" database can remain committed during test session.
- ⚠️ Other tests using separate sessions may see unexpected extra database.Steps of Reproduction ✅
1. Run the test `TestExportDatabasesCommand.test_export_database_command_unicode_chars` in
`tests/integration_tests/databases/commands_tests.py:99-115`, which uses the SQLAlchemy
session `superset.db.session`.
2. At the start of the test, any existing `"中文"` databases are removed via
`db.session.query(Database).filter_by(database_name="中文").delete()` at line 401, and then
`CreateDatabaseCommand({"database_name": "中文", ...}).run()` is called at 403-106.
3. `CreateDatabaseCommand.run()` in `superset/commands/database/create.py:52-68` is
decorated with `@transaction(...)`, and the `transaction` decorator is verified by
`tests/unit_tests/utils/test_decorators.py:300-312` to call `db.session.commit()` on
success, so the new `"中文"` `Database` row is fully committed to the metadata database.
4. At the end of the test, cleanup is attempted with
`db.session.query(Database).filter_by(database_name="中文").delete()` at
`tests/integration_tests/databases/commands_tests.py:414`, but there is no subsequent
`db.session.commit()` in this test; until some unrelated code later calls
`db.session.commit()` (for example `TestImportDatabasesCommand.test_import_v1_database` at
87-88 or the session-level `setup_sample_data` teardown commit in
`tests/integration_tests/conftest.py:136-145), the committed `"中文"` database remains in
the underlying store and can be observed by any code using a separate session/connection,
meaning this test's temporary database is not reliably cleaned up in isolation.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** tests/integration_tests/databases/commands_tests.py
**Line:** 414:414
**Comment:**
*Missing Cleanup: The test deletes the temporary `中文` database at the end but never commits that deletion. Since creation is committed inside `CreateDatabaseCommand.run()`, the row can remain in the database and leak into later tests. Commit after the delete (or wrap creation/deletion in a guaranteed cleanup transaction).
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fixThere was a problem hiding this comment.
Code Review Agent Run #e26422
Actionable Suggestions - 1
-
tests/integration_tests/dashboards/commands_tests.py - 1
- Test isolation failure · Line 513-531
Additional Suggestions - 1
-
tests/integration_tests/databases/commands_tests.py - 1
-
Test resource cleanup gap · Line 399-414The test creates a database with `CreateDatabaseCommand.run()` but cleanup at line 414 only runs if no exception occurs. If an assertion fails or an exception is raised between lines 405-413, the database persists in the test database, potentially causing test pollution. Other tests in this file use `get_example_database()` (a shared persistent resource) which doesn't require cleanup. Consider using try-finally or a pytest fixture for reliable cleanup.
Code suggestion
--- tests/integration_tests/databases/commands_tests.py (lines 399-414) --- 399: @patch("superset.security.manager.g") 400: def test_export_database_command_unicode_chars(self, mock_g) -> None: 401: mock_g.user = security_manager.find_user("admin") 402: db.session.query(Database).filter_by(database_name="\u4e2d\u6587").delete() 403: command = CreateDatabaseCommand( 404: {"database_name": "\u4e2d\u6587", "sqlalchemy_uri": "sqlite:///:memory:"}, 405: ) 406: example_db = command.run() 407: + try: 408: command = ExportDatabasesCommand([example_db.id], export_related=False) 409: contents = dict(command.run()) 410: 411: path = f"databases/{example_db.id}.yaml" 412: assert path in set(contents.keys()) 413: yaml_content = contents[path]() 414: assert "database_name: \u4e2d\u6587" in yaml_content 415: + finally: 416: db.session.query(Database).filter_by(database_name="\u4e2d\u6587").delete()
-
Review Details
-
Files reviewed - 8 · Commit Range:
8a8524f..a526bf9- superset/commands/chart/export.py
- superset/commands/dashboard/export.py
- superset/commands/database/export.py
- superset/commands/dataset/export.py
- tests/integration_tests/charts/commands_tests.py
- tests/integration_tests/dashboards/commands_tests.py
- tests/integration_tests/databases/commands_tests.py
- tests/integration_tests/datasets/commands_tests.py
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
-
/review- Manually triggers a full AI review. -
/pause- Pauses automatic reviews on this pull request. -
/resume- Resumes automatic reviews. -
/resolve- Marks all Bito-posted review comments as resolved. -
/abort- Cancels all in-progress reviews.
Refer to the documentation for additional commands.
Configuration
This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.
Documentation & Help
| def test_export_dashboard_command_unicode_chars(self, mock_g1, mock_g2): | ||
| mock_g1.user = security_manager.find_user("admin") | ||
| mock_g2.user = security_manager.find_user("admin") | ||
| db.session.query(Dashboard).filter_by(slug="world_health").update( | ||
| {"dashboard_title": "中文"}, | ||
| ) | ||
| example_dashboard = ( | ||
| db.session.query(Dashboard).filter_by(dashboard_title="中文").one() | ||
| ) | ||
|
|
||
| command = ExportDashboardsCommand([example_dashboard.id]) | ||
| contents = dict(command.run()) | ||
|
|
||
| path = f"dashboards/{example_dashboard.id}.yaml" | ||
| assert path in set(contents.keys()) | ||
| yaml_content = contents[path]() | ||
| metadata = yaml.safe_load(yaml_content) | ||
| assert metadata["dashboard_title"] == "中文" | ||
| assert "dashboard_title: 中文" in yaml_content |
There was a problem hiding this comment.
This test modifies shared fixture data (dashboard_title changed from "World Bank's Data" to "中文") without restoring it. The fixture cleanup deletes by ID, not by title, so the modification persists. If test_export_dashboard_command_no_related runs after this test, it will fail at line 506 because the exported filename will be dashboards/{id}.yaml instead of dashboards/World_Banks_Data_{id}.yaml.
Code suggestion
Check the AI-generated fix before applying
--- tests/integration_tests/dashboards/commands_tests.py (lines 528-531) ---
528: yaml_content = contents[path]()
529: metadata = yaml.safe_load(yaml_content)
530: assert metadata["dashboard_title"] == "中文"
531: assert "dashboard_title: 中文" in yaml_content
+ # Restore original dashboard title for test isolation
+ db.session.query(Dashboard).filter_by(slug="world_health").update(
+ {"dashboard_title": "World Bank's Data"},
+ )
+ db.session.commit()
Code Review Run #e26422
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
Address review feedback on the unicode YAML export tests: - chart/dashboard: restore the renamed fixture title in a finally block so a failing assertion no longer leaks the renamed slice/dashboard into later tests (and fixture teardown). - database: commit the cleanup delete (creation is committed inside CreateDatabaseCommand) and run it in a finally block so the temporary database does not persist across tests. - dataset: scope the stale-row cleanup to the example database so datasets with the same name on other databases are left untouched; add the missing return type hint. - chart: fix a docstring copied from the key-order test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| command = ExportDatasetsCommand([example_dataset.id], export_related=False) | ||
| contents = dict(command.run()) | ||
|
|
||
| path = f"datasets/examples/{example_dataset.id}.yaml" | ||
| assert path in set(contents.keys()) | ||
| yaml_content = contents[path]() | ||
| assert "table_name: 中文" in yaml_content | ||
|
|
||
| db.session.delete(example_dataset) | ||
| db.session.commit() | ||
| with examples_db.get_sqla_engine() as engine: | ||
| engine.execute("DROP TABLE 中文") | ||
| db.session.commit() |
There was a problem hiding this comment.
Suggestion: The dataset/table teardown runs only on the happy path; if any assertion above fails, cleanup is skipped and leaves persistent DB artifacts that can break later tests. Wrap the body in try/finally and move deletion/drop logic into the finally block. [missing cleanup]
Severity Level: Major ⚠️
- ⚠️ Failed unicode dataset test leaves dataset \"中文\" persisted.
- ⚠️ Failed test leaves physical table \"中文\" in examples DB.
- ⚠️ Later tests see unexpected tables in shared example database.Steps of Reproduction ✅
1. The test `test_export_dataset_command_unicode_chars` in
`tests/integration_tests/datasets/commands_tests.py:27-61` creates a physical table named
`中文` in the example database via `examples_db.get_sqla_engine().execute("CREATE TABLE 中文
AS SELECT 2 as col")` at lines 31-33, and then creates a corresponding `SqlaTable` dataset
via `CreateDatasetCommand(...).run()` at lines 41-47.
2. The test exports this dataset using `ExportDatasetsCommand([example_dataset.id],
export_related=False)` at line 49, builds `path =
f"datasets/examples/{example_dataset.id}.yaml"` at line 52, and asserts the YAML contains
`table_name: 中文` at line 55.
3. If any of the operations or assertions from lines 49-55 fail (for example due to a
regression in dataset export formatting or path construction), Python raises an exception
and the test function exits before executing the cleanup block at lines 57-61
(`db.session.delete(example_dataset)`, `db.session.commit()`, `DROP TABLE 中文`,
`db.session.commit()`).
4. Because there is no surrounding `try/finally`, in that failure scenario the `SqlaTable`
row for `table_name="中文"` and the physical table `中文` remain in the shared example
database returned by `get_example_database()` (`superset/utils/database.py`), unlike the
analogous database-unicode test in
`tests/integration_tests/databases/commands_tests.py:19-41` which wraps cleanup in
`finally`; these leaked artifacts can then be seen by subsequent tests that rely on the
example database contents, breaking assumptions about test isolation.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** tests/integration_tests/datasets/commands_tests.py
**Line:** 298:310
**Comment:**
*Missing Cleanup: The dataset/table teardown runs only on the happy path; if any assertion above fails, cleanup is skipped and leaves persistent DB artifacts that can break later tests. Wrap the body in `try/finally` and move deletion/drop logic into the `finally` block.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
Code Review Agent Run #342f3eActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
SUMMARY
Improve chart YAML export for multi-language users 🌐. Now every user can read and review their chart files in their native language.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
Set name of any charts/dashboard as below strings then export them:
ADDITIONAL INFORMATION