fix(config): round-trip processors and profilers#605
Conversation
Signed-off-by: Johnny Greco <jogreco@nvidia.com>
Greptile SummaryThis PR fixes a silent data-loss bug in
|
| Filename | Overview |
|---|---|
| packages/data-designer-config/src/data_designer/config/config_builder.py | Adds processor and profiler restoration loops to from_config; fix is minimal and correct against a fresh builder instance |
| packages/data-designer-config/tests/config/test_config_builder.py | Adds three targeted regression tests: basic round-trip, DropColumnsProcessor side-effects, and full field coverage for DataDesignerConfig |
Sequence Diagram
sequenceDiagram
participant C as Caller
participant F as from_config()
participant B as DataDesignerConfigBuilder
C->>F: BuilderConfig (with processors/profilers)
F->>B: cls(model_configs, tool_configs)
loop columns
F->>B: add_column(col)
end
loop constraints
F->>B: add_constraint(constraint)
end
opt seed_config present
F->>B: with_seed_dataset(source, strategy)
end
loop processors (NEW)
F->>B: add_processor(processor)
Note over B: upsert by name, applies DROP side-effects
end
loop profilers (NEW)
F->>B: add_profiler(profiler)
Note over B: appends to _profilers list
end
F-->>C: populated builder
Reviews (3): Last reviewed commit: "Merge branch 'main' into johnny/fix-conf..." | Re-trigger Greptile
andreatgretel
left a comment
There was a problem hiding this comment.
LGTM, just noting a couple of things that review flagged:
-
The new test skips
DropColumnsProcessor, which is the one processor type with side effects on column state. Could perhaps add a case with a glob likecol_*that asserts the per-columndropflags survive the round-trip. -
Not for this PR, but
from_configenumeratesDataDesignerConfigfields by hand, which is what caused this bug. Might be worth a follow-up that asserts every field round-trips so the next field addition fails loudly. Claude Code flagged this one.
Smoke-tested locally with DropColumnsProcessor (incl. globs), multi-processor ordering, and YAML file round-trip - all good.
Signed-off-by: Johnny Greco <jogreco@nvidia.com>
|
@andreatgretel thanks for the notes. Addressed both in
Focused checks are still passing locally:
|
📋 Summary
DataDesignerConfigBuilder.from_config()did not restore processors or profilers from an existingDataDesignerConfig, so round-tripping through a builder silently dropped those settings. This PR restores those fields and adds a regression test for the behavior.🔗 Related Issue
N/A
🔄 Changes
DataDesignerConfigBuilderfrom an existing config.DataDesignerConfigBuilderfrom an existing config.🧪 Testing
make testpasses (not run; focused test suite below)Focused checks run:
uv run pytest packages/data-designer-config/tests/config/test_config_builder.py -quv run ruff check packages/data-designer-config/src/data_designer/config/config_builder.py packages/data-designer-config/tests/config/test_config_builder.pyuv run ruff format --check packages/data-designer-config/src/data_designer/config/config_builder.py packages/data-designer-config/tests/config/test_config_builder.pygit diff --check✅ Checklist