[ENH] Add config validation with descriptive error messages#529
[ENH] Add config validation with descriptive error messages#529Saurabh6266 wants to merge 1 commit intomllam:mainfrom
Conversation
- Added validate_config() to neural_lam/config.py - Checks datastore.config_path exists on disk before training starts - Checks ManualStateFeatureWeighting.weights is non-empty - Raises InvalidConfigError with actionable message pointing to the field - Added 5 tests to tests/test_config.py covering all validation paths Closes mllam#528
|
This is a really nice improvement catching config issues early with clear error messages will make things much easier to debug, especially for new users. I also like that validation happens right after loading the config and that you added targeted tests for the new cases. I had a few thoughts mainly around how this might scale as more config options get added:
Out of curiosity, do you plan to expand validation to more fields over time, or keep it focused on the most common failure cases? Overall though, this is a great step toward making config errors much more user-friendly. |
|
Thanks @princekumarlahon, really appreciate the detailed feedback! These are all good points. To respond to each:
On your question about scope: the plan is to keep the initial PR focused on the two most failure-prone cases (missing datastore file, empty manual weights) and expand incrementally as new config fields are introduced — especially the ones needed for the probabilistic forecasting extension. That way each addition stays reviewable and testable on its own. Let me know if you'd like me to address any of these before review, or if it's better to handle them as follow-up PRs. |
|
This sounds great I like the direction. Splitting validation into section-based helpers and standardizing error messages both make sense. And +1 to keeping this PR focused and handling structural improvements as follow up. I think this is in a good place |
Summary
Closes #528
Adds a
validate_config()function toneural_lam/config.pythat checks for common config problems before training or graph creation begins, and raisesInvalidConfigErrorwith clear, actionable messages instead of raw tracebacks.Motivation
Two common config errors currently produce hard-to-debug tracebacks:
datastore.config_pathpointing to a non-existent file causes a deepFileNotFoundErrorinsideinit_datastore, with no indication of which config field is wrong.ManualStateFeatureWeightingwith an emptyweightsdict produces a silent no-op during training rather than an early error.Both are now caught at startup with a message pointing directly to the relevant field and showing an example fix.
Changes
neural_lam/config.py: Addedvalidate_config(config, config_path)called insideload_config_and_datastoreafter YAML parsingtests/test_config.py: Added five tests forvalidate_configcovering existing path (passes), missing path (raises with correct field name), error message content, empty ManualStateFeatureWeighting (raises), and non-empty ManualStateFeatureWeighting (passes). Existing tests untouched.Why this matters for future development
The probabilistic forecasting project and other extensions will introduce new required config fields. Having a single validation function makes it straightforward to add new checks without risk of new fields producing cryptic runtime errors deep in the training loop.
Testing
Checklist