Skip to content

[ENH] Add config validation with descriptive error messages#529

Open
Saurabh6266 wants to merge 1 commit intomllam:mainfrom
Saurabh6266:enh/config-validation-error-messages
Open

[ENH] Add config validation with descriptive error messages#529
Saurabh6266 wants to merge 1 commit intomllam:mainfrom
Saurabh6266:enh/config-validation-error-messages

Conversation

@Saurabh6266
Copy link
Copy Markdown

@Saurabh6266 Saurabh6266 commented Mar 27, 2026

Summary

Closes #528

Adds a validate_config() function to neural_lam/config.py that checks for common config problems before training or graph creation begins, and raises InvalidConfigError with clear, actionable messages instead of raw tracebacks.

Motivation

Two common config errors currently produce hard-to-debug tracebacks:

  1. datastore.config_path pointing to a non-existent file causes a deep FileNotFoundError inside init_datastore, with no indication of which config field is wrong.
  2. ManualStateFeatureWeighting with an empty weights dict produces a silent no-op during training rather than an early error.

Both are now caught at startup with a message pointing directly to the relevant field and showing an example fix.

Changes

  • neural_lam/config.py: Added validate_config(config, config_path) called inside load_config_and_datastore after YAML parsing
  • tests/test_config.py: Added five tests for validate_config covering existing path (passes), missing path (raises with correct field name), error message content, empty ManualStateFeatureWeighting (raises), and non-empty ManualStateFeatureWeighting (passes). Existing tests untouched.

Why this matters for future development

The probabilistic forecasting project and other extensions will introduce new required config fields. Having a single validation function makes it straightforward to add new checks without risk of new fields producing cryptic runtime errors deep in the training loop.

Testing

pytest tests/test_config.py -v
pre-commit run --all-files

Checklist

  • Tests pass locally
  • pre-commit passes
  • Existing tests unaffected

- Added validate_config() to neural_lam/config.py
- Checks datastore.config_path exists on disk before training starts
- Checks ManualStateFeatureWeighting.weights is non-empty
- Raises InvalidConfigError with actionable message pointing to the field
- Added 5 tests to tests/test_config.py covering all validation paths

Closes mllam#528
@princekumarlahon
Copy link
Copy Markdown

This is a really nice improvement catching config issues early with clear error messages will make things much easier to debug, especially for new users. I also like that validation happens right after loading the config and that you added targeted tests for the new cases. I had a few thoughts mainly around how this might scale as more config options get added:

  1. Right now all the checks seem to live inside validate_config. That works well for now, but I wonder if it might get a bit crowded over time. Maybe grouping validations by sections (like datastore, training, etc.) could help keep things organized as it grows.

  2. It might also be worth standardizing the error messages a bit more for example, always including the full config path (training.state_feature_weighting) and keeping a consistent format for suggested fixes.

  3. Since this is meant to support future additions (like probabilistic forecasting configs), it could be useful to have a clear pattern for adding new validation rules, so contributors don’t have to modify one large function every time.

Out of curiosity, do you plan to expand validation to more fields over time, or keep it focused on the most common failure cases? Overall though, this is a great step toward making config errors much more user-friendly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] Improve config validation error messages for missing or invalid fields

2 participants