Skip to content

Conversation

m-rempel
Copy link
Contributor

@m-rempel m-rempel commented Sep 2, 2025

Hi @RubenImhoff and @dnerini!

Here's my first version of the implementation of the reduced-space ensemble Kalman filter method. By calling blending_method = blending.get_method("pca_enkf")

One can now compute a combined forecast by pass arrays of observed precipitation fields, the NWP ensemble model forecast, the motion vector field based on observations as well as the timestamps corresponding to the observations and NWP forecast, respectively.

It's not necessary that the temporal resolution of the NWP forecast is equal to the observation. However, one has to adjust the background inflation factor to >1.0 in this case.

There is also a comprehensive set of additional combination keyword arguments like:

combination_kwargs = dict(n_tapering=0,
                              non_precip_mask = True,
                              n_ens_prec=1,
                              lien_criterion=True,
                              n_lien=10,
                              post_prob_matching = False,
                              iterative_prob_matching = True,
                              inflation_factor_bg=1.0,
                              inflation_factor_obs=1.0,
                              offset_bg=0.0,
                              offset_obs=0.0,
                              nwp_hres_eff=14.0,
                              sampling_prob_source="ensemble",)

Currently, only an AR(1) process within the forecast step is implemented and the respective tests for the method are not implemented.

@m-rempel m-rempel marked this pull request as ready for review September 2, 2025 14:59
@RubenImhoff
Copy link
Contributor

Fantastic, Martin! I will have a look at it tomorrow (already gave it a go earlier, of course, but I'll make it a more thorough review now).

n_lien: int, (n_ens_members/2)
Minimum number of ensemble members that forecast precipitation for the Lien
criterion. As standard, half of the ensemble members should forecast precipitation.
post_prob_matching: bool, (False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is currently not used from what I can see. Should we delete it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've merged the argumentes post_prob_matching and iter_prob_matching into a new one named prob_matching.

iterative_prob_matching: bool, (True)
Flag to specify whether a probability matching should be applied at each correction
step.
inflation_factor_bg: float, (1.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a few lines here describing what the inflation factor is used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Inflation factor of the background (NWC) covariance matrix.
inflation_factor_obs: float, (1.0)
Inflation factor of the observation (NWP) covariance matrix.
offset_bg: float, (0.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a few lines here describing what the offset is used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Offset of the background (NWC) covariance matrix.
offset_obs: float, (0.0)
Offset of the observation (NWP) covariance matrix.
nwp_hres_eff: float
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is currently not used from what I can see. Should we delete it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nwp_hres_eff is now used in ForecastModel to simulate the standard deviation of the smaller scales.

return X_ana

def get_covariance_matrix(self, M: np.ndarray, inflation_factor: float, offset: float):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this function, can you refer to the equation(s) from Nerini et al. (2019) that are used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@RubenImhoff
Copy link
Contributor

RubenImhoff commented Sep 9, 2025

To do:

  • Refactor code.
  • Ensure that less energy/power is lost at the smallest scales.
  • Fix no-rain instances.
  • Ensure that we end with full NWP weight.
  • Add tests.
  • Test with real-world data.
  • Create gallery example.

"""

import numpy as np
from sklearn import decomposition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be handled as an an optional dependency (and hence documented as such), see other examples in the code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dnerini, I've implemented it in this manner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi again @dnerini,
the tests are still failing, since the PCA from sklearn is necessary for the correction step. I've talked with @RubenImhoff about this issue and he has also no idea how to handle this.
One option could be to compute a pure extrapolation nowcast instead of a combination when there's no sklearn installed. However, that makes no really sense for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can install sklearn for the tests, but it shouldn't be added to the base dependencies. This means that users that want to use your method need to install
Sklearn too, otherwise they get an error when they try to use the method

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test dependencies are specified here

https://github.com/pySTEPS/pysteps/blob/master/ci/ci_test_env.yml

so you can add sklearn there

Copy link

codecov bot commented Sep 11, 2025

Codecov Report

❌ Patch coverage is 87.51625% with 96 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.14%. Comparing base (f4429b9) to head (e45ecdc).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
pysteps/blending/pca_ens_kalman_filter.py 89.83% 44 Missing ⚠️
pysteps/utils/reprojection.py 16.12% 26 Missing ⚠️
pysteps/blending/ens_kalman_filter_methods.py 89.43% 15 Missing ⚠️
pysteps/utils/pca.py 75.00% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #491      +/-   ##
==========================================
+ Coverage   83.95%   84.14%   +0.19%     
==========================================
  Files         163      168       +5     
  Lines       13739    14507     +768     
==========================================
+ Hits        11534    12207     +673     
- Misses       2205     2300      +95     
Flag Coverage Δ
unit_tests 84.14% <87.51%> (+0.19%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@m-rempel
Copy link
Contributor Author

Even after adding the test of the PCA, codacy issues that pca is import in pysteps/utils/init.py but never used. Do you have any ideas what the reason could be for this, @dnerini and @RubenImhoff?

@RubenImhoff
Copy link
Contributor

Even after adding the test of the PCA, codacy issues that pca is import in pysteps/utils/init.py but never used. Do you have any ideas what the reason could be for this, @dnerini and @RubenImhoff?

It gives the same issue for the reprojection module. I'm also not entirely sure why. @dnerini, do you know?

@dnerini
Copy link
Member

dnerini commented Sep 12, 2025

We can safely ignore those, they shouldn't appear in the next iterations

@m-rempel
Copy link
Contributor Author

Ok, thank you @dnerini,
I've fixed another bug in the PCA test and now it should be working again.

@RubenImhoff
Copy link
Contributor

I adjusted the gallery example. Due to the very short NWP forecast horizon in pysteps_data, I have adjusted the time step artificially to 15 minutes, so that we can make a blend up to 120 minutes ahead. It works, but is maybe not the prettiest example this way.

@RubenImhoff
Copy link
Contributor

@m-rempel, looks like we still have some failing pca tests :(

@RubenImhoff
Copy link
Contributor

Not the prettiest plot, nor the best example of the reduced-space ensemble Kalman filter blending, but at least this is a sign that it also work on the KNMI ensemble data (30 members). On my laptop, I could only test it with hourly NWP data, so that explains the mismatches so far. I'll try it out with 5-min data next week. :)
Anyway, for now a sign that the model runs and finishes successfully.

Test_KNMI_data_blending

@m-rempel
Copy link
Contributor Author

Thank you @RubenImhoff for sharing these first results with other than DWD data! I've now fixed the bug in the PCA test for cases when n_components < n_ens_member.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants