Skip to content

Conversation

n0228a
Copy link

@n0228a n0228a commented Oct 8, 2025

This PR adds collocated cokriging methods to GSTools for multivariate geostatistical estimation, allowing users to improve sparse primary variable estimates by leveraging densely-sampled secondary variable data.

Collocated Cokriging Features

  • Two cokriging variants: Simple Collocated (SCCK) and Intrinsic Collocated (ICCK)
  • Inherits from Krige base class for full API compatibility
  • Implements Markov Model I (MM1): C_YZ(h) = ρ_YZ(0)·√(C_Z(h)·C_Y(h))
  • Accessible via gstools.cokriging submodule or top-level: from gstools import SimpleCollocated, IntrinsicCollocated
  • Full variance estimation with return_var=True
  • Parameter validation: cross_corr ∈ [-1, 1], secondary_var > 0
  • Boundary conditions: ρ=0 recovers Simple Kriging, ρ=±1 gives zero ICCK variance
  • SCCK can exhibit variance inflation (σ²_SCCK ≥ σ²_SK) in unstable configurations
  • ICCK provides stable variance: σ²_ICCK = (1-ρ₀²)·σ²_SK ≤ σ²_SK

Simple Collocated Cokriging

Uses only collocated secondary data at the estimation point:

import gstools as gs
model = gs.Gaussian(dim=1, var=0.5, len_scale=2)
scck = gs.SimpleCollocated(
    model, cond_pos, cond_val,
    cross_corr=0.8, secondary_var=1.5,
    mean=1.0, secondary_mean=0.5
)
field, var = scck(pos, secondary_data=sec_data, return_var=True)

Intrinsic Collocated Cokriging

Uses collocated secondary data plus secondary values at all primary locations for more stable variance:

icck = gs.IntrinsicCollocated(
    model, cond_pos, cond_val,
    sec_cond_pos, sec_cond_val,
    cross_corr=0.8, secondary_var=1.5,
    mean=1.0, secondary_mean=0.5
)
field, var = icck(pos, secondary_data=sec_data, return_var=True)

Implementation Details

  • Located in new gstools.cokriging submodule with base class and methods
  • CollocatedCokriging base class handles MM1 formulation and variance computation
  • SimpleCollocated implements SCCK estimator: Z*_SCCK = Z*_SK·(1-k·λ_Y0) + λ_Y0·(Y(u0)-m_Y) + k·λ_Y0·m_Z
  • IntrinsicCollocated implements ICCK estimator using secondary weights
  • Both methods require secondary_data parameter on call
  • Cross-covariance computed as: C_YZ(0) = ρ_YZ(0)·√(C_Z(0)·C_Y(0))
  • Inherits all Krige functionality: grids, models, dimensions, anisotropy, drift, normalizers

Tests

11 tests in test_cokriging.py testing only cokriging logic:

  • Parameter validation (secondary_data required, cross_corr bounds, secondary_var positive, ICCK data length)
  • Boundary conditions (ρ=0 equals SK, ρ=±1 near-zero ICCK variance)
  • Variance formulas (SCCK and ICCK mathematical correctness)
  • Edge cases (SCCK variance inflation, SCCK vs ICCK stability comparison)
  • Exact Interpolation at conditioning points

All tests passing with no warnings.

Examples

Two minimal examples following GSTools conventions (76 & 78 lines):

  • 10_simple_collocated_cokriging.py - SCCK demonstration
  • 11_intrinsic_collocated_cokriging.py - ICCK demonstration
  • Same data with 2 spatial gap features demonstrating cokriging value
  • Two-panel plots showing raw data (left) and SK vs cokriging comparison (right)
  • Synthetic correlated secondary data for reproducibility

References

n0228a added 16 commits August 26, 2025 00:41
- Implement Simple Collocated Cokriging (SCCK) extending Krige class
- Implement Intrinsic Collocated Cokriging (ICCK) with flexible secondary models
- Add comprehensive test suite with 14 test cases covering:
  - Matrix construction and dimensions
  - Cross-correlation validation
  - RHS vector structure
  - Integration with drift functions
  - Edge cases (zero/perfect correlation)
- Follow gstools patterns: property validation, error handling, documentation
- Matrix structure: (n+1) x (n+1) for n conditioning points + 1 secondary variable
- Uses Markov model assumption: C_zy(h) = ρ * √(C_zz(h) * C_yy(h))
- All tests passing with proper position handling via pre_pos method
- Clean minimal implementation extending Krige base class
- Follows existing gstools design patterns exactly
- Only adds cross_corr parameter and secondary_data requirement
- Uses (n+1)×(n+1) matrix system solved per estimation point
- Full API compatibility: return_var, chunk_size, only_mean, etc.
- Proper integration with gstools post-processing and chunking
- Zero cross-correlation equals Simple Kriging (verified)
- Located in separate cokriging module as requested
- Create CollocatedCokriging base class following kriging module pattern
- Refactor SCCK and ICCK as thin wrappers (algorithm='MM1' vs 'intrinsic')
- Eliminate ~400 lines of duplicated code
- Maintain full backward compatibility
- All tests passing (14/14)
- Cleaner architecture for future extensibility
@MuellerSeb
Copy link
Member

This is great! I love it. I wanted this for a long time but never found the time to work on it.

I have some general questions/remarks:

  1. What about creating a Correlogram (abstract)-base-class with the MarkovModel1 as one child that holds all the logic currently added with several keywords in CollocatedCokriging.
    I think it could look like this:
import gstools as gs

correlo_model = gs.MarkovModel1(
    model=gs.Gaussian(dim=1, var=0.5, len_scale=2),
    cross_corr=0.8,
    secondary_var=1.5,
)
scck = gs.SimpleCollocated(
    correlo_model, cond_pos, cond_val,
    mean=1.0, secondary_mean=0.5
)

Maybe the means could be combined in a list then, or we could also move them into the MarkovModel1 (not sure what makes more sense here), or keep them as is.
We would then also need to think about which functionality we move into the Correlogram class. (e.g. _compute_covariances)
As always my motivation for this would be, to be future prove (for MM2 and maybe other possible models).

  1. How does/should Cokriging work with normalizer, fit_normalizer and fit_variogram?

  2. Do we also need an estimator for the empirical correlogram?

  3. please use math environments in the documentation for formuals.

  4. There should be a _apply_intrinsic_collocated method like the _apply_simple_collocated method.

@n0228a
Copy link
Author

n0228a commented Oct 14, 2025

1: I'm not sure if this will work very well with the approach I chose. As you maybe have seen, after some linear algebra reformulations of the problem, SCCK and ICCK can be shown as an extension to simple Kriging, making it not necessary to construct new matrices, that would need to be solved for every estimation point in collocated cokriging. That is also where the normalizer goes currently., it applies in the way it would apply for Simple Kriging.
3. As for the Markov Model 1 so far, I don't think a cmpirical correlogramm would be necessary, for mm2 also not as I understand the theory (there you would need to fit to the primary variogram again based on your secondary variable and the correlation coefficient).
4. I will update that.
5. I will update that too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants