Skip to content
128 changes: 74 additions & 54 deletions doc/OnlineDocs/explanation/analysis/parmest/covariance.rst
Original file line number Diff line number Diff line change
@@ -1,86 +1,106 @@
.. _covariancesection:

Covariance Matrix Estimation
============================

The uncertainty in the estimated parameters is quantified using the covariance matrix.
The diagonal of the covariance matrix contains the variance of the estimated parameters.
Assuming Gaussian independent and identically distributed measurement errors, the
covariance matrix of the estimated parameters can be computed using the following
methods which have been implemented in parmest.
The goal is parameter estimation (see :ref:`driversection` Section) is to estimate unknown model parameters
from experimental data. When the model parameters are estimated from the data, their accuracy is measured by
computing the covariance matrix. The diagonal of this covariance matrix contains the variance of the
estimated parameters which is used to calculate their uncertainty. Assuming Gaussian independent and identically
distributed measurement errors, the covariance matrix of the estimated parameters can be computed using the
following methods which have been implemented in parmest.

1. Reduced Hessian Method

When the objective function is the sum of squared errors (SSE):
:math:`\text{SSE} = \sum_{i = 1}^n \left(y_{i} - \hat{y}_{i}\right)^2`,
the covariance matrix is:
When the objective function is the sum of squared errors (SSE) for homogeneous data, defined as
:math:`\text{SSE} = \sum_{i = 1}^{n} \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};
\boldsymbol{\theta})\right)^\text{T} \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};
\boldsymbol{\theta})\right)`, the covariance matrix is:

.. math::
V_{\boldsymbol{\theta}} = 2 \sigma^2 \left(\frac{\partial^2 \text{SSE}}
{\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}\right)^{-1}_{\boldsymbol{\theta}
= \boldsymbol{\theta}^*}

When the objective function is the weighted SSE (WSSE):
:math:`\text{WSSE} = \frac{1}{2} \left(\mathbf{y} - f(\mathbf{x};\boldsymbol{\theta})\right)^\text{T}
\mathbf{W} \left(\mathbf{y} - f(\mathbf{x};\boldsymbol{\theta})\right)`,
\boldsymbol{V}_{\boldsymbol{\theta}} = 2 \sigma^2 \left(\frac{\partial^2 \text{SSE}}
{\partial \boldsymbol{\theta}^2}\right)^{-1}_{\boldsymbol{\theta}
= \hat{\boldsymbol{\theta}}}

Similarly, when the objective function is the weighted SSE (WSSE) for heterogeneous data, defined as
:math:`\text{WSSE} = \frac{1}{2} \sum_{i = 1}^{n} \left(\boldsymbol{y}_{i} -
\boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})\right)^\text{T} \boldsymbol{\Sigma}_{\boldsymbol{y}}^{-1}
\left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})\right)`,
the covariance matrix is:

.. math::
V_{\boldsymbol{\theta}} = \left(\frac{\partial^2 \text{WSSE}}
{\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}\right)^{-1}_{\boldsymbol{\theta}
= \boldsymbol{\theta}^*}

Where :math:`V_{\boldsymbol{\theta}}` is the covariance matrix of the estimated
parameters, :math:`y` are the observed measured variables, :math:`\hat{y}` are the
predicted measured variables, :math:`n` is the number of data points,
:math:`\boldsymbol{\theta}` are the unknown parameters, :math:`\boldsymbol{\theta^*}`
are the estimates of the unknown parameters, :math:`\mathbf{x}` are the decision
variables, and :math:`\mathbf{W}` is a diagonal matrix containing the inverse of the
variance of the measurement error, :math:`\sigma^2`. When the standard
deviation of the measurement error is not supplied by the user, parmest
approximates the variance of the measurement error as
:math:`\sigma^2 = \frac{1}{n-l} \sum e_i^2` where :math:`l` is the number of
fitted parameters, and :math:`e_i` is the residual for experiment :math:`i`.
\boldsymbol{V}_{\boldsymbol{\theta}} = \left(\frac{\partial^2 \text{WSSE}}
{\partial \boldsymbol{\theta}^2}\right)^{-1}_{\boldsymbol{\theta}
= \hat{\boldsymbol{\theta}}}

Where :math:`\boldsymbol{V}_{\boldsymbol{\theta}}` is the covariance matrix of the estimated
parameters :math:`\hat{\boldsymbol{\theta}} \in \mathbb{R}^p`, :math:`\boldsymbol{y}_{i} \in \mathbb{R}^m` are
observations of the measured variables, :math:`\boldsymbol{f}` is the model function,
:math:`\boldsymbol{x}_{i} \in \mathbb{R}^{q}` are the input variables, :math:`n` is the number of experiments,
:math:`\boldsymbol{\Sigma}_{\boldsymbol{y}}` is the measurement error covariance matrix, and :math:`\sigma^2`
is the variance of the measurement error. When the standard deviation of the measurement error is not supplied
by the user, parmest approximates :math:`\sigma^2` as:
:math:`\hat{\sigma}^2 = \frac{1}{n-p} \sum_{i=1}^{n} \boldsymbol{\varepsilon}_{i}(\boldsymbol{\theta})^{\text{T}}
\boldsymbol{\varepsilon}_{i}(\boldsymbol{\theta})`, and :math:`\boldsymbol{\varepsilon}_{i} \in \mathbb{R}^m`
are the residuals between the data and model for experiment :math:`i`.

In parmest, this method computes the inverse of the Hessian by scaling the
objective function (SSE or WSSE) with a constant probability factor.
objective function (SSE or WSSE) with a constant probability factor, :math:`\frac{1}{n}`.

2. Finite Difference Method

In this method, the covariance matrix, :math:`V_{\boldsymbol{\theta}}`, is
calculated by applying the Gauss-Newton approximation to the Hessian,
:math:`\frac{\partial^2 \text{SSE}}{\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}`
In this method, the covariance matrix, :math:`\boldsymbol{V}_{\boldsymbol{\theta}}`, is
computed by differentiating the Hessian,
:math:`\frac{\partial^2 \text{SSE}}{\partial \boldsymbol{\theta}^2}`
or
:math:`\frac{\partial^2 \text{WSSE}}{\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}`,
leading to:
:math:`\frac{\partial^2 \text{WSSE}}{\partial \boldsymbol{\theta}^2}`, and
applying Gauss-Newton approximation which results in:

.. math::
\boldsymbol{V}_{\boldsymbol{\theta}} = \left(\sum_{i = 1}^n \boldsymbol{G}_{i}^{\text{T}}
\boldsymbol{\Sigma}_{\boldsymbol{y}}^{-1} \boldsymbol{G}_{i} \right)^{-1}

where

.. math::
V_{\boldsymbol{\theta}} = \left(\sum_{i = 1}^n \mathbf{G}_{i}^{\mathrm{T}} \mathbf{W}
\mathbf{G}_{i} \right)^{-1}
\boldsymbol{G}_{i} = \frac{\partial \boldsymbol{f}(\boldsymbol{x}_i;\boldsymbol{\theta})}
{\partial \boldsymbol{\theta}}

This method uses central finite difference to compute the Jacobian matrix,
:math:`\mathbf{G}_{i}`, for experiment :math:`i`, which is the sensitivity of
the measured variables with respect to the parameters, :math:`\boldsymbol{\theta}`.
:math:`\mathbf{W}` is a diagonal matrix containing the inverse of the variance
of the measurement errors, :math:`\sigma^2`.
This method uses central finite difference to compute the Jacobian matrix, :math:`\boldsymbol{G}_{i}`,
for experiment :math:`i`.

.. math::
\boldsymbol{G}_{i}[:,\,k] \approx \frac{\boldsymbol{f}(\boldsymbol{x}_i;\theta_k + \Delta \theta_k)
\vert_{\hat{\boldsymbol{\theta}}} - \boldsymbol{f}(\boldsymbol{x}_i;\theta_k - \Delta \theta_k)
\vert_{\hat{\boldsymbol{\theta}}}}{2 \Delta \theta_k} \quad \forall \quad \theta_k \, \in \,
[\theta_1,\cdots, \theta_p]

3. Automatic Differentiation Method

Similar to the finite difference method, the covariance matrix is calculated as:

.. math::
V_{\boldsymbol{\theta}} = \left( \sum_{i = 1}^n \mathbf{G}_{\text{kaug},\, i}^{\mathrm{T}}
\mathbf{W} \mathbf{G}_{\text{kaug},\, i} \right)^{-1}
\boldsymbol{V}_{\boldsymbol{\theta}} = \left( \sum_{i = 1}^n \boldsymbol{G}_{\text{kaug},\, i}^{\text{T}}
\boldsymbol{\Sigma}_{\boldsymbol{y}}^{-1} \boldsymbol{G}_{\text{kaug},\, i} \right)^{-1}

However, this method uses implicit differentiation and the model-optimality or Karush–Kuhn–Tucker (KKT) conditions
to compute the Jacobian matrix, :math:`\boldsymbol{G}_{\text{kaug},\, i}`, for experiment :math:`i`.

.. math::
\boldsymbol{G}_{\text{kaug},\,i} = \frac{\partial \boldsymbol{f}(\boldsymbol{x}_i,\boldsymbol{\theta})}
{\partial \boldsymbol{\theta}} + \frac{\partial \boldsymbol{f}(\boldsymbol{x}_i,\boldsymbol{\theta})}
{\partial \boldsymbol{x}_i}\frac{\partial \boldsymbol{x}_i}{\partial \boldsymbol{\theta}}

However, this method uses the model optimality (KKT) condition to compute the
Jacobian matrix, :math:`\mathbf{G}_{\text{kaug},\, i}`, for experiment :math:`i`.
The covariance matrix calculation is only supported with the built-in objective functions "SSE" or "SSE_weighted".

The covariance matrix calculation is only supported with the built-in objective
functions "SSE" or "SSE_weighted".
In parmest, the covariance matrix can be computed after creating the
:class:`~pyomo.contrib.parmest.experiment.Experiment` class,
defining the :class:`~pyomo.contrib.parmest.parmest.Estimator` object,
and estimating the model parameters using :class:`~pyomo.contrib.parmest.parmest.Estimator.theta_est`
(all these steps were addressed in the :ref:`driversection` Section).

In parmest, the covariance matrix can be calculated after defining the
:class:`~pyomo.contrib.parmest.parmest.Estimator` object and estimating the unknown
parameters using :class:`~pyomo.contrib.parmest.parmest.Estimator.theta_est`. To
estimate the covariance matrix, with the default method being "finite_difference", call
the :class:`~pyomo.contrib.parmest.parmest.Estimator.cov_est` function, e.g.,
To estimate the covariance matrix, with the default method being "finite_difference", call
the :class:`~pyomo.contrib.parmest.parmest.Estimator.cov_est` function as follows:

.. testsetup:: *
:skipif: not __import__('pyomo.contrib.parmest.parmest').contrib.parmest.parmest.parmest_available
Expand Down
Loading
Loading