Multiple quantile regression with preserving monotonicity #5727

RektPunk · 2023-02-17T14:18:00Z

Summary

Hello, while using LightGBM to estimate multiple quantiles, I have encountered several issues where the monotonicity between quantiles is not guaranteed just as #3447, #4201, and #5296. To address this, I have implemented a custom loss and monotonic constraints to ensure that the quantiles satisfy a non-crossing condition inspired by references below. I created this issue (even though this may not be appropriate to the issues tab ;) ) because I wanted to share my solution.

Description

Basically, I used the features provided by LightGBM.
Speaking of the concept, an input train data (including explanation and response variables) is duplicated as much as the number of input alphas (which are quantiles), and each alpha is put into a column as Cannon's research. Next, calculate the gradient of the composite quantile loss for the new data like References, set hess = 1, and do first order approximation. Finally, if we put increasing monotone constraints as $1$ in the alpha column, the monotone constraints for alpha with any new data is preserved even though alphas are differ from train alphas.

from typing import List, Union, Dict, Any, Tuple
from functools import partial
from itertools import repeat, chain

import numpy as np
import lightgbm as lgb
import pandas as pd


def _grad_rho(u, alpha) -> np.ndarray:
    return -(alpha - (u < 0).astype(float))


def check_loss_grad_hess(
    y_pred: np.ndarray, dtrain: lgb.basic.Dataset, alphas: List[float]
) -> Tuple[np.ndarray, np.ndarray]:
    _len_alpha = len(alphas)
    _y_train = dtrain.get_label()
    _y_pred_reshaped = y_pred.reshape(_len_alpha, -1)
    _y_train_reshaped = _y_train.reshape(_len_alpha, -1)

    grads = []
    for alpha_inx in range(_len_alpha):
        _err_for_alpha = _y_train_reshaped[alpha_inx] - _y_pred_reshaped[alpha_inx]
        grad = _grad_rho(_err_for_alpha, alphas[alpha_inx])
        grads.append(grad)

    grad = np.concatenate(grads)
    hess = np.ones(_y_train.shape)

    return grad, hess


def _alpha_validate(
    alphas: Union[List[float], float],
) -> List[float]:
    if isinstance(alphas, float):
        alphas = [alphas]
    return alphas


def _prepare_x(
    x: Union[pd.DataFrame, pd.Series, np.ndarray],
    alphas: List[float],
) -> pd.DataFrame:
    if isinstance(x, np.ndarray) or isinstance(x, pd.Series):
        x = pd.DataFrame(x)
    assert "_tau" not in x.columns, "Column name '_tau' is not allowed."
    _alpha_repeat_count_list = [list(repeat(alpha, len(x))) for alpha in alphas]
    _alpha_repeat_list = list(chain.from_iterable(_alpha_repeat_count_list))
    _repeated_x = pd.concat([x] * len(alphas), axis=0)

    _repeated_x = _repeated_x.assign(
        _tau=_alpha_repeat_list,
    )
    return _repeated_x


def _prepare_train(
    x: Union[pd.DataFrame, pd.Series, np.ndarray],
    y: Union[pd.Series, np.ndarray],
    alphas: List[float],
) -> Dict[str, Union[pd.DataFrame, np.ndarray]]:
    _train_df = _prepare_x(x, alphas)
    _repeated_y = np.concatenate(list(repeat(y, len(alphas))))
    return (_train_df, _repeated_y)


class MonotonicQuantileRegressor:
    def __init__(
        self,
        x: Union[pd.DataFrame, pd.Series, np.ndarray],
        y: Union[pd.Series, np.ndarray],
        alphas: Union[List[float], float],
    ):
        alphas = _alpha_validate(alphas)
        self.x_train, self.y_train = _prepare_train(x, y, alphas)
        self.dataset = lgb.Dataset(data=self.x_train, label=self.y_train)
        self.fobj = partial(check_loss_grad_hess, alphas=alphas)

    def train(self, params: Dict[str, Any]) -> lgb.basic.Booster:
        self._params = params.copy()
        if "monotone_constraints" in self._params:
            self._params["monotone_constraints"].append(1)
        else:
            self._params.update(
                {
                    "monotone_constraints": [
                        1 if "_tau" == col else 0 for col in self.x_train.columns
                    ]
                }
            )
        self.model = lgb.train(
            train_set=self.dataset,
            verbose_eval=False,
            params=self._params,
            fobj=self.fobj,
            feval=self.feval,
        )
        return self.model

    def predict(
        self,
        x: Union[pd.DataFrame, pd.Series, np.ndarray],
        alphas: Union[List[float], float],
    ) -> np.ndarray:
        alphas = _alpha_validate(alphas)
        _x = _prepare_x(x, alphas)
        _pred = self.model.predict(_x)
        _pred = _pred.reshape(len(alphas), len(x))
        return _pred

I created a simple example to test the non-crossing condition.

sample_size = 500
params = {
    "max_depth": 4,
    "num_leaves": 15,
    "learning_rate": 0.1,
    "n_estimators": 100,
    "boosting_type": "gbdt",
}
alphas = [0.3, 0.4, 0.5, 0.6, 0.7]
x = np.linspace(-10, 10, sample_size)
x_test = np.linspace(-10, 10, sample_size)
y_noise = np.sin(x) + np.random.uniform(-0.4, 0.4, sample_size)
y_test = np.sin(x_test) + np.random.uniform(-0.4, 0.4, sample_size)

monotonic_quantile_regressor = MonotonicQuantileRegressor(x=x, y=y_noise, alphas=alphas)
model = monotonic_quantile_regressor.train(params=params)
preds = monotonic_quantile_regressor.predict(x=x_test, alphas=alphas)
preds_df = pd.DataFrame(preds).T
(preds_df.diff(axis = 1) < 0).sum(axis = 1).sum(axis = 0)

When visualized, the result was as shown in the figure below.

I am very curious about how other guys have approached this problem and would love to hear any new ideas, insights, or feedbacks.

References

Non-crossing nonlinear regression quantiles by monotone composite quantile regression neural network, with application to rainfall extremes: https://link.springer.com/article/10.1007/s00477-018-1573-6
Learning Multiple Quantiles With Neural Networks: https://www.tandfonline.com/doi/full/10.1080/10618600.2021.1909601

chulhongsung · 2023-02-20T02:06:46Z

Awesome!

juandados · 2023-11-25T06:01:51Z

Thanks for sharing this here @RektPunk. Could you please elaborate a little more about:

"..., set hess = 1, and do first order approximation".

In particular, I am interested on understanding two things:

The pinball loss is picewise linear respect the model output, so the second derivative is zero, then the hessian is null. Why to set it as one?
Where is the first order approximation used here?

RektPunk · 2023-12-01T09:11:00Z

Hey, @juandados. Thanks for asking :)
As you said, hessian should be null, but for computation, hess is set as 1. Also, the first order approximation (cause only use grad) is used for estimating params in lgb like https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.train.html.

StatMixedML · 2024-01-03T08:47:58Z

Very interesting discussions.

Even though there is no quantile regression available yet, you might find my repo LightGBMLSS that models and predicts the full conditional distribution of a univariate target as a function of covariates interesting. Choosing from a wide range of continuous, discrete, and mixed discrete-continuous distributions, modelling and predicting the entire conditional distribution greatly enhances the flexibility of LightGBM, as it allows to create probabilistic outputs from which prediction intervals and quantiles of interest can be derived. Since the distribution function is strictly monotonically increasing, quantile crossing cannot happen.

Currently, I am also working on a quantile regression extension, with strictly non-crossing quantiles. I'll post again once it is available.

mat-ej · 2024-02-03T12:20:17Z

@StatMixedML

Have you managed to add non crossing quantile reg into your package ?

StatMixedML · 2024-02-20T09:49:36Z

@mat-ej Sorry for the late reply.

Yes I have a working version of non-crossing quantiles, where all quantiles are jointly estimated in one model. I haven't included it yet as a release in the package since I need to evaluate it in more detail.

StatMixedML · 2024-02-21T09:49:23Z

@RektPunk The approach of Cannon (2018) is extremely nice and using the monotonicity constraint on the quantile-column "guarantees" they are non-crossing. Extremely happy you brought this to our attention!

I have some questions to the community, especially when it comes to the creation of the dataset and its implications on estimation:

The original dataset is replicated/stacked across different quantile levels, with the only difference being the quantile column. Is there any experience on how this affects training? Reason I am asking is because there is redundancy in the dataset when it is stacked multiple times, as each row from the original dataset is repeated for each quantile level.
@RektPunk In the paper, there are two tau columns: first and last. In your implementation there is only one column which is used for the monotonicity constraint. Any reason why not using the first as well?

Many thanks for the great suggestions!

RektPunk · 2024-02-22T00:05:10Z

Hey @StatMixedML, thxs for your comment. The first comment is exactly what i concerned. I also wonder is there any way to remove redundancy :).
For second comment, I don’t think there are two columns in Canon’s paper. Even so, one tau column is enough to guarantee non-crossing property.

StatMixedML · 2024-02-27T09:19:42Z

I used the above implementation and compared it with the base quantile regression available in LightGBM. Both models have the same set of hyper-parameters and number of iterations, and the dashed lines show the actual quantiles

The constrained estimation does not properly capture the shape of the quantiles, especially for the higher and lower quantiles. This might be an effect of the monotonicity constraint or the lack of proper HPO. Any comments on this?

RektPunk · 2024-02-28T06:40:55Z

Hey, thanks for conducting experiment. I think hyper-parameters should be different cause constraint model is much complex and also have constraint as you mentioned.

StatMixedML · 2024-02-29T13:52:40Z

The thing is that due to the multiple stacking of the data, the approach is infeasible if the original data is already big...

RektPunk · 2024-03-01T02:17:35Z

Yeah, I totally agree with you. For that case, other trick should be considered.

RektPunk · 2024-06-28T11:49:36Z

LightGBM version has changed and some details have been modified. The latest code can be found here. I also added code for xgboost.

jameslamb added the question label Feb 17, 2023

RektPunk mentioned this issue Mar 8, 2023

Quantile LightGBM - inconsistent deciles #3447

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple quantile regression with preserving monotonicity #5727

Multiple quantile regression with preserving monotonicity #5727

RektPunk commented Feb 17, 2023 •

edited

Loading

chulhongsung commented Feb 20, 2023

juandados commented Nov 25, 2023

RektPunk commented Dec 1, 2023

StatMixedML commented Jan 3, 2024 •

edited

Loading

mat-ej commented Feb 3, 2024

StatMixedML commented Feb 20, 2024

StatMixedML commented Feb 21, 2024 •

edited

Loading

RektPunk commented Feb 22, 2024

StatMixedML commented Feb 27, 2024 •

edited

Loading

RektPunk commented Feb 28, 2024

StatMixedML commented Feb 29, 2024

RektPunk commented Mar 1, 2024

RektPunk commented Jun 28, 2024 •

edited

Loading

Multiple quantile regression with preserving monotonicity #5727

Multiple quantile regression with preserving monotonicity #5727

Comments

RektPunk commented Feb 17, 2023 • edited Loading

Summary

Description

References

chulhongsung commented Feb 20, 2023

juandados commented Nov 25, 2023

RektPunk commented Dec 1, 2023

StatMixedML commented Jan 3, 2024 • edited Loading

mat-ej commented Feb 3, 2024

StatMixedML commented Feb 20, 2024

StatMixedML commented Feb 21, 2024 • edited Loading

RektPunk commented Feb 22, 2024

StatMixedML commented Feb 27, 2024 • edited Loading

RektPunk commented Feb 28, 2024

StatMixedML commented Feb 29, 2024

RektPunk commented Mar 1, 2024

RektPunk commented Jun 28, 2024 • edited Loading

RektPunk commented Feb 17, 2023 •

edited

Loading

StatMixedML commented Jan 3, 2024 •

edited

Loading

StatMixedML commented Feb 21, 2024 •

edited

Loading

StatMixedML commented Feb 27, 2024 •

edited

Loading

RektPunk commented Jun 28, 2024 •

edited

Loading