-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple quantile regression with preserving monotonicity #5727
Comments
Awesome! |
Thanks for sharing this here @RektPunk. Could you please elaborate a little more about: "..., set hess = 1, and do first order approximation". In particular, I am interested on understanding two things:
|
Hey, @juandados. Thanks for asking :) |
Very interesting discussions. Even though there is no quantile regression available yet, you might find my repo LightGBMLSS that models and predicts the full conditional distribution of a univariate target as a function of covariates interesting. Choosing from a wide range of continuous, discrete, and mixed discrete-continuous distributions, modelling and predicting the entire conditional distribution greatly enhances the flexibility of LightGBM, as it allows to create probabilistic outputs from which prediction intervals and quantiles of interest can be derived. Since the distribution function is strictly monotonically increasing, quantile crossing cannot happen. Currently, I am also working on a quantile regression extension, with strictly non-crossing quantiles. I'll post again once it is available. |
Have you managed to add non crossing quantile reg into your package ? |
@mat-ej Sorry for the late reply. Yes I have a working version of non-crossing quantiles, where all quantiles are jointly estimated in one model. I haven't included it yet as a release in the package since I need to evaluate it in more detail. |
@RektPunk The approach of Cannon (2018) is extremely nice and using the monotonicity constraint on the quantile-column "guarantees" they are non-crossing. Extremely happy you brought this to our attention! I have some questions to the community, especially when it comes to the creation of the dataset and its implications on estimation:
![]() Many thanks for the great suggestions! |
Hey @StatMixedML, thxs for your comment. The first comment is exactly what i concerned. I also wonder is there any way to remove redundancy :). |
Hey, thanks for conducting experiment. I think hyper-parameters should be different cause constraint model is much complex and also have constraint as you mentioned. |
The thing is that due to the multiple stacking of the data, the approach is infeasible if the original data is already big... |
Yeah, I totally agree with you. For that case, other trick should be considered. |
LightGBM version has changed and some details have been modified. The latest code can be found here. I also added code for xgboost. |
Summary
Hello, while using LightGBM to estimate multiple quantiles, I have encountered several issues where the monotonicity between quantiles is not guaranteed just as #3447, #4201, and #5296. To address this, I have implemented a custom loss and monotonic constraints to ensure that the quantiles satisfy a non-crossing condition inspired by references below. I created this issue (even though this may not be appropriate to the issues tab ;) ) because I wanted to share my solution.
Description
Basically, I used the features provided by LightGBM.$1$ in the alpha column, the monotone constraints for alpha with any new data is preserved even though alphas are differ from train alphas.
Speaking of the concept, an input train data (including explanation and response variables) is duplicated as much as the number of input alphas (which are quantiles), and each alpha is put into a column as Cannon's research. Next, calculate the gradient of the composite quantile loss for the new data like References, set hess = 1, and do first order approximation. Finally, if we put increasing monotone constraints as
I created a simple example to test the non-crossing condition.
When visualized, the result was as shown in the figure below.
![image](https://user-images.githubusercontent.com/110188257/219676440-13db0be7-e74a-45ca-a465-ae76491a1847.png)
I am very curious about how other guys have approached this problem and would love to hear any new ideas, insights, or feedbacks.
References
The text was updated successfully, but these errors were encountered: