-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Output format of conformal predictions #221
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would like your opinion on the work so far
It looks good in general. My understanding from our private conversation was that you wanted to change the format from long to wide as well.
I left a couple of consideration in the codebase itself though, and I believe that hinting may require some more attention if you care about those.
Also I would like to hear your opinion on how you would test the conformal prediction code.
As behavioral testing I would like to see that the point prediction in regression is always between the lower and upper bounds, which it is not the case with the current implementation.
From the theoretical point of view, maybe one could also check that the actual value falls between the bounds (at least) x-percent of the times - this would be a flaky test though :)
|
||
# Make alpha base 100 | ||
y_pred_quantiles = y_pred_quantiles.with_columns( | ||
(pl.col("quantile") * 100).cast(pl.Int16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have more than 2 decimal places, then the casting would just truncate the decimal values 🤔
Example:
import polars as pl
pl.DataFrame({"a": [0.111, 0.11]}).with_columns((pl.col("a")*100).cast(pl.Int16))
shape: (2, 1)
┌─────┐
│ a │
│ --- │
│ i16 │
╞═════╡
│ 11 │
│ 11 │
└─────┘
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct; I am still to change that bit of the code fortunately. This column won't exist once I'm done ✔️
return (0.1, 0.9) | ||
elif len(alphas) != 2: | ||
raise ValueError("alphas must be a list of length 2") | ||
elif not all(0 < alpha < 1 for alpha in alphas): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also check that the sequence is sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC the function returns the alphas sorted, so that should be OK. Am I correct?
a41baa6
to
16e35e1
Compare
@FBruzzesi I am starting to work on #135 ; would like your opinion on the work so far 😊 Also I would like to hear your opinion on how you would test the conformal prediction code.
(To keep track: this is preparatory work for #39)