-
-
Notifications
You must be signed in to change notification settings - Fork 53
[bayes_nonconj] Update lecture #545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks @kp992 ! |
@mmcky : one for the warning checker: |
@kp992 Great job, much appreciated. Are you willing to hand over to some junior RAs to finish the PR? Perhaps they can compare the code to the style guide and PEP8, plus add some discussion of the figures at the end. (CC @xuanguang-li @matheusvillasb ) |
I would quickly send a commit to fix doc reference and then we could hand over for adding more notes.
I have ran the pep8 and style fixes. One easy way to fix that I use is https://pypi.org/project/jupyter-black/. import jupyter_black
jupyter_black.load() And |
@kp992 thanks again. @matheusvillasb @xuanguang-li are you willing to take over? Perhaps @mmcky can coordinate merge and opening of a new issue. |
Of course @jstac. I will assign myself to the new issue related to this. |
Thanks @xuanguang-li ! |
- Adopt QuantEcon naming format - Adopt one-line docstring format in accordance with PEP 8 - Change integral notation from $\int dx\, f(x)$ to $\int f(x)\, dx$ - Reorder the illustration of (`param`, `name_dist`) pairs
thanks @HumphreyYang -- great guidance. @xuanguang-li like @HumphreyYang has suggested longer titles are OK when necessary. The style guide is provide guidance around the length (i.e. use the minimum). In this case I agree a longer title is OK to capture the necessary information. Another idea would be to use
sometimes I use this when specifying parameters in a figure for example. |
@xuanguang-li the execution error is due to
It looks like there are duplicate labels (they must be unique to work properly). |
Many thanks, @mmcky @HumphreyYang! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the Bayesian non-conjugate lecture to remove dependencies on PyTorch and Pyro, streamlining it to use only NumPyro backed by JAX. The changes include significant code refactoring, typo corrections, and improved code styling throughout the lecture.
Key Changes
- Simplified dependencies: Removed PyTorch and Pyro code, using only NumPyro with JAX backend
- Code refactoring: Updated the
BayesianInference
class to remove dual solver support and streamline the implementation - Content improvements: Fixed typos, improved formatting, and enhanced code documentation
lectures/bayes_nonconj.md
Outdated
base_dist,ndist.transforms.ExpTransform() | ||
) | ||
base_dist = ndist.TruncatedNormal( | ||
low=jnp.log(0), high=jnp.log(1), loc=loc, scale=scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using jnp.log(0)
will result in negative infinity (-inf), which may cause numerical issues. Consider using a small positive value instead, such as jnp.log(1e-10)
or define an appropriate lower bound.
low=jnp.log(0), high=jnp.log(1), loc=loc, scale=scale | |
low=jnp.log(1e-10), high=jnp.log(1), loc=loc, scale=scale |
Copilot uses AI. Check for mistakes.
@@ -346,17 +269,17 @@ $$ | |||
where | |||
|
|||
$$ | |||
p\left(Y\right)=\int d\theta p\left(Y\mid\theta\right)p\left(Y\right). | |||
p\left(Y\right)=\int p\left(Y\mid\theta\right)p\left(Y\right) d\theta. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The integrand should be p(Y|θ)p(θ)
not p(Y|θ)p(Y)
. The correct formula should be p(Y) = ∫ p(Y|θ)p(θ) dθ
.
p\left(Y\right)=\int p\left(Y\mid\theta\right)p\left(Y\right) d\theta. | |
p\left(Y\right)=\int p\left(Y\mid\theta\right)p\left(\theta\right) d\theta. |
Copilot uses AI. Check for mistakes.
\log p(Y)&=D_{KL}(q(\theta;\phi)\;\|\;p(\theta\mid Y))+\int d\theta q_{\phi}(\theta)\log\frac{p(\theta,Y)}{q_{\phi}(\theta)} | ||
\begin{aligned}D_{KL}(q(\theta;\phi)\;\|\;p(\theta\mid Y)) & =-\int q(\theta;\phi)\log\frac{P(\theta\mid Y)}{q(\theta;\phi)} d\theta\\ | ||
& =-\int q(\theta)\log\frac{\frac{p(\theta,Y)}{p(Y)}}{q(\theta)} d\theta\\ | ||
& =-\int q(\theta)\log\frac{p(\theta,Y)}{p(\theta)q(Y)} d\theta\\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The denominator should be q(θ)p(Y)
not p(θ)q(Y)
. The correct expression should be p(θ,Y)/(q(θ)p(Y))
.
& =-\int q(\theta)\log\frac{p(\theta,Y)}{p(\theta)q(Y)} d\theta\\ | |
& =-\int q(\theta)\log\frac{p(\theta,Y)}{q(\theta)p(Y)} d\theta\\ |
Copilot uses AI. Check for mistakes.
lectures/bayes_nonconj.md
Outdated
|
||
elif self.name_dist=='vonMises': | ||
elif self.name_dist == "vonMises": | ||
# unpack parameters | ||
kappa = self.param |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter unpacking is inconsistent. For 'vonMises', kappa
is assigned directly from self.param
without tuple unpacking, but other distributions use tuple unpacking. This suggests self.param
should be a scalar for vonMises, but the documentation shows it as a tuple.
kappa = self.param | |
kappa, = self.param |
Copilot uses AI. Check for mistakes.
lectures/bayes_nonconj.md
Outdated
optimizer = nAdam(step_size=lr) | ||
svi = nSVI( | ||
self.model, self.truncnormal_guide, optimizer, loss=nTrace_ELBO() | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is inconsistent with the surrounding code block. This line should be aligned with the optimizer = nAdam(step_size=lr)
line above.
Copilot uses AI. Check for mistakes.
@HumphreyYang would you mind to review the |
Hi @mmcky, thanks for your follow-up. Copilot's suggestions are also very helpful. I'm currently working on converting the class-based pattern to a JAX pattern and expect to complete it today. Would you prefer I continue with this refactoring, or pause to address some remaining typos first? |
@xuanguang-li please feel free to continue work on this PR. We will merge updates to lectures from 26-August (as there is currently active teaching from this lecture series) so we have a little bit of time. |
Thanks @mmcky. I will push the refactored version later once I resolve those unexpected bugs. |
Hi @mmcky @HumphreyYang, I attempted to use After that, I tried Here is my code:
Do you have any idea for this, or would it be better to leave |
Thanks for thinking about this so carefully @xuanguang-li ! My view is that we shouldn't be too aggressive in trying to get every speed gain in every situation. For this kind of plotting code, it's fine to do ordinary for loops. In fact it might be better, because it makes the code easy to read. For code that's central to algorithms or that will be used many many times, where efficiency is really important, we should push harder and be more aggressive. |
Thanks for your suggestion, @jstac! It's true that |
Thanks @xuanguang-li ! @mmcky @HumphreyYang perhaps we should modify the manual to reflect the fact that we're not optimizing everything -- it's a matter of judgement but the basic rule is we should keep things simple and readable for code that uses the model -- e.g., to generate plots -- but isn't internal to the model or the core algorithms that solve it. |
Updates the lecture with:
Fixes #524