Remove the type `ParamSpaceSGD` #205

Red-Portal · 2025-09-15T16:06:47Z

This PR removes the use of the type ParamSpaceSGD, which provides a unifying implementation of VI algorithms that run SGD in parameter space. Instead, each parameter space SGD-based VI algorithm becomes its own AbstractVariationalAlgorithm, where the shared code implementing step is shared by dispatching over their Union.

This addresses #204

Red-Portal · 2025-09-15T16:08:18Z

Hi @yebai , could you check to make sure that this is what you asked for? Personally, I feel the ParamSpaceSGD-based interface is much cleaner and intuitive, especially in terms of project structure. So I still insist we keep it as an internal implementation detail.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

src/algorithms/paramspacesgd/interface.jl

src/algorithms/interface.jl

yebai

Thanks @Red-Portal -- I left some comments above. In addition, let's simplify the folder structure a bit for clarity:

move all files in paramsspacesgd to algorithms, eg, "algorithms/paramspacesgd/constructors.jl" to "algorithms/constructors.jl"
keep each algorithm in its own file

Also, I'd suggest we consider renaming paramspacesgd.jl to interface.jl or something along the lines:

"algorithms/paramspacesgd/paramspacesgd.jl" to "algorithms/interface.jl"

Red-Portal · 2025-09-15T22:46:40Z

Hi Hong, I planned to do the restructuring in a separate PR to keep things simple in this one. Though:

move all files in paramsspacesgd to algorithms, eg, "algorithms/paramspacesgd/constructors.jl" to "algorithms/constructors.jl"

"algorithms/paramspacesgd/paramspacesgd.jl" to "algorithms/interface.jl"

After the release of v0.5, we'll be adding algorithms that don't conform to the original ParamSpaceSGD formalism so I think these namings are not gonna withstand that change. In fact, part of the reason I kept everything under paramspacesgd/ was precisely for this reason.

yebai · 2025-09-15T23:01:29Z

It is okay to keep all algorithms under algorithms and remove the extra subfolder paramspacesgd:

These categorisations are nonstandard, so they are not helping clarity.
There aren't many algorithms, so it is okay to keep all of them under the same algorithms folder.

After the release of v0.5, we'll be adding algorithms that don't conform to the original ParamSpaceSGD formalism

You can add more algorithms to interface.jl, so long as these algorithms are clearly grouped in interface.jl.

Red-Portal · 2025-09-15T23:13:03Z

After the release of v0.5, we'll be adding algorithms that don't conform to the original ParamSpaceSGD formalism

You can add more algorithms to interface.jl, so long as these algorithms are clearly grouped in interface.jl.

I am saying that these new algorithms can't be grouped in interface.jl since they will need their custom implementation of step. The current algorithms all go through the same step, which is why they can be grouped.

yebai · 2025-09-16T07:12:33Z

"grouping" refers to grouping interface code together for similar VI algorithms in the proposed interface.jl. It doesn't require creating a new union type if an algorithm is distinct from others. In these cases, the algorithm could be a singleton group.

sunxd3 · 2025-09-16T09:29:48Z

If I understand right, this PR flattens the existing AbstractVariationalAlgorithm → ParamSpaceSGD → (concrete algorithms) hierarchy. That intermediate layer exists today so that anything doing “SGD in parameter space” can share one abstraction.

Before we drop it, may I understand what concrete benefits the flattening delivers? In particular, are we planning to add other algorithm families alongside the current ParamSpaceSGD? If so, we then probably need to add the middle layer abstraction back.

At the moment, I haven't quite convince myself that the flattening of the type hierarchy is necessary.

yebai · 2025-09-16T09:38:54Z

I suggested the removal of the ParamSpaceSGD because it is not standard terminology in the variational inference literature. Adding this less well-known terminology adds mental overhead to understanding the code.

Red-Portal · 2025-09-16T16:32:50Z

@yebai @sunxd3 Thanks for chiming in. Actually, I have a new idea. So I believe the main complaint at the moment is that the term ParamSpaceSGD is not intuitive as an abstraction. What if we fix that directly: by changing the name to something more obvious.

In a nutshell, the nice thing about the current ParamSpaceSGD interface is that you only need to define a gradient estimator (estimate_gradient) of a corresponding objective to form an algorithm. So let me make that more explicit in the name. Here are a few candidates:

ObjectiveSGD
ObjectiveDescentInducedAlgorithm
MinObjectiveSGD
MinObjectiveWithSGD
MinObjectiveBySGD
ObjectiveGradientEstimateDescent

or something along these lines? Would that resolve your concern?

yebai · 2025-09-16T22:13:05Z

I think I see your point. But, I am not sure that helps.

define a gradient estimator (estimate_gradient) of a corresponding objective to form an algorithm.

That probably includes every learning algorithm in ML.

Red-Portal · 2025-09-17T01:11:27Z

define a gradient estimator (estimate_gradient) of a corresponding objective to form an algorithm.

That probably includes every learning algorithm in ML.

Yes, that is indeed almost true! But the point is that there are a couple of important algorithms that don't quite conform to this formalism, as they result in a custom update rule. They don't fall out of a gradient estimator, but modify the parameter update step too. So this is the reason I wish to allow for two different abstraction levels. But as you said, most algorithms only require defining a gradient estimator. So the lower-level interface helps unify the code for all those algorithms.

yebai · 2025-09-17T08:58:46Z

a couple of important algorithms that don't quite conform to this formalism, as they result in a custom update rule. They don't fall out of a gradient estimator, but modify the parameter update step too.

We are at risk of premature abstraction and introducing heuristic terminology here. It is better to work with concrete algorithms, and define a union type if sharing code is needed (eg, step) across algorithms.

There might be some insights we can learn by taking a unifying view of parameter space gradient descent VI, but that is a discussion we should have offline for a review paper.

Red-Portal · 2025-09-17T16:50:56Z

We are at risk of premature abstraction and introducing heuristic terminology here. It is better to work with concrete algorithms, and define a union type if sharing code is needed (eg, step) across algorithms.

My main beef with using Unions here is the following:

This results in duplicating code, as can already be seen in the PR (the structs KLMinRepGradDescent, KLMinRepGradProxDescent, KLMinScoreGradDescent all have the same fields.)
To use the shared step function, a certain <:AbstractVariationalAlgorithm struct has to contain a certain list of fields, which is now implicit since we're not using a proper interface.
I am not sure whether I should document the use of the shared step function. But if I do document it, it's not going to be pretty since it assumes a whole lot of implicit things (as stated in the item above).
The structuring of the project will need a bit of discussion since things are a bit more complicated. (What should we call the file containing the shared step function? How should we structure the directories?) As mentioned in a previous comment, we can't just use generic names since I will be adding algorithms that don't use the shared step function in the following PRs. Also, people first looking at the code base will probably have to do some mental gymnastics since lots of things are made implicit but not explicit through some interface.

With that said, do you find the solution below still unsatisfying? At least I hope that this resolves your concern that the terminology is non-standard.

So let me make that more explicit in the name. Here are a few candidates:

ObjectiveSGD

ObjectiveDescentInducedAlgorithm

MinObjectiveSGD

MinObjectiveWithSGD

MinObjectiveBySGD

ObjectiveGradientEstimateDescent
or something along these lines? Would that resolve your concern?

If you think we should still go with an implicit interface, then I'll follow for the sake of moving forward.

Red-Portal · 2025-09-25T01:22:10Z

Hi @yebai , what is your final verdict on what we should do here given my last comment above?

yebai · 2025-09-25T08:36:15Z

Hi @ Red-Portal, I don't think the proposed interfaces are better. For clarity, I'd suggest we remove paramspacesgd types. It's okay to keep the step etc functions in interface.jl instead.

Happy to continue the discussion offline and explore whether parameter space can be useful for novel algorithmic insights.

src/algorithms/interface.jl

…VI.jl into remove_paramspacesgd

…e_paramspacesgd

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Red-Portal · 2025-10-13T19:55:50Z

Hi @yebai , I applied all the changes you requested. Let me know if you're satisfied with the code now. Updating the docs will be separate since it will be a lot. (And painful on my end 😅 )

yebai

Many thanks, @Red-Portal, for the contributions. I'm happy with the changes; @penelopeysm / @mhauru might want to review this too.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

github-actions

Benchmark Results

Benchmark suite	Current: `9b2eabb`	Previous: `fb69a43`	Ratio
`normal/RepGradELBO + STL/meanfield/Zygote`	`3990679915` ns	`4038685577` ns	`0.99`
`normal/RepGradELBO + STL/meanfield/ReverseDiff`	`1124955886` ns	`1125086941` ns	`1.00`
`normal/RepGradELBO + STL/meanfield/Mooncake`	`1261092892.5` ns	`1251364823` ns	`1.01`
`normal/RepGradELBO + STL/fullrank/Zygote`	`3971814683.5` ns	`4006929376` ns	`0.99`
`normal/RepGradELBO + STL/fullrank/ReverseDiff`	`1621498433.5` ns	`1627490017.5` ns	`1.00`
`normal/RepGradELBO + STL/fullrank/Mooncake`	`1285149814.5` ns	`1271144046.5` ns	`1.01`
`normal/RepGradELBO/meanfield/Zygote`	`2859724373.5` ns	`2867089524.5` ns	`1.00`
`normal/RepGradELBO/meanfield/ReverseDiff`	`777886417` ns	`799659066` ns	`0.97`
`normal/RepGradELBO/meanfield/Mooncake`	`1137178738` ns	`1121597958` ns	`1.01`
`normal/RepGradELBO/fullrank/Zygote`	`2864869952` ns	`2844951835.5` ns	`1.01`
`normal/RepGradELBO/fullrank/ReverseDiff`	`954006564` ns	`954966722.5` ns	`1.00`
`normal/RepGradELBO/fullrank/Mooncake`	`1144634418` ns	`1159699376` ns	`0.99`
`normal + bijector/RepGradELBO + STL/meanfield/Zygote`	`5743469386` ns	`5572987635` ns	`1.03`
`normal + bijector/RepGradELBO + STL/meanfield/ReverseDiff`	`2419710523` ns	`2398829013` ns	`1.01`
`normal + bijector/RepGradELBO + STL/meanfield/Mooncake`	`4227822021.5` ns	`4131189735.5` ns	`1.02`
`normal + bijector/RepGradELBO + STL/fullrank/Zygote`	`5787110379` ns	`5649130276` ns	`1.02`
`normal + bijector/RepGradELBO + STL/fullrank/ReverseDiff`	`3116964174` ns	`3043240221` ns	`1.02`
`normal + bijector/RepGradELBO + STL/fullrank/Mooncake`	`4366946807.5` ns	`4230203948.5` ns	`1.03`
`normal + bijector/RepGradELBO/meanfield/Zygote`	`4464663738.5` ns	`4276506885.5` ns	`1.04`
`normal + bijector/RepGradELBO/meanfield/ReverseDiff`	`2102610294` ns	`1991487412` ns	`1.06`
`normal + bijector/RepGradELBO/meanfield/Mooncake`	`4074397467` ns	`3939831565.5` ns	`1.03`
`normal + bijector/RepGradELBO/fullrank/Zygote`	`4560198939` ns	`4343225391` ns	`1.05`
`normal + bijector/RepGradELBO/fullrank/ReverseDiff`	`2331323017` ns	`2275989579` ns	`1.02`
`normal + bijector/RepGradELBO/fullrank/Mooncake`	`4175468965.5` ns	`4061135640` ns	`1.03`

This comment was automatically generated by workflow using github-action-benchmark.

remove the type ParamSpaceSGD

e332d8c

Red-Portal and others added 3 commits September 15, 2025 12:08

run formatter

1f35cc9

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

run formatter

c8404b6

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

run formatter

0cc7538

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

yebai requested changes Sep 15, 2025

View reviewed changes

src/algorithms/paramspacesgd/interface.jl Outdated Show resolved Hide resolved

src/algorithms/paramspacesgd/interface.jl Outdated Show resolved Hide resolved

src/algorithms/interface.jl Show resolved Hide resolved

yebai reviewed Sep 15, 2025

View reviewed changes

yebai requested a review from sunxd3 September 16, 2025 07:13

yebai reviewed Sep 25, 2025

View reviewed changes

src/algorithms/interface.jl Show resolved Hide resolved

Red-Portal added 8 commits October 13, 2025 14:35

fix rename file paramspacesgd.jl to interface.jl

ede91c6

Merge branch 'remove_paramspacesgd' of github.com:TuringLang/Advanced…

625f429

…VI.jl into remove_paramspacesgd

Merge branch 'main' of github.com:TuringLang/AdvancedVI.jl into remov…

e3c2761

…e_paramspacesgd

throw invalid state for unknown paramspacesgd type

683a09d

add docstring for union type of paramspacesgd algorithms

570fe11

fix remove custom state types for paramspacesgd algorithms

2d5f373

fix remove custom state types for paramspacesgd

e0221eb

fix file path

e51ab3c

Red-Portal and others added 5 commits October 13, 2025 15:38

fix bug in BijectorsExt

e49c680

fix include SubSampleObjective as part of ParamSpaceSGD

3c5b56f

fix formatting

30f5160

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

fix revert adding SubsampledObjective into ParamSpaceSGD

008c4ea

refactor flatten algorithms

8a18902

Red-Portal requested a review from yebai October 13, 2025 19:54

fix error update paths in main file

b002e1e

yebai reviewed Oct 13, 2025

View reviewed changes

Red-Portal requested review from mhauru and penelopeysm October 13, 2025 20:01

Red-Portal and others added 4 commits October 13, 2025 16:06

refactor flatten the tests to reflect new structure

1ba361f

fix file include path in tests

86baa07

fix missing operator in subsampledobj tests

67e9375

fix formatting

9b2eabb

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

github-actions bot reviewed Oct 13, 2025

View reviewed changes

Remove the type ParamSpaceSGD #205

Are you sure you want to change the base?

Remove the type ParamSpaceSGD #205

Conversation

Red-Portal commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Red-Portal commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yebai left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Red-Portal commented Sep 15, 2025

Uh oh!

yebai commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Red-Portal commented Sep 15, 2025

Uh oh!

yebai commented Sep 16, 2025

Uh oh!

sunxd3 commented Sep 16, 2025

Uh oh!

yebai commented Sep 16, 2025

Uh oh!

Red-Portal commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yebai commented Sep 16, 2025

Uh oh!

Red-Portal commented Sep 17, 2025

Uh oh!

yebai commented Sep 17, 2025

Uh oh!

Red-Portal commented Sep 17, 2025

Uh oh!

Red-Portal commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yebai commented Sep 25, 2025

Uh oh!

Uh oh!

Red-Portal commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yebai left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Benchmark Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Remove the type `ParamSpaceSGD` #205

Remove the type `ParamSpaceSGD` #205

Red-Portal commented Sep 15, 2025 •

edited

Loading

yebai left a comment •

edited

Loading

yebai commented Sep 15, 2025 •

edited

Loading

Red-Portal commented Sep 16, 2025 •

edited

Loading

Red-Portal commented Sep 25, 2025 •

edited

Loading

Red-Portal commented Oct 13, 2025 •

edited

Loading