STIMA Machine Learning Reading Group

The reading group is open to anyone who is interested in machine learning and who wants to meet up regularly to discuss ML research papers. If you want to get added to the mailing list or have any other questions feel free to contact Martin Andrae.

For VT 2026 we will continue with the format where we stick to papers on the same topic for two subsequent sessions. The first session (Jan 12) will be devoted to deciding all the topics for the fall.

Format

One or two people are designated the host for each topic (two sessions). They are responsible for choosing the papers to be discussed and for leading the discussion during the session.
We meet Wednesdays 11.00-12:00 on even weeks. (Change from previous years)
We will meet in Thomas Bayes, B-building (map).
To stimulate discussions we ask you to write down two observations about the paper and bring them to the session. This could be something you liked, disliked, did not understand, a connection you made or something else entirely.

Host instructions

Choose one paper for each session related to your topic that you think would be interesting to discuss in the reading group. In order to set a focus for the reading group we came up with the following short guidelines for how to choose papers:
- The main topic of the paper should be core machine learning research. Try to avoid papers that just apply well known machine learning methods to specific application areas.
- Make sure the paper is of high quality. Read through it yourself and try to gauge its quality. As a guideline, think that it should be publishable at a top machine learning conference (i.e. the paper should be of such quality, it does not have to actually be a short conference paper).
- If you want a second opinion on whether a paper is suitable feel free to ask anyone who has been in the reading group previous years.
Think about how the two papers in your topic relate to each other. For example, it can be nice to discuss first an introductory paper and then the state-of-the-art, or two different approaches to/perspectives on the same underlying problem.
Send out a link to the paper on the mailing list at least one week in advance.
As the host it is also good to somewhat lead the discussion during the session. If you want you can give a short description of why you chose this paper, but there is no need for any proper presentation. It might be a good idea to come to the session prepared with a few discussion points, just to keep the conversation going.

Schedule HT 2025

Week 3 (Jan 12 11-12)
Location: Von Neumann
Decide on topics for the fall.

Week 4 (Jan 21)
Location: Von Neumann
Topic: Diffusion - Host: Adi, Martin
Simplified and Generalized Masked Diffusion for Discrete Data
Jiaxin Shi, Kehang Han, Zhe Wang, Arnaud Doucet, Michalis K. Titsias
https://proceedings.neurips.cc/paper_files/paper/2024/file/bad233b9849f019aead5e5cc60cef70f-Paper-Conference.pdf

Our rating: 2.5 ± 0.5

Week 6 (Feb 4)
Topic: Diffusion - Host: Adi, Martin
Branching Flows: Discrete, Continuous, and Manifold Flow Matching with Splits and Deletions
Lukas Billera, Hedwig Nora Nordlinder, Jack Collier Ryder, Anton Oresten, Aron Stålmarck, Theodor Mosetti Björk, Ben Murrell
https://arxiv.org/abs/2511.09465

Our rating: 1.5 ± 0.5

Week 8 (Feb 18)
Topic: Mechanistic Interpretability - Host: Filip, Martin
Towards Automated Circuit Discovery for Mechanistic Interpretability, Arthur Conmy, Augustine Mavor-Parker, Aengus Lynch, Stefan
Heimersheim, Adrià Garriga-Alonso
https://proceedings.neurips.cc/paper_files/paper/2023/file/34e1dbe95d34d7ebaf99b9bcaeb5b2be-Paper-Conference.pdf

As some optional additional reading, the paper A Primer on the Inner Workings of Transformer-based Language Models by Javier Ferrando et al. (https://arxiv.org/abs/2405.00208) provides some more background on the topic, and could potentially be used as a source for (alternative) explanations of terms used in the main paper.

Our rating: 1.14 ± 0.35

Week 10 (Mar 4)
Topic: CausalML - Host: Marc, Lisa
Double/Debiased/Neyman Machine Learning of Treatment Effects
Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey
https://pubs-aeaweb-org.e.bibl.liu.se/doi/pdfplus/10.1257/aer.p20171038

The note gives a short overview of the topic and focuses on one specific application (estimation of average treatment effects). It is based on the original paper from the same authors, which you can read if you would like to know more details about the theory (https://academic.oup.com/ectj/article/21/1/C1/5056401). However, it is an econometrics paper and 60 pages long and therefore not suited for our reading group.

Our rating: 2.41 ± 0.84

Week 12 (Mar 18)
Topic: Transformers 2026 - Host: Lisa, Louis

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Tri Dao, Albert Gu
https://proceedings.mlr.press/v235/dao24a.html

Our rating: 2.1 ± 1.04

Week 14 (Apr 1)
Topic: Transformers 2026 - Host: Lisa, Louis

Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo
https://arxiv.org/abs/2402.16788

Week 16 (Apr 15)
Topic: Controlled Generation - Host: Erik, Filip

Week 18 (Apr 29)
Topic: Sampling from unnormalized distributions - Host: Adi, Erik

Week 20 (May 13)
Topic: Sampling from unnormalized distributions - Host: Adi, Erik

Week 22 (May 27)
Topic: Representation Disentanglement - Host: Lisa, Louis

Week 24 (Jun 10)
Topic: Representation Disentanglement - Host: Lisa, Louis

Earlier sessions

Paper scoring scale

5: Very Strong Accept:

Technically flawless paper
with groundbreaking impact on at least one area of ML and excellent impact on multiple areas of ML,
with flawless evaluation, resources, and reproducibility,
and no unaddressed ethical considerations.

4: Strong Accept:

Technically strong paper, with novel ideas,
excellent impact on at least one area of ML or high-to-excellent impact on multiple areas of ML,
with excellent evaluation, resources, and reproducibility,
and no unaddressed ethical considerations.

3: Accept:

Technically solid paper,
with high impact on at least one sub-area of ML or moderate-to-high impact on more than one area of ML,
with good-to-excellent evaluation, resources, reproducibility,
and no unaddressed ethical considerations.

2: Weak Accept:

Technically solid,
moderate-to-high impact paper,
with no major concerns with respect to evaluation, resources, reproducibility, ethical considerations.

1: Borderline accept:

Technically solid paper
where reasons to accept outweigh reasons to reject, e.g., limited evaluation.

Name		Name	Last commit message	Last commit date
Latest commit History 203 Commits
archive		archive
README.md		README.md
chat_2020-03-31.txt		chat_2020-03-31.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STIMA Machine Learning Reading Group

Format

Host instructions

Schedule HT 2025

Earlier sessions

Paper scoring scale

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

STIMA Machine Learning Reading Group

Format

Host instructions

Schedule HT 2025

Earlier sessions

Paper scoring scale

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages