Overview:
.
├── evaluation/
│ ├── benchmark/ # benchmark set of 16 probabilistic models
│ │ └── generated/ # sub-programs generated by generate_factorisation.py script
│ ├── custom/ # implementation of custom factorisations (GMM + LDA)
│ ├── finite/ # implementation of fixed-finite factorisation
│ ├── unrolled/ # programs where while loop can be syntactically unrolled
│ │ └── generated/ # sub-programs generated by generate_factorisation.py script
│ ├── models/ # your models go here
│ ├── lmh_standard.jl # Standard implementation of LMH
│ ├── lmh_factorised.jl # Implementation of LMH that leverages factorisation into sub-programs
│ ├── bench_lmh.jl # script to run LMH benchmark for model
│ ├── bench_smc.jl # script to run SMC benchmark for model
│ ├── bench_vi.jl # script to run VI benchmark for model
│ ├── paper_lmh_results.csv # LMH runtime results reported in paper
│ ├── paper_smc_results.csv # SMC runtime results reported in paper
│ ├── paper_vi_results.csv # VI gradient variance results reported in paper
│ ├── ppl.jl # implemenation of research PPL
│ ├── smc.jl # implemenation of SMC (naive and iterative way)
│ └── vi.jl # implemenation of variational inference (standard and variance-reduced way)
│
├── compare/
│ ├── gen/ # benchmark set of 16 probabilistic models re-implemented in Gen.jl
│ │ ├── run_lmh.py # script to run combinator inference for every model
│ │ ├── paper_gen_results.csv # Gen runtime results reported in paper
│ │ └── geometric_recurse.jl # Demonstrating why we could not evaluate Recurse combinator
│ ├── pyro_bbvi/ # benchmark set re-implemented in Pyro
│ │ ├── bbvi.py # Black-box variational inference implementation in Pyro
│ │ ├── run_vi.py # script to run BBVI for every model
│ │ ├── paper_vi_results.csv # VI gradient variance results reported in paper
│ │ └── geometric.jl # Demonstrating why we cannot reduce variance with control-dependencies in Pyro
│ └── webppl/ # benchmark set of 16 probabilistic models re-implemented in WebPPL
│ ├── smc/ # Javascript implementation of naive SMC
│ ├── run_lmh.py # script to run C3 inference for every model
│ ├── run_smc.py # script to run SMC inference for every model
│ ├── paper_lmh_results.csv # WebPPL C3 runtime results reported in paper
│ └── paper_smc_results.csv # WebPPL SMC runtime results reported in paper
│
├── src/
│ ├── formal/
│ │ ├── ast.jl # get abstract syntax tree for julia program
│ │ ├── factorisation_builder.py # uses model_graph.py to write sub-programs for each factor of probabilistic program
│ │ └── formal_cfg.py # computes the CFG for probabilistic program
│ └── static/
│ ├── cfg.py # types for building CFG
│ ├── ir.py # intermediate represenation of probabilstic program
│ ├── model_graph.py # computes dependencies between sample statements as graph
│ └── rd_bp.py # functionality to compute reaching definitions and branch parents
│
├── generate_factorisation.py # script to generate sub-programs for each factor for every benchmark model
├── run_lmh_benchmark.py # script to run bench_lmh.jl for every benchmark model
├── run_smc_benchmark.py # script to run bench_smc.jl for every benchmark model
├── run_vi_benchmark.py # script to run bench_vi.jl for every benchmark model
├── generate_factorisation_for_model.py # script to generate sub-programs for your model
└── run_benchmark_for_model.py. # script to run benchmarks for your model
No special hardware is needed for installation.
Recommendations:
- Hardware: >= 3.5 GHZ dual core CPU, >= 8 GB RAM, and >= 10 GB storage
- OS: unix-based operating system
- Installation with Docker
Install docker.
Build the pplstaticfactor image (this may take several minutes):
docker build -t pplstaticfactor .
If the build was successful, run the docker image:
docker run -it --name pplstaticfactor --rm pplstaticfactor
Alternatively, you can load the docker image provided at Zenodo with
docker load -i pplstaticfactor-amd64.tar
docker load -i pplstaticfactor-arm64.tar
depending on your system, which was saved with (Docker version 28.3.0)
docker buildx build --platform linux/amd64 -t pplstaticfactor-amd64 .
docker image save pplstaticfactor-amd64 > pplstaticfactor-amd64.tar
docker buildx build --platform linux/arm64 -t pplstaticfactor-arm64 .
docker image save pplstaticfactor-arm64 > pplstaticfactor-arm64.tar
Run those images with
docker run -it --name pplstaticfactor-amd64 --rm pplstaticfactor-amd64
docker run -it --name pplstaticfactor-arm64 --rm pplstaticfactor-arm64
Python 3.10.12
withrequirements.txt
sexpdata==1.0.2
pandas==2.2.3
numpy==2.2.4
pyro-ppl==1.9.1
torch==2.7.1
tqdm==4.67.1
Julia 1.9.2
withProject.toml
Distributions = "0.25.112"
JuliaSyntax = "0.4.10"
Gen = "0.4.7"
node.js 23.10
Generate the sub-programs:
python3 generate_factorisation.py
Run benchmarks N
times (we set N
= 10
):
python3 run_lmh_benchmark.py N
The results are written to evaluation/lmh_results.csv
and aggregated in evaluation/lmh_results_aggregated.csv
.
The results reported in the paper can be found in evaluation/paper_lmh_results.csv
(measured on a M2 Pro CPU).
This script runs the bench_lmh.jl
file which measures the runtime for the factored and non-factored versions of the LMH algorithm and asserts that the samples are the same for all LMH implementations.
For comparison run (we set N
= 10
):
python3 compare/gen/run_lmh.py N
and
python3 compare/webppl/run_lmh.py N
The results are written to compare/gen/lmh_results.csv
and compare/webppl/lmh_results.csv
and aggregated in compare/gen/lmh_results_aggregated.csv
and compare/webppl/lmh_results_aggregated.csv
, respectively.
The results reported in the paper can be found in compare/gen/paper_lmh_results.csv
and compare/webppl/paper_lmh_results.csv
(measured on a M2 Pro CPU).
python3 print_table_1.py
will generate Table 1 of the manuscript from the paper_lmh_results.csv
files.
python3 print_table_2.py
will generate Table 2 of the manuscript from the paper_lmh_results.csv
files.
Run benchmarks N
times (we set N
= 10
):
python3 run_vi_benchmark.py N
The results are written to evaluation/vi_results.csv
and aggregated in evaluation/vi_results_aggregated.csv
.
The results reported in the paper can be found in evaluation/paper_vi_results.csv
(measured on a M2 Pro CPU).
This script runs the bench_vi.jl
file which measures the gradient variance for the factored and non-factored versions of the VI algorithm.
For comparison run (we set N
= 1
. Note that this may take a long time to complete, ~6-8 hours, and you may lower N_ITER
or L
in the script for faster completion):
python3 compare/pyro_bbvi/run_vi.py N
The results are written to compare/pyro_bbvi/vi_results.csv
and aggregated in compare/pyro_bbvi/vi_results_aggregated.csv
.
The results reported in the paper can be found in compare/pyro_bbvi/paper_vi_results.csv
(measured on a M2 Pro CPU).
python3 print_table_3.py
will generate Table 3 of the manuscript from the paper_vi_results.csv
files.
Run benchmarks N
times (we set N
= 10
):
python3 run_smc_benchmark.py N
The results are written to evaluation/smc_results.csv
and aggregated in evaluation/smc_results_aggregated.csv
.
The results reported in the paper can be found in evaluation/paper_smc_results.csv
(measured on a M2 Pro CPU).
This script runs the bench_smc.jl
file which measures the runtime for the naive and iterative versions of the SMC algorithm.
For comparison run (we set N
= 10
):
python3 compare/webppl/run_smc.py N
The results are written to compare/webppl/smc_results.csv
and aggregated in compare/webppl/smc_results_aggregated.csv
.
The results reported in the paper can be found in compare/webppl/paper_smc_results.csv
(measured on a M2 Pro CPU).
python3 print_table_4.py
will generate Table 3 of the manuscript from the paper_smc_results.csv
files.
To test our approach on your model, you may implement it in a single file in evaluation/models
.
The static analysis only accepts programs written in Julia that follow the formal syntax described in the manuscript with some minor adaptions, which we summarise in the following.
A probabilistic program is defined with a Julia function annotated with @model
with statement S
and arguments A1
,...,An
.
modelname = "..."
proposers = Dict{String, Distribution}(...)
@model function model_fn(ctx::SampleContext, A1, ..., An)
S
end
In general, we allow arbitrary pure Julia expressions E
in statements.
Further, you may define pure Julia helper functions outside of this model block to be used in expressions.
Also, container types like arrays may only be used in an immutable fashion.
You may also give your model a name with the global variable modelname
and custom proposal distributions for LMH with the global proposers
dictionary.
For references, take a look at the models in evaluation/benchmark
.
Next, we describe the set of allowed statements S
:
- Assignment, where
x
is a variable
x = E
Importantly, the first use of a variable has to be type-annotated, e.g.
x::Vector{Float64} = E
and the type of the variable has to be static throughout the program.
- Sequence of two statements
S1
S2
- If statements
if E
S1
end
if E
S1
else
S2
end
- While statement
while E
S
end
- sample statement, where
ctx
is theSampleContext
,x
is a variable andf(E1,...En)
is the constructor of a distribution
x = sample(ctx, E, f(E1,...En))
We also support to inline observed values with expression C
that does not depend on random variables.
x = sample(ctx, E, f(E1,...En), observed=C)
You may also omit the assignment in this case:
sample(ctx, E, f(E1,...En), observed=C)
Generate the subprograms with
usage: generate_factorisation_for_model.py [-h] filename
positional arguments:
filename name of your program in evaluations/models/ without path but with file extension
Run a benchmark with
usage: run_benchmark_for_model.py [-h] filename algorithm repetitions
positional arguments:
filename name of your program in evaluations/models/ without path but with file extension
algorithm benchmark to run (lp|is|lmh|smc|vi)
repetitions number of experemint runs
You can configure the number of iterations for is
and lmh
in evaluation/N_iters.jl
.
Note that in order to compare the factorised SMC implementation to a naive implementation, you have to implement a data-annealed version of your model in evaluation/smc.jl
.