Official Pytorch implementation of SGI: Structured 2D Gaussians for Efficient and Compact Large Image Representation.
Zixuan Pan*, Kaiyuan Tang*, Jun Xia, Yifan Qin, Lin Gu, Chaoli Wang, Jianxu Chen, Yiyu Shi
(* denotes equal contribution)
2D Gaussian Splatting has emerged as a novel image representation technique that can support efficient rendering on low-end devices. However, scaling to high-resolution images requires optimizing and storing millions of unstructured Gaussian primitives independently, leading to slow convergence and redundant parameters. To address this, we propose Structured Gaussian Image (SGI), a compact and efficient framework for representing high-resolution images. SGI decomposes a complex image into multi-scale local spaces defined by a set of seeds. Each seed corresponds to a spatially coherent region and, together with lightweight multi-layer perceptrons (MLPs), generates structured implicit 2D neural Gaussians. This seed-based formulation imposes structural regularity on otherwise unstructured Gaussian primitives, which facilitates entropy-based compression at the seed level to reduce the total storage. However, optimizing seed parameters directly on high-resolution images is a challenging and non-trivial task. Therefore, we designed a multi-scale fitting strategy that refines the seed representation in a coarse-to-fine manner, substantially accelerating convergence. Quantitative and qualitative evaluations demonstrate that SGI achieves up to 7.5
We tested our code on a server with Ubuntu 20.04.1, cuda 11.8, gcc 9.4.0.
- Unzip files
cd submodules
unzip diff-gaussian-rasterization.zip
unzip gridencoder.zip
unzip simple-knn.zip
unzip arithmetic.zip
cd ..
- Install environment
conda env create --file environment.yml
conda activate SGI_env
- Install gsplat2d
cd gsplat2d/gsplat2d/cuda/csrc
mkdir third_party
cd third_party
git clone https://github.com/g-truc/glm.git
cd ../../../..
python setup.py build
python setup.py install
- Install
tmc3(for GPCC)
- Please refer to tmc3 github for installation.
- Don't forget to add
tmc3to your environment variable. - Tips:
tmc3is commonly located at/PATH/TO/mpeg-pcc-tmc13/build/tmc3.
The data structure should be organised as follows:
data/
├── dataset_name
│ ├── xxx_0.png
│ ├── xxx_1.png
│ ├── xxx_2.png
│ ├── ...
...
- The FGF2 dataset can be downloaded here
- The ICB dataset can be downloaded here
- The STimage datasets can be downloaded here
Set the path of tmc3 before running:
bash train.sh image_dir output_root
Notes:
- The pipeline runs training, encoding, decoding, and evaluation.
- Outputs are organized as
<output_root>/<dataset_name>/<image_name>/.... - Logs are written to
outputs.logunder each image's output directory. - Encoded bitstreams are saved to
<model_path>/bitstreams. - Per-image metrics are saved to
<output_root>/<dataset_name>/metrics.jsonand<output_root>/<dataset_name>/metrics.csv. - For very large images, use
--disable_lpipsto skip LPIPS and avoid potential GPU OOM during evaluation.
This codebase is built upon several excellent open-source projects, including LIG, GaussianImage, gsplat, and HAC-plus. We sincerely thank the authors of these works for making their code publicly available.
If you use SGI algorithm in your research, please cite our paper:
@misc{pan2026sgistructured2dgaussians,
title={SGI: Structured 2D Gaussians for Efficient and Compact Large Image Representation},
author={Zixuan Pan and Kaiyuan Tang and Jun Xia and Yifan Qin and Lin Gu and Chaoli Wang and Jianxu Chen and Yiyu Shi},
year={2026},
eprint={2603.07789},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2603.07789},
}
