Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any suggestions to train large scene_scale dataset with MCMC? #542

Open
a11enL opened this issue Jan 19, 2025 · 1 comment
Open

Comments

@a11enL
Copy link

a11enL commented Jan 19, 2025

gsplat 1.4.0

python3 examples/simple_trainer.py mcmc --use_bilateral_grid --data_factor 1 --data_dir data/test123/ --result_dir exports/test123/

part of logs

...
[Parser] 120 images, taken by 120 cameras.
Scene scale: 2009.6237303032315
Model initialized. Number of GS: 159431
...
loss=0.475| sh degree=0| : 100%|██▋| 299/30000 [00:25< ......
......
loss=0.185| sh degree=3| : 25%|██████████████████████▌ | 7600/30000 [13:15<39:05, 9.55it/s]
Traceback (most recent call last):
File "/home/ubuntu/gsplat-1.4.0/examples/simple_trainer.py", line 1120, in
cli(main, cfg, verbose=True)
File "/home/ubuntu/gsplat-1.4.0/gsplat/distributed.py", line 360, in cli
return _distributed_worker(0, 1, fn=fn, args=args)
File "/home/ubuntu/gsplat-1.4.0/gsplat/distributed.py", line 295, in _distributed_worker
fn(local_rank, world_rank, world_size, args)
File "/home/ubuntu/gsplat-1.4.0/examples/simple_trainer.py", line 1065, in main
runner.train()
File "/home/ubuntu/gsplat-1.4.0/examples/simple_trainer.py", line 820, in train
self.cfg.strategy.step_post_backward(
File "/home/ubuntu/gsplat-1.4.0/gsplat/strategy/mcmc.py", line 128, in step_post_backward
n_relocated_gs = self._relocate_gs(params, optimizers, binoms)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/gsplat-1.4.0/gsplat/strategy/mcmc.py", line 158, in _relocate_gs
relocate(
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/gsplat-1.4.0/gsplat/strategy/ops.py", line 278, in relocate
sampled_idxs = _multinomial_sample(probs, n, replacement=True)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/gsplat-1.4.0/gsplat/strategy/ops.py", line 31, in _multinomial_sample
assert not num_elements == 0, ('_multinomial_sample weights 0')
AssertionError: _multinomial_sample weights 0

It seems gsplat will crash or generate a nothing but noise ply file if change opacity_reg to 0.001, when dataset scene_scale is larger than 2000 or 10000. all of cases is running with high traning loss. I have three datasets from AI tool with this situation here, both these datasets work well with colmap without any errors or warnings. and I nerver encounter such situation with traditional colmap dataset. thanks.

@a11enL
Copy link
Author

a11enL commented Jan 20, 2025

Current the problem is that loss doesn't converge, it stopped on 0.1 or 0.2.
I have tried noise-lr, opacity_reg, scale_reg and filter theose zero-alive GS cases etc.

This is the dataset with above situation

120 images
120 cameras
159431 points
Source: colmap image_undistorter

Image

download, 183M, Google Drive
https://drive.google.com/file/d/1pn8b74AAGobQI-Bxv6arLA-B6-l8IIWQ/view?usp=sharing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant