-
Notifications
You must be signed in to change notification settings - Fork 95
Open
Description
Thank you for the great work!
May I ask what GPU config is required for inference with HJB optimization, and if there are any constraints for the reference image or target video? I encountered an OOM issue when running the command_op_infer.sh, and I was using H100 with 80G and have already set PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True. Could you please advise how to resolve the issue? Thank you very much!
<PROJECT_PATH>/animation/helper/backbones/iresnet.py:149: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
with torch.cuda.amp.autocast(self.fp16):
4%|█████ | 1/25 [00:28<11:14, 28.08s/it] current iteration: 0
4%|█████ | 1/25 [00:36<14:30, 36.26s/it]
Traceback (most recent call last):
File "<PROJECT_PATH>/inference_op.py", line 471, in <module>
video_frames = pipeline(
File "<PROJECT_PATH>/animation/pipelines/inference_pipeline_animation_pro.py", line 692, in __call__
latents = self.scheduler.step(
File "<PROJECT_PATH>/animation/pipelines/euler_discrete_pro.py", line 702, in step
pred_frames = decode_latents_scheduler_new(
latents=z0,
num_frames=num_frames,
decode_chunk_size=decode_chunk_size,
vae=vae,
device=device
)
File "<PROJECT_PATH>/animation/pipelines/euler_discrete_pro.py", line 107, in decode_latents_scheduler_new
frame = vae.decode(latents[i: i + decode_chunk_size], **decode_kwargs).sample
File "<ENV_PATH>/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "<PROJECT_PATH>/animation/modules/refined_vae.py", line 355, in decode
decoded = self.decoder(
z,
num_frames=num_frames,
image_only_indicator=image_only_indicator
)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "<PROJECT_PATH>/animation/modules/refined_vae.py", line 107, in forward
sample = torch.utils.checkpoint.checkpoint(
custom_forward,
*inputs
)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/_compile.py", line 32, in inner
return disable_fn(*args, **kwargs)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 632, in _fn
return fn(*args, **kwargs)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 496, in checkpoint
ret = function(*args, **kwargs)
File "<PROJECT_PATH>/animation/modules/refined_vae.py", line 91, in custom_forward
return module(*inputs)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "<ENV_PATH>/lib/python3.10/site-packages/diffusers/models/unets/unet_3d_blocks.py", line 1000, in forward
hidden_states = resnet(hidden_states, temb)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "<ENV_PATH>/lib/python3.10/site-packages/diffusers/models/resnet.py", line 693, in forward
hidden_states = self.spatial_res_block(hidden_states, temb)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "<ENV_PATH>/lib/python3.10/site-packages/diffusers/models/resnet.py", line 327, in forward
hidden_states = self.norm1(hidden_states)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 313, in forward
return F.group_norm(input, self.num_groups, self.weight, self.bias, self.eps)
File "<ENV_PATH>/lib/python3.10/site-packages/torch/nn/functional.py", line 2955, in group_norm
return torch.group_norm(input, self.num_groups, self.weight, self.bias, self.eps)
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB. GPU 0 has a total capacity of 79.19 GiB, of which 913.06 MiB is free. Including non-PyTorch memory, this process has 78.29 GiB in use. Of that, 77.22 GiB is allocated by PyTorch and 374.74 MiB is reserved but unallocated.
If reserved but unallocated memory is large, try setting `PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True` to avoid fragmentation.
See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Metadata
Metadata
Assignees
Labels
No labels