Summary
RVC loads a selected .pth voice model with torch.load(...) and does not pass weights_only=True. The project pins torch below 2.6, where the default is weights_only=False, so loading a model performs full pickle deserialization. RVC voice models are routinely downloaded and shared, so a malicious .pth placed in the weights directory and selected in the inference dropdown executes arbitrary code on the victim's machine the moment it is loaded. Confirmed against the real load path: a crafted .pth ran a command during unpickling.
Details
infer/modules/vc/modules.py (~line 100 to 103):
person = f'{os.getenv("weight_root")}/{sid}'
...
self.cpt = torch.load(person, map_location="cpu") # no weights_only -> pickle on torch < 2.6
sid is the inference voice-model dropdown value (sid0, infer-web.py ~line 819), wired sid0.change(fn=vc.get_vc, ...) (infer-web.py ~lines 1107 to 1109) and also exposed as a Gradio api_name. The dropdown lists *.pth files in weight_root (assets/weights), exactly where users drop downloaded voice models. The same unsafe torch.load (no weights_only) appears across the repo (uvr5 infer/lib/uvr5_pack/... vr.py ~33/214, rmvpe.py ~544, process_ckpt.py, train.py pretrained loads); the inference dropdown is the directly user-facing path.
Loading a shared/downloaded voice model is the normal, advertised operation of RVC (the ecosystem revolves around trading .pth models), so a malicious model is the recognized untrusted-model code-execution boundary.
PoC
Impact
A user who downloads a malicious RVC voice model and selects it for inference executes attacker code with their privileges, with no further interaction. If the WebUI is run with --listen / share and an attacker can place a file in the weights directory, the exposure increases.
Remediation
Load with torch.load(person, map_location="cpu", weights_only=True) and read cpt["weight"] as a plain state dict; if non-tensor metadata is needed, allowlist the required globals via torch.serialization.add_safe_globals. Apply the same fix to every torch.load in the repo. Consider migrating model artifacts to safetensors.
Summary
RVC loads a selected
.pthvoice model withtorch.load(...)and does not passweights_only=True. The project pins torch below 2.6, where the default isweights_only=False, so loading a model performs full pickle deserialization. RVC voice models are routinely downloaded and shared, so a malicious.pthplaced in the weights directory and selected in the inference dropdown executes arbitrary code on the victim's machine the moment it is loaded. Confirmed against the real load path: a crafted.pthran a command during unpickling.Details
infer/modules/vc/modules.py(~line 100 to 103):sidis the inference voice-model dropdown value (sid0,infer-web.py~line 819), wiredsid0.change(fn=vc.get_vc, ...)(infer-web.py~lines 1107 to 1109) and also exposed as a Gradioapi_name. The dropdown lists*.pthfiles inweight_root(assets/weights), exactly where users drop downloaded voice models. The same unsafetorch.load(noweights_only) appears across the repo (uvr5infer/lib/uvr5_pack/...vr.py~33/214,rmvpe.py~544,process_ckpt.py,train.pypretrained loads); the inference dropdown is the directly user-facing path.Loading a shared/downloaded voice model is the normal, advertised operation of RVC (the ecosystem revolves around trading
.pthmodels), so a malicious model is the recognized untrusted-model code-execution boundary.PoC
Impact
A user who downloads a malicious RVC voice model and selects it for inference executes attacker code with their privileges, with no further interaction. If the WebUI is run with
--listen/ share and an attacker can place a file in the weights directory, the exposure increases.Remediation
Load with
torch.load(person, map_location="cpu", weights_only=True)and readcpt["weight"]as a plain state dict; if non-tensor metadata is needed, allowlist the required globals viatorch.serialization.add_safe_globals. Apply the same fix to everytorch.loadin the repo. Consider migrating model artifacts to safetensors.