Fix path traversal when loading OneFormer image processor metadata#46270
Closed
LinZiyuu wants to merge 1 commit into
Closed
Fix path traversal when loading OneFormer image processor metadata#46270LinZiyuu wants to merge 1 commit into
LinZiyuu wants to merge 1 commit into
Conversation
The `class_info_file` and `repo_path` fields of the OneFormer image processor config are loaded verbatim from `preprocessor_config.json` by `from_pretrained`, but they are untrusted. `class_info_file` is documented to live inside `repo_path`, yet it flows straight into `os.path.join(repo_path, class_info_file)` -> `open(...)` -> `json.load` in `load_metadata`, which runs during the processor's `__init__`. A value like "../../secret.json" or an absolute path escapes `repo_path`, so loading a malicious model via `AutoImageProcessor.from_pretrained(...)` (no `trust_remote_code`) reads an arbitrary local JSON file off the victim's machine. This is a sibling of the Bark (huggingface#46237) and chat-template (huggingface#46191) path traversals. Verify the resolved metadata path stays inside `repo_path` before reading it, allowing files in subdirectories but rejecting `..`/absolute escapes. Applied to both the torchvision and PIL backends, with regression tests for the escape and the allowed-subdirectory case.
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: oneformer |
This was referenced May 29, 2026
Contributor
|
This doesn’t seem like a real security boundary: both repo_path and class_info_file come from the same untrusted processor config, so constraining one relative to the other does not prevent a malicious config from choosing a broader local base path. I don't see how an attacker benefits from this before having already critical access? Closing for now, feel free to reopen if there's a misunderstanding. |
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
class_info_fileandrepo_pathfields of the OneFormer image processor config are loaded verbatim frompreprocessor_config.jsonbyfrom_pretrained, but they are untrusted.class_info_fileis documented to live insiderepo_path, yet it flows straight intoos.path.join(repo_path, class_info_file)→open(...)→json.loadinload_metadata, which runs during the processor's__init__. A value like"../../secret.json"or an absolute path escapesrepo_path, so loading a malicious model viaAutoImageProcessor.from_pretrained(...)(notrust_remote_code) reads an arbitrary local JSON file off the victim's machine.The fix verifies the resolved metadata path stays inside
repo_pathbefore reading it, allowing files in subdirectories but rejecting../absolute escapes. Applied to both the torchvision and PIL backends, with regression tests for the escape and the allowed-subdirectory case.Before submitting
Who can review?