-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
gh-132933: Allow zipapp target to overwrite filtered source file #132934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Jolmberg
commented
Apr 25, 2025
•
edited by bedevere-app
bot
Loading
edited by bedevere-app
bot
- Issue: zipapp disregards filter when checking if target overwrites source #132933
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
I wrongly said "no NEWS" but since the fix has been backported to 3.13, we need a NEWS entry as 3.13 is a stable release (and because we already shipped a release of 3.14; actually everything that changes from something that has been shipped in another release should have a NEWS entry when possible) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced we should be supporting the approach of putting the target in the same directory as the sources are coming from. If you can get the logic to work correctly and cleanly, I'll consider adding it, as I don't want to arbitrarily break people's workflows. But please understand, that there's a reason the existing check is simplistic, and it's clearly stated in the comments - it's to catch common errors, not to make the approach robust.
Lib/zipapp.py
Outdated
raise ZipAppError( | ||
f"The target archive {target} overwrites one of the source files.") | ||
if filter: | ||
arcname = target.relative_to(source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a valid check. source
and target
could be on different drives (on Windows), in which case relative_to
will fail with a ValueError
.
The logic here seems backwards to me. Rather than trying to replicate the filtering done in the archive writing loop, we should be calculating files_to_add more precisely, based on the filter. That wouldn't fix the problem of target
being on a different drive, though - there's a fundamental problem with the whole idea here, as filter
is defined to work on the arcname
of the source file, and target
may not even have a defined arcname
, because logically it isn't a source file.
Ultimately, the correct answer here is what I've said elsewhere - don't put your output file in the directory you're creating the zipapp from. That's a much more straightforward approach than trying to exclude it using a filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time with this! You are right of course, that check is wrong and can fail even within a single drive I believe. I've rewritten the code to filter files_to_add right away and then just use your original target check.
I can appreciate your sentiment about not writing the output file to the source directory, but I would prefer not to mess with my build system and I still think this change is for the better. I use the filter as a whitelist, precisely because I know that I can't trust the source directory to always be clean. I think it's pretty risk free™.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, thinking some more about it, I'm not sure I see how the relative_to check could fail. It's only performed after we've confirmed that target is equal to one of the paths inside source. But there could be intricacies to the Path class that I'm missing, and I think the new code is a cleaner way of doing this either way.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
files_to_add = {f: f2 for f in files_to_add | ||
if filter(f2 := f.relative_to(source))} | ||
else: | ||
files_to_add = {f: f.relative_to(source) for f in files_to_add} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comprehension doesn't add to readability, IMO. I suggest something more like:
files_to_add = []
for file in sorted(source.rglob("*")):
relative_name = file.relative_to(source)
if filter is None or filter(relative_name):
files_to_add.append((file, relative_name))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do away with the comprehension, but building files_to_add as a list of tuples makes it more difficult to search for target in it. A simple target in files_to_add
will no longer work.
One option is to again use a dictionary for files_to_add. Another is to do the check for target as part of this loop, like this:
files_to_add = []
for file in sorted(source.rglob("*")):
relative_name = file.relative_to(source)
if filter is None or filter(relative_name):
if file == target:
raise ZipAppError(...)
files_to_add.append((file, relative_name))
I have no strong opinion. Doing the target check separately as target in files_to_add
is more idiomatic but doing it as part of the loop (as above) is probably a tiny bit more efficient.
@@ -0,0 +1 @@ | |||
The zipapp module now applies the provided filter to the list of files to be added before determining whether the target file overwrites a source file. This allows the target file to be written inside the source directory provided it is excluded by the filter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree strongly with this news item, as it suggests that writing the target into the source directory is intended usage.
Maybe
The zipapp module now applies the filter when creating the list of files to add, rather than waiting until the file is being added to the archive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I wanted to get a bit of rationale in there, but I agree that it could come off as a recommendation, and it got long-winded too. I'll change it.
@@ -154,15 +159,14 @@ def create_archive(source, target=None, interpreter=None, main=None, | |||
raise ZipAppError( | |||
f"The target archive {target} overwrites one of the source files.") | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unintended blank line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will fix!
@@ -113,6 +113,18 @@ def test_target_overwrites_source_file(self): | |||
with self.assertRaises(zipapp.ZipAppError): | |||
zipapp.create_archive(source, target) | |||
|
|||
def test_target_overwrites_filtered_source_file(self): | |||
# The target cannot be one of the files to add. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# The target cannot be one of the files to add. | |
# If there's a filter that excludes the target, | |
# the overwrite check shouldn't trigger. |