-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Compression plugin #18314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop2
Are you sure you want to change the base?
Compression plugin #18314
Changes from 25 commits
4cd2811
156e036
c7b5dff
be30ba2
1295d4f
cc4d3ad
e245a93
f094d9a
fee5fab
452f774
c1a7320
a240c8c
cf7de74
557f8e0
f82d764
549e086
ac1fc45
181a736
3b32ec2
b52bce2
91da7a4
e96bd3c
a024baf
989fde9
f986651
8009bed
0db1a7b
dcbb29e
28e9148
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -1,4 +1,6 @@ | ||||||
| import errno | ||||||
| from pathlib import Path | ||||||
| import tempfile | ||||||
| import gzip | ||||||
| import hashlib | ||||||
| import os | ||||||
|
|
@@ -11,10 +13,13 @@ | |||||
|
|
||||||
| from contextlib import contextmanager | ||||||
|
|
||||||
| from conan.api.output import ConanOutput | ||||||
| from conan.errors import ConanException | ||||||
|
|
||||||
| _DIRTY_FOLDER = ".dirty" | ||||||
|
|
||||||
| # Name (without extension) of the tar file to be created by the compression plugin | ||||||
| COMPRESSED_PLUGIN_TAR_NAME = "__conan_plugin_compressed_contents__" | ||||||
|
|
||||||
| def set_dirty(folder): | ||||||
| dirty_file = os.path.normpath(folder) + _DIRTY_FOLDER | ||||||
|
|
@@ -256,15 +261,44 @@ def mkdir(path): | |||||
| os.makedirs(path) | ||||||
|
|
||||||
|
|
||||||
| def tar_extract(fileobj, destination_dir): | ||||||
| def tar_extract(fileobj, destination_dir, compression_plugin=None, conf=None): | ||||||
| if compression_plugin: | ||||||
| _tar_extract_with_plugin(fileobj, destination_dir, compression_plugin, conf) | ||||||
| return | ||||||
| the_tar = tarfile.open(fileobj=fileobj) | ||||||
| # NOTE: The errorlevel=2 has been removed because it was failing in Win10, it didn't allow to | ||||||
| # "could not change modification time", with time=0 | ||||||
| # the_tar.errorlevel = 2 # raise exception if any error | ||||||
| the_tar.extraction_filter = (lambda member, path: member) # fully_trusted, avoid Py3.14 break | ||||||
| the_tar.extractall(path=destination_dir) | ||||||
| the_tar.close() | ||||||
|
|
||||||
| if list(Path(destination_dir).glob(f"{COMPRESSED_PLUGIN_TAR_NAME}.*")): | ||||||
| raise ConanException(f"Error while extracting {os.path.basename(fileobj.name)}.\n" | ||||||
| "This file has been compressed using a `compression` plugin.\n" | ||||||
| "If your organization uses this plugin, ensure it is correctly installed on your environment.") | ||||||
|
|
||||||
| def _tar_extract_with_plugin(fileobj, destination_dir, compression_plugin, conf): | ||||||
| """First remove tar.gz wrapper and then call the plugin to extract""" | ||||||
| with tempfile.TemporaryDirectory() as temp_dir: | ||||||
|
||||||
| pkglist_path = os.path.join(tempfile.gettempdir(), "pkglist.json") |
(do not look at the blame, it was me!) but this case is different as it is only a pckglist extraction + removal.
I could move it also to the conan cache and may consider creating a tmp folder in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand the concern.
Let me think about it.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In reference to #18259, we'd eventually like to have our own additional files added to the top-level. Don't assume here that it's the only file. I don't think this requires anything other than clarifying the comment.
| # Get the only extracted file: the plugin tar | |
| # Extract the actual contents from the plugin tar (ignore other files present). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greetings @mrjoel,
I understand your concern.
Maybe from the perspective of the git diff it is not clear enough.
With this incoming compression plugin you could do whatever you need with the files.
You just have to implement tar_compress and tar_extract functions in the compression.py file.
In that functions, you could add whatever metadata you need.
The thing is that, to avoid issues with the compressed file extension (which could be whatever the user decides), we have decided to wrap that compressed result file into a constant conan_cache_save.tgz file name.
This wraping is being done with compress level 0 and takes a negligible overhead.
I'll put an example.
Without the plugin:
$ conan cache save "zlib/1.3.1:*"
$ ls -l
.rw-r--r--@ 1,7M perseo 11 jul 16:19 conan_cache_save.tgz
$ tar xzf conan_cache_save.tgz
drwxr-xr-x@ - perseo 11 jul 16:20 b
.rw-r--r--@ 1,7M perseo 11 jul 16:19 conan_cache_save.tgz
.rw-r--r--@ 949 perseo 11 jul 16:19 pkglist.json
drwxr-xr-x@ - perseo 11 jul 16:20 zlib204752602052d
With the plugin, using standard:
$ conan cache save "zlib/1.3.1:*"
$ ls -l
.rw-r--r--@ 1,7M perseo 11 jul 16:19 conan_cache_save.tgz
$ tar xzf conan_cache_save.tgz
.rw-r--r--@ 1,6M perseo 11 jul 16:21 __conan_plugin_compressed_contents__.zst
.rw-r--r--@ 1,6M perseo 11 jul 16:21 conan_cache_save.tgz
$
As you can see, when decompressing the conan_cache_save.tgz file, we will always get the output file of your tar_compress function.
And is that file the one which will be passed to the tar_extract function on your plugin.
In that compressed file, you could have whatever structure and metadata you need. Your tar_extract could work with that metadata as long as always decompress the files in the way conan expects.
Answering your suggestion: there will always be one file, the unwrapped compressed file from the plugin!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@perseoGI Yep, I'm tracking all of that from the diff.
My main concern was just including the assumption and comment that the outer wrapper would only contain a single file. For #18259 we'd really like it to end up being the following, where we add the custom additional files. We don't expect Conan to know or do anything about it, but do want it to not cause issues.
I'm slightly sad that custom files in the direct wrapper won't be compressed at all, but so be it, we mainly expect files of a few hundred KBs (the HTML graph reports).
$ conan cache save "zlib/1.3.1:*"
$ ls -l
.rw-r--r--@ 1,7M perseo 11 jul 16:19 conan_cache_save.tgz
...
# Other things happen externally to add files directly to conan_cache_save.tgz
...
$ tar xzf conan_cache_save.tgz
.rw-r--r--@ 1,6M perseo 11 jul 16:21 __conan_plugin_compressed_contents__.zst
.rw-r--r--@ xxxx perseo 11 jul 16:21 README.txt
.rw-r--r--@ xxxx perseo 11 jul 16:21 Cache-contents-graph-report.html
.rw-r--r--@ xxxx perseo 11 jul 16:21 Some-other-externally-embedded-file.py
.rw-r--r--@ xxxx perseo 11 jul 16:21 Some-other-externally-embedded-file.exe
.rw-r--r--@ 1,6M perseo 11 jul 16:21 conan_cache_save.tgz
$
As a separate question, why gzip -0 the generated archive at all instead of just having it be a plain native .tar file? That would certainly make it easier for us to tar -rvf conan_cache_save.tar file1.txt file2.txt file3.html instead of having to ungzip and regzip a level 0 compression layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further, this approach of requiring the compression plugin to be enabled in a user's Conan home means that it is considered for all projects they may be working. We'd like to be able to provide our conan package archives in a non-interference way, not requiring users of our archive to have to use our compression plugin for everything.
Would you consider an approach where an embedded compression.py (or __conan_plugin_compression_extension__.py or some variant) file in the archive level would be used if present? It could then only be used for that archive instead of needing to be installed fully in a user's Conan home?
perseoGI marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
perseoGI marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -158,9 +158,13 @@ def test_cache_save_excluded_folders(): | |||||
|
|
||||||
| # exclude source | ||||||
| c.run("cache save * --no-source") | ||||||
| # Check default compression function is being used and not compression.py plugin one | ||||||
| assert "Compressing conan_cache_save.tgz\n" in c.out | ||||||
|
||||||
| assert "Compressing conan_cache_save.tgz\n" in c.out | |
| assert "Compressing conan_cache_save.tgz" in c.out |
Uh oh!
There was an error while loading. Please reload this page.