Skip to content

Conversation

@bstrand
Copy link

@bstrand bstrand commented Oct 25, 2022

Partial fix for #190

NB To realize the reduction in repo size, strip the old blobs from the history with filter-repo.

Image file size reduction

  • Target: JPEG files larger than 2 MB
  • No images were deleted or resized; only recompressed
  • Recompression was done with ImageMagick with the following parameters:
    magick "$jpg_file" -strip -interlace Plane -sampling-factor 4:2:0 -quality 33%

Results

  • .jpg file sizes reduced by 90%
  • Repo disk usage reduced by 267 MB (504 -> 237)
  • Repo still large due to old files in git history

Before:

code_snippets on master✔ » du -h -d 1 . | sort -rh | head -n5
504M	.
318M	./Python
161M	./.git
 24M	./Django_Blog
 28K	./Terminal

After:

code_snippets  on 190-recompress-large-jpgs✔» du -h -d 1 . | sort -rh | head -n5
237M	.
176M	./.git
 50M	./Python
 11M	./Django_Blog
 28K	./Terminal

Follow-up: filter large blobs

To realize the reduction in repo size, strip the old blobs from the history with filter-repo. Following this guide and using a max blob size of 2M:
git filter-repo --strip-blobs-bigger-than 2M

Results of doing the above on my fork of code_snippets repo

  • Repo download (clone) reduced to 17 MB (down from 164 MB)
  • Repo total disk usage reduced to 69 MB (down from 504 MB)

Clone before:
Receiving objects: 100% (940/940), 163.91 MiB | 9.24 MiB/s, done
Clone after:
Receiving objects: 100% (923/923), 16.68 MiB | 7.64 MiB/s, done.

Before:

code_snippets on master✔ » du -h -d 1 . | sort -rh | head -n5
504M	.
318M	./Python
161M	./.git
 24M	./Django_Blog
 28K	./Terminal

After:

code_snippets on master✔ » du -h -d 1 . | sort -rh | head -n5
 69M	.
 40M	./Python
 18M	./.git
 11M	./Django_Blog
 28K	./Terminal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant