How is it decided how to optimize images? #1487
Replies: 2 comments 2 replies
-
The process is complicated. In the default mode we process with Ghostscript which converts to PDF/A. In the process Ghostscript will optimize and change some image formats. This can converted lossless images to lossy in some cases. This behavior can be disabled with The second pass optimizer then reviews each image and tries a variety of optimization strategies. If an image can be quantized to 1-bit PNG without much loss, then you might see an original JPEG/TIFF converted to PNG then to JBIG2, provided all the dependencies are available. If an attempt to optimize an image results in a larger byte count, the optimization is discarded. If JBIG2 is available, all monochrome images get converted to JBIG2 or CCITT. All palette images get converted to PNG. Most JPEG2000 images get converted to JPEG. There are all kinds of exceptions for nonstandard images. |
Beta Was this translation helpful? Give feedback.
-
Are BW images detected using metadata or by euristics? Say, if it's a BW images saved in grayscale, will it be optimized "back" to BW? Or is this part done by GS? |
Beta Was this translation helpful? Give feedback.
-
As an example, I scan documents in grayscale TIFF, combine them with
img2pdf
and then runThe CLI output shows that some files are optimized as JPEG, some as JBIG, some as PNG (although initially all were TIFFs).
How does OCRmyPDF decide how to optimize images and to which formats to convert them?
Beta Was this translation helpful? Give feedback.
All reactions