Why do files become bigger after optimization without OCR? #1486
Unanswered
homocomputeris
asked this question in
Q&A
Replies: 2 comments
-
It could be, all else being equal, but impossible to say without a reproducing test file and command line. |
Beta Was this translation helpful? Give feedback.
0 replies
-
OK, I managed to make a MWE: Here is the full file generating chain, and PAPERSIZE='A4'
LANG='jpn+eng'
TITLE='test'
AUTHOR='test'
name='test'
PARTKEYWORDS='manual'
KEYWORDS="name ${name}; ${PARTKEYWORDS}"
echo "Running img2pdf"
img2pdf -S "${PAPERSIZE}" --title "${TITLE}" --author "${AUTHOR}" --keywords "${KEYWORDS}" -o ./"${name}.pdf" ./out/*.tif
ocrmypdf --output-type pdf --oversample 600 -l "${LANG}" --title "${TITLE}" --author "${AUTHOR}" --keywords "${KEYWORDS}" -O1 --fast-web-view 10 "./"${name}".pdf" "./"${name}_O1".pdf"
ocrmypdf --verbose --skip-text --tesseract-timeout=0 --remove-background --optimize 2 "./"${name}_O1".pdf" "./"${name}_O2".pdf"
ocrmypdf --verbose --skip-text --tesseract-timeout=0 --remove-background --optimize 3 "./"${name}_O2".pdf" "./"${name}_O3".pdf" eza -l
drwxr-xr-x - user 22 Feb 21:30 out
.rwx------@ 719 user 22 Feb 21:58 test.ocr.zsh
.rw-r--r--@ 47M user 22 Feb 21:58 test.pdf
.rw-r--r--@ 47M user 22 Feb 22:00 test_O1.pdf
.rw-r--r--@ 3.5M user 22 Feb 22:00 test_O2.pdf
.rw-r--r--@ 3.6M user 22 Feb 22:00 test_O3.pdf Verbose output
% zsh ./*.ocr.zsh
Running img2pdf
Scanning contents ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 6/6 0:00:00
Start processing 6 pages concurrently ocr.py:96
4 [tesseract] lots of diacritics - possibly poor OCR tesseract.py:241
3 [tesseract] lots of diacritics - possibly poor OCR tesseract.py:241
2 [tesseract] Image too small to scale!! (2x36 vs min width tesseract.py:259
of 3)
2 [tesseract] Line cannot be recognized!! tesseract.py:259
2 [tesseract] Image too small to scale!! (2x36 vs min width tesseract.py:259
of 3)
2 [tesseract] Line cannot be recognized!! tesseract.py:259
OCR ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 6/6 0:00:00
Postprocessing... ocr.py:144
Linearizing ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 100/100 0:00:00
Recompressing JPEGs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0/0 -:--:--
Deflating JPEGs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0/0 -:--:--
JBIG2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 1/1 0:00:00
Image optimization ratio: 1.00 savings: 0.1% _pipeline.py:1002
Total file size ratio: 1.00 savings: -0.1% _pipeline.py:1005
ocrmypdf 16.9.0 __main__.py:59
Running: ['tesseract', '--version'] __init__.py:133
Found tesseract 5.5.0 __init__.py:345
Running: ['tesseract', '--version'] __init__.py:133
Running: ['tesseract', '--version'] __init__.py:133
Running: ['pngquant', '--version'] __init__.py:133
Found pngquant 3.0.3 __init__.py:345
Running: ['jbig2', '--version'] __init__.py:133
Found jbig2 0.30 __init__.py:345
Running: ['gs', '--version'] __init__.py:133
Found gs 10.4.0 __init__.py:345
Running: ['gs', '--version'] __init__.py:133
Running: ['tesseract', '--list-langs'] __init__.py:133
stdout/stderr = List of available languages in __init__.py:73
"/usr/local/share/tessdata/" (163):
afr
amh
ara
asm
aze
aze_cyrl
bel
ben
bod
bos
bre
bul
cat
ceb
ces
chi_sim
chi_sim_vert
chi_tra
chi_tra_vert
chr
cos
cym
dan
deu
div
dzo
ell
eng
enm
epo
equ
est
eus
fao
fas
fil
fin
fra
frk
frm
fry
gla
gle
glg
grc
guj
hat
heb
hin
hrv
hun
hye
iku
ind
isl
ita
ita_old
jav
jpn
jpn_vert
kan
kat
kat_old
kaz
khm
kir
kmr
kor
kor_vert
lao
lat
lav
lit
ltz
mal
mar
mkd
mlt
mon
mri
msa
mya
nep
nld
nor
oci
ori
osd
pan
pol
por
pus
que
ron
rus
san
script/Arabic
script/Armenian
script/Bengali
script/Canadian_Aboriginal
script/Cherokee
script/Cyrillic
script/Devanagari
script/Ethiopic
script/Fraktur
script/Georgian
script/Greek
script/Gujarati
script/Gurmukhi
script/HanS
script/HanS_vert
script/HanT
script/HanT_vert
script/Hangul
script/Hangul_vert
script/Hebrew
script/Japanese
script/Japanese_vert
script/Kannada
script/Khmer
script/Lao
script/Latin
script/Malayalam
script/Myanmar
script/Oriya
script/Sinhala
script/Syriac
script/Tamil
script/Telugu
script/Thaana
script/Thai
script/Tibetan
script/Vietnamese
sin
slk
slv
snd
snum
spa
spa_old
sqi
srp
srp_latn
sun
swa
swe
syr
tam
tat
tel
tgk
tha
tir
ton
tur
uig
ukr
urd
uzb
uzb_cyrl
vie
yid
yor
pikepdf mmap enabled helpers.py:328
os.symlink(./test_O1.pdf, helpers.py:179
/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.k_2g
p6st/origin)
Gathering info with 1 thread workers info.py:816
pikepdf mmap enabled helpers.py:328
Scanning contents ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 6/6 0:00:00
Using Tesseract OpenMP thread limit 1 tesseract_ocr.py:199
Start processing 6 pages concurrently ocr.py:96
pikepdf mmap enabled helpers.py:328
pikepdf mmap enabled helpers.py:328
pikepdf mmap enabled helpers.py:328
1 skipping all processing on this page _pipeline.py:343
pikepdf mmap enabled helpers.py:328
pikepdf mmap enabled helpers.py:328
2 skipping all processing on this page _pipeline.py:343
pikepdf mmap enabled helpers.py:328
3 skipping all processing on this page _pipeline.py:343
4 skipping all processing on this page _pipeline.py:343
1 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
5 skipping all processing on this page _pipeline.py:343
6 skipping all processing on this page _pipeline.py:343
1 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
2 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
2 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
3 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
3 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
4 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
4 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
5 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
5 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
6 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
6 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
Image processing ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 6/6 0:00:00
Postprocessing... ocr.py:144
os.symlink(/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmy helpers.py:179
pdf.io.k_2gp6st/graft_layers.pdf,
/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.k_2g
p6st/fix_docinfo.pdf)
Running: ['gs', '--version'] __init__.py:133
Running: ['gs', '-dBATCH', '-dNOPAUSE', '-dSAFER', __init__.py:133
'-dCompatibilityLevel=1.6', '-sDEVICE=pdfwrite',
'-dAutoRotatePages=/None',
'-sColorConversionStrategy=LeaveColorUnchanged',
'-dPDFSTOPONERROR', '-dAutoFilterColorImages=true',
'-dAutoFilterGrayImages=true', '-dJPEGQ=95', '-dPDFA=2',
'-dPDFACompatibilityPolicy=1', '-o',
'/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.k_
2gp6st/pdfa.pdf', '-sstdout=%stderr',
'/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.k_
2gp6st/pdfa.ps',
'/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.k_
2gp6st/fix_docinfo.pdf']
GPL Ghostscript 10.04.0 (2024-09-18) __init__.py:108
Copyright (C) 2024 Artifex Software, Inc. All rights reserved. __init__.py:108
This software is supplied under the GNU AGPLv3 and comes with NO __init__.py:108
WARRANTY:
see the file COPYING for details. __init__.py:108
Processing pages 1 through 6. __init__.py:108
Page 1 __init__.py:108
Page 2 __init__.py:108
Page 3 __init__.py:108
Page 4 __init__.py:108
Page 5 __init__.py:108
Page 6 __init__.py:108
PDF/A conversion ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 6/6 0:00:00
Running: ['tesseract', '--version'] __init__.py:133
Some input metadata could not be copied because it is not _metadata.py:63
permitted in PDF/A. You may wish to examine the output PDF's XMP
metadata.
The following metadata fields were not copied: _metadata.py:68
{'{http://ns.adobe.com/xap/1.0/}MetadataDate'}
Linearizing ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 100/100 0:00:00
xref 38: treating as an optimization candidate optimize.py:290
xref 40: treating as an optimization candidate optimize.py:290
xref 43: treating as an optimization candidate optimize.py:290
xref 45: treating as an optimization candidate optimize.py:290
xref 48: treating as an optimization candidate optimize.py:290
xref 50: treating as an optimization candidate optimize.py:290
XrefExt(xref=48, ext='.jpg') optimize.py:355
XrefExt(xref=50, ext='.jpg') optimize.py:355
XrefExt(xref=38, ext='.jpg') optimize.py:355
XrefExt(xref=40, ext='.jpg') optimize.py:355
XrefExt(xref=43, ext='.jpg') optimize.py:355
Optimizable images: JPEGs: 5 PNGs: 0 optimize.py:360
xref 48, jpeg, made larger - skip optimize.py:476
xref 40, jpeg, made larger - skip optimize.py:476
xref 43, jpeg, made larger - skip optimize.py:476
xref 50, jpeg, made larger - skip optimize.py:476
Recompressing JPEGs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 5/5 0:00:00
xref 38: treating as an optimization candidate optimize.py:290
xref 40: treating as an optimization candidate optimize.py:290
xref 43: treating as an optimization candidate optimize.py:290
xref 45: treating as an optimization candidate optimize.py:290
xref 48: treating as an optimization candidate optimize.py:290
xref 50: treating as an optimization candidate optimize.py:290
xref 48: marking this JPEG as deflatable optimize.py:555
xref 50: marking this JPEG as deflatable optimize.py:555
xref 38: marking this JPEG as deflatable optimize.py:555
xref 40: marking this JPEG as deflatable optimize.py:555
xref 43: marking this JPEG as deflatable optimize.py:555
Deflating JPEGs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 5/5 0:00:00
PNGs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0/0 -:--:--
xref 38: treating as an optimization candidate optimize.py:290
xref 40: treating as an optimization candidate optimize.py:290
xref 43: treating as an optimization candidate optimize.py:290
xref 45: treating as an optimization candidate optimize.py:290
xref 48: treating as an optimization candidate optimize.py:290
xref 50: treating as an optimization candidate optimize.py:290
xref 48: found image compressed as /FlateDecode /DCTDecode, optimize.py:103
marked for JPEG optimization
xref 50: found image compressed as /FlateDecode /DCTDecode, optimize.py:103
marked for JPEG optimization
xref 38: found image compressed as /FlateDecode /DCTDecode, optimize.py:103
marked for JPEG optimization
xref 40: found image compressed as /FlateDecode /DCTDecode, optimize.py:103
marked for JPEG optimization
xref 43: found image compressed as /FlateDecode /DCTDecode, optimize.py:103
marked for JPEG optimization
Running: ['jbig2', '--version'] __init__.py:133
Optimizable images: JBIG2 groups: 1 optimize.py:371
Running: ['jbig2', '--pdf', '-t', '0.85', __init__.py:133
PosixPath('/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrm
ypdf.io.k_2gp6st/images/00000045.prejbig2.tif')]
JBIG2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 1/1 0:00:00
os.symlink(/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmy helpers.py:179
pdf.io.k_2gp6st/optimize.opt.pdf,
/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.k_2g
p6st/optimize.pdf)
Running: ['jbig2', '--version'] __init__.py:133
Running: ['pngquant', '--version'] __init__.py:133
Image optimization ratio: 1.08 savings: 7.3% _pipeline.py:1002
Total file size ratio: 13.38 savings: 92.5% _pipeline.py:1005
/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.k _pipeline.py:1077
_2gp6st/optimize.pdf -> ./test_O2.pdf
Output file is a PDF/A-2B (as expected) _common.py:474
ocrmypdf 16.9.0 __main__.py:59
Running: ['tesseract', '--version'] __init__.py:133
Found tesseract 5.5.0 __init__.py:345
Running: ['tesseract', '--version'] __init__.py:133
Running: ['tesseract', '--version'] __init__.py:133
Running: ['pngquant', '--version'] __init__.py:133
Found pngquant 3.0.3 __init__.py:345
Running: ['jbig2', '--version'] __init__.py:133
Found jbig2 0.30 __init__.py:345
Running: ['gs', '--version'] __init__.py:133
Found gs 10.4.0 __init__.py:345
Running: ['gs', '--version'] __init__.py:133
Running: ['tesseract', '--list-langs'] __init__.py:133
stdout/stderr = List of available languages in __init__.py:73
"/usr/local/share/tessdata/" (163):
afr
amh
ara
asm
aze
aze_cyrl
bel
ben
bod
bos
bre
bul
cat
ceb
ces
chi_sim
chi_sim_vert
chi_tra
chi_tra_vert
chr
cos
cym
dan
deu
div
dzo
ell
eng
enm
epo
equ
est
eus
fao
fas
fil
fin
fra
frk
frm
fry
gla
gle
glg
grc
guj
hat
heb
hin
hrv
hun
hye
iku
ind
isl
ita
ita_old
jav
jpn
jpn_vert
kan
kat
kat_old
kaz
khm
kir
kmr
kor
kor_vert
lao
lat
lav
lit
ltz
mal
mar
mkd
mlt
mon
mri
msa
mya
nep
nld
nor
oci
ori
osd
pan
pol
por
pus
que
ron
rus
san
script/Arabic
script/Armenian
script/Bengali
script/Canadian_Aboriginal
script/Cherokee
script/Cyrillic
script/Devanagari
script/Ethiopic
script/Fraktur
script/Georgian
script/Greek
script/Gujarati
script/Gurmukhi
script/HanS
script/HanS_vert
script/HanT
script/HanT_vert
script/Hangul
script/Hangul_vert
script/Hebrew
script/Japanese
script/Japanese_vert
script/Kannada
script/Khmer
script/Lao
script/Latin
script/Malayalam
script/Myanmar
script/Oriya
script/Sinhala
script/Syriac
script/Tamil
script/Telugu
script/Thaana
script/Thai
script/Tibetan
script/Vietnamese
sin
slk
slv
snd
snum
spa
spa_old
sqi
srp
srp_latn
sun
swa
swe
syr
tam
tat
tel
tgk
tha
tir
ton
tur
uig
ukr
urd
uzb
uzb_cyrl
vie
yid
yor
pikepdf mmap enabled helpers.py:328
os.symlink(./test_O2.pdf, helpers.py:179
/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.iyiq
01bi/origin)
Gathering info with 1 thread workers info.py:816
pikepdf mmap enabled helpers.py:328
Scanning contents ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 6/6 0:00:00
Using Tesseract OpenMP thread limit 1 tesseract_ocr.py:199
Start processing 6 pages concurrently ocr.py:96
pikepdf mmap enabled helpers.py:328
1 skipping all processing on this page _pipeline.py:343
pikepdf mmap enabled helpers.py:328
pikepdf mmap enabled helpers.py:328
pikepdf mmap enabled helpers.py:328
2 skipping all processing on this page _pipeline.py:343
pikepdf mmap enabled helpers.py:328
3 skipping all processing on this page _pipeline.py:343
pikepdf mmap enabled helpers.py:328
1 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
4 skipping all processing on this page _pipeline.py:343
5 skipping all processing on this page _pipeline.py:343
6 skipping all processing on this page _pipeline.py:343
1 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
2 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
2 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
3 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
3 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
4 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
4 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
5 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
5 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
6 Text rotation: (text, autorotate, content) -> text _graft.py:152
misalignment = (0, 0, 0) -> 0
6 Page rotation: (content, auto) -> page = (0, 0) -> 0 _graft.py:177
Image processing ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 6/6 0:00:00
Postprocessing... ocr.py:144
os.symlink(/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmy helpers.py:179
pdf.io.iyiq01bi/graft_layers.pdf,
/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.iyiq
01bi/fix_docinfo.pdf)
Running: ['gs', '--version'] __init__.py:133
Running: ['gs', '-dBATCH', '-dNOPAUSE', '-dSAFER', __init__.py:133
'-dCompatibilityLevel=1.6', '-sDEVICE=pdfwrite',
'-dAutoRotatePages=/None',
'-sColorConversionStrategy=LeaveColorUnchanged',
'-dPDFSTOPONERROR', '-dAutoFilterColorImages=true',
'-dAutoFilterGrayImages=true', '-dJPEGQ=95', '-dPDFA=2',
'-dPDFACompatibilityPolicy=1', '-o',
'/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.iy
iq01bi/pdfa.pdf', '-sstdout=%stderr',
'/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.iy
iq01bi/pdfa.ps',
'/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.iy
iq01bi/fix_docinfo.pdf']
GPL Ghostscript 10.04.0 (2024-09-18) __init__.py:108
Copyright (C) 2024 Artifex Software, Inc. All rights reserved. __init__.py:108
This software is supplied under the GNU AGPLv3 and comes with NO __init__.py:108
WARRANTY:
see the file COPYING for details. __init__.py:108
Processing pages 1 through 6. __init__.py:108
Page 1 __init__.py:108
Page 2 __init__.py:108
Page 3 __init__.py:108
Page 4 __init__.py:108
Page 5 __init__.py:108
Page 6 __init__.py:108
PDF/A conversion ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 6/6 0:00:00
Running: ['tesseract', '--version'] __init__.py:133
Some input metadata could not be copied because it is not _metadata.py:63
permitted in PDF/A. You may wish to examine the output PDF's XMP
metadata.
The following metadata fields were not copied: _metadata.py:68
{'{http://ns.adobe.com/xap/1.0/}MetadataDate'}
Linearizing ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 100/100 0:00:00
xref 574: treating as an optimization candidate optimize.py:290
xref 980: treating as an optimization candidate optimize.py:290
xref 2124: treating as an optimization candidate optimize.py:290
xref 2204: treating as an optimization candidate optimize.py:290
xref 2246: treating as an optimization candidate optimize.py:290
xref 2248: treating as an optimization candidate optimize.py:290
XrefExt(xref=980, ext='.jpg') optimize.py:355
XrefExt(xref=2246, ext='.jpg') optimize.py:355
XrefExt(xref=2248, ext='.jpg') optimize.py:355
XrefExt(xref=2124, ext='.jpg') optimize.py:355
XrefExt(xref=574, ext='.jpg') optimize.py:355
Optimizable images: JPEGs: 5 PNGs: 0 optimize.py:360
Recompressing JPEGs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 5/5 0:00:00
xref 574: treating as an optimization candidate optimize.py:290
xref 980: treating as an optimization candidate optimize.py:290
xref 2124: treating as an optimization candidate optimize.py:290
xref 2204: treating as an optimization candidate optimize.py:290
xref 2246: treating as an optimization candidate optimize.py:290
xref 2248: treating as an optimization candidate optimize.py:290
xref 980: marking this JPEG as deflatable optimize.py:555
xref 2246: marking this JPEG as deflatable optimize.py:555
xref 2248: marking this JPEG as deflatable optimize.py:555
xref 2124: marking this JPEG as deflatable optimize.py:555
xref 574: marking this JPEG as deflatable optimize.py:555
Deflating JPEGs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 5/5 0:00:00
PNGs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0/0 -:--:--
xref 574: treating as an optimization candidate optimize.py:290
xref 980: treating as an optimization candidate optimize.py:290
xref 2124: treating as an optimization candidate optimize.py:290
xref 2204: treating as an optimization candidate optimize.py:290
xref 2246: treating as an optimization candidate optimize.py:290
xref 2248: treating as an optimization candidate optimize.py:290
Running: ['jbig2', '--version'] __init__.py:133
xref 980: found image compressed as /FlateDecode /DCTDecode, optimize.py:103
marked for JPEG optimization
xref 2246: found image compressed as /FlateDecode /DCTDecode, optimize.py:103
marked for JPEG optimization
xref 2248: found image compressed as /FlateDecode /DCTDecode, optimize.py:103
marked for JPEG optimization
xref 2124: found image compressed as /FlateDecode /DCTDecode, optimize.py:103
marked for JPEG optimization
xref 574: found image compressed as /FlateDecode /DCTDecode, optimize.py:103
marked for JPEG optimization
Optimizable images: JBIG2 groups: 1 optimize.py:371
Running: ['jbig2', '--pdf', '-t', '0.85', __init__.py:133
PosixPath('/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrm
ypdf.io.iyiq01bi/images/00002204.prejbig2.tif')]
JBIG2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 1/1 0:00:00
os.symlink(/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmy helpers.py:179
pdf.io.iyiq01bi/optimize.opt.pdf,
/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.iyiq
01bi/optimize.pdf)
Running: ['jbig2', '--version'] __init__.py:133
Running: ['pngquant', '--version'] __init__.py:133
Image optimization ratio: 1.19 savings: 15.6% _pipeline.py:1002
Total file size ratio: 0.99 savings: -0.7% _pipeline.py:1005
/var/folders/8z/d8_btvkx7rjbr1q7zm_ynztc0000gn/T/ocrmypdf.io.i _pipeline.py:1077
yiq01bi/optimize.pdf -> ./test_O3.pdf
Output file is a PDF/A-2B (as expected) _common.py:474 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In many cases when I run
where LEVEL is 2 or 3, the files actually become bigger compared even to
-o1
.Is it a bug?
Beta Was this translation helpful? Give feedback.
All reactions