Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: #1477

Closed
Silgrond opened this issue Feb 10, 2025 · 1 comment
Closed

[Bug]: #1477

Silgrond opened this issue Feb 10, 2025 · 1 comment
Assignees

Comments

@Silgrond
Copy link

Describe the bug

ocrmypdf -l jpn_vert input.pdf ocroutput.pdf ✔

Scanning contents ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0/26 -:--:--
An exception occurred while executing the pipeline _common.py:296
Traceback (most recent call last):
File "/usr/lib/python3.13/site-packages/ocrmypdf/_pipelines/_common.py", line 261, in
cli_exception_handler
return fn(options, plugin_manager)
File "/usr/lib/python3.13/site-packages/ocrmypdf/_pipelines/ocr.py", line 174, in
_run_pipeline
pdfinfo = do_get_pdfinfo(origin_pdf, executor, options)
File "/usr/lib/python3.13/site-packages/ocrmypdf/_pipelines/_common.py", line 318, in
do_get_pdfinfo
return get_pdfinfo(
pdf_path,
...<5 lines>...
check_pages=options.pages,
)
File "/usr/lib/python3.13/site-packages/ocrmypdf/_pipeline.py", line 199, in
get_pdfinfo
return PdfInfo(
input_file,
...<5 lines>...
executor=executor,
)
File "/usr/lib/python3.13/site-packages/ocrmypdf/pdfinfo/info.py", line 1179, in
init
self._pages = _pdf_pageinfo_concurrent(
~~~~~~~~~~~~~~~~~~~~~~~~^
pdf,
^^^^
...<7 lines>...
miner_state=miner_state,
^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/lib/python3.13/site-packages/ocrmypdf/pdfinfo/info.py", line 821, in
_pdf_pageinfo_concurrent
executor(
~~~~~~~~^
use_threads=use_threads,
^^^^^^^^^^^^^^^^^^^^^^^^
...<12 lines>...
task_finished=update_pageinfo,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/lib/python3.13/site-packages/ocrmypdf/_concurrent.py", line 78, in call
self._execute(
~~~~~~~~~~~~~^
use_threads=use_threads,
^^^^^^^^^^^^^^^^^^^^^^^^
...<5 lines>...
task_finished=task_finished,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/lib/python3.13/site-packages/ocrmypdf/builtin_plugins/concurrency.py", line
144, in _execute
result = future.result()
File "/usr/lib/python3.13/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
~~~~~~~~~~~~~~~~~^^
File "/usr/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib/python3.13/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/lib/python3.13/site-packages/ocrmypdf/pdfinfo/info.py", line 766, in
_pdf_pageinfo_sync
return PageInfo(
pdf, pageno, infile, check_pages, detailed_analysis, miner_state
)
File "/usr/lib/python3.13/site-packages/ocrmypdf/pdfinfo/info.py", line 886, in
init
self._gather_pageinfo(
~~~~~~~~~~~~~~~~~~~~~^
pdf, pageno, infile, check_pages, detailed_analysis, miner_state
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/lib/python3.13/site-packages/ocrmypdf/pdfinfo/info.py", line 941, in
_gather_pageinfo
for info in _process_content_streams(
~~~~~~~~~~~~~~~~~~~~~~~~^
pdf=pdf, container=page, shorthand=userunit_shorthand
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
):
^
File "/usr/lib/python3.13/site-packages/ocrmypdf/pdfinfo/info.py", line 669, in
_process_content_streams
contentsinfo = _interpret_contents(container, initial_shorthand)
File "/usr/lib/python3.13/site-packages/ocrmypdf/pdfinfo/info.py", line 229, in
_interpret_contents
_normalize_stack(parse_content_stream(contentstream, operator_whitelist))
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/site-packages/pikepdf/models/_content_stream.py", line 106,
in parse_content_stream
page._parse_page_contents_grouped(operators),
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
ValueError: overflow/underflow converting
-2157231713616891516413098094427726015594901769862730602100741084491831945181808952318889
73205445516242738206423723380283420801456828955611720528865654388281399315331778890329660
58780819344078419605516764288528517056790776940003197080411964201361804775065144566391317
408998138373203171479264026313950639423488 to 64-bit integer

Steps to reproduce

1. Run ocrmypdf -v1 ...arguments... input.pdf output.pdf
2. Open output.pdf
3. ...

Files

No response

How did you download and install the software?

PyPI (pip, poetry, pipx, etc.), source build

OCRmyPDF version

16.9.0

Relevant log output


@Silgrond Silgrond added the triage Issue needs triage label Feb 10, 2025
@jbarlow83
Copy link
Collaborator

I cannot investigate this without a reproducing example.

@jbarlow83 jbarlow83 closed this as not planned Won't fix, can't repro, duplicate, stale Feb 10, 2025
@github-actions github-actions bot removed the triage Issue needs triage label Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants