You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2025-01-30 17:51:06,158 DEBUG - [Process 35668] - [MainThread] - urllib3.connectionpool - Starting new HTTPS connection (1): arxiv.org:443
2025-01-30 17:51:06,270 DEBUG - [Process 35668] - [MainThread] - urllib3.connectionpool - https://arxiv.org:443 "GET /pdf/2309.05406v5 HTTP/1.1" 200 5150002
2025-01-30 17:51:07,560 INFO - [Process 35668] - [MainThread] - docling.document_converter - Going to convert document batch...
2025-01-30 17:51:07,561 DEBUG - [Process 35668] - [MainThread] - urllib3.connectionpool - Starting new HTTPS connection (1): huggingface.co:443
2025-01-30 17:51:07,697 DEBUG - [Process 35668] - [MainThread] - urllib3.connectionpool - https://huggingface.co:443 "GET /api/models/ds4sd/docling-models/revision/v2.1.0 HTTP/1.1" 200 1264
2025-01-30 17:51:07,731 INFO - [Process 35668] - [MainThread] - docling.utils.accelerator_utils - Accelerator device: 'cpu'
2025-01-30 17:51:08,725 INFO - [Process 35668] - [MainThread] - docling.utils.accelerator_utils - Accelerator device: 'cpu'
2025-01-30 17:51:08,940 DEBUG - [Process 35668] - [MainThread] - docling_ibm_models.layoutmodel.layout_predictor - LayoutPredictor settings: {'safe_tensors_file': 'C:\\Users\\fagom\\.cache\\huggingface\\hub\\models--ds4sd--docling-models\\snapshots\\36bebf56681740529abd09f5473a93a69373fbf0\\model_artifacts\\layout\\model.safetensors', 'device': 'cpu', 'num_threads': 4, 'image_size': 640, 'threshold': 0.3}
2025-01-30 17:51:08,940 INFO - [Process 35668] - [MainThread] - docling.utils.accelerator_utils - Accelerator device: 'cpu'
2025-01-30 17:51:09,108 INFO - [Process 35668] - [MainThread] - docling.pipeline.base_pipeline - Processing document 2309.05406v5.pdf
2025-01-30 17:51:10,319 WARNING - [Process 35668] - [MainThread] - docling.pipeline.base_pipeline - Encountered an error during conversion of document 64901092dec5889cacddcecb334242ea2381d67ea3743a60ed0b02ce65800306:
Traceback (most recent call last):
File "F:\workspace\proj\.venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 161, in _build_document
for p in pipeline_pages: # Must exhaust!
^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 127, in _apply_on_pages
yield from page_batch
File "F:\workspace\proj\.venv\Lib\site-packages\docling\models\page_assemble_model.py", line 60, in __call__
for page in page_batch:
^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\models\table_structure_model.py", line 136, in __call__
for page in page_batch:
^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\models\layout_model.py", line 102, in __call__
for page in page_batch:
^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\models\easyocr_model.py", line 82, in __call__
for page in page_batch:
^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\models\page_preprocessing_model.py", line 25, in __call__
for page in page_batch:
^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\pipeline\standard_pdf_pipeline.py", line 179, in initialize_page
page._backend = conv_res.input._backend.load_page(page.page_no) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\backend\docling_parse_v2_backend.py", line 239, in load_page
return DoclingParseV2PageBackend(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\backend\docling_parse_v2_backend.py", line 27, in __init__
parsed_page = parser.parse_pdf_from_key_on_page(document_hash, page_no)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: #-instructions 1 does not match expected value 2 for PDF operation: m
Traceback (most recent call last):
File "F:\workspace\proj\tests\docling_test.py", line 24, in <module>
print(DocumentConverter().convert("https://arxiv.org/pdf/2309.05406v5"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\pydantic\_internal\_validate_call.py", line 38, in wrapper_function
return wrapper(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\pydantic\_internal\_validate_call.py", line 111, in __call__
res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\document_converter.py", line 195, in convert
return next(all_res)
^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\document_converter.py", line 216, in convert_all
for conv_res in conv_res_iter:
^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\document_converter.py", line 251, in _convert
for item in map(
^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\document_converter.py", line 292, in _process_document
conv_res = self._execute_pipeline(in_doc, raises_on_error=raises_on_error)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\document_converter.py", line 315, in _execute_pipeline
conv_res = pipeline.execute(in_doc, raises_on_error=raises_on_error)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 53, in execute
raise e
File "F:\workspace\proj\.venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 45, in execute
conv_res = self._build_document(conv_res)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 196, in _build_document
raise e
File "F:\workspace\proj\.venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 161, in _build_document
for p in pipeline_pages: # Must exhaust!
^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\pipeline\base_pipeline.py", line 127, in _apply_on_pages
yield from page_batch
File "F:\workspace\proj\.venv\Lib\site-packages\docling\models\page_assemble_model.py", line 60, in __call__
for page in page_batch:
^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\models\table_structure_model.py", line 136, in __call__
for page in page_batch:
^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\models\layout_model.py", line 102, in __call__
for page in page_batch:
^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\models\easyocr_model.py", line 82, in __call__
for page in page_batch:
^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\models\page_preprocessing_model.py", line 25, in __call__
for page in page_batch:
^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\pipeline\standard_pdf_pipeline.py", line 179, in initialize_page
page._backend = conv_res.input._backend.load_page(page.page_no) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\backend\docling_parse_v2_backend.py", line 239, in load_page
return DoclingParseV2PageBackend(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\workspace\proj\.venv\Lib\site-packages\docling\backend\docling_parse_v2_backend.py", line 27, in __init__
parsed_page = parser.parse_pdf_from_key_on_page(document_hash, page_no)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: #-instructions 1 does not match expected value 2 for PDF operation: m
Docling version
2.17.0
Python version
3.12.8
The text was updated successfully, but these errors were encountered:
Bug
Docling fails with exception.
Steps to reproduce
Please run this code
The stack trace
Docling version
2.17.0
Python version
3.12.8
The text was updated successfully, but these errors were encountered: