Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource tracker error #782

Open
GitHamza0206 opened this issue Jan 21, 2025 · 1 comment
Open

resource tracker error #782

GitHamza0206 opened this issue Jan 21, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@GitHamza0206
Copy link

Bug

...
/Users/mac/.pyenv/versions/3.11.5/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

Steps to reproduce

...
output_path = os.path.splitext(file_path)[0] + '.md'
EMBED_MODEL_ID = "sentence-transformers/all-MiniLM-L6-v2"

            loader = DoclingLoader(
                    file_path=file_path,
                    export_type=ExportType.MARKDOWN,
                    chunker=HybridChunker(tokenizer=EMBED_MODEL_ID),
                )

            docs = loader.load()

Docling version

...

Python version

...
3.11

@GitHamza0206 GitHamza0206 added the bug Something isn't working label Jan 21, 2025
@workflowsguy
Copy link

I encounter the same issue under Python 3.12 from the command line:

Steps to reproduce
docling -v /Users/guy/Playground/invoice-simple.pdf

Output

INFO:docling.document_converter:Going to convert document batch...
INFO:docling.utils.accelerator_utils:Accelerator device: 'cpu'
INFO:docling.utils.accelerator_utils:Accelerator device: 'cpu'
INFO:docling.utils.accelerator_utils:Accelerator device: 'cpu'
INFO:docling.pipeline.base_pipeline:Processing document invoice-simple.pdf
[1]    39270 segmentation fault  docling -v /Users/guy/Playground/invoice-simple.pdf
/opt/local/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

After this, Python crashes

Docling version: 2.15.1
Docling Core version: 2.14.0
Docling IBM Models version: 3.1.2
Docling Parse version: 3.1.0
Python 3.12.8

Installed in separate virtualenv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants