Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

According to the official method, will the model be downloaded locally or will it be downloaded #741

Open
mayu123mayu opened this issue Jan 14, 2025 · 4 comments
Labels
question Further information is requested

Comments

@mayu123mayu
Copy link

from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions, EasyOcrOptions
from docling.document_converter import PdfFormatOption, DocumentConverter

参考接口地址

配置pdf模型,设置Docling模型的路径

pdf_artifacts_path = "/docling-models"
pdf_pipeline_options = PdfPipelineOptions(artifacts_path=pdf_artifacts_path)

转换模型

converter = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(pipeline_options=pdf_pipeline_options)
}
)

source = "庞氏骗局 - 中国投资者网.pdf"
result = converter.convert(source)
print(result.document.export_to_markdown()) python test.py
Downloading detection model, please wait. This may take several minutes depending upon your network connection.

@mayu123mayu mayu123mayu added the bug Something isn't working label Jan 14, 2025
@dolfim-ibm dolfim-ibm added question Further information is requested and removed bug Something isn't working labels Jan 14, 2025
@dolfim-ibm
Copy link
Contributor

@mayu123mayu what is exactly your question? would this help you? https://ds4sd.github.io/docling/usage/#provide-specific-artifacts-path

@mayu123mayu
Copy link
Author

@mayu123mayu what is exactly your question? would this help you? https://ds4sd.github.io/docling/usage/#provide-specific-artifacts-path你的问题到底是什么?这对你有帮助吗? https://ds4sd.github.io/docling/usage/#provide-specific-artifacts-path

I manually downloaded it locally and set the local path, but when running the program, I still go to Hugg to download it

@Gelvins
Copy link

Gelvins commented Jan 15, 2025

我也是同样的问题。设置了本地路径,但还是回去huggingface上下载
artifacts_path = "D:/model/docling-models"

pipeline_options = PdfPipelineOptions(artifacts_path=artifacts_path)
doc_converter = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options)
}
)

@Gelvins
Copy link

Gelvins commented Jan 16, 2025

我也是同样的问题。设置了本地路径,但还是回去huggingface上下载 artifacts_path = "D:/model/docling-models"

pipeline_options = PdfPipelineOptions(artifacts_path=artifacts_path) doc_converter = DocumentConverter( format_options={ InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options) } )

问题已解决,请求路径是https://github.com/JaidedAI/EasyOCR/releases/download/v1.3/latin_g2.zip ,本地下载后存放到~/.EasyOCR/model 目录下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants