According to the official method, will the model be downloaded locally or will it be downloaded #741

mayu123mayu · 2025-01-14T07:55:21Z

from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions, EasyOcrOptions
from docling.document_converter import PdfFormatOption, DocumentConverter

参考接口地址

配置pdf模型，设置Docling模型的路径

pdf_artifacts_path = "/docling-models"
pdf_pipeline_options = PdfPipelineOptions(artifacts_path=pdf_artifacts_path)

转换模型

converter = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(pipeline_options=pdf_pipeline_options)
}
)

source = "庞氏骗局 - 中国投资者网.pdf"
result = converter.convert(source)
print(result.document.export_to_markdown()) python test.py
Downloading detection model, please wait. This may take several minutes depending upon your network connection.

dolfim-ibm · 2025-01-14T08:05:52Z

@mayu123mayu what is exactly your question? would this help you? https://ds4sd.github.io/docling/usage/#provide-specific-artifacts-path

mayu123mayu · 2025-01-14T08:12:11Z

@mayu123mayu what is exactly your question? would this help you? https://ds4sd.github.io/docling/usage/#provide-specific-artifacts-path你的问题到底是什么？这对你有帮助吗？ https://ds4sd.github.io/docling/usage/#provide-specific-artifacts-path

I manually downloaded it locally and set the local path, but when running the program, I still go to Hugg to download it

Gelvins · 2025-01-15T14:05:58Z

我也是同样的问题。设置了本地路径，但还是回去huggingface上下载
artifacts_path = "D:/model/docling-models"

pipeline_options = PdfPipelineOptions(artifacts_path=artifacts_path)
doc_converter = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options)
}
)

Gelvins · 2025-01-16T02:38:03Z

我也是同样的问题。设置了本地路径，但还是回去huggingface上下载 artifacts_path = "D:/model/docling-models"

pipeline_options = PdfPipelineOptions(artifacts_path=artifacts_path) doc_converter = DocumentConverter( format_options={ InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options) } )

问题已解决，请求路径是https://github.com/JaidedAI/EasyOCR/releases/download/v1.3/latin_g2.zip ，本地下载后存放到~/.EasyOCR/model 目录下

mayu123mayu added the bug Something isn't working label Jan 14, 2025

dolfim-ibm added question Further information is requested and removed bug Something isn't working labels Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

According to the official method, will the model be downloaded locally or will it be downloaded #741

According to the official method, will the model be downloaded locally or will it be downloaded #741

mayu123mayu commented Jan 14, 2025

dolfim-ibm commented Jan 14, 2025

mayu123mayu commented Jan 14, 2025

Gelvins commented Jan 15, 2025

Gelvins commented Jan 16, 2025

According to the official method, will the model be downloaded locally or will it be downloaded #741

According to the official method, will the model be downloaded locally or will it be downloaded #741

Comments

mayu123mayu commented Jan 14, 2025

参考接口地址

配置pdf模型，设置Docling模型的路径

转换模型

dolfim-ibm commented Jan 14, 2025

mayu123mayu commented Jan 14, 2025

Gelvins commented Jan 15, 2025

Gelvins commented Jan 16, 2025