Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于pipe_ocr_mode的疑问 #1585

Closed
toyn0015 opened this issue Jan 20, 2025 · 1 comment
Closed

关于pipe_ocr_mode的疑问 #1585

toyn0015 opened this issue Jan 20, 2025 · 1 comment

Comments

@toyn0015
Copy link

ds.apply(doc_analyze, ocr=True).pipe_ocr_mode(image_writer).dump_md(md_writer, f"{name_without_suff}.md", image_dir)

步骤1:.apply(doc_analyze, ocr=True)
步骤2:pipe_ocr_mode(image_writer)
步骤3:dump_md(md_writer, f"{name_without_suff}.md", image_dir)

想咨询一下,在这串代码中 image_writer 这个参数的作用:
1、image_writer 是否会向目标目录里写入文件?
2、image_writer 这个参数,存在的目的是什么?

之所以有这样的疑问,是我在测试中发现,如果没有步骤3,就不会有任何结果文件保存到目标目录,例如:

ds.apply(doc_analyze, ocr=True).pipe_ocr_mode(image_writer)

这样就不会有任何文件写入到目标目录。

@myhloli
Copy link
Collaborator

myhloli commented Jan 20, 2025

第二步需要imagewriter是为了截图,你第二步没输出是因为没有图片需要截,第三步是为了导出markdown需要一个writer对象,就直接复用了之前的imagewriter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants