-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在magic_pdf_parse_main这个demo中,如何才能批量处理PDF文件 #513
Comments
+1同问 |
1 similar comment
+1同问 |
import os 设置要处理的文件夹路径input_directory = '' # 替换为你的文件夹路径 收集所有PDF文件pdf_files = [] 打开CSV文件进行写入with open(csv_file_path, mode='w', newline='') as csvfile:
print(f'Processing complete. Results saved in {csv_file_path}.') |
这样似乎每次都要重新init model,批量跑速度太慢了 |
其实也不会,目前的逻辑只有第一次init才是真的init,后面都是读的缓存 |
您好,这个地方我不是很明白。当前demo里面每个文件都是单独做了一个pipe,然后pipe里do_parse的部分看上去是每次都要init(init里面去做ModelSingleton()的实例化然后init模型),这似乎没看到读缓存的部分? |
单例不会重复生成,所以每次去调用的都是之前init好的模型对象。 |
Is your feature request related to a problem? Please describe.
您的特性请求是否与某个问题相关?请描述。
如何在magic_pdf_parse_main.py这个demo中修改代码,实现本地pdf批量处理。使用命令行可以实现批量处理,但是我不知道api如何实现。
Describe the solution you'd like
描述您期望的解决方案
实现批量处理一个文件夹中的所有pdf文件
Describe alternatives you've considered
描述您已考虑的替代方案
无
Additional context
提供更多细节
无
The text was updated successfully, but these errors were encountered: