Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

能否把模型加载和处理流程拆分开,这样不用重复加载模型,速度能快点 #932

Closed
charliedream1 opened this issue Nov 12, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@charliedream1
Copy link

能否把模型加载和处理流程拆分开,这样不用重复加载模型,速度能快点

@charliedream1 charliedream1 added the enhancement New feature or request label Nov 12, 2024
@myhloli
Copy link
Collaborator

myhloli commented Nov 12, 2024

模型是支持预加载的,可以预先加载,也可以处理的时候再加载
#517

@myhloli myhloli closed this as completed Nov 12, 2024
@charliedream1
Copy link
Author

感谢回复,我看到这段代码了

def init_model():
    from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton
    try:
        model_manager = ModelSingleton()
        txt_model = model_manager.get_model(False, False)
        logger.info(f"txt_model init final")
        ocr_model = model_manager.get_model(True, False)
        logger.info(f"ocr_model init final")
        return 0
    except Exception as e:
        logger.exception(e)
        return -1


model_init = init_model()
logger.info(f"model_init: {model_init}")

有没有文档说明,get_model里的False, True是什么意思。另外,table的模型,layout怎么加载?

import os

from loguru import logger
from magic_pdf.pipe.UNIPipe import UNIPipe
from magic_pdf.rw.DiskReaderWriter import DiskReaderWriter


try:
    current_script_dir = os.path.dirname(os.path.abspath(__file__))
    demo_name = "demo1"
    pdf_path = os.path.join(current_script_dir, f"{demo_name}.pdf")
    pdf_bytes = open(pdf_path, "rb").read()
    jso_useful_key = {"_pdf_type": "", "model_list": []}
    local_image_dir = os.path.join(current_script_dir, 'images')
    image_dir = str(os.path.basename(local_image_dir))
    image_writer = DiskReaderWriter(local_image_dir)
    pipe = UNIPipe(pdf_bytes, jso_useful_key, image_writer)
    pipe.pipe_classify()
    pipe.pipe_analyze()
    pipe.pipe_parse()
    md_content = pipe.pipe_mk_markdown(image_dir, drop_mode="none")
    with open(f"{demo_name}.md", "w", encoding="utf-8") as f:
        f.write(md_content)
except Exception as e:
    logger.exception(e)

模型预加载之后,和这段处理的pipeline怎么整合?

另外,我在处理不同的文件需要输出不同的目录,这个pipeline预先就定下来了。怎么动态调整?

期待答复,谢谢:)

@myhloli
Copy link
Collaborator

myhloli commented Nov 12, 2024

在pipeline前的任意时刻,调用模型加载代码就可以了,模型对象是一整个模型包,包含了layout、公式、ocr、table等所有模型
输出的时候需要改变目录就直接该这个open里的参数就行

    with open(f"{demo_name}.md", "w", encoding="utf-8") as f:
        f.write(md_content)

@charliedream1
Copy link
Author

charliedream1 commented Nov 12, 2024

感谢神速回复。还有几个疑问:

  1. 也就是是说Pipeline的那段代码还是可以原封不动的使用,init_model调用之后,pipeline自动就能识别模型已经加载,就不会加载了吧?
  2. 您说的模型是整包加载的,相当于只需要model_manager = ModelSingleton()这一句就够了,就自动加载所有模型?
  3. txt_model = model_manager.get_model(False, False),这个False, False代表什么?
  4. image_writer = DiskReaderWriter(local_image_dir)这段代码会不会写缓存文件,如果我启动好几个进程,但local_image_dir都指定的一个目录,会不会导致最后的markdown结果互相干扰?
  5. 如果布使用Pipeline,如何使用流程种某一个模型,单独识别?比如我只想识别公式或者ocr?

期待答复,谢谢:)

@myhloli
Copy link
Collaborator

myhloli commented Nov 12, 2024

  1. 是的
  2. 是的
  3. ocr和ocrlog的开关,可以跳转到方法实现自己看源码
  4. 不会干扰,设计之初就是为了将所有图片平铺写入到同一目录的,图片有自己的独特的命名,保证不会互相干扰
  5. 参考magic_pdf/model/pdf_extract_kit.py 中的实现

@charliedream1
Copy link
Author

感谢!

@charliedream1
Copy link
Author

这个方法好像不生效。执行完init,在调识别的时候,才会弹出模型初始化信息。不过如果循环跑,第一遍之后不会再弹这些消息

CustomVisionEncoderDecoderModel init
VariableUnimerNetModel init
VariableUnimerNetPatchEmbeddings init
VariableUnimerNetModel init
VariableUnimerNetPatchEmbeddings init
CustomMBartForCausalLM init
CustomMBartDecoder init

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants