Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.10.6 #1213

Closed
wants to merge 35 commits into from
Closed

Release 0.10.6 #1213

wants to merge 35 commits into from

Conversation

myhloli
Copy link
Collaborator

@myhloli myhloli commented Dec 6, 2024

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here and update the documentation.

Checklist

Before PR:

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
  • The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

  • If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
  • CLA has been signed and all committers have signed the CLA in this PR.

myhloli and others added 30 commits December 2, 2024 14:26
- Add get_concurrency_limit function to calculate concurrency limit based on VRAM
- Update clean_vram function and rename to get_vram for better clarity
- Apply concurrency limit to the to_markdown function in the Gradio app
- Update VRAM checking logic in app.py and model_utils.py
- Add None and type checks for VRAM values
- Adjust concurrency limit calculation in app.py
- Modify clean_vram function to handle cases with no VRAM information
feat(gradio_app): implement dynamic concurrency limit based on VRAM
- Introduce a lock to synchronize access to OCR model initialization- This change improves thread safety when multiple threads access the OCR model concurrently
- The lock ensures that the OCR model is initialized only once, even in multi-threaded scenarios
perf(model): add threading lock for OCR model initialization
- Add condition to return existing model if already initialized
- Improve efficiency by avoiding redundant model creation
perf(model): optimize model initialization
- Remove unnecessary threading.Lock in AtomModelSingleton
- Add threading.Lock to CustomPEKModel for OCR processing
- Simplify model initialization logic in AtomModelSingleton
fix(model): simplify model initialization logic
…ion and improve threading

- Remove usage of ModelSingleton class
- Initialize model directly using custom_model_init function
- Add self._lock attribute to PDFExtractKit class for thread safety- Replace local lock with self._lock for OCR processing
…CR model initialization

- Remove usage of AtomModelSingleton for OCR model initialization
- Add import of ocr_model_init from model_init module
- Update OCR model initialization process to use ocr_model_init function
- Remove lock for OCR processing as it's no longer needed
…or OCR model instantiation

- Remove usage of AtomModelSingleton for OCR model initialization- Use ocr_model_init function for creating OCR model instance
- Update import statement to include ocr_model_init- Comment out old OCR model initialization code
fix(multi-threading ):Enable multi-threading support for PaddleOCR.
feat: update test case
…ation code

- Remove threading.Lock import and usage
- Delete unused model initialization comments and code- Simplify OCR model initialization in both pdf_extract_kit.py and pdf_parse_union_core_v2.py
refactor(magic_pdf): remove unused threading lock and model initialization code
- Update `ultralytics` dependency to version >= 8.3.43
- This change ensures compatibility with yolov8 for formula detection
build(deps): specify minimum version for ultralytics
- Add threading support for OCR model initialization
- Modify AtomModelSingleton to handle thread-specific instances
- Update PDFExtractKit and PDFParseUnionCoreV2 to use new thread-safe OCR initialization
refactor(model): implement thread-safe OCR model initialization
… model instantiation

- Remove usage of AtomModelSingleton for OCR model creation
- Add ocr_model_init function to initialize OCR model
- Update OCR model initialization in pdf_extract_kit.py and pdf_parse_union_core_v2.py
- Modify txt_spans_extract_v2 function to accept ocr_model as a parameter
- Update parse_page_core function to use ocr_model instead of lang for OCR processing
refactor(ocr): replace AtomModelSingleton with ocr_model_init for OCR model instantiation
refactor(model): implement thread-safe OCR model initialization
@myhloli myhloli closed this Dec 6, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Dec 6, 2024
@myhloli myhloli deleted the release-0.10.6 branch December 6, 2024 12:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants