Release 0.10.6 #1213

myhloli · 2024-12-06T09:39:31Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here and update the documentation.

Checklist

Before PR:

Pre-commit or other linting tools are used to fix the potential lint issues.
Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
CLA has been signed and all committers have signed the CLA in this PR.

master->dev

- Add get_concurrency_limit function to calculate concurrency limit based on VRAM - Update clean_vram function and rename to get_vram for better clarity - Apply concurrency limit to the to_markdown function in the Gradio app

- Update VRAM checking logic in app.py and model_utils.py - Add None and type checks for VRAM values - Adjust concurrency limit calculation in app.py - Modify clean_vram function to handle cases with no VRAM information

feat(gradio_app): implement dynamic concurrency limit based on VRAM

- Introduce a lock to synchronize access to OCR model initialization- This change improves thread safety when multiple threads access the OCR model concurrently - The lock ensures that the OCR model is initialized only once, even in multi-threaded scenarios

perf(model): add threading lock for OCR model initialization

- Add condition to return existing model if already initialized - Improve efficiency by avoiding redundant model creation

perf(model): optimize model initialization

- Remove unnecessary threading.Lock in AtomModelSingleton - Add threading.Lock to CustomPEKModel for OCR processing - Simplify model initialization logic in AtomModelSingleton

fix: update notify

fix(model): simplify model initialization logic

…ion and improve threading - Remove usage of ModelSingleton class - Initialize model directly using custom_model_init function - Add self._lock attribute to PDFExtractKit class for thread safety- Replace local lock with self._lock for OCR processing

…CR model initialization - Remove usage of AtomModelSingleton for OCR model initialization - Add import of ocr_model_init from model_init module - Update OCR model initialization process to use ocr_model_init function - Remove lock for OCR processing as it's no longer needed

…or OCR model instantiation - Remove usage of AtomModelSingleton for OCR model initialization- Use ocr_model_init function for creating OCR model instance - Update import statement to include ocr_model_init- Comment out old OCR model initialization code

fix(multi-threading ):Enable multi-threading support for PaddleOCR.

feat: update test case

…ation code - Remove threading.Lock import and usage - Delete unused model initialization comments and code- Simplify OCR model initialization in both pdf_extract_kit.py and pdf_parse_union_core_v2.py

refactor(magic_pdf): remove unused threading lock and model initialization code

- Update `ultralytics` dependency to version >= 8.3.43 - This change ensures compatibility with yolov8 for formula detection

build(deps): specify minimum version for ultralytics

- Add threading support for OCR model initialization - Modify AtomModelSingleton to handle thread-specific instances - Update PDFExtractKit and PDFParseUnionCoreV2 to use new thread-safe OCR initialization

refactor(model): implement thread-safe OCR model initialization

… model instantiation - Remove usage of AtomModelSingleton for OCR model creation - Add ocr_model_init function to initialize OCR model - Update OCR model initialization in pdf_extract_kit.py and pdf_parse_union_core_v2.py - Modify txt_spans_extract_v2 function to accept ocr_model as a parameter - Update parse_page_core function to use ocr_model instead of lang for OCR processing

refactor(ocr): replace AtomModelSingleton with ocr_model_init for OCR model instantiation

refactor(model): implement thread-safe OCR model initialization

myhloli and others added 30 commits December 2, 2024 14:26

Merge pull request #1170 from opendatalab/master

fdf4715

master->dev

fix(vram): improve VRAM checking logic

104273c

- Update VRAM checking logic in app.py and model_utils.py - Add None and type checks for VRAM values - Adjust concurrency limit calculation in app.py - Modify clean_vram function to handle cases with no VRAM information

Merge pull request #1177 from myhloli/dev

41b9cbc

feat(gradio_app): implement dynamic concurrency limit based on VRAM

Merge pull request #1193 from myhloli/dev

92ad41c

perf(model): add threading lock for OCR model initialization

perf(model): optimize model initialization

ce592f8

- Add condition to return existing model if already initialized - Improve efficiency by avoiding redundant model creation

Merge pull request #1198 from myhloli/dev

7ca7e59

perf(model): optimize model initialization

refactor(magic_pdf): optimize model initialization and threading

878f3de

- Remove unnecessary threading.Lock in AtomModelSingleton - Add threading.Lock to CustomPEKModel for OCR processing - Simplify model initialization logic in AtomModelSingleton

fix: update notify

494859c

update yml

eb021e5

update yml

78e84f6

update notify

c77bec7

update runner env

e674848

update runner env

8327d9d

update runner env

fc6ea7a

update runner env

cf09313

Merge pull request #1201 from dt-yy/dev

dab0798

fix: update notify

fix(model): simplify model initialization logic

a9723c6

Merge pull request #1207 from myhloli/dev

272014c

fix(model): simplify model initialization logic

Merge pull request #1208 from myhloli/dev

92c10d1

fix(multi-threading ):Enable multi-threading support for PaddleOCR.

feat: update test case

1d6000e

Merge pull request #1209 from dt-yy/dev

ebfd6fd

feat: update test case

refactor(magic_pdf): remove unused threading lock and model initializ…

a1744b7

…ation code - Remove threading.Lock import and usage - Delete unused model initialization comments and code- Simplify OCR model initialization in both pdf_extract_kit.py and pdf_parse_union_core_v2.py

Merge pull request #1211 from myhloli/dev

b8aab26

refactor(magic_pdf): remove unused threading lock and model initialization code

build(deps): specify minimum version for ultralytics

1f1335c

- Update `ultralytics` dependency to version >= 8.3.43 - This change ensures compatibility with yolov8 for formula detection

Merge pull request #1212 from myhloli/dev

ec5a09d

build(deps): specify minimum version for ultralytics

myhloli added 5 commits December 6, 2024 18:40

refactor(model): implement thread-safe OCR model initialization

f2a92d5

- Add threading support for OCR model initialization - Modify AtomModelSingleton to handle thread-specific instances - Update PDFExtractKit and PDFParseUnionCoreV2 to use new thread-safe OCR initialization

Merge pull request #1214 from myhloli/dev

0acfce2

refactor(model): implement thread-safe OCR model initialization

Merge pull request #1215 from myhloli/dev

ef5cffc

refactor(ocr): replace AtomModelSingleton with ocr_model_init for OCR model instantiation

Merge pull request #1216 from opendatalab/dev

5940f0f

refactor(model): implement thread-safe OCR model initialization

myhloli closed this Dec 6, 2024

github-actions bot locked and limited conversation to collaborators Dec 6, 2024

myhloli deleted the release-0.10.6 branch December 6, 2024 12:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.10.6 #1213

Release 0.10.6 #1213

myhloli commented Dec 6, 2024

Release 0.10.6 #1213

Release 0.10.6 #1213

Conversation

myhloli commented Dec 6, 2024

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist