Skip to content

Commit

Permalink
Merge pull request #56 from myhloli/master
Browse files Browse the repository at this point in the history
有些ocr的text和block框差异过大,降低fill阈值到0.7
  • Loading branch information
myhloli authored Apr 23, 2024
2 parents 61267ed + ce992f2 commit 1e69067
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion magic_pdf/pre_proc/ocr_dict_merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ def fill_spans_in_blocks(blocks, spans):
block_spans = []
for span in spans:
span_bbox = span['bbox']
if calculate_overlap_area_in_bbox1_area_ratio(span_bbox, block_bbox) > 0.8:
if calculate_overlap_area_in_bbox1_area_ratio(span_bbox, block_bbox) > 0.7:
block_spans.append(span)

'''行内公式调整, 高度调整至与同行文字高度一致(优先左侧, 其次右侧)'''
Expand Down

0 comments on commit 1e69067

Please sign in to comment.