Missing character when saving after using multiple add_redact_annot & insert_htmlbox #3270
              
                Unanswered
              
          
                  
                    
                      Nader-Khalil
                    
                  
                
                  asked this question in
                Looking for help
              
            Replies: 2 comments 2 replies
-
| 
         Highly complex post - and apparently not a bug, but a call for help.  | 
  
Beta Was this translation helpful? Give feedback.
                  
                    0 replies
                  
                
            -
        
  | 
  
Beta Was this translation helpful? Give feedback.
                  
                    2 replies
                  
                
            
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Description of the bug
Hello there,

I'm trying to use the library for pdf translation process,
I managed to do lots of stuff but i stuck in a problem
This is the an original page of a pdf that contains 9 pages :
This is the result after translation to Arabic :

you can notice lots of missing characters in a lot of words [ I already pointed to some with a red arrow ]
so after some tracing and trials, i tried to make the pdf with this page only to make the tracing more simple :

and the surprise the letters appear! 😅 not all There are still 2 missing but lots have been shown correctly now
you can note the correction in the middle squares compared to the previous image
I need help handling this, whenever the page contains fewer blocks it goes well, but when the page contains lots of blocks the problem occurs ,
I have also another question the pdf after saving had its size increased alot is there something to do to reduce this to be close to the orginal size ?? cause it's more than x10
Thanks in advance
How to reproduce the bug
i'm using python code in these steps :
python in Google Colab
Define translate_sent(sent, source, trgt) function:
Translate text from source language to target language
Define check_table(page, text, bbox) function:
Determine if the specified text in a bbox is a table by checking alignment
Define process_table_text(page, bbox, text, src_lang, trgt_lang, css_style) function:
## Can be Neglected for now nothing wrong with it
Translate and replace table text on the page within the specified bbox using provided styles
The exact function to handle the page is as follows:
for running :
PyMuPDF version
1.23.26
Operating system
Other
Python version
3.10
Beta Was this translation helpful? Give feedback.
All reactions