Fixed the GetUTF8Text method to return nullptr instead of asserting #4451
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
Description
This PR fixes a crash in the
GetUTF8Textmethod by returningnullptrinstead of asserting whenbest_choiceisnullptr.Background
The issue was originally reported in sirfz/tesserocr#324, where a crash/assertion failure occurred when passing an empty image to tesserocr/Tesseract. The error message was:
While reproducing the issue with a test C++ program, I encountered a similar crash in another location:
Changes
GetUTF8Textto safely returnnullptrifbest_choiceisnullptr, instead of triggering an assertion failure.Additional Notes
I tested this with Warp2, which suggested updating multiple instances of
ASSERT_HOST(best_choice != nullptr);. This PR addresses the crash in the context ofGetUTF8Text.There are still 3 other occurrences of this assertion in the following files:
src/ccmain/recogtraining.cppsrc/ccstruct/pageres.cppIt's unclear if those should also be replaced, as they may be used in different contexts (e.g., training or layout analysis). Further review is recommended before changing them.
Reproduction
Here is a minimal test program that reproduces the crash (when using an empty image):