Adaptive binarisation is an innovative approach within ABBYY's image pre-processing algorithms. The technologies were optimized to increase the quality of source images for the character recognizers. At the same time, the optimized binarisation approach allows to persevere more text on images with degrading qualist as well as to remove “noise” that is caused by 'shine trough' text from the back side of the document page.
Optimization of the binarization algorithms and its impact on the ammount of 'rescued' text.
The images below show the difference - with the optimized binarization process, significantly higher portion of text could be extracted from a page scanned with a negative impact of light. 'Standard binarization' would allow only to retreive text from areas not impacted by the light.
The images below show the difference when processing a low-quality image with white text on a black page. A standard binarization approach would only allow to extact text from a very small area of the image, while the optimized binarization algorithms allow to extract the text completely.
The images below show the difference when processing a newspaper pages (which are very thin) with text on the back page shining through the page. With the new binarization approach the shine-thourgh text can be ignored during the recognition process. This will lead to higher recognition results.
-
Bookscanners often use stron light. After standard binarization, this can result in “ghost text-lines” without any useful information shining though.
-
The ABBYY's improved binarization technology is able to detect that and remove the “garbage” before applying text recognition
The optimized binarization was introduced in version 10 of FineReader Engine (10/2010) and version 10 of FlexiCapture Engine (10/2012).
Comments
0 comments
Please sign in to leave a comment.