Question
What binarization enhancements are used in SDK products?
Answer
Adaptive binarization is an innovative approach within SDK products image pre-processing algorithms. The technologies were optimized to increase the quality of source images for the character recognizers. At the same time, the optimized binarization approach allows persevering more text on images with degrading quality as well as to remove “noise” that is caused by 'shine trough' text from the back side of the document page.
Optimization of the binarization algorithms and its impact on the amount of 'rescued' text.
The images below show the difference - with the optimized binarization process, significantly higher portion of text could be extracted from a page scanned with a negative impact of light. 'Standard binarization' would allow only to retrieve text from areas not impacted by the light.
The images below show the difference when processing a low-quality image with white text on a black page. A standard binarization approach would only allow to extract text from a very small area of the image, while the optimized binarization algorithms allow extracting the text completely.
The images below show the difference when processing a newspaper pages (which are very thin) with text on the back page shining through the page. With the new binarization approach, the shine-through text can be ignored during the recognition process. This will lead to higher recognition results.
-
Book scanners often use strong light. After standard binarization, this can result in “ghost text-lines” without any useful information shining though.
-
Improved binarization technology is able to detect that and remove the “garbage” before applying text recognition
Comments
0 comments
Please sign in to leave a comment.