Prior to analysing the structure of the document and identifying its blocks, an OCR program will binarize the image, i.e.: it will convert a colour or a greyscale image into a monochrome one (1 bit).
Some documents might contain complex design elements as textures or background images (or both).
If an OCR program would use only a very simple binarisation, there will be too many excess dots left around the characters, which will have an adverse effect on the quality of recognition.
The same is true about binarising background images. Therefore it is crucial that the program can separate the text from the underlying textures and background images. To solve this issue, ABBYY technologies use two pre-processing procedures:
- Intelligent Background Filtering
- Adaptive Binarization
Intelligent Background Filtering
Intelligent Background Filtering (IBF) allows the program to separate text strings from the background, even for complex layout, and the system selects the optimal binarization parameters for each individual region. Moreover, once the document has been treated with IBF, the lower-level objects such as text blocks and tables on pages with complex layouts can be detected more accurately.
Intelligent Background Filtering at work:
The presence of background images or textures is not the only factor that can impair recognition quality. As well the low contrast of the original document and the changing brightness of the background might cause low recognition quality. For such documents, the Adaptive Binarization procedure is used. It measures the brightness of the background and the saturation of the black areas along the line in order to find optimal binarization parameters for each separate line's fragment. As a result, the lines and words will be correctly detected to reach higher recognition accuracy.
Adaptive Binarization at work:
Adaptive Binarization - impact on characters
The following image shows why proper binarization is important for good OCR results: