In this article, we describe the possibility to improve recognition results with training the technology on own user-created patterns.
Following topics are covered:
- What is recognition with pattern training
- How does pattern training work
- When it is useful to use pattern training
What is Recognition with Pattern Training
To improve the accuracy of the recognition results, it is possible to leverage own patterns and train the technology. This allows to use couple of pages in a training mode, where the user will manually correct the recognition results and this way train the technology. A trained pattern is additional information for the technology on how to recognize certain characters - for example decorative fonts or ligatures (where letters might seem to be 'glued' together.)
After the pattern training, ABBYY FineReader Engine then uses this information for recognition of the remaining text.
How does Pattern Training work
One or two pages are recognized in training mode, and, subsequently, a pattern is created. The pattern is used as a source of additional information during recognition to aid recognition of the remaining text.
A pattern is a set of pairs “a character image - the character itself” created during pattern training.
Sometimes two or even three characters may get “glued” together (see Wikipedia: Ligature), and cannot be separated. If this is the case (i.e. you cannot move the frame so that it contains only one whole character and no parts of the other character), you can train the whole 'inseparable' character combinations. Examples of character combinations frequently found 'glued together' include ff, fi, and fl (see picture above). Such combinations are referred to as ligatures.
When is the Recognition with Pattern Training useful?
Extended recognition with custom trained pattern can be used for:
text is set in decorative fonts
text contains unusual characters (e.g. specific mathematical symbols)
ldarge number of (or long) documents of very low print quality (more than a hundred pages)
Note: It is recommended to use Recognition with Pattern Training only if one of the above applies. In 'standard situations', the slight increase in recognition quality will be outweighed by considerably longer processing times.
the same font
the same font size
the same resolution
Learn more about this feature in the video (relevant for FineReader Engine 11 as well as 12).