Recognition with Pattern Training

In this article, we describe the possibility to improve recognition results with training the technology on own user-created patterns.

Following topics are covered:

What is Recognition with Pattern Training

To improve the accuracy of the recognition results, it is possible to leverage own patterns and train the technology. This allows to use couple of pages in a training mode, where the user will manually correct the recognition results and this way train the technology. A trained pattern is additional information for the technology on how to recognize certain characters - for example decorative fonts or ligatures (where letters might seem to be 'glued' together.)

After the pattern training, ABBYY FineReader Engine then uses this information for recognition of the remaining text.

How does Pattern Training work

One or two pages are recognized in training mode, and, subsequently, a pattern is created. The pattern is used as a source of additional information during recognition to aid recognition of the remaining text.

user_pattern_training-1.png

A pattern is a set of pairs “a character image - the character itself” created during pattern training.

user_pattern_training-2.png

Sometimes two or even three characters may get “glued” together (see Wikipedia: Ligature), and cannot be separated. If this is the case (i.e. you cannot move the frame so that it contains only one whole character and no parts of the other character), you can train the whole 'inseparable' character combinations. Examples of character combinations frequently found 'glued together' include ff, fi, and fl (see picture above). Such combinations are referred to as ligatures.

user_pattern_training-3.png  

When is the Recognition with Pattern Training useful?

Extended recognition with custom trained pattern can be used for:

  • text is set in decorative fonts
  • text contains unusual characters (e.g. specific mathematical symbols)
  • ldarge number of (or long) documents of very low print quality (more than a hundred pages)

Note: It is recommended to use Recognition with Pattern Training only if one of the above applies. In 'standard situations', the slight increase in recognition quality will be outweighed by considerably longer processing times.

Patterns are only useful when the documents that will be processed and the document used to create the user pattern contain:
    • the same font
    • the same font size
    • the same resolution
Note: Pattern training is not supported for hieroglyphic or Asian languages.

Learn more about this feature in the video (relevant for FineReader Engine 11 as well as 12).

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.

Recently viewed