Advantages of Multileves Document Analysis
In many cases, documents are more complex than just a page with black text printed on a white background. Modern complex layouts and formatting will often include different elements like tables, pictures, footers and headers, background images and so on.
In order to recognize such documents and preserve their complex formatting, all today’s OCR programs first analyse the structure of the document before they start reading it.
During such a process, several logical levels will be singled out. At the top level of this hierarchy would be only one object — the page itself. The other levels in descending order:
- page
- table, text block
- table cell
- paragraph, picture
- line
- word, picture within a line
- letter (character)
-
Any object in this hierarchy is composed of lower-level objects, e.g.
-
letters make words,
-
words make lines, etc.
-
-
The OCR technlogoy analyses the document from the top down:
-
it first divides the page into larger objects,
-
which in turn it divides into smaller objects etc. until it reaches the level of characters.
-
-
Once the characters have been singled out and recognized, a reverse process begins: The program assembles individual objects into larger objects that will be combined into the complete page at the end. This procedure is called Multilevel Document Analysis, or MDA.
-
If an OCR program makes a recognition error at one of the higher levels of analysis (e.g. if it mistakes a paragraph for a picture), there is very little chance that it will come up with the right result — the document will be recognized incorrectly.
-
To achive highest recognition accuracy, the ABBYY OCR technology analyses documents slightly differently than other OCR engines:
-
When recognizing objects of any level, the ABBYY technology is guided by the IPA principles.
-
The process starts by forming hypotheses about the nature of the objects and purposefully verifies its hypotheses.
-
At the same time it takes into account the distinctive features it has detected on the document and saves the newly acquired information for future use (self-learning).
-
More information:
Comments
0 comments
Please sign in to leave a comment.