What is Document Analysis in SDK products


What is Document Analysis in SDK products?


Within document processing, the OCR software product conduct a Document Analysis step to specify areas with the 'to-be-extracted' information - and receive important information for the later Document Export step.

Within the Document Analysis step, the logical structure of the document has to be analyzed and defined. For example:

  • What is the logical structure of the document?
  • Where are text blocks, paragraphs, lines?
  • Is there a table that should be reconstructed?
  • Are there any “images” on the page(s)?
  • Are there any barcodes to read?

Our technology contains several variants of Document Layout Analysis:

Automatic Document Analysis

The Document Analysis (DA) searches and “finds” zones for recognition on the document images.

How it works:

  • The Document Analysis algorithms detect different elementary objects on the image, e.g.
    • words or parts of words
    • separators
    • connected components
    • color gradients, inverted text areas
    • …etc.
  • In the next step, based on this information, hypotheses for these blocks are formed and checked:
    • What is the type of the block?
    • Where are the borders of the block?
    • What type of the document layout could it be (magazine, newspaper, book page) ?

In the following screenshot, the detected layout elements are displayed on the left-hand side (text, image and table blocks). These elements were precisely reconstructed in the processed document - the exported documents on the right-hand side.



A thorough document analysis is even more important for documents with complex layouts, multi-column magazine pages. The following screenshot shows the properly identified layout elements on the left-hand side, as well as the resulting page with precisely reconstructed columns. 

Intelligent Document Analysis is detecting columns:



Without the intelligent layout analysis, only one large text block would be identified which would make the column reconstruction impossible and the resulting document would not be readable and therefor not usable for a human.

No detection of columns:



Document/Layout Analysis Modes

Automatic Document Analysis in the SDKs can work in the different settings:

  • Text, images, tables and barcodes are detected
  • Even if text areas are embedded in images, such as company logos.

It is as well possible to use the SDK without applying the built-in document layout analysis. In this case, the developer can create its own blocks/recognition areas by setting the coordinates manually. This is a common way when only specific data should be extracted from the page and the position of this data is known (Field-Level-OCR - Zonal OCR)


Have more questions? Submit a request



Please sign in to leave a comment.