OCR Processing Steps

All ABBYY SDKs and other document conversion software products (such as FineReader Server or ABBYY FineReader Desktop) conduct basic steps during the recognition process.

In this article, these steps  describe the options for ABBYY FineReader Engine for Windows in more details. The focus here is on “fulltext OCR” and “document conversion” - please note that some of the steps are slightly different in other scenarios, especially when extracting only specific data - data capture scenarios. (Note: The options describes below might not be available in older versions.)

The typical document conversion process contains following steps:

  1. Document input
  2. Image preprocessing
  3. Document and layout analysis
  4. The actual recognition
  5. Verification of the results (optional step)
  6. Document export

1) Document Input

ABBYY FineReader Engine can process documents and images from different sources and different formats. This can be a variety of image formats, but also all types of PDFs. Since the version 12, the SDK can as well process different Office document formats.

2) Image Preprocessing

Once document pages are loaded into the FineReader Engine, the SDK offers a lot of options for image optimization. This is important to ensure the best possible image quality before the actual recognition step. The higher the quality of the image, the better recognition results can be achieved. Image enhancement options are for example garbage and noise removal or skew correction. Read more.

3) Document & Layout Analysis

After the image quality was enhanced, the document pages will be analyzed by advanced algorithms that leverage artificial intelligence mechanisms. The goal of this step is to “understand” the structure of the whole document as well as of each individual page.

Each page contains elements that need to be identified:

  • Paragraphs
  • Lines
  • Images
  • Barcodes

This processing step is very important as it clearly detected the text areas which will be then recognized during the next step. The information from this step wil be as well used at the very end of the recognition process, the export step when the output format will be provided in the same layout as the original document (for example when converting a scan of a contract into a MS Word). 

The Document Analysis step is also needed for creation of a searchable PDF where the text will be placed as an 'invisible text layer' behind the original image - for a later search, it is important that the invisible text areas are on the same place as the image of the text . Read more >

4) Recognition

Once the recognition areas are detected and defined, each individual character and each individual word are recognized by sophisticated algorithms.

Several parameters play an important role:

  • Languages
  • Fonts
  • Print types

Developers can work with the default settings, but they have as well to possibility to chose from many different options to optimize the recognition process and achieve better recognition results. Read more.

5) Verification & User Interaction

After the recognition step, the FineReader Engine provides basic information like Character Coordinates, as well as advanced attributes, like:

  • Font and Formatting Information
  • Word and Character Recognition Hypotheses

Read more.

 

6) Export/Document Output

The last processing step is the document export. ABBYY technology offers access to the “pure text” as well as different export formats:

  • Text only
  • Editable formats like RTF and Microsoft Office
  • PDF, PDF/A Export
  • XML Export
  • Internal Engine Format

More details on the Output Options can be found here.

 

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.

Recently viewed