How many processing steps are in the SDK products?
In this article, these steps describe the options for FineReader Engine for Windows in more details. The focus here is on “full text OCR” and “document conversion” - please note that some of the steps are slightly different in other scenarios, especially when extracting only specific data - data capture scenarios. (Note: The options describe below might not be available in older versions.)
The typical document conversion process contains the following steps:
- Document input
- Image preprocessing
- Document and layout analysis
- The actual recognition
- Verification of the results (optional step)
- Document export
FineReader Engine can process documents and images from different sources and different formats. This can be a variety of image formats, but also all types of PDFs. Since the version 12, the SDK can as well process different Office document formats.
Once document pages are loaded into the FineReader Engine, the SDK offers a lot of options for image optimization. This is important to ensure the best possible image quality before the actual recognition step. The higher the quality of the image, the better recognition results can be achieved. Image enhancement options are for example garbage and noise removal or skew correction. More information about it can be found here.
Document & Layout Analysis
After the image quality was enhanced, the document pages will be analyzed by advanced algorithms that leverage artificial intelligence mechanisms. The goal of this step is to “understand” the structure of the whole document as well as of each individual page.
Each page contains elements that need to be identified:
This processing step is very important as it clearly detected the text areas which will be then recognized during the next step. The information from this step will be used at the very end of the recognition process – the export step when the output format will be provided in the same layout as the original document (for example when converting a scan of a contract into an MS Word format).
The Document Analysis step is also needed for creation of a searchable PDF where the text will be placed as an 'invisible text layer' behind the original image - for a later search, it is important that the invisible text areas are on the same place as the image of the text. More information about it can be found here.
Once the recognition areas are detected and defined, each individual character and each individual word are recognized by sophisticated algorithms.
Several parameters play an important role:
Developers can work with the default settings, but they also have the possibility to choose from many different options to optimize the recognition process and achieve better recognition results. More information about it can be found here.
Verification & User Interaction
After the recognition step, the FineReader Engine provides basic information like Character Coordinates, as well as advanced attributes, like:
- Font and Formatting Information
- Word and Character Recognition Hypotheses
More information about it can be found here.
The last processing step is the document export. SDK technology offers access to the “pure text” as well as different export formats:
Editable formats like RTF and Microsoft Office
PDF, PDF/A Export
Internal Engine Format