Question
How many processing steps are in the SDK products?
Answer
All SDKs and other document conversion software products (such as FineReader Server or FineReader Desktop) conduct basic steps during the recognition process.
In this article, these steps describe the options for FineReader Engine for Windows in more details. The focus here is on “full text OCR” and “document conversion” - please note that some of the steps are slightly different in other scenarios, especially when extracting only specific data - data capture scenarios. (Note: The options describe below might not be available in older versions.)
The typical document conversion process contains the following steps:
- Document input
- Image preprocessing
- Document and layout analysis
- The actual recognition
- Verification of the results (optional step)
- Document export
Document Input
FineReader Engine can process documents and images from different sources and different formats. This can be a variety of image formats, but also all types of PDFs. Since the version 12, the SDK can as well process different Office document formats.
Image Preprocessing
Once document pages are loaded into the FineReader Engine, the SDK offers a lot of options for image optimization. This is important to ensure the best possible image quality before the actual recognition step. The higher the quality of the image, the better recognition results can be achieved. Image enhancement options are for example garbage and noise removal or skew correction. More information about it can be found here.
Document & Layout Analysis
After the image quality was enhanced, the document pages will be analyzed by advanced algorithms that leverage artificial intelligence mechanisms. The goal of this step is to “understand” the structure of the whole document as well as of each individual page.
Each page contains elements that need to be identified:
-
Paragraphs
-
Lines
-
Images
-
Barcodes
This processing step is very important as it clearly detected the text areas which will be then recognized during the next step. The information from this step will be used at the very end of the recognition process – the export step when the output format will be provided in the same layout as the original document (for example when converting a scan of a contract into an MS Word format).
The Document Analysis step is also needed for creation of a searchable PDF where the text will be placed as an 'invisible text layer' behind the original image - for a later search, it is important that the invisible text areas are on the same place as the image of the text. More information about it can be found here.
Recognition
Once the recognition areas are detected and defined, each individual character and each individual word are recognized by sophisticated algorithms.
Several parameters play an important role:
-
Languages
-
Fonts
-
Print types
Developers can work with the default settings, but they also have the possibility to choose from many different options to optimize the recognition process and achieve better recognition results. More information about it can be found here.
Verification & User Interaction
After the recognition step, the FineReader Engine provides basic information like Character Coordinates, as well as advanced attributes, like:
- Font and Formatting Information
- Word and Character Recognition Hypotheses
More information about it can be found here.
Export/Document Output
The last processing step is the document export. SDK technology offers access to the “pure text” as well as different export formats:
-
Text only
-
Editable formats like RTF and Microsoft Office
-
PDF, PDF/A Export
-
XML Export
-
Internal Engine Format
Comments
0 comments
Please sign in to leave a comment.