ABBYY FineReader Engine 11 for Windows - Release 1
Release date: 24 October 2013 (public release)
Part #: 1041/17
Build #: 18.104.22.168
Classification - Image and text-based document type detection
Business Card Recognition - Single and multi-card processing, vCard export
Extended PDF Capabilities - PDF/A2 & PDF/A-3 support, enhanced PDF processing
New and Improved OCR Technology - New OCR and ICR languages, new barcode types, improved image pre-processing Development Improvements 64-bit support, asynchronous scanning, new Java Native Interface (JNI) support
The two main steps in detail:
You select several images of documents of each type. Representatives of each type have similar appearance (similar layout of elements). You can use these images to create the classification database. Scanned images or photos may need some pre-processing before database creation.
You can use the created database to classify documents in the document flow. You scan documents, or load photographed documents and pass them to the pre-trained classification system which use the classification database to determine the type of each document. You may update the classification database each time you add new types of documents or change existing ones.
Business Card Recognition - New Business Card recognition API
Extended PDF Capabilities
Multiple new API options give developers more control over the PDF processing steps allowing them to fine-tune their own applications and services.
Opening (small) PDF files from memory
Up to 12% faster export speed, compared to previous technology
Ability to specify resolution for rasterization during PDF opening
Higher quality of highly compressed MRC PDFs
- Higher background image compression in V11 can reduce the size of output PDF MRC files up to 50% (compared to version 10 implementation, based on ABBYY internal tests).
In addition to the common PDF and PDF/A-1 formats, FineReader Engine 11 now exports to PDF/A-2. The new options of the ISO standard format are:
- Support of JPEG2000 compression to generate smaller files
- PDF/A-2a – tagged & unicode PDF/A-2
- PDF/A-2u – not-tagged PDF/A-2 with an ability to extract text in Unicode.
PDF/A-2 enables creation of smaller PDF files using JPEG2000 compression. For long-term archiving, this can help reduce used storage space and enable faster access when working on low bandwidth networks.
PDF/A-3 is an extension of the A-2 standard which allows inclusion of file formats such as XML, CSV, CAD, word-processing documents, spreadsheet documents, and others into PDF/A documents. Long-term archiving and readability of the PDF/A part is still guaranteed, and the binary attachments can deliver additional benefits.
The PDF/A-3 extended container capabilities make this format attractive in new areas, for example when a graphical representation of a document should be combined with some source data. The new e-invoice format defined by the Forum for Electronic Invoices Germany (FeRD) is based on PDF/A-3 and XML formats.
Read more about ZUGFeRD processing with ABBYY SDKs. (The API extention allowing adding binary attachements to the PDF/A-3 export was introduced in Release 3.)
- Native 64-bit Support
- Simplified Java Integration
FineReader Engine 11 can be used from Java on 64-bit systems either by loading into the current process (InprocLoader), or by loading into a separate process (OutprocLoader). The new ready-to-use Java classes for the Engine library cover the full API*.
- Extended Scanning Capabilities
New and updated Code Samples
New ABBYY Arabic OCR technology
Arabic as a new OCR language will be supported from Version 11 on as an official OCR language (previously as technical preview) and can be combined with other available OCR languages. Arabic now also comes with dictionary support.
Compared to the technical preview in Version 10 the number of incorrectly recognized words for Arabic OCR has dropped by 50%. At the same time, recognition speed is up to 3 times faster (based on ABBYY test and test set).
New and improved language support
- New OCR languages: Turkmen (Latin) and Old Slavonic
- New ICR languages: Danish, Norwegian (Bokmal & Nynorsk), Old English, Serbian (Cyrillic), Tajik
- Latin language has full dictionary support
Improved OCR for Chinese, Japanese & Korean
- Japanese up to 2.5 times faster
- Chinese (Simplified) up to 2.5 times faster
- Chinese (Traditional up to 4.0 times faster
- Korean up to 2.5 times faster.
- User dictionaries can be created for Japanese and Korean languages
- All elements of UI and messages of FineReader Engine 11 are now available in Japanese.
Improved Image Pre-processing
Input image quality is a key factor in achieving good OCR results. At the end recognition works faster delivers higher accuracy. Better image quality also enables higher compression rates for MRC PDFs.
- Extended geometrical distortions correction
- Auto-splitting of double-pages
- Background lightening
- Better ISO noise removal
- Separation of colour objects on the document (planned for Release 3)
New Text Type: Receipts
Support for a new Barcode Type - MaxiCode
Synthesis & Export
Extended ABBYY XML export with the ability to save information of paragraph styles and roles in XML file.
Improved font management API and extended access to the fonts used during document synthesis (predefined font filters)
- FineReader Engine collections can be iterated using the for each statement in .NET
- Ability to cancel processing operation and repeat processing of a page with Batch Processor
- OCR Language Auto-Detection
* Indicates functionality not available immediately, but planned for release in a maintenance release of FineReader Engine 11.