ABBYY FineReader Engine 11 for Windows - Release 1

ABBYY FineReader Engine 11 for Windows - Release 1

  • Release date: 24 October 2013 (public release)
  • Part #: 1041/17
  • Build #: 11.1.4.118
ABBYY FineReader Engine 11 offers a variety of new built-in features and improvements making it the ideal text recognition and document conversion SDK for your systems and applications. Highlights of the new key features include:
  • Classification - Image and text-based document type detection
  • Business Card Recognition - Single and multi-card processing, vCard export
  • Extended PDF Capabilities - PDF/A2 & PDF/A-3 support, enhanced PDF processing
  • New and Improved OCR Technology - New OCR and ICR languages, new barcode types, improved image pre-processing Development Improvements 64-bit support, asynchronous scanning, new Java Native Interface (JNI) support

Classification

The classification API is used to create a classification database, which describes several types of documents to be classified, and to classify documents on the basis of this defined set.
This content based classification is intended for simple scenarios, where a document should be example classified, for example as: contracts, invoices or receipts.

fre11_autodocuclassi_e.jpg

The two main steps in detail:

Creating a classification database
You select several images of documents of each type. Representatives of each type have similar appearance (similar layout of elements). You can use these images to create the classification database. Scanned images or photos may need some pre-processing before database creation.
Classifying documents
You can use the created database to classify documents in the document flow. You scan documents, or load photographed documents and pass them to the pre-trained classification system which use the classification database to determine the type of each document. You may update the classification database each time you add new types of documents or change existing ones.
The SDK also contains a code sample how to train and work with the new classification API
 

Business Card Recognition - New Business Card recognition API

FineReader Engine 11 includes a special API section for data extraction from business cards.
Extended automatic card splitting, when the scan/image contains several business cards
Predefined processing profile available BusinessCardsProcessing
fre11-bc-object.png
The SDK also contains a code sample how to integrate business card reading.
 
Further information about Business Card Reading

Extended PDF Capabilities

Multiple new API options give developers more control over the PDF processing steps allowing them to fine-tune their own applications and services.

  • Opening (small) PDF files from memory
  • Up to 12% faster export speed, compared to previous technology
  • Ability to specify resolution for rasterization during PDF opening
  • Keep Bookmarks
  • Higher quality of highly compressed MRC PDFs
  • Higher background image compression in V11 can reduce the size of output PDF MRC files up to 50% (compared to version 10 implementation, based on ABBYY internal tests).

PDF/A-2 Support

In addition to the common PDF and PDF/A-1 formats, FineReader Engine 11 now exports to PDF/A-2. The new options of the ISO standard format are:

  • Support of JPEG2000 compression to generate smaller files
  • PDF/A-2a – tagged & unicode PDF/A-2
  • PDF/A-2u – not-tagged PDF/A-2 with an ability to extract text in Unicode.

fre11_grafik_pdfa2_e.jpg

PDF/A-2 enables creation of smaller PDF files using JPEG2000 compression. For long-term archiving, this can help reduce used storage space and enable faster access when working on low bandwidth networks.

PDF/A-3 Support

PDF/A-3 is an extension of the A-2 standard which allows inclusion of file formats such as XML, CSV, CAD, word-processing documents, spreadsheet documents, and others into PDF/A documents. Long-term archiving and readability of the PDF/A part is still guaranteed, and the binary attachments can deliver additional benefits.

fre11_grafik_pdfa3_e.jpg

The PDF/A-3 extended container capabilities make this format attractive in new areas, for example when a graphical representation of a document should be combined with some source data. The new e-invoice format defined by the Forum for Electronic Invoices Germany (FeRD) is based on PDF/A-3 and XML formats.

Read more about ZUGFeRD processing with ABBYY SDKs. (The API extention allowing adding binary attachements to the PDF/A-3 export was introduced in Release 3.)

Development Improvements

  • Native 64-bit Support
FineReader Engine 11 provides C++ DLLs that can be linked in x64 applications directly without using a COM proxy. The neutral .NET interops allow .Net projects for 32-bit or 64-bit machines without re-compilation. The new 64-bit support makes it easier to integrate and to roll out ABBYY OCR technology in applications that need more than 4 GB of RAM.
  • Simplified Java Integration
    FineReader Engine 11 can be used from Java on 64-bit systems either by loading into the current process (InprocLoader), or by loading into a separate process (OutprocLoader). The new ready-to-use Java classes for the Engine library cover the full API*.
  • Extended Scanning Capabilities
Asynchronous Scanning enables recognition of scanned pages before scanning of all pages is finished.
Extended access to scan settings, including access to scan source capabilities.
Ability to specify compression type of scanned images.
The new code sample makes it easy to implement better and faster scanning for your application.

New and updated Code Samples

New: Classification, Business Card Reading, Scanning, Thread-Pool
Updated: Image preprocessing, Camera OCR

New ABBYY Arabic OCR technology

Arabic as a new OCR language will be supported from Version 11 on as an official OCR language (previously as technical preview) and can be combined with other available OCR languages. Arabic now also comes with dictionary support.

Compared to the technical preview in Version 10 the number of incorrectly recognized words for Arabic OCR has dropped by 50%. At the same time, recognition speed is up to 3 times faster (based on ABBYY test and test set).

New and improved language support

  • New OCR languages: Turkmen (Latin) and Old Slavonic
  • New ICR languages: Danish, Norwegian (Bokmal & Nynorsk), Old English, Serbian (Cyrillic), Tajik
  • Latin language has full dictionary support

Improved OCR for Chinese, Japanese & Korean

Processing speed in fast mode has been increased, while maintaining accuracy level.
  • Japanese up to 2.5 times faster
  • Chinese (Simplified) up to 2.5 times faster
  • Chinese (Traditional up to 4.0 times faster
  • Korean up to 2.5 times faster.
  • User dictionaries can be created for Japanese and Korean languages
  • All elements of UI and messages of FineReader Engine 11 are now available in Japanese.

Improved Image Pre-processing

Input image quality is a key factor in achieving good OCR results. At the end recognition works faster delivers higher accuracy. Better image quality also enables higher compression rates for MRC PDFs.

  • Extended geometrical distortions correction
  • Auto-cropping
  • Auto-splitting of double-pages
  • Background lightening
  • Better ISO noise removal
  • Separation of colour objects on the document (planned for Release 3)

New Text Type: Receipts

The new, optimized text type ensures better OCR results for receipts. This is important for solutions for automation of travel expense processing.
receipt_02_030.png

Support for a new Barcode Type - MaxiCode

MaxiCode - created and used by United Parcel Service
maxicode-sample.png

Synthesis & Export

Extended ABBYY XML export with the ability to save information of paragraph styles and roles in XML file.

Improved font management API and extended access to the fonts used during document synthesis (predefined font filters)

Export of business cards to vCard format
Recreation of the logical structure of a document is an option during export to RTF, DOCX, and HTML formats
New color settings for embedded pictures in RTF, DOCX, PPTX, HTML, EPUB, and FB2 formats
Export to XPS (XML Paper Specification)

Other improvements

  • FineReader Engine collections can be iterated using the for each statement in .NET
  • Ability to cancel processing operation and repeat processing of a page with Batch Processor
  • OCR Language Auto-Detection

* Indicates functionality not available immediately, but planned for release in a maintenance release of FineReader Engine 11.

Was this article helpful?

1 out of 1 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.

Recently viewed