ABBYY FineReader Engine 11 for Linux X64 for GLIBC 2.5 - Release Notes

Part number, build number

Part# 1155/6
Build#  11.1.3.458651

Software and Hardware Requirements

ABBYY FineReader Engine 11 for Linux is designed for glibc version 2.5 and above.

For the FineReader Engine dynamic library the standard libstdc++.so.6 and libgcc_s.so.1 libraries have to be used.

ABBYY FineReader Engine 11 for Linux has been tested on the following operating systems:

  • Red Hat Enterprise Linux 5.9
  • Oracle Enterprise Linux 5.9
  • Ubuntu 12.04 LTS @ Amazon EC2

PC with Intel® Pentium® or compatible processor (1 GHz or higher) which supports SSE and SSE2 instruction sets.

Memory:

  • for processing one-page documents — minimum 400 MB RAM, recommended 1 GB RAM
  • for processing multi-page documents — minimum 1 GB RAM, recommended 1,5 GB RAM

Hard disk space: 800 MB for library installation and 100 MB for program operation plus additional 15Mb for every processing page of a multi-page document.

New Features and Improvements

Glibc 2.5 is supported

This is the release with glibc 2.5 support.

Java Samples

Hello sample for Java has been added.

This Hello sample demonstrates how to use ABBYY FineReader Engine in Java within JNI (Java Native Interface).  It includes a C++ “HelloJNI” project with a sample JNI-wrapper implementing the most essential methods of FRDocument object.

The JNI-wrapper was ported from FineReader Engine 11 for Windows. It was tested with JDK 1.6, 1.7.

Fixed Bugs

There are no new fixes.

Known Issues and Workarounds

Wrong list of supported OS and glibc versions in the Help file

The list of supported operating systems and glibc versions was not updated.

 BatchProcessor is not stable

BatchProcessor could miss images in the processing queue in case there is not enough memory.

Warnings during parallel processing

The following warnings could appear during parallel processing:

OMP: Warning #72: KMP_AFFINITY: affinity only supported for Intel(R) processors.

OMP: Warning #71: KMP_AFFINITY: affinity not supported, using "disabled".

These warnings occur on non-intel processor hosts and as a workaround it is recommended to set “export KMP_AFFINITY=disabled”

Activation scripts shows several unavailable modules

There are some modules which are unsupported on Linux, but they are shown in the activation script.

LINUX VERSION LIMITATIONS

The following functionality of FineReader Engine 11 for Windows is not available in the Linux version:

DjVu opening

Scanning

ICR/OMR

Visual Components and other GUI elements

WDP/WIC/BITMAP input formats and other Windows-specific functionality

Samples being installed under ‘root’ account are read only for others

Samples are read-only for accounts with less than ‘root’ privileges if they are installed under ‘root’ account.

This will be fixed in the maintenance release.

IFontSet::EnablePdfStandardFonts is non-functional

It is not possible to use the standard PDF fonts during synthesis. That may lead to font embedding even though the text is typed using one of the standard PDF fonts.

This will be fixed in the maintenance release.

OnPageProcessed comes once per a document

During opening, synthesis, exporting stages the callback OnPageProcessed arrives only once per a document rather than after each page.

This will be fixed in the maintenance release.

Some API is not implemented

The following API is not implemented in FRE 10 and FRE 11:

IFootnoteSeries::IsNumberingWithSuperscript. Always returns “false”.

IFootnoteSeries::PositionOnPage. Always returns “FPPT_SingleColumnSection”.

IFootnoteSeries::PositionInDocument. Always returns “FPDT_PageEnd”.

IFootnoteSeries::HasSeparator. Always returns “true”.

ITextPicture::ColumnNumber. Always returns “0”.

ICharParams::IsWordStart. Always returns “false”. It is true only for character parameters got through IWordRecognitionVariants interface.

IIncut::TextWrapping. Always returns “TW_Undefined”.

IRunningTitlesSeriesText::HasSeparator. Always returns “false”.

The implementation is not planned.

Recognition of a PDF file with text in Vietnamese may fail

Recognition fails with internal program error on certain PDF files with text layer in Vietnamese when “content reuse mode” is on.

This will be fixed in the maintenance release.

PDF/A validation report

The following issues are known for PDF/A files produced by this release of FRE 11:

Adobe Acrobat 11.0.3 reports “Text cannot be mapped to Unicode” for 2% of images in CJK languages recognized and exported into PDF/A-1a or PDF/A-2a formats. On the other hand http://www.pdf-tools.com/pdf/validate-pdfa-online.aspx on-line validator finds no issue in the same documents.

callas pdfaPilot, 3.1 (156) and 4 report “Image is not valid” for few images exported into PDF/A-2a (-2u) format. At the same time Adobe Acrobat 10.1.4, 11 report no issues with these files.


Part number, build number

Part# 1155/11
Build#  11.1.4.506920

New Features and Improvements

Java wrapper is included into the distribution

From now on Java wrapper is implemented and entire SDK functionality is available through it. Now FRE can be easily integrated into Java environment.

RPM is now used in distributives

RPM software package is available.

IEngine::InjectTextLayer method was added

This is a new method which creates a copy of the input PDF file and adds the text layer which corresponds to the recognized text of the document.

Support for IntelligentMail barcodes

Starting with this release, ABBYY FineReader Engine recognizes IntelligentMail barcodes. Intelligent Mail is a height-modulated 65-bar barcode which is used on mail in the United States. It is also known as USPS 4-CB.

BT_ IntelligentMail was added to BarcodeTypeEnum enumeration constants.

Predefined languages OcrA and OcrB were added

The OCR A and OCR B modules now license not only corresponding text types but also the predefined languages which are provided for these text types. Apart from regular characters the alphabet for these languages also contains special symbols from OCR_A and OCR_B standards.

Detection of vertical text for all languages

IPageAnalysisParams::DetectVerticalEuropeanText property is a new property which allows you to detect vertical text in languages other than CJK (HelpDesk request 369221).

New IClassificationTrainer::AddPage method

From now on FRE 11 supports IClassificationTrainer::AddPage() method.

This is a new method which allows you to add an already recognized page to the classification database.

The possibility to add custom features for classification

Now it is possible to add custom features that can enhance the accuracy of classification.

The following methods were implemented:

1. IClassificationTrainer::AddFeaturesForPage()

This is a new method which supports training the classification database with the help of user-defined additional features.

2. IFRPage::ClassifyEx()

This is a new method which supports classification with the help of user-defined additional features.

New predefined profile EngineeringDrawingsProcessing

This profile is specially developed for engineering drawings and maps recognition.

EngineeringDrawingsProcessing would be useful for text extraction disregarding its orientation and conversion of engineering drawings from PDF to searchable PDF.

Export with original layout to XLSX

New property IXLExportParams::LayoutRetentionMode that allows to export files to XLSX with original layout was added.

It is possible to choose the mode of retaining tables’ formatting during export to XLSX. XLSXLayoutRetentionModeEnum gives the ability to choose the mode that is better corresponding to the assigned task.

New properties for IBarcodeParams object

This release includes a new property, which allows you to detect more barcodes, but slows down the processing. IBarcodeParams::EnableAdvancedExtractionMode property enables the feature.

New IBarcodeParams::MinRatioToTextHeight  property defines the minimal acceptable height of the barcode in relation to the average letters height. This setting can help to detect low barcodes.

Support for embedded files in PDF

IPDFExportFeatures::WriteSourceAttachments property specifies if attachments from the original PDF file should be written to the output file. This property is ignored if the original file was not PDF.

New values PCM_Pdfa_3a and PCM_Pdfa_3u were added to enumeration constants PDFAComplianceModeEnum  for PDF/A-3a and PDF/A-3u  formats correspondingly .

New property IEngine::Version

IEngine::Version returns the build number of the ABBYY FineReader Engine version you are using.

Screenshot detection

From now on, our technologies are able to identify screenshots in documents. The document analyzer detects screenshots as image blocks if EnableTextExtractionMode property set to false (it is default value). Otherwise, text blocks can be detected on the screenshots (HelpDesk request 343728).

Improved key value tables detection

Significant improvements were made for key value tables’ detection in this release.

Improved memory allocation

The developers have successfully managed to significantly reduce memory fragmentation. Thus, the error OUT_OF_MEMORY was fixed in many cases.

BatchProcessor acceleration on small images

From this release BatchProcessor works faster on the batches with small images.

Silent mode for runtime installation is now supported

In previous release it was only possible to launch the installation process in interactive mode. Now it possible to launch the installation process in automatic mode setting all the parameters via command line.

Fixed Bugs

This section contains a list of bugs reported by customers that have been fixed.

The following terminology is used for frequency in the list:

General bugs are frequently reproduced.

Rare bugs are seldom reproduced on certain conditions.

Specific bugs are reproduced in custom images in very specific scenarios.

Frequency Description Subsystem HD # Office

General

 

"Segmentation fault" occurs on customer’s specific Virtual Machine API 408410 UA
The properties Title, Producer and DocumentInformationDictionnary of DocumentContentInfo were not recorded during the opening of PDF file. Now the issue is fixed. API 398171 EU
The size of PDF file with Jpeg2000 compression was bigger than with Jpeg compression. Now the compression works correctly and PDF files with Jpeg2000 compression are smaller than PDF files with Jpeg compression. API 380139 EU
CommandLineInterface sample crashed after several launches of this sample. Samples 388270 EU
An error during export in CLI sample in case of export limitations in the license. Samples 379964 US
New information was added in the Help file that the detection of fonts depends on the installed fonts packages. Help 392454 US
There was an error during the registration of libraries in the system by sudo Inconfig command. The problem was in the linker scripts. Now the issue is fixed. API 385322 EU
The distributive didn’t work on Linux host with 32 physical cores. API 381584 EU

IRTFExportParams::KeepPages property didn’t work correctly. There was a disparity between the API and the Help file. Now this property is working as described in the Help file: the value of this property is used only if the PageSynthesisMode property is set to PSM_RTFFormatParagraphs or PSM_RTFPlainText, otherwise it is ignored.

 

API - -
An error occurred during the call GetEngineObjectEx with the property IsSharedCPUCoresMode = true. API - -
Samples being installed under ‘root’ account were read only for other accounts. Now every user is able to modify the samples. Samples - -

Known Issues and Workarounds

Linux version limitations

 The following functionality of FineReader Engine 11 for Windows is not available in the Linux version:

DjVu opening

Scanning

ICR/OMR

Visual Components and other GUI elements

WDP/WIC/BITMAP input formats and other Windows-specific functionality

OnPageProcessed comes once per a document

During opening, synthesis, exporting stages the callback OnPageProcessed arrives only once per a document rather than after each page.

This will be fixed in the maintenance release.

Some API is not implemented

The following API is not implemented in FRE 10 and FRE 11:

IFootnoteSeries::IsNumberingWithSuperscript. Always returns “false”.

IFootnoteSeries::PositionOnPage. Always returns “FPPT_SingleColumnSection”.

IFootnoteSeries::PositionInDocument. Always returns “FPDT_PageEnd”.

IFootnoteSeries::HasSeparator. Always returns “true”.

ITextPicture::ColumnNumber. Always returns “0”.

ICharParams::IsWordStart. Always returns “false”. It is true only for character parameters got through IWordRecognitionVariants interface.

IIncut::TextWrapping. Always returns “TW_Undefined”.

IRunningTitlesSeriesText::HasSeparator. Always returns “false”.

The implementation is not planned.

An error during the unloading of libFREngine.so on SLES 11 SP2 and SP3

An error message is shown during the unloading of libFREngine.so on SLES 11 SP2 and SP3.

Memory leak during processing of PDF files on Java

Out of memory asserts and internal program errors periodically occur during processing of large amount of PDF files.

We recommend to use 4 GM RAM or more for parallel processing of large PDF documents.

PDF/A validation report

The following issues are known for PDF/A files produced by this release of FRE 11:

1. Adobe Acrobat 11.0.5 (Preflight 11.0.4) detects an error for PDF/A with attachments (PCM_Pdfa_3a):

“Embedded file does not have AF entity”

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request