ABBYY FineReader Engine 11 for Linux - Release 8

ABBYY FineReader Engine 11 for Linux - Release 8

  • Release date: 21 March 2017 (public release)
  • Part #: 1161/28
  • Build #: 11.1.19.854867

Image pre-processing: Ability to remove garbage from color images

A new extended method to remove garbage from images was added. This new method works with color as well as with black-and-white images.

fre11r7_garbage_removal_input2_illu.png
Color image: Before removing garbage
fre11r7_garbage_removal_output2_illu.png
Clor image: After removing garbage

Note: With black-and-white images, the new method works the same way as the existing method ImageDocument::​RemoveGarbage. This method will be deprecated in future.

Ability to inject a text layer into selected pages of a PDF document - extended method

A new method allows to process specific pages in the “image only” or “image on text” PDF documents. With this method, it is possible to create a searchable PDF file which contains the original page image and the 'invisible' text layer created from the recognized text. This method was extended to allow for more flexibility: Now it is possible to individually specify the pages, in which the text layer should be injected and to correct orientation and skew of an image when injecting the text layer into PDF.
Note: Only PDF files can be processed using this method. If the input file already contains a text layer, it will be replaced. ABBYY FineReader Engine offers methods for checking if the input file contains a text layer. See the next point.

Extension of method for detecting text layer in PDFs

The method for detecting text layer in PDFs has been extended. PDFs imported from memory stream can now be checked for text layer without a need to write the stream into a temporary file which increases the overall processing speed.

Ability to rasterize FreeText annotations

Annotations in PDF enable to display text, images, shapes etc. and allow to easily comment a document or highlight important areas. When processing PDF documents that contain Text Box annotations and exporting them to PDF, it is now possible to retain all information from annotations in FreeText type in PDF.

Export for multi-page PDFs documents with an undefined number of pages

This feature is an extention of the new API for faster export for large multipage documents. It is not necessary to know the number of pages in the document at the moment of creating a recognition session. This can be useful for effective scanning when the number of pages is significant but unknown in advance, and at the same time, the processing should start before the scanning process was finished.

Ability to adjust a time zone for PDF export

In ABBYY FineReader Engine it is possible to write the modification and the creation date using UTC format into the PDF file. This feature was extended by the possibility to specify a time zone that will be used for the creation and modification date of the exporting documents. Several PDF viewer applications display creation/modification date of the document without using information about the user’s time zone. In some cases this missing information might be very important. These new options will allow to specify the creation and modification date for each PDF file.
fre11r7_adjust_time-zone_illu.png

Faster PDF printing when using MRC compression [Technical preview]

A new option in the set of MRC correction parameters allows to tune Mixed Raster Content parameters for PDF export. This increases the PDF printing speed.
Note: At the moment, the feature is implemented as a technical preview.

Improved readability of exported XML data for users

The default value of paragraph style names are now automatically generated according to the paragraphs’ role and modifications, which were applied to the style. This improves the readability of XML-based text and simplifies work for operators or system administrators. To increase flexibility, users can also set a paragraph style name manually.

Ability to exclude BOM during export to TXT

New export option allows specifying, whether the byte order mark (BOM) should appear at the start of the text stream, when the document is exported to TXT format in UTF-8 encoding. This saves Java developers from programming workarounds for discarding the BOM characters at the beginning of the file.

Ability to simultaneously use network and standalone licences within one installation

In some cases it is efficient to use different types of licenses - standalone and network - on one computer. To support this scenario, it is possible to define network and standalone licenses in one LicensingSettings.xml file.

Updated documentation for working with screenshots

New recommendations for processing of screenshots were added into the documentation to support developers with useful tipps for this increasingly popular scenario.

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.

Recently viewed