PDF/A Export in ABBYY Technologies

What is PDF/A

PDF/A is a file format according to the ISO standard for the long-term archiving of electronic documents. It is  a 'subset of PDF' that excludes PDF features which are not suited to long-term archiving.

There are different levels of PDF/A:

  • PDF/A-1b - Level B compliance in Part 1
    PDF/A-1b has the objective of ensuring reliable reproduction of the visual appearance of the document.
  • PDF/A-1a - Level A compliance in Part 1
    PDF/A-1a includes all the requirements of PDF/A-1b and additionally requires that document structure be included (also known as being “tagged”/“Tagged PDF”), with the objective of ensuring that document content can be searched and repurposed. PDF/A-1a also requires Unicode character maps.
  • PDF/A-2 is based on ISO 32000-1
    A-2 a new standard
    PDF 1.7 and is defined by ISO 19005-2:2011, published on June 20, 2011 under the formal name Document management – Electronic document file format for long-term preservation – Part 2: Use of ISO 32000-1 (PDF/A-2).
  • PDF/A-3
    • The standard was published in October 2012 and differs form PDF/A-2 in a way that it allows to embed all kinds of file formats. For example: XML, Office formats, raw binary data, etc
    • Important: the long-term compatibility will only be guaranteed for the PDF-part of the collection. If an organization will embed other file formats, then there are reasons/benefits to have access to the other file formats and accepting the risk that they are not usable in 100 years.

 

PDF/A Minimum Requirements

Conditions for PDF/A compliancy:

  • Audio and video content are forbidden.
  • JavaScript and executable file launches are forbidden.
  • All fonts must be embedded and also must be legally embeddable for unlimited, universal rendering. This also applies to the so-called PostScript standard fonts such as Times or Helvetica.
  • Colorspaces specified in a device-independent manner.
  • Encryption is forbidden.
  • Use of standards-based metadata is mandated.
  • External content references are forbidden.
  • LZW and JPEG2000 image compressions are forbidden in PDF/A-1,
    but JPEG 2000 compression is allowed in PDF/A-2.
  • Transparent objects and layers (Optional Content Groups) are forbidden in PDF/A-1, but they are supported in PDF/A-2.
  • Provisions for digital signatures in accordance with the PAdES (PDF Advanced Electronic Signatures) standard are supported in PDF/A-2.
  • Embedded files are forbidden in PDF/A-1, but PDF/A-2 offers the possibility to embed PDF/A files, allowing archiving of sets of documents in a single file.

 >Source (Wikipedia)

 

PDF/A Support in ABBYY Technology Products

PDF/A Export (PDF/A-1b & PDF/A-1a) is available in the following ABBYY technology products

FineReader Engine - OCR & Document Conversion

  • FineReader Engine 12 Windows & Linux & Mac
  • FineReader Engine 11 Windows & Linux & Mac
  • FineReader Engine 10 Windows
  • FineReader Engine 9.0 Windows
  • FineReader Engine 9.0 Linux
  • FineReader Engine 8.1 Windows

FlexiCapture Engine - Separation, Classification & Data Capture

  • FlexiCapture SDK
  • FlexiCapture Engine 12
  • FlexiCapture Engine 11
  • FlexiCapture Engine 10
  • FlexiCapture Engine 9.0
  • FlexiCapture Engine 8.0

FineReader Server - Solution for server-based processing and document capture

  • FineReader Server 4.0
  • Recognition Server 3.0
  • Recognition Server 2.0

FlexiCapture - Solutions for Data Capture

  • FlexiCapture 12 - all versions
  • FlexiCapture 11 - all versions
  • FlexiCapture 10 - all versions
  • FlexiCapture 9.0 - all versions
  • FlexiCapture 8 Professional

 

PDF/A-2 Support

In addition to the common PDF and PDF/A-1 formats, FineReader Engine 11 now experts to PDF/a-2. The new options of the ISO standard format are:

  • Support of JPEG2000 compression to generate smaller files
  • A-2a – tagged & unicode PDF/A-2
  • A-2u – not-tagged PDF/A-2 with an ability to extract text in Unicode.

fre11_grafik_pdfa2_e.jpg

PDF/A-2 enables creation of smaller PDF files using JPEG2000 compression. For long-term archiving, this can help reduce used storage space and enable faster access when working on low bandwidth networks.

The general technical changes of PDF/A-2 are:

  • based on based PDF 1.7 (ISO 32000-1)
  • highly efficient JPEG2000 compression allowed
  • support for transparency effects and layers
  • embedding of OpenType fonts
  • provisions for digital signatures in accordance with the
    PAdES (PDF Advanced Electronic Signatures) standard.
  • possibility to embed PDF/A files in PDF/A-2,
    allowing archiving of sets of documents as individual documents in a single file.

 

PDF/A-3 Support

PDF/A-3 is an extension of the A-2 standard which allows inclusion of PDF/A files or files in a variety of other binary formats such as XML or Office formats. Long-term archiving and readability of the PDF/A part is still guaranteed, and the binary attachments can deliver additional benefits.

fre11_grafik_pdfa3_e.jpg

The PDF/A-3 extended container capabilities will make this format attractive in new areas, for example when a graphical representation of a document should be combined with some source data. The new e-invoice format defined by the Forum for Electronic Invoices Germany (FeRD) is based on PDF/A-3 and XML.

Since Release 3 of FineReader Engine 11 the API is extended, so that included files/attachments can be extracted and also be added to a PDF.

 

ABBYY is a worldwide member in the PDF/A Competence Center and committed to support PDF/Alogo-pdfa-200x84trans.png

 

 

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.

Recently viewed