com.abbyy.FREngine.EngineException: The PDF file has unsupported format and cannot be opened.


When trying to process a specific PDF with the ABBYY FREngine Java-API we face the following error-message:

com.abbyy.FREngine.EngineException: The PDF file `invoice.pdf` has unsupported format and cannot be opened.

at com.abbyy.FREngine.IFRDocument.AddImageFile(Native Method)

Details about our installation of ABBYY FineReader Engine:
- Debian 8.11 (64-bit)
- Java 1.8.0_201 (64-bit)
- FineReader Engine

Java-snippet that we use to proces the PDFs:

            // Create document
            IFRDocument document = engine.CreateFRDocument();

                If orientation detection is performed during document processing
                (IPagePreprocessingParams::CorrectOrientation property is TRUE), you can select fast
                orientation detection mode: set the OrientationDetectionMode property of the
                OrientationDetectionParams object to ODM_Fast.
            IDocumentProcessingParams dpp = engine.CreateDocumentProcessingParams();   
            try {
                // Add image file to document
                document.AddImageFile( imagePath, null, null );

                //process full document

                // Save results to pdf using 'balanced' scenario
                IPDFExportParams pdfParams = engine.CreatePDFExportParams();
                pdfParams.setScenario( PDFExportScenarioEnum.PES_Balanced );

                String pdfExportPath = inputfilename + "_ocrred.pdf";
                document.Export( pdfExportPath, FileExportFormatEnum.FEF_PDF, pdfParams );
            } finally {
                // Close document

Other PDFs are succesfully processed, this specific one is not.
Any suggestions?

Best regards

Koen de Leijer



1 comment

  • Avatar
    Koen de Leijer


    We've found out that the PDFs that are rejected by ABBYY Finereader have one thing in common,
    they all have "PDF Producer" => "Adobe XML Form Library".

    According to the Adobe forum, these PDF are XMLs wrapped inside a PDF:

    The need to OCR these PDFs that sometimes, valuable information is within the company-logo or an image in the PDF-footer.

    Thanks in advance

    Koen de Leijer


    Comment actions Permalink

Please sign in to leave a comment.