Community

Working with multipage documents and FineReader

I would like to use FineReader to process either multipage Tiffs or multipage image Acrobat docs and would like to get multipage .docx file. The reasoning is that we have large tables that span several tables and would like them to be recognized as single tables.

Was this article helpful?

0 out of 0 found this helpful

Comments

4 comments

  • Avatar
    Scott Chau

    Terry,

    That shouldn't be a problem.  All you need to do is load the files to be processed in FineReader and it will load all the pages of that document into the batch.  Once it has been processed, you can then export all those pages into 1 .docx file.  

    0
  • Avatar
    Terry Carnes

    My apologies. I should have specified that I am using the FineReader Engine. When I give it a multipage tif or pdf file, I only get the first image/page as output.

    0
  • Avatar
    Terry Carnes

    Using the Demo code:

    // Create CImageSource for accessing to images files in source folder 
    ImageSourceImpl imageSource = new ImageSourceImpl( sourceFolder, engineLoader );
    if( imageSource.IsEmpty() ) {
    throw new Exception( "No images in specified folder." );
    }
    FREngine.IBatchProcessor batchProcessor = engineLoader.Engine.CreateBatchProcessor();
    // Start batch processor for specified image source
    batchProcessor.Start(imageSource, null, null, null, null);
    // Obtain recognized pages and export them to RTF format
    FREngine.FRPage page = batchProcessor.GetNextProcessedPage();
    while( page != null ) {
    // Synthesize page before export
    page.Synthesize(null);
    // Export page to file with the same name and .docx extension
    string resultFilePath = Path.Combine(resultFolder, Path.GetFileName(page.SourceImagePath) + ".docx");
    page.Export(resultFilePath, FREngine.FileExportFormatEnum.FEF_DOCX, null);
      // Export page to file with the same name and txt extension
    resultFilePath = Path.Combine(resultFolder, Path.GetFileName(page.SourceImagePath) + ".txt");
    page.Export(resultFilePath, FREngine.FileExportFormatEnum.FEF_TextUnicodeDefaults, null);
    // Export page to file with the same name and pdf extension
    resultFilePath = Path.Combine(resultFolder, Path.GetFileName(page.SourceImagePath) + ".pdf");
    page.Export(resultFilePath, FREngine.FileExportFormatEnum.FEF_PDF, null);
    page = batchProcessor.GetNextProcessedPage();
    }
    0
  • Avatar
    Yuriy Korotkevych

    Hello Terry,

    I've moved the thread to FineReader Engine Q&A section of the Community.

    Best regards,

    Yuriy

    0

Please sign in to leave a comment.