Community

text layer extraction from PDF without recognition Answered

Hi All,

We are processing both PDFs with text layers and scanned PDF documents.

If I checked whether there are text layers with Engine :: IsPdfWithTextualContent, I want to extract only the text in the text layer without performing recognition, but I wonder if there are any APIs that can provide such functions.

Thank you.

----

Added:

Would you send me (troublecoder@gmail.com) a c++ hello sample using AddImageFileFromMemory, please?

 

Was this article helpful?

1 out of 1 found this helpful

Comments

1 comment

  • Avatar
    Ksenia Leonteva

    Hi,

    In case some document contains text layer you can process it using ObjectsExtractionParams::SourceContentReuseMode = CRM_ContentOnly setting. Then FRE won't try to recognize this document.

    As for your second question, we'll try to help you with this at the nearest time. 

    1

Please sign in to leave a comment.