How to get recognized text from a document in FlexiCapture SDK?

Din Idrisov

Edited June 21, 2023 11:51

Question

I want to access all background text on recognized pages, for example, to index documents for future searches. How can I do that?

Answer

To do this, you can use the ExtractTextRegions method of the Page object. For example:

// C#
batch.Recognize(null, RecognitionModeEnum.RM_ReApplyDocumentDefinitions, null);
foreach (IDocument document in batch.Documents) {
    document.Open();
    foreach (IPage page in document.Pages) {
        ITextRegions textRegions = page.ExtractTextRegions();
        foreach (ITextRegion textRegion in textRegions) {
            string plainText = textRegion.Text.PlainText;
            System.Console.WriteLine(plainText);
        }
    }
    document.Close();
}

You can find additional information about the ExtractTextRegions in the FlexiCapture SDK User's Guide.

How to get recognized text from a document in FlexiCapture SDK?

Din Idrisov

Question

Answer

Was this article helpful?

Recently viewed

How to get recognized text from a document in FlexiCapture SDK?

Din Idrisov

Question

Answer

Was this article helpful?

Related articles

Recently viewed