Question
I want to access all background text on recognized pages, for example, to index documents for future searches. How can I do that?
Answer
To do this, you can use the ExtractTextRegions method of the Page object. For example:
// C#
batch.Recognize(null, RecognitionModeEnum.RM_ReApplyDocumentDefinitions, null);
foreach (IDocument document in batch.Documents) {
document.Open();
foreach (IPage page in document.Pages) {
ITextRegions textRegions = page.ExtractTextRegions();
foreach (ITextRegion textRegion in textRegions) {
string plainText = textRegion.Text.PlainText;
System.Console.WriteLine(plainText);
}
}
document.Close();
}
You can find additional information about the ExtractTextRegions in the FlexiCapture SDK User's Guide.
Comments
0 comments
Article is closed for comments.