How to recognize text containing only numbers

There are two ways to recognize text containing only numbers:

A. Use the Digits Predefined Text Languages

Sample code in C#:

// load image
FRDocument document = engineLoader.Engine.CreateFRDocumentFromImage(@"D:\Demo.tif");

// parameters setting
DocumentProcessingParams documentProcessingParams = engineLoader.Engine.CreateDocumentProcessingParams();
documentProcessingParams.PageProcessingParams.RecognizerParams.SetPredefinedTextLanguage("Digits");

// process the document using the predifined language

document.Process( documentProcessingParams );

Note:  "Digits" text language contains other common symbols, typical for digit-only fields, such as decimal separator sign (comma or dot), minus and etc. Here is the full list:

#$%()+,-./:=[]{}¢£°¼½¾—‹›€

 

B. Specify the alphabet directly

  1. Сreate BaseLanguage object and set its alphabet
  2. Сreate corresponding TextLanguage object
  3. Process the document using the created language

Sample code in C#:

// creating text language
ILanguageDatabase languageDatabase = engineLoader.Engine.CreateLanguageDatabase();
ITextLanguage textLanguage = languageDatabase.CreateTextLanguage();

// creating and setting base language
IBaseLanguage baseLanguage = textLanguage.BaseLanguages.AddNew();
baseLanguage.LetterSet[BaseLanguageLetterSetEnum.BLLS_Alphabet] = "0123456789";

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Recently viewed