Hi all,
I'm trying making a snippet code to extract Field Level Recognition on digit data only ([0-9],.), here code in Java with FineReader Engine using regular dictionary and a text language.
to create configuration of engine:
public IDocumentProcessingParams buildProcessingParams(IEngine engine) {
IDocumentProcessingParams documentProcessingParams = engine.CreateDocumentProcessingParams();
IPageProcessingParams pageProcessingParams = engine.CreatePageProcessingParams();
IRecognizerParams recognizerParams = engine.CreateRecognizerParams();
recognizerParams.setLanguageDetectionMode(ThreeStatePropertyValueEnum.TSPV_No);
ITextLanguage textLanguage = engine.CreateLanguageDatabase().CreateTextLanguage();
textLanguage.setInternalName("Digit");
textLanguage.setLetterSet(TextLanguageLetterSetEnum.TLLS_ProhibitedLetters, "0123456789,.");
IDictionaryDescription textDictionnary = textLanguage.getProhibitingDictionaries().AddNew(DictionaryTypeEnum.DT_RegularExpression);
textDictionnary.GetAsRegExpDictionaryDescription().SetText("[0..9],\\.");
IBaseLanguage baseLanguage = textLanguage.getBaseLanguages().AddNew();
baseLanguage.setInternalName("Base-Digit");
baseLanguage.setLetterSet(BaseLanguageLetterSetEnum.BLLS_Alphabet, "0123456789,.");
baseLanguage.setAllowWordsFromDictionaryOnly(true);
IDictionaryDescription baseDictionary = baseLanguage.getDictionaryDescriptions().AddNew(DictionaryTypeEnum.DT_RegularExpression);
baseDictionary.GetAsRegExpDictionaryDescription().SetText("[0..9],\\.");
recognizerParams.setTextLanguage(textLanguage);
pageProcessingParams.setRecognizerParams(recognizerParams);
documentProcessingParams.setPageProcessingParams(pageProcessingParams);
return documentProcessingParams;
}
how to use in process:
IDocumentProcessingParams documentProcessingParams = setup.buildProcessingParams(engine);
document.Process(documentProcessingParams);
Unfortunately, this code give nothing in export.
Do you have any idea ?
コメント
2件のコメント
Apparently, in your custom language you prohibit the same characters and words that you would like to recognize. Basically, the following lines
tell FREngine that there should not be symbols "0123456789,." in the recognized results, i.e. they are prohibited. Please refer to the Developer's Help articles API Reference → Language-Related Objects → TextLanguage and API Reference → Enumerations → TextLanguageLetterSetEnum for additional details.
Once you remove these lines from your code, you should be able to get the results that you expect.
Thank alot your answer. In mean time, I found an other solution in your doc. Abbyy has already a particular built-in language 'Digits' can fit this case. Here's the code:
サインインしてコメントを残してください。