Field Level Recognition on digit only – Help Center

Field Level Recognition on digit only 回答済み

Written by Permanently deleted user

2017年04月24日 21:00
2

Hi all,

I'm trying making a snippet code to extract Field Level Recognition on digit data only ([0-9],.), here code in Java with FineReader Engine using regular dictionary and a text language.

to create configuration of engine:

public IDocumentProcessingParams buildProcessingParams(IEngine engine) {
        IDocumentProcessingParams documentProcessingParams = engine.CreateDocumentProcessingParams();
        IPageProcessingParams pageProcessingParams = engine.CreatePageProcessingParams();
        IRecognizerParams recognizerParams = engine.CreateRecognizerParams();
        recognizerParams.setLanguageDetectionMode(ThreeStatePropertyValueEnum.TSPV_No);

        ITextLanguage textLanguage = engine.CreateLanguageDatabase().CreateTextLanguage();
        textLanguage.setInternalName("Digit");
        textLanguage.setLetterSet(TextLanguageLetterSetEnum.TLLS_ProhibitedLetters, "0123456789,.");
        IDictionaryDescription textDictionnary = textLanguage.getProhibitingDictionaries().AddNew(DictionaryTypeEnum.DT_RegularExpression);
        textDictionnary.GetAsRegExpDictionaryDescription().SetText("[0..9],\\.");

        IBaseLanguage baseLanguage = textLanguage.getBaseLanguages().AddNew();
        baseLanguage.setInternalName("Base-Digit");
        baseLanguage.setLetterSet(BaseLanguageLetterSetEnum.BLLS_Alphabet, "0123456789,.");
        baseLanguage.setAllowWordsFromDictionaryOnly(true);
        IDictionaryDescription baseDictionary = baseLanguage.getDictionaryDescriptions().AddNew(DictionaryTypeEnum.DT_RegularExpression);
        baseDictionary.GetAsRegExpDictionaryDescription().SetText("[0..9],\\.");

        

        recognizerParams.setTextLanguage(textLanguage);
        pageProcessingParams.setRecognizerParams(recognizerParams);
        documentProcessingParams.setPageProcessingParams(pageProcessingParams);
        return documentProcessingParams;
    }

how to use in process:

IDocumentProcessingParams documentProcessingParams = setup.buildProcessingParams(engine);
document.Process(documentProcessingParams);

Unfortunately, this code give nothing in export.

Do you have any idea ?

textLanguage.setLetterSet(TextLanguageLetterSetEnum.TLLS_ProhibitedLetters, "0123456789,.");
IDictionaryDescription textDictionnary = textLanguage.getProhibitingDictionaries().AddNew
(DictionaryTypeEnum.DT_RegularExpression);
textDictionnary.GetAsRegExpDictionaryDescription().SetText("[0..9],\\.");

2件のコメント

Permanently deleted user

2017年05月03日 14:18
Apparently, in your custom language you prohibit the same characters and words that you would like to recognize. Basically, the following lines

textLanguage.setLetterSet(TextLanguageLetterSetEnum.TLLS_ProhibitedLetters, "0123456789,.");
IDictionaryDescription textDictionnary = textLanguage.getProhibitingDictionaries().AddNew
(DictionaryTypeEnum.DT_RegularExpression);
textDictionnary.GetAsRegExpDictionaryDescription().SetText("[0..9],\\.");

tell FREngine that there should not be symbols "0123456789,." in the recognized results, i.e. they are prohibited. Please refer to the Developer's Help articles API Reference → Language-Related Objects → TextLanguage and API Reference → Enumerations → TextLanguageLetterSetEnum for additional details.

Once you remove these lines from your code, you should be able to get the results that you expect.
1
Permanently deleted user

2017年05月03日 14:35
Thank alot your answer. In mean time, I found an other solution in your doc. Abbyy has already a particular built-in language 'Digits' can fit this case. Here's the code:

@Override public IDocumentProcessingParams buildProcessingParams(IEngine engine) { IDocumentProcessingParams documentProcessingParams = engine.CreateDocumentProcessingParams(); IPageProcessingParams pageProcessingParams = engine.CreatePageProcessingParams(); IRecognizerParams recognizerParams = engine.CreateRecognizerParams(); recognizerParams.SetPredefinedTextLanguage("Digits"); recognizerParams.setLanguageDetectionMode(ThreeStatePropertyValueEnum.TSPV_Yes); pageProcessingParams.setRecognizerParams(recognizerParams); documentProcessingParams.setPageProcessingParams(pageProcessingParams); return documentProcessingParams; }

0

サインインしてコメントを残してください。

コミュニティ

Field Level Recognition on digit only 回答済み

この記事は役に立ちましたか？

コメント

お探しのものを見つけられませんでしたか？