コミュニティ

Field Level Recognition on digit only 回答済み

Hi all,

I'm trying making a snippet code to extract Field Level Recognition on digit data only ([0-9],.), here code in Java with FineReader Engine using regular dictionary and a text language.

to create configuration of engine:

public IDocumentProcessingParams buildProcessingParams(IEngine engine) {
        IDocumentProcessingParams documentProcessingParams = engine.CreateDocumentProcessingParams();
        IPageProcessingParams pageProcessingParams = engine.CreatePageProcessingParams();
        IRecognizerParams recognizerParams = engine.CreateRecognizerParams();
        recognizerParams.setLanguageDetectionMode(ThreeStatePropertyValueEnum.TSPV_No);

        ITextLanguage textLanguage = engine.CreateLanguageDatabase().CreateTextLanguage();
        textLanguage.setInternalName("Digit");
        textLanguage.setLetterSet(TextLanguageLetterSetEnum.TLLS_ProhibitedLetters, "0123456789,.");
        IDictionaryDescription textDictionnary = textLanguage.getProhibitingDictionaries().AddNew(DictionaryTypeEnum.DT_RegularExpression);
        textDictionnary.GetAsRegExpDictionaryDescription().SetText("[0..9],\\.");

        IBaseLanguage baseLanguage = textLanguage.getBaseLanguages().AddNew();
        baseLanguage.setInternalName("Base-Digit");
        baseLanguage.setLetterSet(BaseLanguageLetterSetEnum.BLLS_Alphabet, "0123456789,.");
        baseLanguage.setAllowWordsFromDictionaryOnly(true);
        IDictionaryDescription baseDictionary = baseLanguage.getDictionaryDescriptions().AddNew(DictionaryTypeEnum.DT_RegularExpression);
        baseDictionary.GetAsRegExpDictionaryDescription().SetText("[0..9],\\.");

        

        recognizerParams.setTextLanguage(textLanguage);
        pageProcessingParams.setRecognizerParams(recognizerParams);
        documentProcessingParams.setPageProcessingParams(pageProcessingParams);
        return documentProcessingParams;
    }

 

how to use in process:

IDocumentProcessingParams documentProcessingParams = setup.buildProcessingParams(engine);
document.Process(documentProcessingParams);

 

Unfortunately, this code give nothing in export. 

Do you have any idea ?

 

この記事は役に立ちましたか?

0人中0人がこの記事が役に立ったと言っています

コメント

2件のコメント

  • Avatar
    Permanently deleted user

    Apparently, in your custom language you prohibit the same characters and words that you would like to recognize. Basically, the following lines

    textLanguage.setLetterSet(TextLanguageLetterSetEnum.TLLS_ProhibitedLetters, "0123456789,.");
    IDictionaryDescription textDictionnary = textLanguage.getProhibitingDictionaries().AddNew
       (DictionaryTypeEnum.DT_RegularExpression);
    textDictionnary.GetAsRegExpDictionaryDescription().SetText("[0..9],\\.");

    tell FREngine that there should not be symbols "0123456789,." in the recognized results, i.e. they are prohibited. Please refer to the Developer's Help articles API Reference → Language-Related Objects → TextLanguage and API Reference → Enumerations → TextLanguageLetterSetEnum for additional details.

    Once you remove these lines from your code, you should be able to get the results that you expect.

    1
  • Avatar
    Permanently deleted user

    Thank alot your answer. In mean time, I found an other solution in your doc. Abbyy has already a particular built-in language 'Digits' can fit this case. Here's the code:

    @Override   public IDocumentProcessingParams buildProcessingParams(IEngine engine) {   IDocumentProcessingParams documentProcessingParams = engine.CreateDocumentProcessingParams();   IPageProcessingParams pageProcessingParams = engine.CreatePageProcessingParams();   IRecognizerParams recognizerParams = engine.CreateRecognizerParams();   recognizerParams.SetPredefinedTextLanguage("Digits");   recognizerParams.setLanguageDetectionMode(ThreeStatePropertyValueEnum.TSPV_Yes);   pageProcessingParams.setRecognizerParams(recognizerParams);   documentProcessingParams.setPageProcessingParams(pageProcessingParams);   return documentProcessingParams;   }
    0

サインインしてコメントを残してください。