Community

How to get confidence with FREngine 11 for Linux? (C++)

I've got he following code:

void performOCR(std::string fp) {

  try {
    clock_t t;
    t = clock();

    std::wstring file_wstring = std::wstring(fp.begin(), fp.end());
    const wchar_t* file_path = file_wstring.c_str();
    // Load ABBYY FineReader Engine
        std::cout << "Initializing Engine for file " << fp << std::endl;
        LoadFREngine();

    CSafePtr<IDocumentProcessingParams> documentProcessingParams;
        CheckResult( FREngine->CreateDocumentProcessingParams( &documentProcessingParams ) );
        CSafePtr<IPageProcessingParams> pageProcessingParams;
        CheckResult( documentProcessingParams->get_PageProcessingParams( &pageProcessingParams ) );
        CSafePtr<IRecognizerParams> recognizerParams;
        CheckResult( pageProcessingParams->get_RecognizerParams( &recognizerParams ) );
    CheckResult( recognizerParams->SetPredefinedTextLanguage( L"Spanish" ) );

      std::cout << "Loading PDF..." << std::endl;
      CBstr imagePath = Concatenate(L"./data/",file_path);
      CSafePtr<IFRDocument> frDocument = 0;

      CheckResult( FREngine->CreateFRDocumentFromImage( imagePath, 0, frDocument.GetBuffer() ) );

      //Recognize document
       std::cout << "Recognizing..." << std::endl;
      //CheckResult( frDocument->Process());
    CheckResult( frDocument->Process( documentProcessingParams ) );


      // Save results
    fp += ".rtf";
    std::wstring output_wstring = std::wstring(fp.begin(), fp.end());
    const wchar_t* output_wchar = output_wstring.c_str();
    std::cout << "Saving Results..." << std::endl;
      CBstr exportPath = Concatenate(L"./output/",output_wchar);
      CheckResult( frDocument->Export(exportPath, FEF_RTF, 0  ) );

    // Unload ABBYY FineReader Engine
    std::cout << "Deinitializing Engine..." << std::endl;
    UnloadFREngine();

    t = clock() - t;
    std::cout << "Time processing: " << float(t)/CLOCKS_PER_SEC << " seconds" << std::endl;
    std::cout << std::endl;
    }
  catch( CAbbyyException& e ) {
    wprintf(e.Description());
  }
}

 

I would like to add a threshold if the confidence of the word is not good enough. How could I achieve this?

In the documentation there are references to IRecognizerParams::ExactConfidenceCalculation, but when I compile something like recognizerParams->ExactConfidenceCalculation(1) for test, I get:

‘struct IRecognizerParams’ has no member named ‘ExactConfidenceCalculation’

Thanks in advance

0

Comments

6 comments

  • Avatar
    Diana Khammatova

    Could you please try to replace recognizerParams->ExactConfidenceCalculation(1)  by recognizerParams->put_ExactConfidenceCalculation( VARIANT_TRUE )?

    0
    Comment actions Permalink
  • Avatar
    Omar López Rubio

    Thanks! I will try. And how can I access to that confidence? I'm searching in the documentation and I can not find anything about put_ExactConfidenceCalculation. I suppose that all putters and getters follow the same syntax then (put_... and get_...) ?

    0
    Comment actions Permalink
  • Avatar
    Diana Khammatova

    There are two properties in FREngine that represent the recognition confidence of each character:

    • CharacterRecognitionVariant::CharConfidence is a numerical estimate of the probability that the image does in fact represent this character. Every character recognition variant has this property.
    • CharParams::IsSuspicious is a property of 1 character. If the property is set to TRUE, the character was recognized unreliable. Otherwise the property is set to FALSE.

     As for your question about putters and setters, you absolutely right, it is necessary to use the put_/get_ prefixes in C++ code.

    0
    Comment actions Permalink
  • Avatar
    Omar López Rubio

    Thanks a lot Diana. I'm really confused with the documentation. I have an CSafePtr<IFRDocument> object, where I process the document. So CharConfidence is a property of PlainText and IsSuspicious is a property of CharParams. How can I get the PlainText object from the IFRDocument (If it has it)?

    0
    Comment actions Permalink
  • Avatar
    Omar López Rubio

    I tried the following:

    CSafePtr<IPlainText> plainText = 0;

    frDocument->getplainText(&plainText);

    OLECHAR**  text = 0;

    plainText->get_Text(text);

    I would like to print the word (or char) and the confidence.

    Thanks!

     

    0
    Comment actions Permalink
  • Avatar
    Diana Khammatova

    Please try to get PlainText as follows:

    CSafePtr<IPlainText> plainText;

    frDocument->get_PlainText(&plainText);
           
    BSTR text;

    plainText->get_Text(&text);

    More infpormation about properties of the FRDocument Object please find in the Help → API Reference → Documetn-Related Objects → Document Organization → FRDocument.

    In addition, please pay attention to the article from our knowledgebase where described how to get recognized text.

     

     

    0
    Comment actions Permalink

Please sign in to leave a comment.