I've got he following code:
void performOCR(std::string fp) {
try {
clock_t t;
t = clock();
std::wstring file_wstring = std::wstring(fp.begin(), fp.end());
const wchar_t* file_path = file_wstring.c_str();
// Load ABBYY FineReader Engine
std::cout << "Initializing Engine for file " << fp << std::endl;
LoadFREngine();
CSafePtr<IDocumentProcessingParams> documentProcessingParams;
CheckResult( FREngine->CreateDocumentProcessingParams( &documentProcessingParams ) );
CSafePtr<IPageProcessingParams> pageProcessingParams;
CheckResult( documentProcessingParams->get_PageProcessingParams( &pageProcessingParams ) );
CSafePtr<IRecognizerParams> recognizerParams;
CheckResult( pageProcessingParams->get_RecognizerParams( &recognizerParams ) );
CheckResult( recognizerParams->SetPredefinedTextLanguage( L"Spanish" ) );
std::cout << "Loading PDF..." << std::endl;
CBstr imagePath = Concatenate(L"./data/",file_path);
CSafePtr<IFRDocument> frDocument = 0;
CheckResult( FREngine->CreateFRDocumentFromImage( imagePath, 0, frDocument.GetBuffer() ) );
//Recognize document
std::cout << "Recognizing..." << std::endl;
//CheckResult( frDocument->Process());
CheckResult( frDocument->Process( documentProcessingParams ) );
// Save results
fp += ".rtf";
std::wstring output_wstring = std::wstring(fp.begin(), fp.end());
const wchar_t* output_wchar = output_wstring.c_str();
std::cout << "Saving Results..." << std::endl;
CBstr exportPath = Concatenate(L"./output/",output_wchar);
CheckResult( frDocument->Export(exportPath, FEF_RTF, 0 ) );
// Unload ABBYY FineReader Engine
std::cout << "Deinitializing Engine..." << std::endl;
UnloadFREngine();
t = clock() - t;
std::cout << "Time processing: " << float(t)/CLOCKS_PER_SEC << " seconds" << std::endl;
std::cout << std::endl;
}
catch( CAbbyyException& e ) {
wprintf(e.Description());
}
}
I would like to add a threshold if the confidence of the word is not good enough. How could I achieve this?
In the documentation there are references to IRecognizerParams::ExactConfidenceCalculation, but when I compile something like recognizerParams->ExactConfidenceCalculation(1) for test, I get:
‘struct IRecognizerParams’ has no member named ‘ExactConfidenceCalculation’
Thanks in advance
コメント
6件のコメント
Could you please try to replace recognizerParams->ExactConfidenceCalculation(1) by recognizerParams->put_ExactConfidenceCalculation( VARIANT_TRUE )?
Thanks! I will try. And how can I access to that confidence? I'm searching in the documentation and I can not find anything about put_ExactConfidenceCalculation. I suppose that all putters and getters follow the same syntax then (put_... and get_...) ?
There are two properties in FREngine that represent the recognition confidence of each character:
As for your question about putters and setters, you absolutely right, it is necessary to use the put_/get_ prefixes in C++ code.
Thanks a lot Diana. I'm really confused with the documentation. I have an CSafePtr<IFRDocument> object, where I process the document. So CharConfidence is a property of PlainText and IsSuspicious is a property of CharParams. How can I get the PlainText object from the IFRDocument (If it has it)?
I tried the following:
I would like to print the word (or char) and the confidence.
Thanks!
Please try to get PlainText as follows:
CSafePtr<IPlainText> plainText;
frDocument->get_PlainText(&plainText);
BSTR text;
plainText->get_Text(&text);
More infpormation about properties of the FRDocument Object please find in the Help → API Reference → Documetn-Related Objects → Document Organization → FRDocument.
In addition, please pay attention to the article from our knowledgebase where described how to get recognized text.
サインインしてコメントを残してください。