Question
How to calculate a confidence level for a page or entire document?
Answer
FineReader Engine does not have a Confidence Property for the frPage or the frDocument Objects.
The confidence level of a page or an entire document could be calculated based on the confidence levels of each character recognized within the page or a document.
You can access the confidence value of each character via the CharConfidence Property of the PlainText Object. PlainText Object is available for both frPage and frDocument Objects.
CSafePtr<IPlainText> PlainText; CheckResult(frPage->get_PlainText(&PlainText)); int SymbolsCount = 0; CheckResult(PlainText->get_SymbolsCount(&SymbolsCount)); for (int CurrentSymbol = 0; CurrentSymbol < SymbolsCount - 1; CurrentSymbol++) { int Confidence = 0; CheckResult(PlainText->get_CharConfidence(CurrentSymbol, &Confidence)); //... //Custom agregator function (Confidence); //... }
C++ code snippet
FREngine.IPlainText PlainText = frPage.PlainText; for (int CurrentSymbol = 0; CurrentSymbol < PlainText.SymbolsCount - 1; CurrentSymbol++) { int Confidence = PlainText.CharConfidence[CurrentSymbol]; //... //Custom agregator function (Confidence); //... }
C# code snippet
Note. Confidence estimates the probability that a recognition variant is correct. It should not be understood as a general recognition quality measure: the only safe use of confidence is for comparing recognition variants of the same character. The characters extracted from the source PDF file without recognition have confidence set to 100. To calculate character confidence more accurately, set the IRecognizerParams::ExactConfidenceCalculationproperty to TRUE.
See also What is the difference between the CharConfidence and the IsSuspicious properties?
Comments
0 comments
Please sign in to leave a comment.