The CharConfidence property of the ExtendedRecAttributes, the PlainText, and the CharacterRecognitionVariant objects is the read-only long property which stores the value of character confidence.
It is in the range from 0 to 100, and 255 corresponds to the fact that confidence is undefined. It represents an estimate of recognition confidence of a character in percentage points. The greater its value, the greater the confidence.
Character confidence can be undefined, for example, for characters which were added during text editing.
Recognition confidence of a character image is a numerical estimate of the similarity of this image and the “ideal character” whose recognition confidence would be 100%. When recognizing a character, the program provides several recognition variants which are ranked by their confidence values. For example, an image of the letter “e” may be recognized as:
- “e” with a confidence of 95%
- “c” with a confidence of 85%
- “o” with a confidence of 65%
The sum total of the confidence values of all the recognition variants of a character need not be 100%. The hypothesis with a higher confidence rating is selected as the recognition result. However, the choice also depends on the context (i.e. the actual word to which the character belongs) and the results of a differential comparison. For example, if the word with the “e” hypothesis is not a dictionary word while the word with the “c” hypothesis is a dictionary word, the latter will be selected as the recognition result, and its confidence rating will be 85%. The rest of the recognition variants can be obtained as hypotheses.
The IsSuspicious property of the CharParams object is the Boolean property. This property set to TRUE means that the character was recognized unreliably. This property is determined by an algorithm which takes into account a number of parameters, such as recognition confidence of a character, nearby characters and their recognition confidence, hypotheses and their recognition confidence, the geometric parameters of a character, the context (i.e. the word to which a character belongs), etc.
See also OCR Voting API