Community

[FR Engine 11 SDK] How to exclude some characters after setting recognition language ? Answered

Written by Permanently deleted user

March 27, 2015 17:45
1

Hello,

I need to perform some text extraction. After selecting the predefined recognition language, I would like to exclude some characters.

Example: I would like to exclude the & ( ) | characters.

How can I do that ?

After reading the documentation I'm not sure if I need to create a dictionnary or modify the letter set of the language. The documentation is not very clear on this subject.

Was this article helpful?

0 out of 0 found this helpful

Comments

1 comment

Permanently deleted user

March 30, 2015 09:31
Hello,

Yes, you could specify the letter set. Please see the article about how to do it with the help of regular expressions : http://knowledgebase.ocrsdk.com/article/1188.

Also, you can iterate layout and remove all unwanted characters by means of the Remove method of the Paragraph Object. The similar example (but just for replacing curly quotes) could be found here: http://knowledgebase.ocrsdk.com/article/1468. This way looks more simple, however, if the documents have a large amount of characters iterating may take more time.

Hope it helps!

1

Please sign in to leave a comment.

Community

[FR Engine 11 SDK] How to exclude some characters after setting recognition language ? Answered

Was this article helpful?

Comments

Didn't find what you were looking for?