Community

How to ignore special characters in OCR recognition

Written by TASK eDoc Services GmbH - Thomas Berg

March 28, 2022 13:48

Good day.
Is there a way to simply switch off or ignore some characters during recognition? We transfer text from the recognised PDF into an XML and special characters like "□" cause problems there.
Therefore, I would like to configure the job on the server so that certain characters are not replaced by "?" or "□", but simply ignored.
Similarly, with other ASCII and UFT8 characters, e.g.
Ͱ ˩ ˥ ˦ ˧ ˾ ┌ ┐ └ ┘ ⌐ ■ □ ¬

Thank you for your support.

Regards
Thomas Berg

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.

Community

How to ignore special characters in OCR recognition

Was this article helpful?

Comments

Didn't find what you were looking for?