コミュニティ

How to ignore special characters in OCR recognition

Good day.
Is there a way to simply switch off or ignore some characters during recognition?  We transfer text from the recognised PDF into an XML and special characters like "□" cause problems there.
Therefore, I would like to configure the job on the server so that certain characters are not replaced by "?" or "□", but simply ignored.
Similarly, with other ASCII and UFT8 characters, e.g. 
Ͱ ˩ ˥ ˦ ˧ ˾ ┌ ┐ └ ┘ ⌐ ■ □ ¬

Thank you for your support.

Regards
Thomas Berg

この記事は役に立ちましたか?

0人中0人がこの記事が役に立ったと言っています

コメント

0件のコメント

サインインしてコメントを残してください。