コミュニティ

junk chars are detected

Written by Permanently deleted user

2014年02月03日 20:16
3

mpraining: Some junk chars are detected, e.g.: "ï»¿PINOT NOIR" - this is the first line of the result of the attached image. Another one "Joan dâ€™Anguera". Here we need the text after such junk char removed. So is there any option to avoid such characters?

Image

この記事は役に立ちましたか？

0人中0人がこの記事が役に立ったと言っています

3件のコメント

Permanently deleted user

2014年02月03日 20:24

The issue is not reproduced on our side. We recommend to recognize your image with the URL "http://cloud.ocrsdk.com/processImage?language=english,french&profile=textextraction&exportFormat=txt". In this case the result is

PINOT NOIR
BURGUNDY
A1020 Roblet-Monnot “Vieilles Vignes" 2010
72
Al 021 Paul Pernot et ses Fils 2008
122
Pommard-Noizons
A1022 Domaine Antonin Guyon 2009
Clos de la Chaume Gaufriot, Beaune
A1023 Domaine Ardhuy 2009
Gevrey-Chambertin
U5
172
C10-24 Domaine de Lambrays Grand Cru 2009
Clos des Lambrays, Morey
260
C1025 Camille Giroud Grand Cru 2008
Chapelle-Chambertin
430
an 18% gratuity is included on all checks

Permanently deleted user

2014年02月05日 23:17
Hello Anastasia, Thanks for your feedback, I got it working better, but still there is one thing I do not understand is that, please check the following entry which I got from my result

A1022 Domaine Antonin Guyon 2009 Clos de la Chaume Gaufriot, Beaune A1023 Domaine Ardhuy 2009 Gevrey-Chambertin 145 172

Here actually, we expect something like this,

A1022 Domaine Antonin Guyon 2009 Clos de la Chaume Gaufriot, Beaune 145 A1023 Domaine Ardhuy 2009 Gevrey-Chambertin 172

But result is not fine, can you please check why this is happening otherwise my algorithm to detect this line will fail due to this OCR mistake. And I checked the xml format, that is not suitable for us. I'm just expecting the contents as in the image. Please check and help me.
0
Permanently deleted user

2014年02月07日 18:42
The automatic analysis recognize this picture as several separate areas, that's why the text order is not from left to right and from top to bottom. Unfortunately, now it's impossible to export text in this order automatically. So the only way to get this order is to sort the words using its coordinates on your side.

0

サインインしてコメントを残してください。

コミュニティ

junk chars are detected

この記事は役に立ちましたか？

コメント

お探しのものを見つけられませんでしたか？