Community

Capturing whole page text

Written by Permanently deleted user

May 08, 2014 21:26
5

Just wondering, what is the best way to capture text of whole page using FlexiCapture 10. We got Distributed Flexicapture 10 for invoice processing now want to use it for another task where whole page needs to be convert into string by eliminating all separators, images and special characters.

Was this article helpful?

0 out of 0 found this helpful

Comments

5 comments

Permanently deleted user

May 09, 2014 20:31
farukhali,

So you want to do a full text OCR of the page? i.e. like Recognition server or FineReader retail where you can convert an image to a Txt file? If that is the case, it would be better to use the other product but you could sort of do this by creating a large capture area for one field. i.e. in FlexiLayout Studio, use the Region and have it capture the whole page area. Then on the FlexiCapture side, you can use the autocorrect function to remove all your characters you don't want. From there I would create a special export that would only export this one field value.

0
Permanently deleted user

May 09, 2014 20:44
Thanks for the reply. How about using Paragraph instead of Region? I tried it seems to working pretty well.

BTW is there any way to detect human snap or a particular logo on an image?

0
Permanently deleted user

May 09, 2014 20:51
One more thing where can I use autocorrect function. Any documentation on this. Appreciate your help!

0
Permanently deleted user

May 10, 2014 00:26
Farukhali,

Paragraph can work too but I'm worry about an image or something getting picked up in the middle of the document and then you're missing some other text. Region will get everything. Otherwise if you are worry about getting photos OCR as bad data, you could just the paragraph and put it in a repeatable groups

As for detecting logos, it can't do that. Basically you can detect the size but it can't distinguish between the other different type of photo.

Autocorrection Function is in FlexiCapture. On the field, go to the properties and select Data Type. Autocorrection is the middle option. It will let you replace the characters you don't want with empty values.

0
Permanently deleted user

May 10, 2014 00:58
Thanks Sushi. I'll check these.

0

Please sign in to leave a comment.

Community

Capturing whole page text

Was this article helpful?

Comments

Didn't find what you were looking for?