Just wondering, what is the best way to capture text of whole page using FlexiCapture 10. We got Distributed Flexicapture 10 for invoice processing now want to use it for another task where whole page needs to be convert into string by eliminating all separators, images and special characters.
Capturing whole page text
Was this article helpful?
0 out of 0 found this helpful
Comments
5 comments
farukhali,
So you want to do a full text OCR of the page? i.e. like Recognition server or FineReader retail where you can convert an image to a Txt file? If that is the case, it would be better to use the other product but you could sort of do this by creating a large capture area for one field. i.e. in FlexiLayout Studio, use the Region and have it capture the whole page area. Then on the FlexiCapture side, you can use the autocorrect function to remove all your characters you don't want. From there I would create a special export that would only export this one field value.
BTW is there any way to detect human snap or a particular logo on an image?
Paragraph can work too but I'm worry about an image or something getting picked up in the middle of the document and then you're missing some other text. Region will get everything. Otherwise if you are worry about getting photos OCR as bad data, you could just the paragraph and put it in a repeatable groups
As for detecting logos, it can't do that. Basically you can detect the size but it can't distinguish between the other different type of photo.
Autocorrection Function is in FlexiCapture. On the field, go to the properties and select Data Type. Autocorrection is the middle option. It will let you replace the characters you don't want with empty values.
Please sign in to leave a comment.