I 'm a newbie and investigating a solution that matches 2 requirements:
- Does FlexiCapture can automatically detect a specifical table (occupied one page) in a PDF file with multiple pages then extract data from the table and ignore the left pages to speed up OCR progress.
- There are more than 10 thousand types of PDF File, you know if I create a document definition for each type,it is a formidable task.So I wonder if FlexiLayout or some other tool can automatically recognize some related fields like name, address, number, line items and so on to reduce the task and I just need to incrementally add a new field in the template once there is a new type which is out of the above scope.
Best regards.
コメント
2件のコメント
Hello,
You may search for the specific table on the level of "flexi layout" using unique identifiers, then consider all pages that will not have these identifiers as non-recognizable annexes.
As to the auto-classification "out-of-b o x" and extraction of some standard fields - this functionality is available for the Invoice type of documents, see the FlexiCapture for Invoices description, otherwise you'll need to prepare your own layout.
Dear Ekaterina
Thank you. I will try to use identifiers firstly.
サインインしてコメントを残してください。