How can I set up a felxilayout template to capture data from a document where, in some pages, data is organised in 2 columns ? In each column I need to capture table data and data has to be read left column first. Tables can go over to the next column (same page or next page). Thanks
Multi Columns Documents
Was this article helpful?
0 out of 0 found this helpful
Comments
9 comments
1. distinguish column regions in repeating group. For example, find vertical white gaps within whole page height from left to right. Areas between whitegaps will be columns.
2. running in a loop from 1st to last column, find header and footer for each table.
3. form table search areas as RectArrays. For example, if table header is in column N and footer is in column N+2 then RectArray will contain 3 rectangles: 1st is from header to column N bottom, 2nd is a whole column N+1 and 3rd is from column N+2 top to footer
4. find table instances in appropriate areas.
thanks for your tips. I am quite new to abbyy, especially on the scripting side. My document is divided in 2 columns exactly and I know the exact coordinate (area) of each column, so I do not need to detect the number of columns using white gaps. I have created a repeating group to detect header and footer, but I do not seem to be able to force the search using the column's order and I have difficulties coupling them (detecting header and footer of the same table) especially if the table crosses 2 columns (that could also be on different pages). This is probably what you were suggesting in step 2. How do I run a loop from first to second column for each page ? Can I do this using the search area ? Once detected the header and footer how can I create the rectarrays for searching the table ? Thank you for your help
Probably the easiest way in the situation is to do the following:
1. Create Repeating Group Element
Use the “Page mode”, specify the number of repetitions on page
Use Absolute search area constrain; define coordinates of the two regions of the page (left and right halves, the gap between them excluded)
2. Create a Table within the Repeating Grout Element, check “Look for header / footer”, uncheck “Header / Footer is on each page” options;
3. When you describe the position of the Table block by specifying the source element (the Repeating Group Element), check “has repeating instances” and choose “Left to right” as instance sort order to make sure that data is read left column first
Will that suit you?
thanks for your help. unfortunately I have tried, but it does not resolve my problem. The number of instances per page is not known and tables can go across columns and pages. I would be glad to hear any other tip you might have. Thank you.
Susanna,
Is there a sample image that you can provide so I can try on my end?
I can certainly provide you with the document. I have attached a scanned version of the 5 pages of the document we would like to capture data from. For each table you will see a table name (es. 1st table = Chiamate Locali), a total number of rows (the table is a call list - Numero totale chiamate) and at the end of the table a footer that provides information on the total duration of the call (total of 3rd column - Durata totale chiamte), the average duration (Durata media chiamate) and total amount (total of column 5.
We have tried some scripting ourselves, but we do not seem able to implement the logic of sequence between columns and pages (first Left Column page 1, then right column page 1, then left column page 2 and so on).
Thank you for any help you will be able to provide.
On the issue of multipage Document Definitions, tried to follow the guidelines in the tutorial publication and keep getting the error "Several fields are exported to same 'ROW_INDEX'" I defined sections in the document and also defined footer as the tell sign of end of document. Assembled the sections using a key field that appears on all pages. Am managing to export to desktop but not into database where it is very important
Thank you for the images. I had training last week so couldn't look at them too much. Looking at these images, the way I would approach this is definitely a repeatable group. I would probably create a repeatable group for the left and another for the right side. In the FlexiCapture side, I would then write a script to merge the export together. The hard part is that the table will jump from the left side to the right side. I suspect I would have to create some non-recognized repeatable group and remove the labels I don't want from the search area.
Please sign in to leave a comment.