Hello,
I am trying to capture address in a document. So I search the zip code first N{5}[\-]N{4}, then get the 2 lines closes to the zip code. Not ideal, but for most of the document it should do the trick. However, I can only get 1 line instead of 2. Or if I increase it to 3 or 4 line then I can capture all of them. How do I make sure that it get only the last 2 lines? Please help.
Thanks
I am trying to capture address in a document. So I search the zip code first N{5}[\-]N{4}, then get the 2 lines closes to the zip code. Not ideal, but for most of the document it should do the trick. However, I can only get 1 line instead of 2. Or if I increase it to 3 or 4 line then I can capture all of them. How do I make sure that it get only the last 2 lines? Please help.
Thanks
Comments
10 comments
As you specified in the element settings that from 1 to 2 lines are allowed for CustomerAddress, FlexiCapture may take any reliable text within the search area (which is now includes three text lines).
If you need to take 2 last lines exactly, then try setting Min line count to 2. Probably setting up the search area below the CustomerName element + the height of the next line ("FEEE") would also help you to improve the result.
Please let me note that standard FlexiLayout distribution includes sample named invoices where logic of accurate address searching is described for typical invoice. This sample is also described in FlexiLayout Studio Help, section Tutorial->Sample 3. Please try looking into documentation as many useful tips can be found there.
If would like forum users to assist you with your flexilayout please at least provide us with a real image (or a couple of images, any sensitive information can be replaced by some "fake" text there) and specify what information remains unchanged for real docs and can be used as reference to extract data. It would also be good to know what FlexiCapture version you are currently using.
I am trying to capture address and customer name. Nothing is static, the name can be 2 lines long so I am using paragraph for the name as well. Which means I should get address first, then anything above address is customer name. I found zipcode but can't get 2 lines closest to zipcode. I also tried capturing address number on street but there is possibility that the customer name contains number as well. The one thing that I don't understand is why is it not getting the 2 lines closest to zipcode when that's what I specified, if it's not a bug.
no it's not a bug. It seems that FlexiLayout studio has troubles with capturing CustomerName element (e.g. set relations are not enough to identify it reliably) and this affects CustomerAddress search quality.
I set up a flexible description with your sample image without creating CustomerName element, only kwZip, and FLS captured two lines nearest to ZIP into CustomerAddress successfully (see attach). Please note that I have additionally excluded ZIp-code region from the CustomerAddress search area.
So you can try setting CustomerAddress parameters without reference to CustomerName. If you want to get CustomerName too, then please try setting it up more carefully. Again, looking into standard samples may help you to build flexilayout properly (e.g., there you can see how white gaps and separators are used to find text parts in the unstructured invoice).
But let's say if I don't want customer name, just the address. Then I think it's a bug for why it doesn't capture the 2 closest lines, unless the zipcode is omitted.
Again, it isn't. The line including some element on the right is the nearest line to this element, isn't it? If you would like to exclude ZIP-code from the search you could either do like I suggested before, or additionally specify that you are looking for address to the left of the ZIP-code (i.e. add Left of relation).
from your previous explanations I understood that you search ZIP separately, sorry if I got it wrong. Of course you can capture ZIPcode within the address, just remove the corrsponding search constraint. In my test project I specified kwZip in a field "Exclude regions of elements", Search Constraints tab, so if I remove this restriction, ZIP is included into CustomerAddress element.
Please sign in to leave a comment.