I have documents that tend to have two tables but no whitespace between them. It could also be viewed as one table without uniform columns for every row. The FineReader SDK gets confused since it tries to treat this as a table, and when I try to extract the data I can't tell where one row ends and the next begins. For example:
The first two rows are divided into 7 columns. The second set of rows are divided into 8 columns. It appears as if the SDK is trying to treat it as one large table, adding invisible separators for the areas where the columns don't extend. Like this:
Which is obvious why the engine would get confused over this. If I split the tables visually using photoshop they get parsed perfectly. Any tips on how to handle this situation? I could hardcode the number of columns per document type, but that seems messy and I'd like to keep it more generic.