I'm using the Cloud OCR to read some file in PDF format. It work's fine. But I had a little problem.
This pdf always has information in the same positions when we open it. However, when being processed by Cloud OCR, the positions in the TXT are not the same.
Even when I receive 2 very very similar PDFs and fields that are in the same position, in TXT or XML, they are in different positions.
Is there any way to get the return in TXT or XML of the fields with the correct and exact row / column information?
This is a image for first PDF file and TXT returned by OCR Cloud. The filed marked with red box, return in line 3 position 369
This is a image for second PDF file and TXT returned by OCR Cloud. . The filed marked with red box, return in line 4 position 260
When looking at the PDF file the position is the same, however, the return in TXT is not. Why does it happen ? Would it be possible to always return to the same position?
Please sign in to leave a comment.