Hi,
I'm working with the processFields method, and everything works fine with JPG (coordinate-wise). However, JPGs don't handle multiple pages - so I'd like to use PDF. However, I'm feeding the same coordinates I fed the JPG version, but it doesn't work - values come back empty in the output XML. Is there something special that I have to do to get PDF coordinates since PDF is sort of vectorish and it will depend on the dpi your engine is using?
What do you recommend?
Thanks, Adam
Comments
6 comments
Please provide sample of JPG and PDF files
http://wevito.com/ocr/ocrtestscanned.jpg
http://wevito.com/ocr/ocrtestscanned.pdf
Could you please provide the fields settings XML and the request parameters (language, etc) too?
Field settings XML, parameters are in there:
http://wevito.com/ocr/ocrparams2.xml
Thanks, Adam
In cases when actual resolution is not defined it is internally set to 300 dpi. This is exactly why you have no data extracted - the JPEG file has resolution set to 150 dpi and so coordinates in respect to the JPEG file are completely irrelevant to the PDF file.
Then solution would be fixing the problem with software that was used to create PDF files so it will set correct DPI settings from JPEG files.
Please sign in to leave a comment.