The image of a phonebook page column is here: http://digitalfire.com/culiacan/pictures/309.jpg It was scanned at 600 dpi and resized in Photoshop to 300 (without resampling). We are passing a parameter to read Spanish.
The recognizer is not getting the phone numbers correct on the last 30 or 40 lines (they are chopped off on the right, more digits are missing on numbers nearer the bottom). Also, we are getting a high frequency of errors (for the 60 or so OCRs we have tested so far) where it is reading '-0' as '4)' (7494)211 instead of 749-0211), 'LI' as 'U' and '96' as '%'. This same image reads with much fewer errors using other recognition service. It is also failing to interpret the period-tab as a tab.
Any ideas? Thanks.