Blurry OCR leftovers in MRC-compressed PDF


Not sure if this is the place to ask questions about working with FineReader that is not Server edition, but if anyone has advice on what has to be a standard issue but doesn't seem to be covered by help pages I will be super grateful.

Is it inevitable to get blurred character parts that failed to be picked up by OCR in the final saved searchable PDF, if MRC compression is used? I attached an example; it can be diacritics, punctuation or specks of pixels left behind - there is always something the OCR will not pick up. I would not be concerned with such details, so long as the visual output was not affected.

I have tried various compression settings, but unless I skip MRC the blurry bits seem to be there to stay. To be completely clear, so long as the image is viewed in FineReader, there is no blur and the contrast is perfectly sharp. 

It may seem a fair trade-off for the huge drop in file size, but then I do wonder if the quality of original images really has to be affected by the OCR layer.

If there was a known fix for this, I would hugely appreciate to hear.


Edit: thank you for the reply with links, that's helpful. I have also found a solution in the meantime, which is to convert images to black-and-white before launching OCR.



1 comment

Please sign in to leave a comment.