Community

How To Edit OCR Text WITHOUT Re-recognizing Existing OCRed Text

So I am importing a bunch of pictures to make an searchable text PDF, and it takes a while of course to process the OCRing. Then when it is done, it opens it so I can READ it, but cannot edit the text. To edit it, I need to RE-recognize the already OCRed text...which is counterproductive.

I should also point out that if I just try to hit the Edit tab at the top, it warns me "This page contains a text layer under the image. Editing text on this page will change the text layer and make the edited fragment appear on the page".

Also, whenever I have an ALREADY OCRed document that just has ONE typo in the OCRed text, how can I import that and edit just that little blurb of text without RE-OCRing the whole thing?

Was this article helpful?

0 out of 0 found this helpful

Comments

2 comments

  • Avatar
    Yuriy Korotkevych

    Hi!

    FineReader PDF does different kinds of processing when converting images of documents to a searchable PDF vs. when preparing an existing PDF for editing. And as PDF itself, as a format, is not editable, FineReader thus must prepare any PDF for editing, even if the PDF already contains text, and even if it has been created just seconds ago by FineReader itself. Such preparation uses OCR, but in case of a PDF with text inside OCR is used not for capturing the text itself from the image, but for analyzing the layout of pages and then embedding the edits into the document. You can read about this more in our blog article. FineReader always starts preparation for editing from the page that you're currently on, so if you spotted a typo on a page and clicked "Edit document", you can edit this typo on this page already in a few seconds, and you don't need to wait until FineReader finishes preparation of the other pages in the document: just continue working with the document, for example, by clicking on another tab, or saving and closing it.

    The warning you mentioned is just to make the users aware of the fact that changes made when editing a searchable PDF will affect the text layer as well and also may affect the visual appearance of the document.

    To eliminate typos in the OCRed text, in your particular case, taking into account your workflow as you described it, I would recommend using the Verification tool in OCR Editor yet on the stage of converting images of a document to searchable PDF before saving the results to the PDF. Here is the Help section about the tool.

    Hope it helps!

    Yuriy

     

     
    0
  • Avatar
    Victoria Dvornikova

    Hello,

    Please try to edit the text in PDF Editor window as described in our online help: https://help.abbyy.com/en-us/finereader/16/user_guide/edittext/.

    If you have any questions related to a specific document, please create a support ticket and provide us with the desired scenario and the file. 

    0

Please sign in to leave a comment.