Сообщество

Editing Files after/while verification

Hello,

is it possible to edit the recognized files while or after verification? E.g. read a PDF -> put a specific stamp (date, company, signature) on the PDF -> export 

Thanks a lot

Была ли эта статья полезной?

Пользователи, считающие этот материал полезным: 0 из 0

Комментарии

Комментариев: 21

  • Avatar
    Alexey Efremov

    Hello Lennart,

    The answer is yes, you can do it. You can access the property IPictureObject of the page, get the Hbitmap handle of the image, change the image and then replace it for page:

     

    OLE_HANDLE handle = Document.Pages[0].Picture.Handle;

    System.Drawing.Bitmap bitmap = System.Drawing.Image.FromHbitmap( handle );

    //<do something with bitmap>

     

    IPictureObject FinalPicture = FCTools.PictureFromHBitmap( bitmap.GetHbitmap().ToInt32(), 300 );

    Document.Pages[0].ReplaceImage(FinalPicture);

     

    You can do this in the export script or create a custom document processing stage.

     

    Hope this helps.

    Alexey

     

    0
  • Avatar
    Fritz

    1. After registration I want to come back to the discussion page I had opened before. Please fix that.

    2. Now the question. I want to edit the OCR result behind a PDF text, but without changing the visible document. I scan old Fraktur pages, and would like to edit the OCR results only, for easier machine search, like from an original long ſ into a serpent s e.g. buſt to bust. The original must remain untouched. How to do that? Thank you.

    0
  • Avatar
    Alexey Efremov

    Hello Fritz,

    Unfortunately, FlexiCapture dose not have such capabilities by design.

    Alexey.

     

    0
  • Avatar
    Fritz

    But I vaguely remember that Abbyy has some software that can do that? It’s a frequent problem. Fritz

    0
  • Avatar
    Alexey Efremov

    The ABBYY policy is that the software functions as a "black box". You insert image and get the text as output. 

    The closest you can get to changing the text is using ABBYY FineReader Engine's methods Remove(fromPos, toPos) and Insert(position, insertString, charParams) of the Paragraph object. But this methods are mot meant for a wide use because of the processing time they need (irrelevant for one document, but noticeable on a big scale)

    We recommend to use third party tools for the tasks you propose. 

    Alexey

    0
  • Avatar
    Fritz

    See https://stackoverflow.com/questions/32914609/how-can-i-edit-the-search-text-of-a-searchable-pdf
       “I'm using ABBYY FineReader 12 Professional. (not open source). Just open a scanned image or scanned pdf and press Verify Text (or Ctrl + F7), then you go over all the spelling errors or low-confidence charachters and fix them.
       The program is very good, it shows you the exact place in image/pdf to correct and the OCR guessing side by side for convenience. It iterates all of them.
       [By the way, I'm using the shortcuts to speed up things: Alt+Enter to add the unrecognized word to dictionary. Ctrl+Delete to skip word or confirm in case you fixed it.]
       Then save the document as a pdf file, Menu: File>Save Document As> PDF File, and you can search it on every pdf reader. The saved file looks the same as the scanned one, but 'behind' it there [is the corrected] text.
       It's weird you tried ABBYY with no luck... it's working great for me. Maybe you didn’t try the Professional version.”

    – Might that work, Alexey? I have no finereader, I just enquire with high interest as tech journalist, see www.Joern.De/Presseausweis (can you give me a free version to try? Fritz@Joern.De).

    0
  • Avatar
    Alexey Efremov

    I thought you were asking about the business/industrial solutions.

    Yes, desktop solutions have this capability. You can request the trial here:

    https://www.abbyy.com/en-eu/download/finereader/

     

    0
  • Avatar
    Ola Thuresson

    Dear Alexy,

     

    I have managed to get the script working, I was missing some assemblies and .net references and I the OLE_HANDLE was wrong and also the issue with the bitmap being indexed. But the script wont work because of it being run as Export script?

    System.Drawing.dll

    PresentationCore.dll

    using System.Drawing;

    using System.Drawing.Drawing2D;

    using System.Drawing.Text;

    using System.Windows.Media.Imaging;

    using System.Windows.Forms;

     int handle = Document.Pages[0].Picture.Handle; //int defins the get OLE_HANDLE

                IntPtr xAsIntPtr = new IntPtr(handle); 

                System.Drawing.Bitmap bitmap = System.Drawing.Image.FromHbitmap( xAsIntPtr );

                Bitmap newBitmap = new Bitmap(bitmap.Width, bitmap.Height);

                System.Drawing.Graphics g = Graphics.FromImage(newBitmap);

                g.DrawImage(bitmap, 0, 0);

                

                //add fakturanr to bmp

               

                RectangleF rectf = new RectangleF(70, 90, 90, 50);                        

                g.SmoothingMode = SmoothingMode.AntiAlias;

                g.InterpolationMode = InterpolationMode.HighQualityBicubic;

                g.PixelOffsetMode = PixelOffsetMode.HighQuality;

                g.TextRenderingHint = TextRenderingHint.AntiAliasGridFit;

                

                // Create string formatting options (used for alignment)

                StringFormat format = new StringFormat()

                {

                Alignment = StringAlignment.Far,

                LineAlignment = StringAlignment.Near

                };

                g.DrawString(fakturanr, new Font("Tahoma",16), Brushes.Black, rectf,format);              

                g.Flush();            

                

                newBitmap.Save(@"D:\Apps\Abbyy\Fakturor\GarpFakturor\bitamp.bmp");

                

                //replace old bitmap with new bitmap

                IPictureObject FinalPicture = FCTools.PictureFromHBitmap( newBitmap.GetHbitmap().ToInt32(), 300 );

                Document.Pages[0].ReplaceImage(FinalPicture);

     

    Processing server logg: -1 10 6/21/2018 10:49:42 AM Document 1: European Invoice Export: System.Runtime.InteropServices.COMException (0x80004005): Cannot modify object data from this script.    at ABBYY.FlexiCapture.IPage.ReplaceImage(IPictureObject _newPicture)    at Main.AddTextToBitMap(String fakturanr, IDocument Document)    at Main.Execute(IDocument Document, IProcessingCallback Processing)

     

    So I assume this wont work in Export Scripts!?

     

    Regards,

    Ola

    0
  • Avatar
    Alexey Efremov

    I will our post the discussion from private messages here:

    Dear Alexey,

    You wrote this in the forum lately and I tried using it but did not get it to work.

    Hello Lennart,

    The answer is yes, you can do it. You can access the property IPictureObject of the page, get the Hbitmap handle of the image, change the image and then replace it for page:

     

    OLE_HANDLE handle = Document.Pages[0].Picture.Handle;

    System.Drawing.Bitmap bitmap = System.Drawing.Image.FromHbitmap( handle );

    //<do something with bitmap>

     

    IPictureObject FinalPicture = FCTools.PictureFromHBitmap( bitmap.GetHbitmap().ToInt32(), 300 );

    Document.Pages[0].ReplaceImage(FinalPicture);

     

    My not working code:

    AddTextToBitMap(fakturanr,Document);

    public static bool AddTextToBitMap(string fakturanr, IDocument Document)

    {

        try {

                //Get handle and bitmap

                int handle = Document.Pages[0].Picture.Handle; //int defins the get OLE_HANDLE "OLE_HANDLE" generates error..

                IntPtr xAsIntPtr = new IntPtr(handle); 

                System.Drawing.Bitmap bitmap = System.Drawing.Image.FromHbitmap( xAsIntPtr );

                

                //add fakturanr to bmp

                RectangleF rectf = new RectangleF(70, 90, 90, 50);            

                System.Drawing.Graphics g = Graphics.FromImage(bitmap);                        

                //g.SmoothingMode = SmoothingMode.AntiAlias;

                //g.InterpolationMode = InterpolationMode.HighQualityBicubic;

                //g.PixelOffsetMode = PixelOffsetMode.HighQuality;

                

                // Create string formatting options (used for alignment)

                StringFormat format = new StringFormat()

                {

                Alignment = StringAlignment.Far,

                LineAlignment = StringAlignment.Near

                };

                g.DrawString(fakturanr, new Font("Tahoma",8), Brushes.Black, rectf,format);            

                g.Flush();            

                

                //replace old bitmap with new bitmap

                IPictureObject FinalPicture = FCTools.PictureFromHBitmap( bitmap.GetHbitmap().ToInt32(), 300 );

                Document.Pages[0].ReplaceImage(FinalPicture);

            return true;

            }

        catch {}

     

        return false;  

     

    Regards,

    Ola

    ---------------------



    Dear Ola,

    Nice to hear from you. 

    Could you please create a new topic or add your reply to the topic mentioned? (This is bureaucratic issue, otherwise, i cannot log time)

    Thanks in advance.

    The advice would be to check if System.Drawing was added as .Net assembly to the project.

    The questions would be:

    What error messages are you receiving?

    Is your machine 32 or 64 bit? (Please check the size of IntPtr handle during runtime )

    Is this FC12? 

    Are you sure you are using the method on a script workflow stage? (otherwise it will not work)

    Could you please check your Processing Server monitors Task log?

     

     

     

    Regards,

    Alexey



     

    Dear Alexy,

     

    I have managed to get the script working, I was missing some assemblies and .net references and I the OLE_HANDLE was wrong and also the issue with the bitmap being indexed. But the script wont work because of it being run as Export script?

    System.Drawing.dll

    PresentationCore.dll

    using System.Drawing;

    using System.Drawing.Drawing2D;

    using System.Drawing.Text;

    using System.Windows.Media.Imaging;

    using System.Windows.Forms;

     int handle = Document.Pages[0].Picture.Handle; //int defins the get OLE_HANDLE

                IntPtr xAsIntPtr = new IntPtr(handle); 

                System.Drawing.Bitmap bitmap = System.Drawing.Image.FromHbitmap( xAsIntPtr );

                Bitmap newBitmap = new Bitmap(bitmap.Width, bitmap.Height);

                System.Drawing.Graphics g = Graphics.FromImage(newBitmap);

                g.DrawImage(bitmap, 0, 0);

                

                //add fakturanr to bmp

               

                RectangleF rectf = new RectangleF(70, 90, 90, 50);                        

                g.SmoothingMode = SmoothingMode.AntiAlias;

                g.InterpolationMode = InterpolationMode.HighQualityBicubic;

                g.PixelOffsetMode = PixelOffsetMode.HighQuality;

                g.TextRenderingHint = TextRenderingHint.AntiAliasGridFit;

                

                // Create string formatting options (used for alignment)

                StringFormat format = new StringFormat()

                {

                Alignment = StringAlignment.Far,

                LineAlignment = StringAlignment.Near

                };

                g.DrawString(fakturanr, new Font("Tahoma",16), Brushes.Black, rectf,format);              

                g.Flush();            

                

                newBitmap.Save(@"D:\Apps\Abbyy\Fakturor\GarpFakturor\bitamp.bmp");

                

                //replace old bitmap with new bitmap

                IPictureObject FinalPicture = FCTools.PictureFromHBitmap( newBitmap.GetHbitmap().ToInt32(), 300 );

                Document.Pages[0].ReplaceImage(FinalPicture);

     

    Processing server logg: -1 10 6/21/2018 10:49:42 AM Document 1: European Invoice Export: System.Runtime.InteropServices.COMException (0x80004005): Cannot modify object data from this script.    at ABBYY.FlexiCapture.IPage.ReplaceImage(IPictureObject _newPicture)    at Main.AddTextToBitMap(String fakturanr, IDocument Document)    at Main.Execute(IDocument Document, IProcessingCallback Processing)

     

    So I assume this wont work in Export Scripts!?

     

    Regards,

    Ola

     

    Tson

     

    0
  • Avatar
    Alexey Efremov

    Dear Ola,

    The answer is yes, the IPage::Picture object is read-only every where except for workflow scripts.

    You can create a document processing script stage right before export and place your code here.

     

     

    1
  • Avatar
    Ola Thuresson

    Dear Alexey,

     

    Thanks for confirming this and I assume there is no way of modifying or add text and/or picture during export to the pdf. The problem is that I during Export I fetch the GL Voucher no from the ERP which then will be printed with a hard copy of the invoice but without any knowledge to Which voucher no the invoice belongs to. Is it possible to open the export script stage IPage::Picture object to read and write with a system parameter? Is there a work around within the export script stage?

     

    I could always use ITextSharp to modify the PDF after the PDF has been saved but I'm not too keen on using an external assembly for this.

     

    Regards

    Ola

    0
  • Avatar
    Alexey Efremov

    Dear Ola,

    Unfortunately, i do not entirely understand your situation/ Could you please let me know on why the solution:

    "You can create a document/batch processing script stage right before export stage and place your code here."

    does not work for you?

    You can place all the code of your custom export in that stage if you want.

    For the instructions on how to create a script processing stage, please see the Not on the article "Creating processing stages" of the Developer's Help.

    Yours sincerely,

    Alexey

     

     

     

    0
  • Avatar
    Ola Thuresson

    Dear Alexey,

     

    You are right, of course I can use WorkFlow Script Stage right before the Export. I just didn't understand it at first. I can place all exports there actually and don't have to save the document definition over and over again while testing.

     

    Thank you for the guidance I now think I can make the export work.

     

    Regards,

    Ola

    0
  • Avatar
    Ola Thuresson

    Dear Alexey,

    My export script works brilliantly but...

    Everything works except the last stage in the workflow after I have replace the picture object the document suddenly is an "unprocessed document" when it reaches the Export Stage as non analyzed...

    -1 2 7/3/2018 2:02:05 PM Document 1: Unable to export a non-analyzed document

     

    Stage 1 Verification - check

    Stage 2 Custom Export Script Stage - check

    Stage 3 Abbyy standard Export Stage - here abbyy throws the exception

    Stage 4 Training -

     

    Regards,

    Ola

    0
  • Avatar
    Alexey Efremov

    Hello Ola,

    I have consulted the developers and it turns out that after calling

    Document.Pages[0].ReplaceImage(FinalPicture);

    the document always becomes unrecognized and you have to do recognition again

    (Note, that the license counter will be decreased again)

    Because I do not know your export properly, I cannot suggest the proper workaround.

    The one I see is to do the process twice and it is most likely not suitable for you.

    You can also customize it to save the data before the image replacement, and restore the data after it.

     

    Yours sincerely,

    Alexey

     

     

     

     

    0
  • Avatar
    Ola Thuresson

    Dear Alexy,

     

    Thank you for the answer I worked around it by excluding the abbyy standard export and only have custom export.

    The training seems to work ok anyway...no errors but does abbyy learn anything?

    regards,

    Ola

    0
  • Avatar
    Alexey Efremov

    Dear Ola,

    Can I suggest you to place your custom export after the training? Will this work for you?

    Regarding the question - I have to ask the developers, but probably not,

    Yours sincerely,

    Alexey

     

     

    0
  • Avatar
    Ola Thuresson

    Dear Alexey,

    Sure that would work.

    Out of curiosity it would be nice to  know :)

    Regards,

    Ola

     

    0
  • Avatar
    Alexey Efremov

    Dear Ola,

     

    Could you please check on Processing Server Monitor the log for the Training task?

    What does it says?

     

    Yours sincerely,

    Alexey

    0
  • Avatar
    Ola Thuresson

    Dear Alexey,

    Logg when having the export script after traning:

    -1 1 7/9/2018 3:58:26 PM Task processing is started

    -1 2 7/9/2018 3:58:26 PM Document 1: Trying to use the document for training...

    -1 3 7/9/2018 3:58:27 PM Document 1: Field training is not available. Document layout has not been modified.

    -1 4 7/9/2018 3:58:27 PM Task processing is completed

    Logg when having the export script before:

    -1 1 7/9/2018 4:07:24 PM Task processing is started

    -1 2 7/9/2018 4:07:25 PM Document 1: Trying to use the document for training...

    -1 3 7/9/2018 4:07:25 PM Document 1: Field training is not available. There is no Document Definition to train.

    -1 4 7/9/2018 4:07:26 PM Task processing is completed

    Regards,

    Ola

    0
  • Avatar
    Alexey Efremov

    Dear Ola,

    In case of export script after training everything works, you just have to move the regions for the fields.

    In case having the export script before training, the training stage actually fails. In workflow in the properties of the training stage by default there is only 1 exit route - to Processed, there is no route to exceptions, so the error with the document is ignored.

    Kind regards,

    Alexey

    0

Войдите в службу, чтобы оставить комментарий.