Community

Signature Extraction

Hi,

I want to extract signature from pdf. please help me to achieve this.

 

Thanks,

Was this article helpful?

0 out of 0 found this helpful

Comments

5 comments

  • Avatar
    Helen Osetrova

    Hello!

     

    Could you please tell us more about your processing scenario? What kind of signatures do you need to extract: handwritten, digital or something else? It would be great if you attach a sample document to your message.

    0
  • Avatar
    Permanently deleted user

    Hi Helen,

    I need to extract handwritten signature. Is this possible with abby?

     

    Thanks,

    Vidhya.

    0
  • Avatar
    Helen Osetrova

    Hi Vidhya,

     

     

    ABBYY OCR technologies allow recognizing a handwritten text only if the characters are separated and written as single characters (so-called "handprinted text"). The letters in signatures are usually distorted and written together, so it is not possible to recognize them and signatures appear in the result file as pictures.

     

    Hope this answers your query.

     

    0
  • Avatar
    Permanently deleted user

    Hello,

     

    Can I extract image from PDF document by using OCR SDK API?

    0
  • Avatar
    Helen Osetrova

    Hello Ibrahim, 

     

    In order to extract only images from your document, please follow several steps:

    • perform the document preprocessing and the document analysis;
    • obtain an actual page layout as an ILayout object;
    • check its blocks one by one and remove the ones which type is not BT_RasterPicture

     

    Please find below the code snippet that illustrates the suggested approach: 

     

     

    // Java code sample
    // We assume that FREngine is loaded and document analysis is performed

    IFRPages frPages = document.getPages();
    int pagesCount = frPages.getCount();

    for (int j =0; j < pagesCount; j++) {

         IFRPage page = frPages.getElement(j);
         ILayout pageLayout = page.getLayout();
         ILayoutBlocks blocks = pageLayout.getBlocks();
         int blocksCount = blocks.getCount();
      int i =0;

          while (i < blocksCount) {

        IBlock block = blocks.getElement(i);
              displayMessage( "Checking blocks");
              BlockTypeEnum blockType = block.getType();

    if (blockType != BlockTypeEnum.BT_RasterPicture) {

                 displayMessage( "Delete block");
                 blocks.DeleteAt(i);
                 blocksCount = blocks.getCount();
                 continue;

          }

    i++;
    } // iterating blocks

    displayMessage( "Recognize page...");
    page.Recognize(null, null);

    } // iterating pages

    ...

     

    Please also review the topic about tables extraction where the same approach is used: https://forum.ocrsdk.com/thread/tables-only-from-pdf/. You can find more details of different stages of document processing in FineReader Engine in the Developer's Help → Guided Tour → Advanced Techniques → Tuning Parameters of Page Preprocessing, Analysis, Recognition, and Synthesis article.

     

    Hope this information will be helpful!

     

    0

Please sign in to leave a comment.