Community

Parse image from pdf

Hello, I convert pdf to xml using processImage method. It's ok. But there are images in pdf. How can I get these images?

0

Comments

1 comment

  • Avatar
    Oksana Serdyuk

    In the XML export format the OCRed result is presented in the hierarchy: document > page > block > region > etc.. The block tag has the blockType attribute, which denotes the type of the block: Text, Table, Picture, Barcode, Separator, SeparatorsBox.


    For the Picture type only coordinates of the regions are included, the pictures themselves are not saved. The XML document is described with the help of  the following XML schema.

    0
    Comment actions Permalink

Please sign in to leave a comment.