Community

Parse image from pdf

Written by Permanently deleted user

May 02, 2017 14:54
1

Hello, I convert pdf to xml using processImage method. It's ok. But there are images in pdf. How can I get these images?

Was this article helpful?

0 out of 0 found this helpful

Comments

1 comment

Permanently deleted user

May 11, 2017 12:09
In the XML export format the OCRed result is presented in the hierarchy: document > page > block > region > etc.. The block tag has the blockType attribute, which denotes the type of the block: Text, Table, Picture, Barcode, Separator, SeparatorsBox.

For the Picture type only coordinates of the regions are included, the pictures themselves are not saved. The XML document is described with the help of the following XML schema.

0

Please sign in to leave a comment.