Hello, I convert pdf to xml using processImage method. It's ok. But there are images in pdf. How can I get these images?
Parse image from pdf
Was this article helpful?
0 out of 0 found this helpful
Hello, I convert pdf to xml using processImage method. It's ok. But there are images in pdf. How can I get these images?
0 out of 0 found this helpful
Comments
1 comment
In the XML export format the OCRed result is presented in the hierarchy: document > page > block > region > etc.. The block tag has the blockType attribute, which denotes the type of the block: Text, Table, Picture, Barcode, Separator, SeparatorsBox.
For the Picture type only coordinates of the regions are included, the pictures themselves are not saved. The XML document is described with the help of the following XML schema.
Please sign in to leave a comment.