Question
How to extract pictures from the document?
Answer
You can save every picture from a document as a separate file. To do so, you may simply export your document to HTML format. The pictures would be written as "export-name-1.jpg", "export-name-2.jpg", ... "export-name-n.jpg", where export-name is your HTML export file name, 1, 2, ... n is the picture number.
Resulting HTML file "export-name.html" can be simply removed.
To speed up the processing
If you would like to skip recognition (simply extract the pictures), then please note, that
Document.Process() is analog of
document.Preprocess()
document.Analyze();
document.Recognize();
document.Synthesize();
As you don't need recognition results, you may simply replace your FRDocument.Process() method call in your code by the following methods:
document.Preprocess();
document.Analyze();
document.Synthesize();
This will save time for recognition, which is one of the most time-consuming steps. But please note that the document analysis stage cannot be omitted, because, in this step, the Engine determines, where your pictures are located in the document.
To adjust picture format/filesize:
Please use HTMLExportParams.PictureExportParams Object.
Comments
1 comment
S Lieberam
I am using the export to HTML function in order to extract pictures in FineReader for Windows. Most of the pictures are exported in .jpg format, which is okay. Some pictures are exported in .png format in a very poor quality. How can I force the export to be in .jpg?
Please sign in to leave a comment.