Export formats

ABBYY Cloud OCR SDK allows you to export recognized text in the following formats:

Format exportFormat parameter of the processing method Comments
TXT txt  
RTF rtf  
DOCX docx  
XLSX xlsx  
PPTX pptx  
PDF pdfSearchable The entire image is saved as a picture, with recognized text put under the image.
pdfTextAndImages The recognized text is saved as text, and the pictures are embedded as images.
PDF/A-1b pdfa The file is saved in PDF/A-1b-compliant format, with the entire image saved as a picture and recognized text put under it.
XML xml

All coordinates are saved relative to the original image.

See Output XML Document for the description of tags. If you select this export format, barcodes are recognized on the image and saved to output XML no matter which profile is used for recognition.

xmlForCorrectedImage

All coordinates are saved relative to the image after geometry correction.

See Output XML Document for the description of tags. If you select this export format, barcodes are recognized on the image and saved to output XML no matter which profile is used for recognition.

ALTO alto  
vCard vCard This format is only available with the processBusinessCard method.
CSV csv This format is only available with the processBusinessCard method.

You can use any of these formats except vCard and CSV to export recognized text with the help of the processImage, processDocument methods. These methods also let you specify up to three export formats in one task without any additional costs.

Please note that various field processing methods always return results of recognition in XML format, which contains extended information about the recognized characters.

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.