ABBYY Cloud OCR SDK allows you to export recognized text in the following formats:
Format | exportFormat parameter of the processing method | Comments |
---|---|---|
TXT | txt | |
RTF | rtf | |
DOCX | docx | |
XLSX | xlsx | |
PPTX | pptx | |
pdfSearchable | The entire image is saved as a picture, with recognized text put under the image. | |
pdfTextAndImages | The recognized text is saved as text, and the pictures are embedded as images. | |
PDF/A-1b | pdfa | The file is saved in PDF/A-1b-compliant format, with the entire image saved as a picture and recognized text put under it. |
XML | xml |
All coordinates are saved relative to the original image. See Output XML Document for the description of tags. If you select this export format, barcodes are recognized on the image and saved to output XML no matter which profile is used for recognition. |
xmlForCorrectedImage |
All coordinates are saved relative to the image after geometry correction. See Output XML Document for the description of tags. If you select this export format, barcodes are recognized on the image and saved to output XML no matter which profile is used for recognition. |
|
ALTO | alto | |
vCard | vCard | This format is only available with the processBusinessCard method. |
CSV | csv | This format is only available with the processBusinessCard method. |
You can use any of these formats except vCard and CSV to export recognized text with the help of the processImage, processDocument methods. These methods also let you specify up to three export formats in one task without any additional costs.
Please note that various field processing methods always return results of recognition in XML format, which contains extended information about the recognized characters.
Comments
0 comments
Please sign in to leave a comment.