Does OCR on loud process both outline font and general font in PDF?
At present the XML file stream returned by cloud OCR service API does not contain the font information. Is there a way to extract font information using cloud OCR service?
Thanks, - Shailesh
Comments
2 comments
Hi Shailesh,
The service is able to process 'general' fonts for sure regardless to source image format.
Outline fonts in general are not recognized without 'user pattern training' process which is not presented in the service.
PDF format itself can contain images or commands to form a face (image). Later case suppose textual information is in PDF file and can be extracted without applying OCR. To some extent this feature is supported in offline OCR SDK, but is not available in the service.
Best regards, Dmitry.
Hi Shailesh!
We are happy to inform you that the requested functionality has been recently implemented. Now it is possible to get information about the paragraph and character styles in the XML export format. For this please use the xml:writeFormatting parameter of the processImage or processDocument methods and set it to true (by default it is false).
Please sign in to leave a comment.