How to work with Cloud OCR SDK

Processing of any image with Cloud OCR SDK includes the following main steps:

If you have not yet registered with ABBYY Cloud OCR SDK, please follow this link to register. During registration you will receive an Application ID and Application Password, which must be passed to the processing server with each request you send.
Select the processing method to be used:
- To recognize a document, you can use either the processImage method or the pair of submitImage and processDocument methods. The first one directly starts processing of the uploaded image with selected parameters. The later ones allow you to combine a document from several images and obtain recognition results in one file.
- To recognize a business card, use the processBusinessCard method.
- To recognize a small text fragment, barcode field, or checkmark field, use the processTextField, processBarcodeField, or processCheckmarkField method, respectively. You can also use the processFields method to process several fields at a time.
We'll use the processImage method as an example.
Specify necessary parameters of the selected method. For example, for the processImage method, you can specify recognition language, suitable processing profile, format of output file, password for accessing a password-protected image, and description of the task.
Pass these parameters to the selected method with necessary authentication information. The address of the server is http://<PROCESSING_LOCATION_ID>.ocrsdk.com. The URL depends on the processing location of your application. See Data processing location for details.
```
POST http://<PROCESSING_LOCATION_ID>.ocrsdk.com/processImage?language=english&exportformat=pdfSearchable
```

Get response as XML. The successful response will have the following format:

<response>
    <task id="c3187247-7e81-4d12-8767-bc886c1ab878"
        registrationTime="2012-02-16T06:42:09Z"
        statusChangeTime="2012-02-16T06:42:09Z"
        status="Queued"
        filesCount="1"
        credits="0"
        estimatedProcessingTime="1"
        description="Image.JPG to .pdf" />
</response>

Read task ID from response XML. It is contained in the id attribute of the task element. See details in HTTP Status Codes and Response Formats.

Monitor task status in a loop using the getTaskStatus method with the task ID as the parameter.

If your application has started several tasks, we recommend calling the listFinishedTasks method to obtain the information on which of them are already processed.

GET http://<PROCESSING_LOCATION_ID>.ocrsdk.com/getTaskStatus?taskId=c3187247-7e81-4d12-8767-bc886c1ab878

An active task will have consecutive Queued and InProgress statuses. Get task information until the task is processed.

Important! Task status is changed once in 2-3 seconds. Therefore these methods should not be called more frequently.

<response>
    <task id="c3187247-7e81-4d12-8767-bc886c1ab878" 
        registrationTime="2012-02-16T06:42:09Z" 
        statusChangeTime="2012-02-16T06:42:09Z" 
        status="InProgress" 
        filesCount="1" 
        credits="0" 
        estimatedProcessingTime="1" 
        description="Image.JPG to .pdf" />
</response>

When the task is processed, the XML response contains the Completed status and a reference to the result of processing:

<response>
    <task id="c3187247-7e81-4d12-8767-bc886c1ab878" 
        registrationTime="2012-02-16T06:42:09Z" 
        statusChangeTime="2012-02-16T06:42:19Z" 
        status="Completed" 
        filesCount="1" 
        credits="0" 
        resultUrl="https://ocrsdk.blob.core.windows.net/files/xxx.result" 
        description="Image.JPG to .pdf" />
</response>

Extract the URL of the result of processing. It is contained in the resultUrl attribute of the task element.

Download recognition result. This request should not include any authorization headers.
```
GET https://<PROCESSING_LOCATION_ID>.blob.core.windows.net/files/xxx.result
```

You can find implementation of this procedure in code samples for various programming languages and platforms.

To check the correct operation of your application, please use some HTTP debugger such as Fiddler.

If you are going to process small text fragments or text fields, please look through How to Recognize Text Fields.

Sergey Pilipchuk

Was this article helpful?

Recently viewed