How to process large files

The API described in this section is for testing purposes only. Its parameters, functionality and availability are not guaranteed in future.

Processing large files (size over 30 MB) with the processImage or submitImage methods is not possible. The size limit is due to practical considerations: uploading large images in request body takes too much time.

However, there is another procedure which you can use if you need to process large images with the Cloud OCR service. Basically, you need to upload your images to any image-hosting website and pass image URLs to the Cloud OCR service. But note that the size of one image still must not exceed 200 MB.

The processing task for an image located on another website can be created with the help of the processRemoteImage method. It requires a source parameter containing the image URL. All other parameters match the processImage method. Here is an example of a processing request using this method:

GET https://cloud.ocrsdk.com/processRemoteImage?source=https%3A%2F%2Fgithub.com%2F
abbyysdk%2Focrsdk.com%2Fblob%2Fmaster%2FSampleData%2FPage_08.tif%3Fraw%3Dtrue&lang
uage=English&exportFormat=txt

After the successful method call you receive the task description, same as when you use processImage, and then the server attempts to retrieve the image from the specified URL. If any errors occur at this stage, the task receives ProcessingFailed status.

Now you need to monitor the task status and obtain the result URL, when it is completed. The procedure is the same as for processing with other methods, and is described in How to Work with Cloud OCR SDK.

Java sample for processing remote files

You can use the Java sample (browse at GitHub) to process multiple files located on a remote server. Follow the instructions:

  1. Upload your image files to an image-hosting website and acquire the download URLs.
  2. Create a text file listing the download URLs, each URL on a separate line.
  3. Modify the ClientSettings.java file to include your Application ID and password in the following lines:
    // Name of application you created
    public static final String APPLICATION_ID = "";
    // Password should be sent to your e-mail after application was created
    public static final String PASSWORD = "";
  • Compile the sample:
    javac ProcessManyFiles.java
    
  • Run the sample:
    java ProcessManyFiles remote --lang=<language of the documents> --format=<export form
    at> <path to the text file with links> <path to the results folder>

After the sample application finishes its work, the specified folder will contain recognized documents.

Check out also the Processing Large Files Using Dropbox tutorial.

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.