This is driving me nuts and I hope someone can help me. What I did so far using PHP/curl: I upload an image using the submitImage method. The call is successful (200) and returns a taskId. I then try to use processFields to recognize textareas in the image. The code looks like this.
$url = "http://cloud.ocrsdk.com/processFields?taskId=$taskid"; $curlHandle = curl_init(); curl_setopt($curlHandle, CURLOPT_VERBOSE, 1); curl_setopt($curlHandle, CURLOPT_URL, $url); curl_setopt($curlHandle, CURLOPT_RETURNTRANSFER, 1); curl_setopt($curlHandle, CURLOPT_USERPWD, "$applicationId:$password"); curl_setopt($curlHandle, CURLOPT_POST, 1); curl_setopt($curlHandle, CURLOPT_USERAGENT, "PHP Cloud OCR SDK Sample"); curl_setopt($curlHandle, CURLOPT_HTTPHEADER, Array("Content-Type: text/xml")); curl_setopt($curlHandle, CURLOPT_HEADER, 1); curl_setopt($curlHandle, CURLOPT_POSTFIELDS, array("my_file"=>"@".$local_directory."/fields.xml")); $response = curl_exec($curlHandle);
The fields.xml looks like this.
<?xml version="1.0" encoding="utf-8"?> <document xmlns="http://ocrsdk.com/schema/taskDescription-1.0.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://ocrsdk.com/schema/taskDescription-1.0.xsd http://ocrsdk.com/schema/taskDescription-1.0.xsd"> <page applyto="0"> <text id="Google" left="64" top="44" right="110" bottom="103"> <regExp>[0-9]+\.[0-9+]%</regExp> </text> </page> </document>
The call to processFields returns an error 450 with the error message "Data at the root level is invalid. Line 1, position 1." According to the documentation this indicates:
- The identifier of the task has not been specified.
- Incorrect XML file has been transmitted
- The task with the specified identifier cannot be started (e.g. it has already been started).
- Task description length exceeds 255 characters.
The taskId is correct (1), and the task shouldn't have been already started (3) since I get a new identifier every time I run my script. I do not use a description (4). This leaves me, most likely, with an incorrect XML.
I tried to omit "<?xml version="1.0" encoding="utf-8"?>" but still get a 450 with 'data at the root level is invalid'. I removed the regExp, still 450. I changed the encoding from UTF-8 to Latin1, still 450. I changed the line endings form CRLF to LF, still 450. I changed <page applyto="0"> to just <page>, still 450.
Searching this forum for "data root level" and alike yielded no results at all. Anyone out there, who can help me?
Many thanks in advance! Hans