Community

ProcessTextField API is not working properly

I am trying to get data for a respective field (say Discount) from a sample tiff image(i.e. Picture_010.tif, which is a part of Picture_samples, provided by ABBYY http://ocrsdk.com/help/picture_samples.zip ).

Its the URL send to the server: http://cloud.ocrsdk.com/processTextField?language=English&textType=normal,handprinted&oneTextLine=true&regExp=Discount

The output shown as, which doesn't have the intended output. I went through all the previous post on this tag, but didn't get much help from that:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<document xmlns="@link" xmlns:xsi="@link" xsi:schemaLocation="@link" version="1.0">
  <field left="0" top="0" right="2500" bottom="3521" type="text">
    <value encoding="utf-16">ts^</value>
    <line left="216" top="376" right="2104" bottom="1560">
      <char left="216" top="744" right="584" bottom="1544" confidence="12" suspicious="true">t</char>
      <char left="584" top="728" right="1016" bottom="1560" confidence="13" suspicious="true">s</char>
      <char left="1016" top="376" right="2104" bottom="1560" confidence="-1" suspicious="true">?</char>
    </line>
  </field>
</document>

I am running the java sample for ProcessTextField as below :

java TestApp textField --options="regExp=Discount" "C:\\Picture_samples\\English\\Scanned_documents\\Picture_010.tif" "C:\\Picture_010_1.xml"

Looking forward for your help to resolve this issue.

Was this article helpful?

0 out of 0 found this helpful

Comments

6 comments

  • Avatar
    Permanently deleted user

    "regExp"=Discount is a wrong parameter. Regular expressions is not suitable for this task, you can find the details about regExp here.

    You can recognize Discount field using the URL like this:

    http://cloud.ocrsdk.com/processTextField?language=English&textType=normal&region=left,top,right,bottom

    where left, top, right, bottom are the text coordinates. They are measured in pixels relative to the left top corner.

    If you don't want to specify the text coordinates directly, you can get all text with its coordinates (using the processImage method, export to XML). Then you can extract the necessary information on your side, as you know that the numbers you need have almost the same vertical coordinates as the "Discount" word.

    0
  • Avatar
    Permanently deleted user

    I tried your suggestion in the URL, but see HTTP 500 (Internal Server error) without any further information. I tried the following couple of URL (with/without regExp) for Discount field.

    http://cloud.ocrsdk.com/processTextField?language=English&textType=normal&region=left,top,right,bottom

    http://cloud.ocrsdk.com/processTextField?language=English&textType=normal&region=left,top,right,bottom&regExp=D[a-z]+

    0
  • Avatar
    Permanently deleted user

    If i want to get the data for a particular field (say Discount) in the document with the region selected as (left,top,right,bottom), can i also provide the the field name (like Discount) for recognition in any other parameter of your API or regExp is the only way ?

    0
  • Avatar
    Permanently deleted user

    RegExp is not suitable for this task in general. It is suitable for cases if you want, for example, to recognize text "olo 123", and for some reason it is recognized as "010 123".

    Unfortunately, we have not got a parameter which allows you to recognize a field with a specified word (like "Discount") near it. You can only do it on your side (use the processImage method, perform export to XML, extract the necessary information using text and its coordinates).

    The URL should looks like

    http://cloud.ocrsdk.com/processTextField?language=English&textType=normal&region=0,0,200,200

    (where 0,0,200,200 are the text coordinates). Could you please clarify if the error occurs with this URL?

    0
  • Avatar
    Permanently deleted user

    Here is the XML output for this URL : http://cloud.ocrsdk.com/processTextField?language=English&textType=normal&region=0,0,200,200

    <?xml version="1.0" encoding="UTF-8"?>
    <document xmlns="@link" xmlns:xsi="@link" xsi:schemaLocation="@link" version="1.0">
        <field left="0" top="0" right="200" bottom="200" type="text">
            <value encoding="utf-16" />
            <line left="0" top="0" right="0" bottom="0" />
        </field>
    </document>
    
    0
  • Avatar
    Permanently deleted user

    So, the error does not occur. Note that you should specify your own text coordinates (0,0,200,200 are just an example). For the Discount field on Picture_10.tif you can specify 1473,1992,1628,2029.

    0

Please sign in to leave a comment.