processTextField Method

The method allows you to extract the value of a text field on an image. The method loads the image, creates a processing task for the image with the specified parameters, and passes the task for processing.

Customize the following request URL according to your application processing location:

[POST] http(s)://<PROCESSING_LOCATION_ID>.ocrsdk.com/processTextField

The image file is transmitted in the request body. See the list of supported input formats.

fieldLevelRecognition profile is used for processing.

The result of recognition is returned in XML format.

See How to Recognize Text Fields to know how to tune the parameters.

Parameters

Parameter	Is required	Default value	Description
region	No	"-1,-1,-1,-1"	Specifies the region of the text field on the image. The coordinates of the region are measured in pixels relative to the left top corner of the image and are specified in the following order: left, top, right, bottom. By default, the region of the whole image is used.
language	No	"English"	Specifies recognition language of the document. This parameter can contain several language names separated with commas, for example "English,French,German". See the list of available recognition languages. Note that not all languages are available for handprint recognition. The languages which are available for handprint recognition are marked with a special comment.
letterSet	No	""	Specifies the letter set, which should be used during recognition. Contains a string with the letter set characters. For example, "ABCDabcd'-.". By default, the letter set of the language, specified in the language parameter, is used.
regExp	No	""	Specifies the regular expression which defines which words are allowed in the field and which are not. See the description of regular expressions. By default, the set of allowed words is defined by the dictionary of the language, specified in the language parameter. Note that regular expressions do not strictly limit the set of characters of the output result, i.e. the recognized value may contain characters which are not included into the regular expression. During recognition all hypotheses of a word recognition are checked against the specified regular expression. If a given recognition variant conforms to the expression, it has higher probability of being selected as final recognition output. But if there is no variant that matches regular expression, the result will not conform to the expression. If you want to limit the set of characters, which can be recognized, the best way to do it is to use letterSet parameter.
textType	No	"normal"	Specifies the type of the text in the field. This parameter may also contain several text types separated with commas, for example "normal,matrix". The following values can be used: normal typewriter matrix index handprinted ocrA ocrB e13b cmc7 gothic
oneTextLine	No	"false"	Specifies whether the field contains only one text line. The value should be true, if there is one text line in the field; otherwise it should be false.
oneWordPerTextLine	No	"false"	Specifies whether the field contains only one word in each text line. The value should be true, if no text line contains more than one word (so the lines of text will be recognized as a single word); otherwise it should be false.
markingType	No	"simpleText"	This property is valid only for the handprint recognition. Specifies the type of marking around letters (for example, underline, frame, box, etc.). By default, there is no marking around letters. The value can be one of the following: simpleText underlinedText textInFrame greyBoxes charBoxSeries simpleComb combInFrame partitionedFrame Note: For correct handprint recognition specify the value of the placeholdersCount parameter.
placeholdersCount	No	"1"	Specifies the number of character cells for the field. This property has a sense only for the field marking types (the markingType parameter) that imply splitting the text in cells. Default value for this property is 1, but you should set the appropriate value to recognize the text correctly.
writingStyle	No	"default"	Provides additional information about handprinted letters writing style. It can be one of the following: default american german russian polish thai japanese arabic baltic british bulgarian canadian czech croatian french greek hungarian italian romanian slovak spanish turkish ukrainian common chinese azerbaijan kazakh kirgiz latvian
description	No	""	Contains the description of the processing task. Must contain no more that 255 characters.
pdfPassword	No	""	Contains a password for accessing password-protected images in PDF format.

Status codes and response format

General status codes and response format of the method are described in HTTP Status Codes and Response Formats.

The following status codes can be returned when this method is called:

Code	Description
200	Successful method call.
450	Incorrect parameters have been passed. One of the following errors occurred: Image file has not been specified. Incorrect region of the field has been specified. Incorrect recognition language has been specified. Incorrect regular expression has been specified. Incorrect text type has been specified. Handprinted text type is not supported for the specified languages. Incorrect field marking type has been specified. The number of cells in the field is not a positive number. Incorrect writing style has been specified. Task description length exceeds 255 characters. Incorrect password for accessing password-protected image file has been specified. Exceeded quota to add images. This error is returned if the number of images you have uploaded exceeds the number of images you can process with the credits available on your account plus some threshold. You can resolve this issue by topping up your account or by removing the tasks which have been submitted but have not been processed.
550	An internal program error occurred while processing the image.
551	An error occurred on the server side: The format of the image file passed for processing is not supported. The PDF file passed for processing has restrictions on creating raster images.

Output file format

The output XML file has the following format:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<document xmlns="@link" xmlns:xsi="@link" xsi:schemaLocation="@link" version="1.0">
    <field left="0" top="0" right="199" bottom="100" type="text">
        <value encoding="UTF-16">Data Capture Sample Text Data</value>
        <line left="0" top="0" right="199" bottom="100">
            <char left="0" top="0" right="199" bottom="100" confidence="98">
            D
            </char>
            ...
        </line>
        ...
    </field>
</document>

Nikolai Kromm

Parameters

Status codes and response format

Output file format

Was this article helpful?

Recently viewed

processTextField Method

Nikolai Kromm

Parameters

Status codes and response format

Output file format

Was this article helpful?

Related articles

Recently viewed

Related articles