Community

Recognize fields in multipage

I have a PDF document where every page has the same template. I need to recognize a single field from each page.

Do I need to process the whole page or textfield recognition has the possibility to work with multipage?

0

Comments

3 comments

  • Avatar
    Oksana Serdyuk

    For your scenario you can use the processFields method. It allows to specify the coordinates of each field in an XML file for each page, for example:

    <?xml version="1.0" encoding="utf-8"?>
    <document xmlns="http://ocrsdk.com/schema/taskDescription-1.0.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://ocrsdk.com/schema/taskDescription-1.0.xsd http://ocrsdk.com/schema/taskDescription-1.0.xsd">
      <fieldTemplates />
      <page applyTo="0,1">
        <text id="Field1" left="395" top="105" right="1047" bottom="157">
          <language>English</language>
          <textType>normal</textType>
          <oneTextLine>true</oneTextLine>
        </text>
      </page>
      <page applyTo="2">
        ...
      </page>
      ...
      <page applyTo="N">
        ...
      </page>
    </document>
    
    1
    Comment actions Permalink
  • Avatar
    danyolgiax

    I considered it but I don't know how many pages the document has. Do I need to dinamically generate configuration XML file reading total page numbre from pdf document?

    0
    Comment actions Permalink
  • Avatar
    Oksana Serdyuk

    Yes, you should do so because the "applyTo" attribute is mandatory for the "page" element name.

    0
    Comment actions Permalink

Please sign in to leave a comment.