Community

scan not recognized as text Answered

Hi!

What can I do if a single page of a pdf invoice is not fully recognized as text?

<block blockType="Text" blockName="" l="278" t="2872" r="360" b="2926"><region><rect l="353" t="2872" r="360" b="2878"/><rect l="278" t="2878" r="360" b="2919"/><rect l="353" t="2919" r="360" b="2926"/></region>
<text>
<par lineSpacing="-1"></par>
</text>
</block>
<block blockType="Text" blockName="" l="539" t="198" r="567" b="233"><region><rect l="560" t="198" r="567" b="199"/><rect l="539" t="199" r="567" b="232"/><rect l="539" t="232" r="560" b="233"/></region>
<text>
<par lineSpacing="940">
<line baseline="226" l="546" t="206" r="560" b="226"><formatting lang="EnglishUnitedStates">
<charParams l="546" t="206" r="560" b="226" suspicious="1">1</charParams></formatting></line></par>
</text>
</block>
<block blockType="Picture" blockName="" l="395" t="235" r="2867" b="1492"><region><rect l="1614" t="235" r="1851" b="236"/><rect l="559" t="236" r="2867" b="237"/><rect l="395" t="237" r="2867" b="1014"/><rect l="395" t="1014" r="2866" b="1487"/><rect l="395" t="1487" r="2867" b="1489"/><rect l="395" t="1489" r="2867" b="1490"/><rect l="826" t="1490" r="2507" b="1491"/><rect l="2252" t="1491" r="2408" b="1492"/></region>
</block>
<block blockType="Text" blockName="" l="394" t="1609" r="2270" b="1675"><region><rect l="1016" t="1609" r="1916" b="1610"/><rect l="394" t="1610" r="2119" b="1611"/><rect l="394" t="1611" r="2260" b="1612"/><rect l="394" t="1612" r="2270" b="1673"/><rect l="394" t="1673" r="2270" b="1674"/><rect l="822" t="1674" r="2270" b="1675"/></region>

But there is a table with text (invoice elements) - not an unrecognizable image.
What can I do to fix this for the future? I am using processImage?language=german,latin,english&exportFormat=xml at this time.

Best wishes

Marc

0

Comments

3 comments

  • Avatar
    Oksana Serdyuk

    Could you please share your source PDF file or send it to CloudOCRSDK@abbyy.com, so that we can find the more appropriate recognition settings for your scenario?

    0
    Comment actions Permalink
  • Avatar
    Oksana Serdyuk

    Hi Marc,

    Thank you for the provided document. Please try the following recognition settings:

    processImage?language=German,English&profile=textExtraction
    0
    Comment actions Permalink
  • Avatar
    Marc Bachmann

    Hi Oksana,

    thank you for the new parameter.
    It works well.

    Thank you.

    Best wishes

    Marc

    0
    Comment actions Permalink

Please sign in to leave a comment.