Community

How to convert semi structured and unstructured information from OCR to a structured information

We tried to process a document having table like structure but not table. extracted data considers the first column as different text block and the second column as different text block. We need to store first column values as key and second as value.

First Field Value
Price 50
Stock 20


Co-ordinates coming form cloud ocr are not matching its coming like


<line baseline="583" l="289" t="559" r="695" b="583" xmlns="http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml"> <formatting lang="EnglishUnitedStates"> <charParams l="289" t="559" r="306" b="582">P</charParams> <charParams l="307" t="559" r="319" b="583">R</charParams> <charParams l="320" t="559" r="339" b="583">I</charParams> <charParams l="341" t="559" r="361" b="582">C</charParams> <charParams l="364" t="560" r="381" b="583">E</charParams> </formatting> </line>


values are coming like


 <line baseline="2808" l="1801" t="2774" r="1871" b="2808" xmlns="http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml">
<formatting lang="EnglishUnitedStates">
  <charParams l="1801" t="2775" r="1819" b="2806">7</charParams>
  <charParams l="1850" t="2774" r="1871" b="2807">0</charParams>
</formatting>
</line>

There is no match in any of the coordinate between field and value how to tackle this to get correctly mapped values?

Was this article helpful?

0 out of 0 found this helpful

Comments

1 comment

  • Avatar
    Permanently deleted user

    Hi, please send your source image to CloudOCRSDK@abbyy.com for our tests.

    0

Please sign in to leave a comment.