コミュニティ

Coulmn data is merged in abbyy 10.5

A table which is being read as a text data using Abbyy engine. item (say 1001) and material code (20015689) are two coulmns. when ABBYY extracts data it merges both value like below -

 

<cell colSpan="2" leftBorder="White" rightBorder="White" bottomBorder="White" width="91" height="19">

<text>

<par>

<line baseline="328" l="38" t="321" r="112" b="329"><formatting lang="EnglishUnitedKingdom">1001  20015689</formatting></line></par>

</text></cell>

I have seen some suggestions on INTERNET (http://knowledgebase.abbyy.com/article/698 ) which I tried to apply but unable to do that.

I am using java to call ABBYY OCR.

 

Can you please help me in this?

Thanks a Lot.

この記事は役に立ちましたか?

0人中0人がこの記事が役に立ったと言っています

コメント

3件のコメント

  • Avatar
    Permanently deleted user

    Hi,

    please try to enable the AggressiveTableDetection property of the PageAnalysisParams object. If you set it to TRUE, FineReader Engine tries to find as many tables as possible on the page.

    If this setting does not help, please send:

    • your serial number
    • the build number of FineReader Engine 10.5
    • the image illustrating the issue

     to your region Technical Support. All ABBYY contacts you can find here: https://www.abbyy.com/contacts/

    1
  • Avatar
    Permanently deleted user

    Hi Oksana,

    Thanks for your reply.

    I will have have to take permission before providing you sample and license  from my client. Once I get that I will send you the sample and serial number.

    I have used AggressiveTableDetection property but it is also not working, because table has no vertical lines for separating cells. if two cells are having data very near then ABBYY is merging them together.

    Do you have any suggestions for this.

    Just to let you know

    when I used visual component and vertical lines for separating table cells then it is perfectly extracting all data.

    I have one question here,

    1- Are ABBYY visual components and ABBYY engine both used together for good extraction?

    2-Do companies use both of them together for extraction, like first pre-process the image using Visual Component and then pass it to ABBYY engine or we can achieve same result by only using ABBYY engine programmatically?

     

    Thanks a lot.

    Regards,

    Anurag

     

     

     

     

    0
  • Avatar
    Permanently deleted user

    Hi,

    1. You may use visual components to get better recognition results, for example to get your table extracted. However there are also other purposes for the visual components: viewing the list of document pages, editing images, editing or validating recognized text. Please check more information about visual components in Help → Visual Components Reference.

    2. As each compony has its own scenario, the methods they use to recognize documents are usually very different.

    You may achieve the same results as with visual components by using only ABBYY engine programmatically by adding a table block to the layout. However you will need to know the exact region of the table on your page. Please find more information about this method in Help → Guided Tour → Advanced Techniques → Working with Layout and Blocks.

    0

サインインしてコメントを残してください。