Community

how to extract data from microsoft word document using abbyy?

Hi there,

i have a word document with standard template and wanted to extract certain set of fields.

How do we achieve this using abbyy ? Can you please suggest or guide on this requirement.

Thanks in advance

0

Comments

5 comments

  • Avatar
    Scott Chau

    Are you using FlexiCapture Engine or FlexiCapture SDK? Reason I'm asking is FC SDK has native input for office document.  FC Engine, you will need to build your own converter to a format we support.  If you're on FCE, you might want to check with your sales rep about switching over to FCSDK.

    0
  • Avatar
    Avinash

    Scott Chau : Thanks for your response.

    i am using the Flexi Capture Engine.

    By the way one general question. I have couple of structured PDF docs and wanted to process it. So in the project set up station I an using the Forms while creating the document definition. Just wanted to confirm that I am using the correct option/approach to develop this.

    0
  • Avatar
    Scott Chau

    Yes if they are structured/fixed form, you can use that option.  Just keep in mind, sometime FixedForm aren't always fixed.  Some people could print them in a shrunken format.  i.e. a Legal size document is printed in Letter size.  Then you would need to account for both format even though they are the same form.  Other switch to Sem-Structure which would account for both format.

    0
  • Avatar
    Avinash

    Scott Chau : Nice very helpful and thank you for the quick response.

     

    0
  • Avatar
    Avinash

    Scott Chau : Hi Scott, Good Day.

    i have few queries it would be really helpful if you can provide your help/suggestions.

    1. To read/ train MS Word Structured documents - Do we need to predefines or initial steps or settings that I need to do as a part of configuration either in project setup Station or corresponding document definition. If yes can you please provide details.

    2. In one of the field in word document i need to extract the signature ,which field type i should use here or any suggestions here to pick this value. I can say that signature is an image, when i click on that signature its referring to image. can we extract this or is it not feasible ? 

    Kindly requesting you to please help.

    Thanks

    0

Please sign in to leave a comment.