Community

Number of words in a document

HI

I there a way to get the number of words in a pdf document?
If not is there a way to get by developing an add-on ?

Thanks for your help

Regards.

Was this article helpful?

0 out of 0 found this helpful

Comments

7 comments

  • Avatar
    Nikolai Kromm

    Hi,

    You can use Words Object of each Paragraph to calculate the word count.

    Please note that the word which this object contains is an internal entity. It is not guaranteed to coincide with the word as understood in the natural language, with the word as defined by regular expression, or with the sequence of characters which is separated from other words by spaces. The main purpose of the Word object is to provide recognition variants for the word.

    Another possibility is to get the PlainText::Text of the FRDocument Object, and make a custom word calculation algorithm based, for example, on spaces count.

    Nikolai

    0
  • Avatar
    Frederic Bernier

    Hi Nikolai
    Thanks for this answer. I am experienced in development but totally new to the Abbyy context and SDK. Is there any pointer to get an example or starting kit to save me time?

    Thanks for your help

    Regards

    0
  • Avatar
    Nikolai Kromm

    Hi,

    Could you please tell me your preferred programming language?

    0
  • Avatar
    Frederic Bernier

    No preference I learnt when this was necessary. Visual Basic is the easiest but I practice others like Java python etc.

    Regards.

    0
  • Avatar
    Nikolai Kromm

    Hi,

    I have made an article with Java code samples for initial suggestions: Number of words in a document for future reference.

    Please let me know if this information was helpful.

    0
  • Avatar
    Frederic Bernier

    This is great thanks. I suppose that I have to reference libraries from Abbyy right?

    0
  • Avatar
    Nikolai Kromm

    Hi,

    Sorry, I missed this step.

    You are correct. You can use a Hello sample as a starting point. It is a part of the FineReader Engine 12 Developer installation.

    Hello sample is available in C#, C++ (native COM support), raw C++, Java, Delphi, VBScript, JavaScript, Perl, Visual Basic .NET, .NET Core.

    0

Please sign in to leave a comment.