Different char coordinates with SourceContentReuseMode = CRM_ContentOnly and SourceContentReuseMode = CRM_DoNotReuse

Question

Why char coordinates with IObjectsExtractionParams::SourceContentReuseMode = CRM_ContentOnly differ from coordinates with IObjectsExtractionParams::SourceContentReuseMode = CRM_DoNotReuse and IObjectsExtractionParams::SourceContentReuseMode = CRM_Auto?

Answer

When property IObjectsExtractionParams::SourceContentReuseMode = CRM_ContentOnly is used FineReader Engine does not rasterize PDF. In this case, char coordinates can be received only from command descriptions of this PDF. Often they contain rectangles that are bigger than visible text. It can be seen if one opens a PDF file with Adobe Reader and highlights a part of the text— the highlighted area (rectangle in PDF) will be bigger than the visible text (rectangle in FREngine). This is the PDF design. So such behavior is expected and may occur in different PDF files.

Was this article helpful?

1 out of 2 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.