Different char coordinates with SourceContentReuseMode = CRM_ContentOnly and SourceContentReuseMode = CRM_DoNotReuse

Question

Why char coordinates with IObjectsExtractionParams::SourceContentReuseMode = CRM_ContentOnly differ from coordinates with IObjectsExtractionParams::SourceContentReuseMode = CRM_DoNotReuse and IObjectsExtractionParams::SourceContentReuseMode = CRM_Auto?

Answer

When property IObjectsExtractionParams::SourceContentReuseMode = CRM_ContentOnly is used, FineReader Engine will not rasterize PDF. In this case, char coordinates can be received only from command descriptions of this PDF file. Often they contain rectangles that are bigger than visible text. It can be seen if it is opened in any PDF viewer and a part of the text is highlighted — this area (rectangle in PDF) will be bigger than the visible text (rectangle in FineReader Engine). This is the PDF design, so such behavior is expected and may occur in different PDF files.

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.

Recently viewed