Hello,
I have a problem with my parameters in the FineReader Engine. The Engine don't detect tables in our files, only the text. But I don'f find the problem in the parameters. We don't use a profile.
Our parameters are:
private FREngine.DocumentProcessingParams setDPP() {
FREngine.DocumentProcessingParams processingParams = engineLoader.Engine.CreateDocumentProcessingParams();
engineLoader.Engine.MultiProcessingParams.MultiProcessingMode = FREngine.MultiProcessingModeEnum.MPM_Parallel;
engineLoader.Engine.MultiProcessingParams.RecognitionProcessesCount = Environment.ProcessorCount;
processingParams.PageProcessingParams.RecognizerParams.LowResolutionMode = true;
processingParams.PageProcessingParams.RecognizerParams.OneWordPerLine = true;
processingParams.PageProcessingParams.RecognizerParams.DetectLanguage = false;
processingParams.PageProcessingParams.RecognizerParams.TextLanguage = fillLangDatabase();
processingParams.PageProcessingParams.PageAnalysisParams.DetectPictures = false;
processingParams.PageProcessingParams.PageAnalysisParams.DetectVectorGraphics = false;
processingParams.PageProcessingParams.PageAnalysisParams.DetectVerticalEuropeanText = true;
processingParams.PageProcessingParams.PageAnalysisParams.EnableTextExtractionMode = false;
processingParams.PageProcessingParams.PageAnalysisParams.DetectTables = true;
processingParams.PageProcessingParams.PageAnalysisParams.AggressiveTableDetection = true;
processingParams.PageProcessingParams.PagePreprocessingParams.CorrectShadowsAndHighlights = FREngine.ThreeStatePropertyValueEnum.TSPV_Yes;
processingParams.PageProcessingParams.ObjectsExtractionParams.RemoveGarbage = true;
//processingParams.PageProcessingParams.ObjectsExtractionParams.EnableAggressiveTextExtraction = true;
//processingParams.PageProcessingParams.ObjectsExtractionParams.DetectTextOnPictures = true;
return processingParams;
}
...
Document.Process(setDPP());
Comments
6 comments
Hi
My only suggestion is, that it could have something to do with the OneWordPerLine parameter.
What's the result when adjusting that to false ?
From the help:
OneWordPerLine VARIANT_BOOL
This property set to TRUE tells ABBYY FineReader Engine to presume that no text line may contain more than one word, so the lines of text will be recognized as a single word. By default this property is FALSE.
Otherwise, try narrowing down by removing each property, which then automatically uses the default.
Best regards
Koen de Leijer
Please also try only to use the DocumentConversion_Accuracy predefined profile without all your settings. Is the table block detected? If not, please send your source image to SDK_Support@abbyy.com for testing on our side. Thank you!
Hello, everyone,
thank you for your help, unfortunately that didn't help either. The current status is very curious:
Original PDF File -> Tables detected
PDF to BMP -> no tables are recognized in the BMP
A section of the BMP (with Paint) and the tables are recognized.
Is there a maximum pixel and file size for images?
Br
Tobi
This is possible, because during converting from PDF to BMP the resolution of the your document could have changed, as a result you get different recognition results. You can try to change resolution of your BMP file via API: Developer's Help → API Reference → Image-Related Objects → PrepareImageMode → the Resolution overwriting section. If you want to get additional recommendations, kindly send your PDF and BMP files to your region ABBYY Technical Support Team for further investigating the issue.
That's it! Thanks for the solution.
But is there another way to create a image from the pdf? Or can I increase the resolution? I get a ~100dpi image resolution from a 300dpi pdf
My code:
Hi Tobi,
We have successfully downloaded your documents, thank you!
You can convert your PDF file to an image format by using the FREngine API, i.e. use the WriteToFile method of the Image object as shown below in the C# code snippet:
You can choose any image format listed among the ImageFileFormatEnum values.
Also it is possible to change the resolution of your the image during image preprocessing stage and before recognition. You can also do it via the FREngine API by using the additional PrepareImageMode settings:
Please sign in to leave a comment.