We have a developer trial licence for the ABBYY engine, and according to the licence manager, it supports Arabic. I tried our development product with some Arabic text and it did not recognise it. Obviously our product works fine with Latin text. Do we have to specifically configure the ABBYY OCR engine to recognise Arabic, or does it do it by default? If so, how please? We are using the C++ interface, creating an instance of FREngine::IEngine etc.
Thank you. Leif
Comments
10 comments
I think I have solved this one myself:
FREngine::IRecognizerParamsPtr recognizerParamsPtr = enginePtr->CreateRecognizerParams();
recognizerParamsPtr->SetPredefinedTextLanguage("Arabic");
It appears to work. :)
Incidentally, the text recognition accuracy depends on the font. I find that it does not work with Iranian and Saudi ID cards, but it did with a sample in a different Arabic font.
I also find thatfor font on a non uniform background, we need these settings:
void SetParameters(FREngine::IObjectsExtractionParamsPtr objectsExtractionParamsPtr)
{
objectsExtractionParamsPtr->DetectTextOnPictures = VARIANT_TRUE;
objectsExtractionParamsPtr->RemoveGarbage = VARIANT_TRUE;
objectsExtractionParamsPtr->EnableAggressiveTextExtraction = VARIANT_TRUE;
}
Please let me know if you find any 'tricks'. And I hope you are not one of our competitors. ;)
You can also try to use the TextExtraction_Accuracy predefined profile. Please read more about how to work with profiles in the Developer's Help→Guided Tour→Advanced Techniques→Working with Profiles and Specifications→Predefined Profiles Specification.
We are also facing the same issue, ABBYY OCR engine is not recognizing Arabic. Even we tried below code but no luck.
engine.LoadPredefinedProfile( "DocumentConversion_Accuracy" );
engine.CreateRecognizerParams().SetPredefinedTextLanguage("Arabic");
Please suggest
Sharath
Check your ABBYY licence manager to see that you are licenced to use Arabic. The licence manager is in the ABBYY installation folder.
Thanks for the quick response.
Yes, our received license is provisioned to use "Arabic".
Make sure you use the recogniser params! For example:
FREngine::IPageAnalysisParamsPtr pageAnalysisParamsPtr = enginePtr->CreatePageAnalysisParams();
SetParameters(pageAnalysisParamsPtr);
FREngine::IRecognizerParamsPtr recognizerParamsPtr = enginePtr->CreateRecognizerParams();
if (extractTextParameters._languages.length() > 0)
{
recognizerParamsPtr->SetPredefinedTextLanguage(extractTextParameters._languages.c_str());
}
FREngine::IRegionPtr regionPtr = enginePtr->CreateRegion();
RECT& rect = regions.back().Rect;
regionPtr->AddRect(rect.left, rect.top, rect.right, rect.bottom);
documentPtr->Pages->Item(0)->AnalyzeRegion(regionPtr, pageAnalysisParamsPtr, NULL, recognizerParamsPtr);
Alternatively:
FREngine::IPageProcessingParamsPtr pageProcessingParamsPtr = enginePtr->CreatePageProcessingParams();
if (extractTextParameters._languages.length() > 0)
{
pageProcessingParamsPtr->RecognizerParams->SetPredefinedTextLanguage(extractTextParameters._languages.c_str());
}
// Set other parameters here ...
// First page
documentPtr->Pages->Item(0)->PreprocessAnalyzeRecognize(pageProcessingParamsPtr);
Thank you Leif. It worked with the below code. OCR is working for "Arabic" language but results are not accurate.
Do let me know If I am missing something here.
//Set Language
IPageProcessingParams oIPagePreprocessingParams = engine.CreatePageProcessingParams();
if(engine.getPredefinedLanguages().getCount()>0)
{
oIPagePreprocessingParams.getRecognizerParams().SetPredefinedTextLanguage(predefinedTextLanguage);
}
IDocumentProcessingParams oIDocumentProcessingParams = engine.CreateDocumentProcessingParams();
oIDocumentProcessingParams.setPageProcessingParams(oIPagePreprocessingParams);
// Process document
logger.info( "Process..." );
document.Process( oIDocumentProcessingParams );
Sharath
You might be doing fine. Arabic character recognition is very variable, as ABBYY acknowledge. It works great with some fonts, not so well with others, and it is affected by the background too.
Hi Sharath,
I would recommend you to contact your region Technical Support and discuss your situation in more details by email. All ABBYY contacts are available here. Kindly attach to your message the sample image for reproducing the issue and specify the build number of FineReader Engine.
Please sign in to leave a comment.