コミュニティ

Arabic text recognition 回答済み

We have a developer trial licence for the ABBYY engine, and according to the licence manager, it supports Arabic. I tried our development product with some Arabic text and it did not recognise it. Obviously our product works fine with Latin text. Do we have to specifically configure the ABBYY OCR engine to recognise Arabic, or does it do it by default? If so, how please? We are using the C++ interface, creating an instance of FREngine::IEngine etc.

Thank you. Leif

この記事は役に立ちましたか?

0人中0人がこの記事が役に立ったと言っています

コメント

10件のコメント

  • Avatar
    Permanently deleted user

    I think I have solved this one myself:


            FREngine::IRecognizerParamsPtr recognizerParamsPtr = enginePtr->CreateRecognizerParams();
            recognizerParamsPtr->SetPredefinedTextLanguage("Arabic");

    It appears to work. :)

    0
  • Avatar
    Permanently deleted user

    Incidentally, the text recognition accuracy depends on the font. I find that it does not work with Iranian and Saudi ID cards, but it did with a sample in a different Arabic font.

    I also find thatfor font on a non uniform background, we need these settings:


    void SetParameters(FREngine::IObjectsExtractionParamsPtr objectsExtractionParamsPtr)
    {
        objectsExtractionParamsPtr->DetectTextOnPictures = VARIANT_TRUE;
        objectsExtractionParamsPtr->RemoveGarbage = VARIANT_TRUE;
        objectsExtractionParamsPtr->EnableAggressiveTextExtraction = VARIANT_TRUE;
    }

    Please let me know if you find any 'tricks'. And I hope you are not one of our competitors. ;)

    0
  • Avatar
    Permanently deleted user

    You can also try to use the TextExtraction_Accuracy predefined profile. Please read more about how to work with profiles in the Developer's Help→Guided Tour→Advanced Techniques→Working with Profiles and Specifications→Predefined Profiles Specification.

    0
  • Avatar
    Permanently deleted user

    We are also facing the same issue,  ABBYY OCR engine is not recognizing Arabic. Even we tried below code but no luck.

     engine.LoadPredefinedProfile( "DocumentConversion_Accuracy" );
     engine.CreateRecognizerParams().SetPredefinedTextLanguage("Arabic");

     

    Please suggest

    0
  • Avatar
    Permanently deleted user

    Sharath

     

    Check your ABBYY licence manager to see that you are licenced to use Arabic. The licence manager is in the ABBYY installation folder.

    0
  • Avatar
    Permanently deleted user

    Thanks for the quick response.

    Yes, our received license is provisioned to use "Arabic".

    0
  • Avatar
    Permanently deleted user

    Make sure you use the recogniser params! For example:

     

            FREngine::IPageAnalysisParamsPtr pageAnalysisParamsPtr = enginePtr->CreatePageAnalysisParams();
            SetParameters(pageAnalysisParamsPtr);

            FREngine::IRecognizerParamsPtr recognizerParamsPtr = enginePtr->CreateRecognizerParams();
            if (extractTextParameters._languages.length() > 0)
            {
                recognizerParamsPtr->SetPredefinedTextLanguage(extractTextParameters._languages.c_str());
            }

            FREngine::IRegionPtr regionPtr = enginePtr->CreateRegion();
            RECT& rect = regions.back().Rect;
            regionPtr->AddRect(rect.left, rect.top, rect.right, rect.bottom);
            documentPtr->Pages->Item(0)->AnalyzeRegion(regionPtr, pageAnalysisParamsPtr, NULL, recognizerParamsPtr);

    Alternatively:

        FREngine::IPageProcessingParamsPtr pageProcessingParamsPtr = enginePtr->CreatePageProcessingParams();

        if (extractTextParameters._languages.length() > 0)
        {
            pageProcessingParamsPtr->RecognizerParams->SetPredefinedTextLanguage(extractTextParameters._languages.c_str());
        }

        // Set other parameters here ...

        // First page
        documentPtr->Pages->Item(0)->PreprocessAnalyzeRecognize(pageProcessingParamsPtr);

    0
  • Avatar
    Permanently deleted user

    Thank you Leif. It worked with the below code. OCR is working for "Arabic" language but results are not accurate.

    Do let me know If I am missing something here. 

     

     

    //Set Language

    IPageProcessingParams oIPagePreprocessingParams = engine.CreatePageProcessingParams();

    if(engine.getPredefinedLanguages().getCount()>0)

    {

    oIPagePreprocessingParams.getRecognizerParams().SetPredefinedTextLanguage(predefinedTextLanguage);

    }

    IDocumentProcessingParams oIDocumentProcessingParams = engine.CreateDocumentProcessingParams();

    oIDocumentProcessingParams.setPageProcessingParams(oIPagePreprocessingParams);

     

     

    // Process document

    logger.info( "Process..." );

    document.Process( oIDocumentProcessingParams );

     

     

    0
  • Avatar
    Permanently deleted user

    Sharath

    You might be doing fine. Arabic character recognition is very variable, as ABBYY acknowledge. It works great with some fonts, not so well with others, and it is affected by the background too.

    0
  • Avatar
    Permanently deleted user

    Hi Sharath,

    I would recommend you to contact your region Technical Support and discuss your situation in more details by email. All ABBYY contacts are available here. Kindly attach to your message the sample image for reproducing the issue and specify the build number of FineReader Engine.

    0

サインインしてコメントを残してください。