i'm trying to find a way to use the api methods to search an image for certain signal words and get its positions - ideally with regular expressions. this is for extracting relevant data on invoices etc.
example: i want to check if the word "total" appears somewhere within the image and get its coordinates. then i check the image for any occurences of decimal values, get their coordinates and select the one that is closest to "total". any ideas?
of course i could parse the xml output of the processImage output myself in php with regular expressions and use the coordinates of the first and last character for each hit. but this wouldn't work if "total" for example was recognized as "tota 1" or something, so i was thinking there might be a way to tell the ocr directly that it should be looking for "total" and thus make it more likely to return "total" than "tota 1". hope i described my problem understandably, appreciate any thoughts! cheers