Hi,
We currently OCR lots of documents using Recognition server 4.0 which works fine but I was wonder if it was possible to OCR documents and when Recognition Server finds the word 'TEST' within a document output that document to a different folder called 'TEST'?
We currently have an Input folder and an Output Folder but i'd like documents with the word 'TEST' in them to go into the 'TEST' folder instead.
Thanks.
We currently OCR lots of documents using Recognition server 4.0 which works fine but I was wonder if it was possible to OCR documents and when Recognition Server finds the word 'TEST' within a document output that document to a different folder called 'TEST'?
We currently have an Input folder and an Output Folder but i'd like documents with the word 'TEST' in them to go into the 'TEST' folder instead.
Thanks.
Comments
3 comments
The recognized text is available at the Document separation or Indexing stage. You can create a script that would check the document text (e.g. via RecognizedPage.Text) and look for "TEST" word in it. If this text is found on the page, assign CustomText to some value (e.g. set it "TEST").
On the export create another script that will check CustomText value and choose appropriate destination for the document.
For more details and samples please refer to Recognition Server 4.0 Help > Remote Administration Console > How to... > Use scripts (Document Separation for setting CustomText and Export Handling for using CustomText value to put the result to certain folder).
Hi Katja,
I have nearly got my script working but I find that smaller page files overwrite the output txt file, one way to combat this would to name the ABBYY.txt file as the actual filename.
At the Document Seperation stage, what is the method used to get filename and extension?
I cannot seem to find any documentation on this, unless you can let me know
My document separation code:-
var pageText = Text;
var n = "test1";
var v = "test2";
if ( pageText.indexOf(n) > -1 )
{
CustomText = n;
var fso = new ActiveXObject("Scripting.FileSystemObject");
var a = fso.CreateTextFile("C:\\ABBYY.txt", true);
a.WriteLine(n);
a.Close();
}
if ( pageText.indexOf(v) > -1 )
{
CustomText = v;
var fso = new ActiveXObject("Scripting.FileSystemObject");
var a = fso.CreateTextFile("C:\\ABBYY.txt", true);
a.WriteLine(v);
a.Close();
}
Hello,
In the RS scripts the RecognizedPage object is used in scripts on the Document Separation stage. This object has the property InputFileProperties that can be used to get FileExtension and FileName as strings.
Please sign in to leave a comment.