ABBYY is interested what features you would like to see in our software. If you have an idea how to improve data capture software, please submit the suggestion within this thread.
Please note that we can’t guarantee that all your proposed features will be implemented, but we take into consideration all your proposals.
Your ideas and suggestions are very important to us, they will help us to develop better data capture solutions and help you bring your business to the next level.
Please note that we can’t guarantee that all your proposed features will be implemented, but we take into consideration all your proposals.
Your ideas and suggestions are very important to us, they will help us to develop better data capture solutions and help you bring your business to the next level.
Comments
29 comments
is this discussion still open?
If yes - how about in RS 4 add feature "Append if file exists" (Workflow Settings -> 6. Output). Similar feature like I saw in some post about FlexiCapture.
Sometimes I use settings where one input file is processed in two workflows and in this case one file is counted twice in PPM license. Possibility to counted it just once for one source file would really welcomed.
Sorry for being so late with my response, could you please describe your scenario? For what purpose do you need to process document in two workflows?
Kind regards,
Anna
\.
Unfortunatly, only one backslash can be used. We would like to use at least two backslashes like this:
\\.
resulting in creating automatically a subdirectory for the year and each month.
Kind regards
Klaus
Here's a new request:
Please allow Verification Station to utilize user pattern recognition sets and custom dictionaries when re-OCRing text.
Thanks!
Hello chener,
Thank you for your request.
As for custom dictionaries, they can be used to improve recognitions of each field and help a lot to increase recognition quality as they can tell program what exactly can be expected in each field. Custom dictionaries can be added in Document Definition Editor, field properties, Data tab of field properties dialog box, Edit button.
If you see a need to use patterns and custom dictionaries for full-text OCR, could you please specify a use case when this feature could help to increase OCR quality? Do you use some specific characters and words?
Best regards,
Irina.
Let me clarify. Recognition Server currently already allows for use of patterns and custom dictionaries. However those patterns and dictionaries are only utilized during the "Process" stage of the workflow. During "Quality Control" / Verification, those patterns and dictionaries are ignored, thereby rendering re-OCR of any text much less accurate using Verification Station. Does that make sense?
Would it be possible to add user-definable auto-correction to the workflow during the Process stage?
Would it be possible to restrict usage of only certain Area Types (text, table, picture, etc.) during Page Analysis? For example, there are some documents that should never have tables as Area Types, but Recognition Server still tries to make certain parts of the document tables, so we would like to prevent that from happening.
Thank you.
Hi chener, thank you for your question!
It is possible to influence on the analysis process by changing the parameters in the Configuration.xml file (Recognition Server settings file). There are ProhibitTableDetection="false" and ProhibitPictureDetection="false" parameters that can help to detect text instead of tables and ignore picture objects. By default their values are set to "false", please change them to "true".
Could you please clarify what is required by giving an example?
Thank you! I will give it a try!
Let me provide a basic example:
There are certain characters the OCR engine has difficulty discerning, e.g. I, l, and 1.
So certain words like Ill. (abbreviation for state of Illinois) we be misinterpreted as 111.
In this example, we would like the ability to auto-correct all recognized "111" words as "Ill."
Hope that makes sense.
Hi chener,
Custom dictionaries are NOT ignored when doing re-OCR at the Verification Station. Please check, they should work.
And thank you for the suggestion regarding use of patterns during re-recognition at verification stage! We will note this for future versions.
In most cases this problem can be solved by using custom dictionaries. In the next version we plan to allow operators adding words to the dictionary used for OCR by server and operator stations.
how do i add some patterns which should not be read and fully ignored without any spaces?
also, is there any way to add my own font and train abbyy through it...
Hi! You can use FineReader 12 Professional/Corporate to train a user pattern for your font. The resulted file (with *.fbt extension) can be uploaded in the Recognition Server workflow properties (2.Process > Advanced Processing Settings... > Apply user patterns)
In Recognition Server with FineReader XIX module, Advanced Language Properties for Old German, Old French, etc. shows the following:
Punctuation marks adjoining the beginning of a word: "'(-.[{©«»—‘“„•■□▲△►▻▼▽◄◅◊◎◦★☆♦✓❖
Punctuation marks adjoining the end of a word: !"')*,-.:;?]}©«®»—’“”™
Standalone punctuation marks: !"#$%&'()*+,-./:;<=>?[]_{}£¥§©«°»—’“”„•€■□▲△►▻▼▽◄◅◊◎◦★☆♦✓❖
Some of these characters did not exist at the time the Old German, Old French, etc. languages were used! Including the characters reduces recognition accuracy. For example, Old German letter "G" or "S" in Fraktur font is sometimes mistakenly recognized as ©!
In Recognition Server, it does not seem to be possible to create a new language based on Old German and remove these "bad" characters. It is possible to create new languages in FineReader, but not in Recognition Server.
Suggestion 1) Make it possible to create new languages in Recognition Server.
Suggestion 2) In Recognition Server with FineReader XIX model, remove the "bad" characters by default from the punctuation list and add them to the "prohibited characters" list. This is an easy way to increase recognition accuracy for FineReader XIX users. When recognizing Old German, should we really see ©, ®, ™, etc. in the results? (In my experience with other languages and old documents, recognition accuracy increases if we also prohibit ^$@).
Thank you.
I would like to see more intelligent automatic numbering of fields in Flexicapture when they are duplicated by copy/paste or dragging.
In the current editor, numbers are automatically appended to the names of fields when they are duplicated, but the numbering ignores all previous numbers, and just adds a digit to the end. This means you get different results depending on what technique you use for creating large numbers of fields
Example: If you create a checkbox group, call it F01, and then duplicate it by copying and pasting it repeatedly,
The incrementing digits start with 2, not 1.
So copying and pasting five times, I get this:
F01, F012, F013, F014, F015
However if I create the copies by repeatedly using CTRL-drag to create new copies, I get this:
F01, F012, F0122, F01222, F012222
Furthermore, if you try to create a large range of fields by selecting multiple fields and copying or dragging them at once, the numbering gets even more confusing.
Creating a single field, copying it four times, then selecting the group of four and copying that repeatedly, produces this numbering
F01, F012, F0122, F01222, F013, F0123, F01223, F012222, F014, F0124, F01224, F012223
There should be some options to set which control how numbering is done, such as an option to ensure that leading zeroes are always applied, or to control the incrementing.
THis could be done dynamically during field creation, but would probably be easier to do after creation. A tool that allows you to select a group of fields, and then apply numbering to them based on their position on the page, either top down or left right, etc.
great
Hi, thank you for the suggestion.
To understand a priority, may I ask you how often do you use automatic naming? How often do you need to copy fields? Which kind of documents require that: I guess, it might be an application form with a lot of checkmarks?
Best regards,
Irina.
Yes, when working with a form with many checkmarks is exactly when I encountered the problem. I've spent many hours on document definitions over the past month, and a good percentage of that time was dealing with renaming/renumbering of copied fields.
We have hundreds of assessment tests that can have up to 200 multiple choice questions. The exported fields need to have consistent sequentially numbered names for programmatic reading.
Creating the boxes automatically does not give them appropriate names and groupings, and creating them one at a time is very time consuming.
Using copy-paste or drag-and-drop is much more efficient, but the automatically generated names then have to be edited manually, which is still error prone and slow.
I wrote myself a simple AutoIt script that automates the renaming process for me, which saves time and errors, but it still requires some manual setup to use. I believe this should be a built-in feature.
Select a range of fields, and apply sequential numbering to them.
Allow use of a prefix string and add sequential numbers. Allow choice of using leading zeros or not.
I installed FRE 11 to enable the language editor option. I would like to know if an image could be used as a new pattern.
The circled 11 and 1L are interpreted as @.
I clicked the button next to "Alphabet" button.
Since I could not type the circled 11 and 1L (for example), I tried adding a circled 11 and 1L image.
Is this possible?
Is my understanding correct?
Please do the needful.
In Finereader editor, currently if a paragraph at the end of the page continues in the next page, this is not detected and the paragraph splits. Please recognize that the first lines on the next page, if not indented, belong to the paragraph at the end of the previous page.
Dear Isaac Ben Harush, thank you for your message. Could you please write your product name (FineReader PDF or FineReader Server) and version?
Thank you.
Best Regards,
Oleksandr Musatkin
Product Marketing Manager
Thank you for replying. My version is Finereader PDF 15 Standard. This was also observed in version 14.
After recognizing all pages, I save as TXT - Plain Text. In the options I select "create a single file for all pages". The resulting text file have separate paragraphs whenever in the source images a paragraph was continued on the next page. (the first line on the next page is not indented, therefore belonging to the paragraph at the end of the previous page).
Hi Isaac Ben Harush,
Thank you for more details. I`ve created a support ticket based on your request. Please await a reply from Customer Support team.
Hi, it'd be great if FlexiLayout Studio had Git integration. It currently has a backup feature, but that only keeps a local backup. Perhaps the backup feature could have an option to use a Git repository.
I collaborate with colleagues, meaning FlexiLayouts don't just stay on one machine. We currently use OneDrive, which is slow to sync and can cause lost work due to syncing bugs sometimes overriding new files with the old version.
Please sign in to leave a comment.