CPUs contains more than one processing unit / CPU core to execute more tasks in parallel. Modern operating systems can distribute the amount of running processes/tasks/threads more efficiently on multiple CPU cores. Even “virtual” CPU cores can increase the throughput/speed.
What is Hyper-threading
Hyper-threading (HT) is a technology for CPU cores to improve parallelization of computations (execution of multiple tasks at the same time) performed on PC microprocessors.
See more: Hyper-threading
- A CPU with only one physical CPU-core executes an OCR or text analytics task, the PC task manger displays one CPU-core with a high load:
- A CPU with one working CPU supporting hyper-threading, the PC task manger displays 2 logical CPU-cores - each core with only 50% load:
Technically, an HT-CPU can work more efficiently, especially when multiple applications are running, (e.g. Windows, Outlook, Office applications and browsers running). Most of the “normal” applications/processes do not need too much CPU power - this is why a hyper-threading CPU can make switching between the different tasks more efficient.
High Load Processes and HT
When running optical character recognition, hypertreading can make a real difference in processing speed (OCR = a very CPU intense process).
1 OCR process can use up to almost 100% of a physical CPU core capacity.
⇒ the efficiency of the second “logical” HT-core can not “deliver” another 80-90% as it would be the case when running more 'less CPU-hungry' applications in parallel.
Important: A simple (arithmetic) average calculation might deliver wrong impression.
When all the physically provided processing power is used by one or multiple (OCR) processes, the task- manager will show almost a 100% load.
If you now double the number of cores by enabling/using hyperthreading, the almost 100% load appears only as a 50% load because of the average calculation. However, this is not reflecting the reality.
If we should use an example from real life, Hyperthreading can be compared to putting a spoiler on a car - in certain driving situations it might improve the experience, but the actual engine have not received more horsepower to go faster.
Influence of Hyperthreading
- Hyperthreading CPUs will, therefore, influence the performance of computers when running 'standard' applications like Outlook, Browser, Office, etc. Here the user can often experience almost doubling the performance.
- Hyperthreading CPUs do not have such a strong effect on the speed when it comes to CPU intense tasks, like OCR processing. Here the influence will be maximum between 20-30% 1).
Hyperthreading in ABBYY SDKs
ABBYY FineReader Engine and FlexiCapture SDK come with code samples that show how to use multiple CPU cores.
FineReader Engine Processing Pool
A simple test made with FineReader Engine 11 Release 5 on a Laptop (2012) Quad i7-3720QM, 2,6 GHz, Windows 7, 16 GB RAM, 64 bit; 2).
|Threads/Processes running in the background||1||2||3||4||5||6||7||8|
|Throughput, pages per minutes||11||22||26||32||37||37||36||29|
More OCR processes increase the throughput as expected.
The maximum page throughput is achieved when the number of processes is “number of physical CPU cores + 1”
Here it is: 4 physical cores + 1 additional process = 5
If the computer containes more cores, you might see an increase of pages when starting even 2-3 more processes, but the final result also depends on the document size and the OCR scenario you perform.
Screenshots for 1,4,5 and 8 OCR Processes
- If you have FineReader Engine installed, you easily can reproduce the test with the sample 'FineReader Engines Pool - Multithreading Sample (Windows)'.
- The sample 'Multi-Processing Recognition - Code Sample (Windows)' lets you test multiple core scalability without the process-pool.