To convert a book or magazine or in fact any printed document to word is not quite as simple as applying an OCR process to the document.

Scantronics employ high quality tools with world class recognition engines to enable us to produce the most accurate output from the documents that are provided to us, but despite this as the original documents are the printed page some errors of formatting or text will creep in depending on the printed quality and age of the original.

This means that those errors will have to be eliminated by operator intervention and this can be time consuming and ultimately costly.

As an indication to scan page of text to a searchable PDF starts from as little as two pence per page, but by comparison to ‘clean up a document OCR ‘ed output ready for Word or other applications starts from eighty pence per page.

Scantronics hope that the rate we quote are some of the lowest in the industry but any process that is labour intensive will be reflected in the prices we charge for the service.

Once the output is processed by hand we can offer an accuracy level of 99.9% for text and layout and can offer a 100% service at a premium to our normal service.

OCR’ing a book or magazine or in fact any document is only as good as the quality of the original, if there are marks or extraneous items on the page the OCR engine can interpret these as characters or page commands such as line break etc.

If the text on your page is faded or of poor quality this will also affect the accuracy of the OCR engine.

In the main we find that post production if the OCR process is to produce a searchable PDF than the quality of our processes and scanning will produce 95% or more recognisable text from the search engine.

For text that is to be cut and pasted or replicated in a third party application such as word then a degree of ‘cleaning’ of the format of the text and the content is required and this can increase the cost of the scan output to our clients.

We normally recommend the most cost effective path is to keep the scanned image as a text searchable PDF.