Scan and OCR
To create an accessible digital copy from a paper book, you scan or take a digital photograph of it, and then convert it into editable text. When you scan or take a photograph of the page you get an image of the page - you can't edit the text, or read it out with text-to-speech software, because it's just a bunch of black dots and lines on the page. The process of converting the scanned or photographed image into editable, readable text is called 'Optical Character Recognition' or OCR.
Scan the paper copy (preferably with a multisheet scanner that produces an 'image' PDF)
OCR to convert the PDF image file into editable text; check and correct any errors
Save as PDF if you want a digital copy that looks exactly like the paper book
Save to Word (DOC) if you want to:
- edit the font or size to make Adapted or Large Print;
- make a Braille copy,
- make a digital LIT or Daisy version;
- create a synthetic audio version;
- make a Clicker version - copy and paste from Word to Clicker;
- make a symbolised version - copy and paste from Word to Communicate: In Print or Boardmaker.
A flatbed scanner is fine for small quantities but scanning a two hundred page novel one page at a time is very time-consuming so if you plan to scan a lot of material consider buying a multi-sheet scanner. Or, if you have access to a modern networked printer/photocopier you may find it has a scanning facility - you break the spine on the book, separate the pages, stack them on the printer/photocopier/scanner and instead of photocopying them, it scans the pages and produces an electronic file (usually PDF, sometimes DOC). If you don't have access to a multi-sheet scanner you can get the book scanned by a company specialising in document scanning. We have had good results from DDSR in Wishaw: you send them the books and they send you back a CD with the scanned PDF file in a few days, at an approximate cost of 6p per page.
An alternative to scanning is to take digital photographs of each page: most of the OCR programs can recognise text from camera images. Colour images from photos are higher resolution than those from scanners and so this works particularly well for books with lots of illustrations, like story books for young readers. The main issue with using a camera is positioning it and the book so that the images are consistent. See your OCR software manual or the TopOCR tutorial for advice on OCR from camera images.
OCR ('Optical Character Recognition')
Scanners are often supplied with OCR software but they are pretty basic and if you intend to scan a lot of books you are usually better off buying software that is designed for the task like FineReader, OmniPage or ReadIris. We like FineReader but OmniPage and ReadIris are also good. Which OCR program you get depends in part of on what type of accessible copy you want to make (see the Using Books section for more on different formats) and how much money you have. A few observations:
|I want to..||CALL recommendations|
|..save myself time and hassle|| |
|..scan a book or document that's mainly text, with a simple page layout||Most OCR programs (see below) do a pretty good job of OCRing simple documents and books into Word or another text editor. Check and edit the text and then save it in your preferred format.|
|..scan a book or document with a fairly complex layout, with lots of images and text boxes|| |
|..have good control over what gets OCRed||Get one of the professional OCR programs like FineReader, OmniPage or ReadIris. These let you manually 'zone' on areas of the page you want to OCR and you can correct any recognition errors. They can also save the scanned document in lots of different formats (e.g. PDF, DOC, RTF, HTML etc). If you intend to scan a lot of materials we strongly recommend using FineReader.|
|..make a simple 'talking book' for a young reader||Use your digital camera to take photos of each page (don't use the flash), import into Powerpoint, PhotoStory, SwitchIt! Maker or another 'slide show' style program, and record your narration of the story.|
|..spend as little money as possible||Use the OCR software that came with your scanner, or buy FineReader, OmniPage or ReadIris, then: |
Scanning & OCR software
Some of the many OCR programs are listed below. EasyConverter, FineReader, OmniPage, ReadIris and TopOCR are programs specifically for scanning and OCR: they do the best job of converting the image into different digital formats and they let you check and edit the text once it is OCRed. If you want to create accessible books in several different formats we recommend using these programs to create PDF books that look exactly like the orginal, and also editable Word files from which you can make Large Print in various sizes, Braille, synthetic audio or a variety of digital formats.
The main difference between EasyConverter and the other OCR programs is that EasyConverter has tools for converting your scanned and edited Word file easily into Large Print, Braille, MP3 and Daisy. EasyConverter can't save as PDF though.
Programs like ClaroRead and Read and Write Gold are primarily text-to-speech packages with OCR built in: the reader OCRs the page and then uses text-to-speech to access it.
|Scanning / OCR Programs||Comment||Approximate cost|
|Adobe Acrobat X Pro||With Acrobat Pro you can scan and create a PDF direct from a scanner. Accuracy is quite good and with the latest version, Acrobat Pro X, you can correct misrecognied words. It's not as quick or as flexible as the specialist OCR programs, but it's pretty good if you want to create PDFs.||£80 for Scottish schools, from LTS website|
|EasyConverter||EasyConverter OCRs from a scanner or from digital files (e.g. PDF) into Word and converts into: DOC, TXT, RTF, Large Print, Braille, RTF, audio (MP3 & Daisy).||from £890|
|FineReader 10 Pro||Professional OCR program for OCRing from scanner, camera and files. Accurate; recognition errors can be corrected. Saves as PDF, RTF, DOC, HTML etc. Free demo copy from Abby web site.||£60 single user from LTS.|
|Microsoft Office Document Scanning||Basic OCR supplied as part of MS Office; scans into Word.||supplied with MS Office|
|TopOCR||Free OCR software designed to OCR images from cameras.||Free|
|OmniPage 17||Professional OCR program for OCRing from scanner, camera and files. Accurate; recognition errors can be corrected. Saves as PDF, RTF, DOC, HTML etc.||£45 for Standard version, about £180 for Pro version|
|ReadIris 12||Professional OCR program for OCRing from scanner, camera and files. Accurate; recognition errors can be corrected. Saves as PDF, RTF, DOC, HTML etc. Free demo copy from Iris web site.||about £100, single user|
|ClaroRead Plus||Incorporates OmniPage OCR to scan books and OCR files into Word. ClaroRead has text to speech and other tools to support reading and writing.||£159 single user|
|Kurzweil 3000||Scan and OCR from books and files. Kurzweil saves as text or in it's own KES format, which looks like the original page (i.e. like PDF), but you need the Kurzweil Reader (£185) to open the KES files.||£725 single user|
|Read and Write Gold 9||Incorporates FineReader OCR to scan books and OCR files into Word/PDF/HTML. Read and Write Gold has text to speech and other tools to support reading and writing.||£320 single user|
Quick Guides on scanning
- Scanning into Word with FineReader 10
- FineReader 10 Guide
- Scanning into Adobe Acrobat 9
- Better ways to scan than using a page at a time flatbed scanner