Automated Book Reader for People with Visual Impairments and Blindness

Published: Jun 27, 2007

Malek Adjouadi from the Center for Advanced Technology and Education, Electrical and Computer Engineering, Florida International University, Miami, presented the research on an assistive technology solution named Automated Book Reader for people with visual impairments and blindness at the 10th International Conference on Computers (ICCHP) on July 12-14 at the University of Linz, Austria.

The study focused on the development of an assistive technology interface that would read aloud a digitized document for people with a visual disability or blindness. The design of the equipment includes a lateral camera and another that is to be placed over the book/document. There is a book support that enables book capture and a background paper or cloth in a solid color like black used to extract book curvature through the lateral camera.

The support box acts as a reference frame for the book and the cameras extract its lateral and top image. The starting and ending points of the two images and the corresponding points of each image are noted for processing. Edge detection performed using the Laplacian of Gaussian operator on the lateral image helps assess page curvature. Image acquisition process also involve barrel distortion correction, perspective transformation, and character extension for text projection to overcome curvature and interpolation.

Once image acquisition is carried out, the next step involves character extraction and recognition. The entire connected components algorithm is used to find characters in the image. While the algorithm is used to scan each pixel, those pixels with a value above a predetermined limit are allotted a label. Connected components are formed from classes of pixels using the iterative process. Now follows the process of finding characters, lines of text and words within the image. Finally, disjointed characters like the letters ‘i’ and ‘j’ that are otherwise recognized with separate boundaries for the letter and its dot are connected.

The final step in the development of this assistive technology solution is character identification using a multilayer feed-forward neural network trained with the back-propagation algorithm. The network identifies a cross-validation set consisting of a subset made up of select upper and lowercase letters. The weights acquired during the neural network’s training are saved and used later to identify unknown characters.

Different techniques to create a better base product for OCR

The team from Florida International University analyzed the results from all their test cases. While the first test case’s training set was used as input, the second test case had a digitized flattened page torn from a book as input. The third had an image from a digital camera, which was found to require additional image processing to correct curvature distortion. The fourth applied perspective transformation correction, while barrel distortion correction was applied in the fifth. The final test case carried out spell check from Microsoft Word.

The introduction of the book reader system has brought about an inexpensive and effective read-aloud assistive technology product that has improved several notches with its mathematical developments in key areas of page curvature, perspective transformation, barrel distortion, character recognition and spell check for accuracy.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.

Back to top