As it turns out, building a simple digit recognition (also known as OCR) program is rather easy. As a project for CMPT 310 (Artificial Intelligence), a friend of mine and I jointly write the matlab code for it. With the current training done, it can achieve ~90% accuracy.
The method used is called neural network. By training the network with thousands and thousands of images over a few days, the network finds out by itself which pixels in an image are important feature in distinguishing the digits and put more weightings into those pixels.
Training script / Recognition script
A script different from the one included for download is used to train the network. The trained "weightings" are included in the downloadable script, and it can be used to recognize digits.
How to use the script
Once you extracted all the files in the zip
file, load digitrecognition.mat. (If you're using a version of matlab
that predates version 7, you will get a file corruption error. You should
load digitrecognitionv6.mat instead)
1. Type recognizeDigits(images(:,:,1)) to recognize the first image as an example. In general, you can have recognizeDigits(images(:,:,i)), where i is an integer from 1 to 10.
2. Typing labels(1) would show what image #1 is supposed to be (human-recognition).
With the sample images included, machine recognition of images 1-9 are correct. (Note that image 1 does not correspond to digit 1. The 10 sample images contain random digits)
If you have images of handwritten digits, you can use the script to automatically recognize them as well. The images must first be loaded into matlab, and they must be binary (black and white only) and have size 28 pixels x 28 pixels.
To view the image, type figure; imshow(images(:,:,i)); where i is a number from 1 -10 (the index into the image array).
Pretty nice eh? Drop me a line and tell me what you think about it.