OCRopus is an OCR system. I initially wanted to see how it can handle handwriting. So, I gave it a try by installing it on Ubuntu 8.04. To get started, I used Synaptic to install the following required software:
I installed one of the optional packages,
libaspell-dev. Beyond that, I also installed
build-essentials for the compilers needed to build from source.
Next, I checked out
tesseract-ocr from Google.
svn co http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr
patch java/makefile < java
java is the patch file and
java/makefile is the make file in
tesseract-ocr/java directory. After I applied the patch, I continued building
./configure make sudo make install
Now I have all the required software, now I am ready to install
svn co http://ocropus.googlecode.com/svn/trunk/ cd trunk ./configure jam sudo jam install
By this step, the basic
ocropus is installed.
One thing I noticed after the initial install, I needed to create
/usr/local/ocroscript directory and create the following two soft links within the newly created directory.:
ocroscript -> ../bin/ocroscript scripts -> ../share/ocropus/script
To test the software, I used the sample image came with the software:
/usr/local/bin/ocrocmd /data/pages/alice_1.png |less
The default test case above worked for me. Next, I took out my camera and took a picture of my handwriting. Upload the image and ran it through the OCR software. I was disappointed to find that
ocropus couldn’t recognize my handwriting very well. Is there something that can do better?