Thursday, November 06, 2008

Evince in Ubuntu Does OCR

Wow! I did it without thinking then suddenly realized I had selected and copied text from a scanned document PDF, that I was viewing, and pasted the text into an email message!

The document viewer is Evince. You'd never know that's its name without looking, but it's the program you run if you open a PDF file on Ubuntu. It does OCR of text on the fly if you cut and paste it. I had no idea!

I wonder if the Adobe Acrobat viewer does the same thing? I'll have to check.

As usual, the OCR wasn't perfect. It interpreted a decimal point as a comma, or maybe it was just converting to European…