The (by far) most visited post on this blog is from 2010, about OCRing a PDF in GNU/Linux (Optical Character Recognition), and it contains a small shell script that has been improved by others several times. After having bought a new flatbed scanner, I re-investigated how to scan and OCR pdfs, how to produce DJVU files that are incredibly small and how to get metadata right. It turns out what I really ever wanted was to create PDF/A compliant documents (I just didn't know what PDF/A was before). But let me explain the details after presenting you the quick solution. At the end, I have a shell script that scans directly to PDF/A.
Below, you can see the preview of the Unix History (move on the white zone to get a bigger image):
This is a simplified diagram of unix history. There are numerous derivative systems not listed in this chart, maybe 10 times more! In the recent past, many electronic companies had their own unix releases. This diagram is only the tip of an iceberg, with a penguin on it ;-).
evolt.org Browser Archive