You get ADFs with special purpose document scanners for over 500€, or combined with a multifunction printer/scanner/fax/copier starting at 90€. My price point for home was closer to 100€ than 500€, including the other multifunction features. For office, a separate document scanner is better.
Multifunction device makers advertise ADFs with size. ADF tray size is a nice number to print and compare. Bigger is better, but you shouldn't look at this too much. If size is the only information you get while comparing products, then go with the largest. But I suggest searching the internet for user experiences. An ADF is useless if it gets stuck freaquently, no matter how large tray it has attached. Unfortunately, there is no good data on ADF quality.
Hope this helps a bit: I have now scanned approximately 1000 pages with Lexmark 9575 ADF, from 1 to 50 docs at the tray. [Dec. 15. 2008 - Jan. 4. 2009]
A tip: don't scan with your primary machine. Scanning is a perfect background job. Right now I am writing this with a laptop while the scanner chews out a 50 document load to a desktop box. OCR and PDF generation can take longer than scanning on older machines.
A tip: You can't always use ADF. Like when scanning receipts, photos, or anything small. In these cases it is necessary for the scanner be near the computer and you have to open/close the scanner lid frequently. Put the scanner in a place where it does not bother you or your loved ones AND where you can have computer nearby and operate the scanner. With WLAN in the multifunction device, your scanning computer can easily be a laptop.
2. How to be sure your documents will be readable forever?
You want to avoid a personal
digital dark age , sure. I don't know how paranoid you are but I am willing to trust two formats: JPEG images and PDF/A documents. JPEG is ok. The documents I scan need to be readable but not reproducible in fine detail. However, I like to print scans as original-looking images, not as OCR-generated Word documents. With PDF you can get both image and text, which is all you need.
Choosing the best PDF format is not simple. You want to have a PDF that has the scanned image
on top of OCR generated searchable text. One way to do this is with
PDF/A , a format specifically designed for archiving purposes. Basically PDF/A is more conservative than a normal PDF with respect to size optimizations. You get pictures and fonts embedded inside the document. I have no idea if this will be
the format of personal archives but that is a problem for another day, that hopefully will never come.
Bad news: PDF/A support is not included with the default software, at least not in Lexmark X9575. With bundled in
ABBYY FineReader Sprint Plus Lexmark can produce PDF 1.3 with text below image. That may very well be good enough for you. I had issues with this "scan to PDF" feature with over 30 documents in the tray. FineReader crashed with "Out of memory", with a 1,5GB process size! This is a sound reason to switch to 64bit / 8GB+ machines.
ABBYY FineReader Pro cost 159€, handles PDF/A, and produces smaller documents than Sprint. Here's a
feature matrix. A caveat: FineReader Pro needs TWAIN or WIA drivers and does not work over Lexmark WiFi.
A tip [Jan. 4. 2009]: If you have an ADF, scanning is more about software than hardware. You should evaluate alternatives beyond the scanner default software. I couldn't stick with Lexmark default apps for anything other than simple one-document tasks. Scanning multiple documents directly to PDFs is not simple. On the other hand, my previous Canon MP150 scanner had a decent MP Navigator application for image scanning. It was a lot better than the Lexmark Productivity Suite that cannot handle even tasks with multiple images. With Canon I only needed Picasa to complement it in document renaming. With Lexmark I'm currently testing
Hamrick VueScan for image scanning/naming tasks.
3. How to search your scanned documents
Answer: With a good file naming convention and indexing based on PDF text.
I have used the following file naming convention:
yyyy.MM.dd. Sender. Description.
Example: 2007.06.30. Microsoft. MCPD Certificate.jpg
I need to find this document later, and I might use Windows Search and try something like ".net certification". The result would be a miss if the search were looking for complete words. With thousands of documents you will have trouble finding what you want. That is why it is useful to have a chronologically sortable naming. In my opinion, backed by
cognitive psychology, it is natural for humans to search linearly in time. You can certainly remember that the cert exam was in 2007, perhaps in late spring, and start looking from your document folder. And it makes naming folders easier too. You just name your document folder by year:
If you store other people's documents, you have a separate root folder for each. Documents belonging to multiple persons should be copied to each one's folder.
But hey, this is data management from the 80's. Today, you want to search from the contents of your scanned documents, and OCR with PDF content indexing should do this. In the example case, I remember a funny little detail. The cert was printed with a Bill Gates signature.
See, maybe Bill was busy sending certs around the world at the time. So if search with "Gates" the document should be found. With good OCR even "Bill Gates" signature should be indexed without any manual effort.
Help needed: What is the best way to
rename a pdf while you can see its contents, without locking the pdf file. I have tried
Presto! PageManager and
Nuance Paperport, and they kinda suck. Thumbnails are too small.
Picasa is great for naming JPGs. I want a similar tool for PDFs.
A tip: If the doc doesn't tell the date, you guess it, like 2006.12.XX or 1980.XX.YY.
A tip: Scan each document to a single PDF but combine them later to logical multi-page PDFs.
4. How to dispose your scanned originals
Quick answer: you shred them and recycle. Keep only the most important.
I have a small shredder. It is useless. Waste of time and energy for anything except must destroy docs like the ones you take home from office.
Instead of shredding, I use a cardboard box where I throw everything to be destroyed and recycled later. You must have a place for centralized, secure document disposal. For example, sneak the documents in your workplace and use their disposal facilities.
Final words
I have been scanning documents for three years now. First with a Canon Pixma MP150 and now with the Lexmark X9575. It takes time and you should value your time properly. Don't go into it yet if you're not willing to fight technical issues. In a few years you probably get your documents picked up from your house anyway, with scans delivered, secured and backed up somehow.
So why bother? Because of the thrill, man. No matter what the business cycle is today, we as a people are finally getting rid of paper, for good. And only by understanding the details involved, you prepare yourself for the future.
So, how about Lexmark X9575. Here's my conclusions:
What I wanted
- Reliable, fast ADF
- No-hassle automatic scan-to-pdf. Result pdfs should be "image with text below" and wait in a folder to be renamed manually later.
- Reliable software with automation features
- Silent operation, or wlan to put the machine away from me
What I got
- Good enough ADF. My scan machine is old enough to be the speed bottleneck, not the scanner. Actually, the ADF is reliable enough that it could be bigger, like 100 docs.
- Hassle with scanning A4 size docs with default software. Letter/Legal scanning should be ok. No points for Lexmark i18n. They should shame.
- Buggy default software. But this is what you get with less than 500€. You just have to learn to deal with it. Scan-to-PDF automatization didn't work with full ADF load. Losing 50 scans sucks.
- Silent operation AND wlan, good
-mika-
P.S. I intend to add more tips for the scanning process as I get more experience with the current tools.