Equipment
- Windows XP box with 2GB RAM, AMD Athlon XP 2700+
- HP Scanjet 3670
- Plustek OpticBook 3600
Software
- HP Scanjet software (bundled)
- Adobe Acrobat Professional 7.0
- Adobe Photoshop CS 8.0
- Abbyy FineReader 8.0 Professional
- OpticBook 3600 driver
Procedure
hp scanjet (obsolete)
- All Programs...Hewlett-Packard...Scanjet Scanner 36X0 Series...Photo & Imaging Director
- Choose "Scan Document"
- Select - Scan for editable text (OCR)? Yes
- Select - Original contains graphics? No
- Scan to: "Save to file"
- Click "Scan"
- Rotate as appropriate with buttons on left hand side
- Click "Accept"
- Click "No" when asked to scan another image
- Save file as <page numbers>.pdf (e.g., 366-367.pdf)
- Open pdf file
- Tools...Advanced Editing...TouchUp Object Tool
- Left-click on page to select, then right-click and choose "Edit Image"
- Ignore warning about flattening image (Check "Don't show again" and click "OK")
- File...Save for web...
- Choose "JPEG Low" preset
- Adjust image size to 75%, click "Apply"
- Click "Save"
- Save file as <page numbers>.jpg (e.g., 366-367.jpg)
wiki upload
- Upload jpg file to wiki
- Description: Reports of Committe on Foreign Relations 1789-1901 Volume 6 pp<xxx>-<xxx>
- Navigate to page in wiki, and put in stub code
- for first page, make previous=Main Page, for last page, make next=Main Page
{{Double Page|previous=<xxx>-<xxx>|current=<xxx>-<xxx>|next=<xxx>-<xxx>}}
- Click on the "Template:<xxx>-<xxx>" link and copy the text from the pdf
- Spell-check and copy edit the text
Notes
- Scanning in at 300ppi, 256 gray shades (8-bit grayscale)
- I'm not uploading the raw PDF files, since they're about 5 times as large as the jpgs
- Scanning pages 362-1169 (807 pages)
- 2 pages takes roughly 5 minutes to scan, convert, upload and add text
- 2017.5 minutes total required
- 33.625 hours required
- approximately 1 hour/day available
- about 40 days total
- 14 pages already scanned (but not proofed)