Difference between revisions of "TheMorganReport:Community Portal"

From TheMorganReport
Jump to navigation Jump to search
Line 10: Line 10:
  
 
=Procedure=
 
=Procedure=
==hp scanjet (obsolete)==
+
==scan pages from opticbook 3600==
*All Programs...Hewlett-Packard...Scanjet Scanner 36X0 Series...Photo & Imaging Director
+
300dpi, grayscale
**Choose "Scan Document"
+
==batch rename pages==
***Select - Scan for editable text (OCR)? Yes
+
from "Image xxxx.jpg" to "xxxx-xxxx.jpg" to reflect page numbers
***Select - Original contains graphics? No
+
==batch resize pages==
***Scan to: "Save to file"
+
using photoshop automation, save for web, jpeg low settings, and 38% scaled
***Click "Scan"
 
**Rotate as appropriate with buttons on left hand side
 
**Click "Accept"
 
**Click "No" when asked to scan another image
 
**Save file as <page numbers>.pdf (e.g., 366-367.pdf)
 
*Open pdf file
 
**Tools...Advanced Editing...TouchUp Object Tool
 
**Left-click on page to select, then right-click and choose "Edit Image"
 
**Ignore warning about flattening image (Check "Don't show again" and click "OK")
 
**File...Save for web...
 
**Choose "JPEG Low" preset
 
**Adjust image size to 75%, click "Apply"
 
**Click "Save"
 
**Save file as <page numbers>.jpg (e.g., 366-367.jpg)
 
 
==wiki upload==
 
==wiki upload==
*Upload jpg file to wiki
+
batch upload jpg files to wiki
 
**Description: Reports of Committee on Foreign Relations 1789-1901 Volume 6 pp<xxx>-<xxx>
 
**Description: Reports of Committee on Foreign Relations 1789-1901 Volume 6 pp<xxx>-<xxx>
 +
==wiki stubs==
 
*Navigate to page in wiki, and put in stub code
 
*Navigate to page in wiki, and put in stub code
 
**for first page, make previous=Main Page, for last page, make next=Main Page
 
**for first page, make previous=Main Page, for last page, make next=Main Page
 
<pre>{{Double Page|previous=<xxx>-<xxx>|current=<xxx>-<xxx>|next=<xxx>-<xxx>}}</pre>
 
<pre>{{Double Page|previous=<xxx>-<xxx>|current=<xxx>-<xxx>|next=<xxx>-<xxx>}}</pre>
 +
==batch OCR full resolution pages==
 +
create PDF with FineReader
 +
==wiki upload text==
 
*Click on the "Template:<xxx>-<xxx>" link and copy the text from the pdf
 
*Click on the "Template:<xxx>-<xxx>" link and copy the text from the pdf
 
*Spell-check and copy edit the text
 
*Spell-check and copy edit the text

Revision as of 12:11, 11 December 2005

Equipment

  • Windows XP box with 2GB RAM, AMD Athlon XP 2700+
  • Plustek OpticBook 3600

Software

  • Adobe Acrobat Professional 7.0
  • Adobe Photoshop CS 8.0
  • Abbyy FineReader 8.0 Professional
  • OpticBook 3600 driver

Procedure

scan pages from opticbook 3600

300dpi, grayscale

batch rename pages

from "Image xxxx.jpg" to "xxxx-xxxx.jpg" to reflect page numbers

batch resize pages

using photoshop automation, save for web, jpeg low settings, and 38% scaled

wiki upload

batch upload jpg files to wiki

    • Description: Reports of Committee on Foreign Relations 1789-1901 Volume 6 pp<xxx>-<xxx>

wiki stubs

  • Navigate to page in wiki, and put in stub code
    • for first page, make previous=Main Page, for last page, make next=Main Page
{{Double Page|previous=<xxx>-<xxx>|current=<xxx>-<xxx>|next=<xxx>-<xxx>}}

batch OCR full resolution pages

create PDF with FineReader

wiki upload text

  • Click on the "Template:<xxx>-<xxx>" link and copy the text from the pdf
  • Spell-check and copy edit the text

Notes

  • Scanning in at 300ppi, 256 gray shades (8-bit grayscale)
  • I'm not uploading the raw PDF files, since they're about 5 times as large as the jpgs
  • Scanning pages 362-1169 (807 pages)
  • 2 pages takes roughly 5 minutes to scan, convert, upload and add text
    • 2017.5 minutes total required
    • 33.625 hours required
    • approximately 1 hour/day available
    • about 40 days total
  • 14 pages already scanned (but not proofed)

Instructions for Editors