Digitization Process
Digital Imaging | Metadata | GIS

Digital Imaging
Written by Bob Lyner, Digital Preservation, LLC

Bob Lyner and Tom Wobbe

Bob Lyner and Tom Wobbe examine one of the Whipple maps.

Digital Preservation, L.L.C., Chesterfield, Missouri, founded in 1999, was selected to digitize the approximately 3500 Whipple Fire Insurance Maps. Digital Preservation, specializing in special collections and brittle paper digitization, had prior experience with fire insurance maps for the Kansas City Public Library and had developed a book cradle to handle the oversized volumes.

The method of digitization selected utilized a Linhof 4”X5” Technica Camera, 135 m/m Planar Lens and a Phase One Power Phase FX Scan back. The 300 Megapixel scan back allowed each page to be captured at actual size and resolved stunning detail, including the paper fibers.

Each page was captured as a 400 dpi RGB Tiff image file, written to an Apple G4, from there to an external hard drive, then finally, to an Apple G5 computer for postproduction. Postproduction included generating a derivative Tiff file for sharpening, contrast and color balance adjustments to match the original page. Two files, the raw unmodified Tiff and the new modified Tiff, now representing each page, where then up loaded directly to the Libraries server. Each page, captured at actual size, ranged from 250 to 385 MBs and with two file versions of each page the Library had to dedicate 1.3 TBs of server space!

The exposure for each page or double page spread digitized varied from 2:20 to 3:45. On-site digitization and postproduction of the combined Whipple collections digital image files required two months to complete.


Back to Top


Creating Metadata
Written by Tim Lepczyk, Washington University Metadata Librarian

The role of metadata in the creation and transformation of these historic bound volumes into digital objects involved navigating between standards as per the needs of displaying the information across various platforms and applications. All of these records started out as MARC records, which were used in the online catalogs at Washington University and the Missouri History Museum. The records were output from their systems and transformed into MARC XML using a program called MARCEdit. Once information has been encoded into XML (Extensible Markup Language) it is then possible to manage and transform the encoded information using XSL (Extensible Stylesheet Language).

For the purposes of Digital Library Services, we decided to encode and display the volumes using the TEI (Text Encoding Initiative) guidelines. In order to accomplish this goal a metadata crosswalk from MARC to TEI had to be created. Once the various elements were mapped using a spreadsheet, I then created an XSL script that would transform the MARC records into TEI.

The TEI records worked for our requirements in Digital Library Services, but they would not prove useful for our partners at the Missouri History Museum nor the Earth and Planetary Sciences Library. Or would they?

As an end product the TEI records would not be sufficient; however, as a staging ground for further data transformations they would suit our purposes perfectly. Using the TEI records, I mapped the metadata again, this time going from TEI to Dublin Core for the Missouri History Museum to host the records on the Missouri Digital Heritage site.

Also, the same TEI records were mapped to the Federal Geographic Data Committee’s Content Standard for Digital Geospatial Metadata or CSDGM since the other component of this project was georeferencing a subset of images using GIS. Once the images had been georeferenced by Scott Horn, the corresponding shape files were combined with the metadata records, and the resulting information was hosted using ArcGIS Server.

While all digital projects involve managing metadata, this one was unique in the number of standards and formats which were used to both describe and display the information.


Back to Top


GIS Process in the Whipple Project
Written by Scott Horn, Washington University GIS Analyst

GIS screenshot

Screen shot of a georeferenced Whipple map.

In an effort to make the scanned maps more interactive and searchable, the Whipple Project is making use of Geographic Information System (GIS) technology. The GIS was used to enable two processes, georeferencing and spatial searching. The primary software used throughout process was ESRI’s ArcGIS Desktop. In addition, to manipulate the images, FME Workbench and Adobe Photoshop were used. The database work was done in Microsoft Excel with provisions put in place to make the data transferable to more robust database management software if the project is scaled up in the future.

Georeferencing
Georeferencing is the process of assigning a location in physical space. Essentially, points are selected on the scanned maps and then those points are given a location, either interactively by selecting a corresponding point in a previously determined virtual space, or entering coordinates from a known coordinate system.

There were several steps taken to georeference the images. The first step was to compress the images using FME workshop batch processing. The uncompressed images were compressed by transforming them from their tiff format to the jpg format. This step reduced the files size by about 95% and the images still retained excellent resolution for use on the Whipple project Web site and for the GIS.

In the next step, Photoshop was used to crop all non-map elements and borders from the maps. In the final step, the images were loaded into ArcGIS desktop and were manually georeferenced by matching points common to the Whipple maps and previously georeferenced images. Despite careful matching, there is some error inherent in the georeferencing process, especially when working with historical, hand drawn maps. This error results in some edges not lining up exactly and overlap in some areas.

Once georeferenced, the images can be overlaid with other geographic information such as current roads or blocks in any common GIS program. The data contained in the maps can also be digitized into point, line, and polygon vector files for even more manipulation and analysis in a GIS.

Finding Aid
The map volumes digitized for this project reflect the geography of St. Louis as it was over 100 years ago. Since that time, the areas covered have undergone significant changes and would be unrecognizable to most viewers. It is also very difficult to know what spatial area each volume covers without searching them individually. To mitigate this problem, a visual/ spatial finding aid was created.

The finding aid works in two ways. First, it shows what the city blocks looked liked at the time the maps were printed. Second, the finding aid allows users to click on a specific city block and retrieve the page numbers and volumes which display that block. The finding aid can also contain a link to the image if it is served online.

Using an interactive GIS for the finding aid instead of static image maps creates several other options as well. It is possible to display city blocks by any attribute they contain, meaning users can display any city block by specifying a specific volume or year. The maps can also be overlaid with modern Geography to compare the city of Whipple’s era with the city of today. The GIS finding aid also improves scalability, and as other historical maps of the area are preserved in the future, they can be integrated easily. All of these options will allow users to easily browse the Whipple map volumes.

The finding aid was created using Microsoft Excel and ArcGIS Desktop and Server technology. A shapefile reflecting the geography of historical St. Louis was created using ArcGIS Desktop. This shapefile contains a corresponding attribute table of block numbers for each city block. This city block table can be joined to another table containing information about the images using a database join in Microsoft Excel. Once the tables are joined, the file can be made available online through an interactive map application hosted on the ArcGIS Server. Users can then explore the data contained in the tables and search the maps using a variety of criteria as described above.


Back to Top