WU Digital Project Metadata
Why have a minimum set of metadata elements?
- to urge people considering a digital project to think about metadata while planning
- to improve description of items and enhance reliable retrieval (searching)
- to enable compatibility with other collections (cross-database searching)
- to make maintenance and possible migration of data easier
The scope of your project, intended use and audience will influence your choices about metadata standard, fields (in addition to the required fields), thesaurus for subjects, authority file for names, etc.
WU Metadata was developed by the Digital Library Team over a period of six months and was published online in February 2007.
Note about examples provided below: Format for the display and application of these elements is not prescribed, but we strongly recommend following some established standard consistently. What format you use may not be as important as always using the same format for data in the same field.
A name or code for each resource which is unique within your project; "unknown" is not an option. Tips: Follow a consistent naming scheme and document that scheme; filenames should reflect the identifiers.
Example: gre_RBch2p75 112-3 etc. For more discussions about this concept search the web for digital naming scheme or persistent identifier.
The name of the work that has been digitized. A main title must be specified. If the original work is untitled, a title should be derived. (For example, a manuscript fragment may be given a derived title, after its first line: [Ashes of Roses].) The main title may include additional information (usually bracketed), if such information is useful in distinguishing the work from others.
Example: Faerie Queen .
The creators or contributors of the work that has been digitized. May be attributed to "anonymous" or "unknown" when appropriate. Use of an authority list for names, such as, Library of Congress Name Authority Headings or the index of a standard reference book in your subject, is encouraged.
The physical form of the object being digitized, sometimes also called the "carrier" of intellectual content. Not to be confused with terms naming the type of intellectual content (e.g. text, sound, notated music, moving image). May include information on dimensions or extent.
Examples: book, manuscript, CD, photograph, painting, map, film/video.
The storage and/or transmission data type of the resource being described. Recommend recording as the extension of the file (e.g. .jpg, .mdb, .xml, .wav). Internet Assigned Numbers Authority MIME media types may be a useful list.
Examples: .jpg files, Access databases, XML documents, .wav audio files.
Every work must have at least one date to represent when the intellectual content was created, published, or distributed; "unknown" is an option for extreme circumstances.
A novel published in 2002 about 14th century Italy -- Date is 2002.
A painting created in 1898 depicting an event in 1545 -- Date is 1898.
Multiple dates are optional. Most metadata schemas will provide instructions for associating multiple dates to a work, such as a dates for publication, distribution, and/or creation. Recommended format for DATE.
Every work must have a date that records the digital representation of a work. The date the item was ingested into a content management system (for example, the date a work was deposited into DSpace, DLXS, or LUNA) may also satisfy the requirement.
When a work is born-digital, the digital date records the day the file was created, or when the file was last updated/modified, which ever best represents the work in its current form. In some cases, the DATE and DIGITAL DATE for born-digital works may be the same date.
A physical photograph taken in 1957 that was scanned in 2006, producing a preservation TIFF file -- Digital Date is 2006.
A born-digital MS Word file that first created in 2004, but last updated in 2007 -- Digital Date is 2007
Recommended format for recording DATE and DIGITAL DATE is the internationally accepted ISO date/time format, Year: “YYYY” (e.g. 1997), Year and month: “YYYY-MM” (e.g. 1997-07), Complete date: “YYYY-MM-DD” (e.g. 1997-07-16)
Describes the owner of the intellectual work, this may be different from the owner of the physical item. "Unknown" is an option.
A less formal, more flexible space for a project to declare whatever information is pertinent regarding copyright, public domain, access, restriction on the work, etc. A description explaining whether the digital material is available for use to public, WU-only, or restricted; is required.
“Copyright is owned by Washington University and is available for use by the public. Preferred citation is Washington University Film and Media Archive, Henry Hampton Collection.”
“The image is password protected and can only be used by students enrolled in history class.”
“The image can be used in classrooms and other educational institutions free of charge under the fair use doctrine. All other uses are governed by copyright laws and have certain restrictions. Permission for use is required from the copyright owner. Please contact the film archive for more information.”
The above list is minimal set of metadata elements for digital projects at Washington University. Most projects will consider other elements, especially these three:
Subject terms that eliminate the ambiguity that arises from synonyms, variant spellings, etc. There are controlled vocabularies that already exist for many subject areas and or you could create your own specialized list of terms. This is especially useful if more than one person will be inputting metadata. You may also want to include form/genre tags and others
Digital Responsibility Statement
The person(s) responsible for the digital production of a text, edition, recording, or series, where the specialized element for author/creator(s) etc. does not suffice.
Description / Notes
Additional description and/or notes regarding the object.
Every project is different! Please contact Digital Library Services with any questions.
WU Metadata was developed by the 2006 Metadata subcommittee members: Erin Davis, Amanda Gailey, Ruth Lewis, David Rowntree, Mark Scharff, Cassandra Stokes.
2008 members included: Nadia Ghasedi, Tim Lepczyk, Mark Scharff, Shannon Showers, Perry Trolard.