- Количество слайдов: 53
Introduction to Metadata: Dublin Core Best Practice for the Digital Commonwealth • By Leigh A. Grinstead • Digital Commonwealth Massachusetts Collections Online Danvers, MA April 26, 2011 • Copyright © 2011 LYRASIS • Headquarters: 1438 West Peachtree Street NW, Suite 200 Atlanta, GA 30309
Introductions • Who am I? • What institution do I work for? • What do I hope to accomplish during this session today? • Any others?
Training Services u Digital project planning and u Free publications about u u u management Digitization of cultural heritage collections (images, text, and audio) Metadata for digital collections Digital collections management systems Digital preservation Preservation classes and educational materials Online preservation resources >>> protecting and conserving library archival collections u Consulting u Disaster preparedness and response u Assistance with grant writing for NEH Preservation Assistance Grants u Environmental Monitoring Equipment Loan Service To learn more about Digital and Preservation, please email robin. [email protected]. org.
Assessment u Collection Development u Organizational Review u Digitization u Disaster Preparedness and Recovery u Electronic Resources Management u Facilities Planning u Focus Groups u >>> Website Usability u Meeting Facilitation u Preservation u Resource Sharing u Staff Development u Strategic Planning u Technology Planning u Workflow Analysis u Grant Writing u To get started with LYRASIS Consulting, please email cal. [email protected]. org.
Get access to a wide array of resources with member discounts. u e. Content, Databases & Digital Media u Reference & Discovery u ILL & Resource Sharing >>> u Collection Development & Management u Cataloging u Library Supplies & Services For a list of special discounts and promotions, visit www. lyrasis. org/newfeaturedoffers
Agenda – What is Metadata? – Metadata Challenges & Opportunities – What is Good Metadata? – Metadata Schema – Metadata Standards/Best Practices – Interoperability—OAI and harvesting – Dublin Core Elements and Digital Commonwealth Best Practices
Metadata Challenges • Many collections are non-textual • Many collections are not described at item level • How do you find individual digital objects without describing them? • Variety of formats/standards/practices to provide access to library, museum, archive collections
Metadata Opportunities • Increase intellectual access to special collections with limited physical access • Describe collections at a more granular level • Preserve original materials through reduced handling
Audience Needs • Who is your current audience? • Who is your audience in a digital environment? • Do your metadata practices meet defined audience needs? • Will your metadata make sense in a shared environment?
What is Metadata? • Definition – “data about data” – Information about the content, context and structure of information resources. – Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use or manage an information resource. You already have metadata • Library catalog records • Museum registration records • Archival finding aids
Good Metadata Should… • • Be appropriate to the materials in the collection Be appropriate to the users of the collection – • • intended, current and likely users Support interoperability Use controlled vocabularies to reflect the what, where, when and who of the content. From NISO A Framework of Guidance for Building Good Digital Collections
Metadata Schema • MARC 21 – MARC is the acronym for MAchine-Readable Cataloging. It defines a data format that emerged from a Library of Congress-led initiative that began more than thirty years ago. It provides the mechanism by which computers exchange, use, and interpret bibliographic information, and its data elements make up the foundation of most library catalogs used today. MARC became USMARC in the 1980 s and MARC 21 in the late 1990 s
What Is Dublin Core? • Developed by Online Computer Library Center and National Center for Supercomputer Applications (OCLC/NCSA). – NCSA developed the first web browser Mosaic and was looking for ways to improve searching for Mosaic users. in 1995 as a response to rapid growth of resources on the Internet. • Involved librarians, computer scientists, publishers, online content developers.
Dublin Core cont. • National/International Standards – ANSI/NISO Z 39. 85 (2001) The standard defines fifteen metadata elements for resource description in a cross-disciplinary information environment. – ISO 15836: 2003 (2003) is applicable to the Dublin Core metadata element set which deals with cross-domain information resource description. For Dublin Core applications, a resource will typically be an electronic document. – ISO 15836: 2003 is for the element set only, which is generally used in the context of a specific project or application. Local or community based requirements and policies may impose additional restrictions, rules, and interpretations. It is not the purpose of ISO 15836: 2003 to define the detailed criteria by which the element set will be used with specific projects and applications.
Metadata Schema cont. • Dublin Core – VRA Core Categories (VRA) VRA Core is related to Dublin Core and was adapted to provide appropriate metadata about images of original artworks and cultural objects. VRA Core provides additional elements such a medium, physical dimensions , etc. – Categories for the Description of Works of Art (CDWA) – Darwin Core was developed to describe natural science collections.
Metadata Schema cont. • Encoded Archival Description (EAD) – EAD stands for Encoded Archival Description, and is a non-proprietary de facto standard for the encoding of finding aids for use in a networked (online) environment. – Finding aids are inventories, indexes, or guides that are created by archival and manuscript repositories to provide information about specific collections. – Finding aids may vary somewhat in style, their common purpose is to provide detailed description of the content and intellectual organization of collections of archival materials. – EAD allows the standardization of collection information in finding aids within and across repositories.
Additional Schema • METS • MODS • PREMIS
Workflow • Think of creating an outline of workflow • Can you have more than one schema in place at your organization? • How can MARC and Dublin Core work together to simplify the process? Photo in the Framingham State University collection
Metadata Standards/Best Practices
Metadata Interoperability • Allowing different computer systems to share information across a network • Sharing and searching for resources across networks & metadata formats through shared protocols and metadata crosswalks Interoperability is an essential aspect of metadata creation. Without it, your data won’t be interchangeable with data from other museums or libraries.
Crosswalks • Processes and procedures that translate one metadata format into another • Success depends on the similarity of formats and consistency of content standards used MARC Record Dublin Core
Interoperability Protocols • Z 39. 50 – Broadcasts user query to remote databases. – Z 39. 50 is widely used in library environments and is often incorporated into integrated library systems and personal Bibliographic Reference software. Interlibrary catalogue searches for interlibrary loan are often implemented with Z 39. 50 queries.
Interoperability Protocols cont. • OAI-PMH – Open Archives Initiative Protocol for Metadata Harvesting http: //www. oaforum. org/tutorial/english/intro. htm – “Harvests” metadata from registered metadata “providers. ” – “Services” allow users to query harvested metadata.
International in scope • The involvement of representatives from almost every continent in establishing Dublin Core specifications has ensured that the standard will address the multicultural and multilingual nature of networked resources.
Flexible – All Elements are Optional – All Repeatable
Types of Dublin Core • Simple Dublin Core – 15 elements • Qualified Dublin Core – 15 elements with: • Refinements • Schemes
Digital Commonwealth Dublin Core Practice
Dublin Core Recommendations • Dublin Core Metadata Best Practices • http: //www. lyrasis. org/Products-and-Services/Digital-and. Preservation-Services/Digital-Toolbox/Best-Practicesand-Publications. aspx • Sets guidelines for use of DC Elements • Offers input guidelines for consistent data entry • Intended for a shared environment • Created by representatives from seven western states
Dublin Core Elements • • Title Creator Subject Description Publisher Contributor Date Type • • Format Identifier Source Language Relation Coverage Rights
DC Element: Title • Definition: The name given to the resource. – Typically, a Title will be a name by which the resource is formally known. Definitions are from the Dublin Core Web Site http: //dublincore. org/documents/dcmi-terms/
DC Element: Creator • An entity primarily responsible for making the content of the resource. – Examples of a Creator include a person, an organization, or a service.
DC Element: Subject • A topic of the content of the resource. – Typically, a Subject will be expressed as keywords, key phrases or classification codes that describe a topic of the resource. – Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme – Library of Congress Subject Headings (LCSH); Getty Thesaurus of Geographic Names (TGN); Art and Architecture Thesaurus (AAT); or Thesaurus for Graphic Materials I
Controlled Vocabularies • AAT – Art and Architecture Thesaurus • LCSH – Library of Congress Subject Headings • MESH – Medical Subject Headings • TGM – Thesaurus of Graphic Materials • GTGN – Getty Thesaurus of Geographic Names • ULAN – Getty Union List of Artists Names Use what you use now…
DC Element: Description • An account of the content of the resource. – Description may include but is not limited to: • • an abstract, table of contents, reference to a graphical representation of content, or a free-text account of the content.
DC Element: Publisher • An entity responsible for making the resource available. – Examples of a Publisher include • a person, • an organization, • or a service.
DC Element: Contributor • An entity responsible for making contributions to the content of the resource. – Examples of a Contributor include • a person (photographer, translator, etc. ) • an organization • or a service.
DC Element: Date • A date of an event in the lifecycle of the resource. – Typically, Date will be associated with the creation or availability of the resource. – Recommended best practice for encoding the date value is defined in a profile of ISO 8601 [W 3 C-DTF] and follows the YYYY-MM-DD format.
DC Element: Type • The nature or genre of the content of the resource. – Type includes terms describing • • general categories functions genres, aggregation levels for content – Recommended best practice is to select a value from a controlled vocabulary • For example, the Dublin Core Type Vocabulary. • To describe the physical or digital manifestation of the resource, use the FORMAT element.
DC Element: Format • Definition: The physical or digital manifestation of the resource. – Typically, Format may include the media-type or dimensions of the resource. – Format may be used to determine the software, hardware or other equipment needed to display or operate the resource. – Examples of dimensions include size and duration. – Recommended best practice is to select a value from a controlled vocabulary (for example, the list of Internet Media Types [MIME] defining computer media formats).
Format Examples Element Value Definition image/jpeg visual file in JPEG format text/html text file in HTML format text/sgml text file in SGML-encoded format application/sgml interactive application based upon SGML encoding video/mpeg video file in MPEG format audio/mp 3 sound file in MP 3 format 3, 000 bytes file size for a 3 megabyte file 1 minute playtime for a digital audio file
DC Element: Identifier • An unambiguous reference to the resource within a given context. – Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system. – Formal identification systems include the – Uniform Resource Identifier (URI) (including the Uniform Resource Locator (URL)) – Digital Object Identifier (DOI) – International Standard Book Number (ISBN)
DC Element: Source • A Reference to a resource from which the present resource is derived. – The present resource may be derived from the Source resource in whole or in part. – Recommended best practice is to reference the resource by means of a string or number conforming to a formal identification system.
DC Element: Language • A language of the intellectual content of the resource. – Recommended best practice is to use values from a controlled vocabulary standard Photo in the Framingham State University collection
Controlled Vocabularies for Language – ISO 639 -2: Three letter language code • • • English = eng Yiddish = yid Algonquian languages = alg Hmong; Mong = hmn Croatian = hrv
DC Element: Relation • A reference to a related resource. – Recommended best practice is to reference the resource by means of a string or number conforming to a formal identification system. – Prescribed list of qualifiers are used in this element.
DC Element: Coverage • The extent or scope of the content of the resource. – Coverage includes spatial location (a place name or geographic coordinates), temporal period (a period label, date, or date range) or jurisdiction (such as a named administrative entity). – Recommended best practice is to select a value from a controlled vocabulary (for example, the Thesaurus of Geographic Names [TGN]) and that, where appropriate, use named places or time periods in preference to numeric identifiers such as sets of coordinates or date ranges. • Using the Thesaurus of Geographic names: World, North and Central America, United States, Massachusetts, Middlesex county, Lincoln • World, North and Central America, United States, Massachusetts, Middlesex county
DC Element: Rights Management • Information about rights held in and over the resource. – Typically, a Rights element will contain a rights management statement for the resource, or reference a service providing such information. – Rights information often encompasses Intellectual Property Rights (IPR), Copyright, and various Property Rights. – If the Rights element is absent, no assumptions can be made about the status of these and other rights with respect to the resource.
Technical Metadata you may consider collecting for the future— not required by Digital Commonwealth • Information about the creation of the digital version of the original object – – – – – Capture equipment used Software used (version number) Name of technician Date of capture/conversion Resolution Bit-depth Size/length of original File format File storage location Photo in the collection of the Lincoln Town Archives, Lincoln, MA
Find Us on Facebook • Be a fan and keep in touch!
Contact Information Leigh A. Grinstead Digital Initiatives Consultant 404 -520 -8615 leigh. [email protected]. org