Скачать презентацию Preservation Metadata and the OAIS Information Model A Скачать презентацию Preservation Metadata and the OAIS Information Model A

d665e48fdb2ef8a53c59fe711d51f9e7.ppt

  • Количество слайдов: 24

Preservation Metadata and the OAIS Information Model A Metadata Framework to Support the Preservation Preservation Metadata and the OAIS Information Model A Metadata Framework to Support the Preservation of Digital Objects A review of the report by the OCLC/RLG Working Group on Preservation Metadata June 2002 http: //www. rlg. org/longterm/pm_framework. pdf (Report: http: //www. oclc. org/research/pmwg/) Presented by Dan Swaney 2/26/2004 Dan Swaney 1

The OCLC/RLG Working Group u March 2000 – Working Group was Formed by u The OCLC/RLG Working Group u March 2000 – Working Group was Formed by u OCLC u RLG – – u Started Online Computer Library Center, Inc. Resource Library Group, Inc. with a White Paper entitled – “Preservation Metadata for Digital Objects: A Review of the State of the Art” – Introduced concepts that were followed by the development of the actual framework discussed later. 2/26/2004 Dan Swaney 2

What is OAIS? u Open Archival Information System – May 1999 (Original Model) u What is OAIS? u Open Archival Information System – May 1999 (Original Model) u Supported the Space Community – June 2001 (Revised Model) u Extended to support libraries/cultural heritage institutions, gov’t agencies, and private sector – Information Model embedded in OAIS u Direct 2/26/2004 Relevance to Preservation Metadata Dan Swaney 3

OAIS Information Model: The Bottom -- From Data to Information External to the Archival OAIS Information Model: The Bottom -- From Data to Information External to the Archival System Information Object Knowledge Base Programmers must Have the knowledge base To understand Java source 2/26/2004 Data Object Representation Information Describes the Data Object’s bits: 101001 = sound file, paragraph of text, an image Digital Object OR Dan Swaney Physical Object 4

OAIS Information Model: Moving from the Bottom to the Top External to the Archival OAIS Information Model: Moving from the Bottom to the Top External to the Archival System Information Object Knowledge Base Data Object Representation Information Describes the Data Object’s bits: 101001 = sound file, paragraph of text, an image Digital Object 2/26/2004 OR Dan Swaney Physical Object 5

OAIS Information Model: The Top -- From Object to Package Archival (AIP) Information Package OAIS Information Model: The Top -- From Object to Package Archival (AIP) Information Package Submission (SIP) Dissemination (DIP) Content Information Preservation Description Information Packaging Information Descriptive Information Object 2/26/2004 Dan Swaney 6

Three Types of Information Packages an itting Object Subm ation m Infor Submission Information Three Types of Information Packages an itting Object Subm ation m Infor Submission Information Package (SIP) Information Producer Dissemination Information Package (DIP) 2/26/2004 Dan Swaney a to g t din ues n q po e es ry R R e Qu Archive Archival Information Package (AIP) 7

Inside the Information Package Content Information (CI) Descriptive Information - ‘Content’ Data Object - Inside the Information Package Content Information (CI) Descriptive Information - ‘Content’ Data Object - Representation Info Preservation Description Information (PDI) - Info to manage preservation of Content Info - Reference Info - Provenance Info - Context Info - Fixity Info 2/26/2004 Packaging Information Dan Swaney - Metadata for Resource Discovery - Assists finding aids - An Abstract? - Derived from: CI & PDI - Header block of info that binds together an Archive Information Package - Binds together: - digital object + - assoc. metadata 8

Implementing Two of the Components of the OAIS Model u First: Content Information (CI) Implementing Two of the Components of the OAIS Model u First: Content Information (CI) – ‘Content’ Data Object (CDO) u Raw Data Bits – Representation Info (2 components) u Structure Info – technical desc/spec – Example: format, data structs, encoding – Makes CDO Understandable by Machines/Systems u Semantic Info – explains the data – Example: interpret as English or temperatures delimited by tabs – Makes CDO Understandable by Humans 2/26/2004 Dan Swaney 9

-Details for Rendering/Viewing in Human-readable form Content Information (CI) Attributes -Defines Attributes: 1. Abstract -Details for Rendering/Viewing in Human-readable form Content Information (CI) Attributes -Defines Attributes: 1. Abstract of Steps -Steps to restore a ZIP file back to files/folders -Steps to restore into a DBMS 2. Structural Type 3. Technical infrastructure 4. 5. 6. 7. 8. 9. Content Information (CI) Package (Web Page and all it’s req’d files) ‘Content’ Data Object File Description Installation requirements Size Access Inhibitors Access Facilitators Significant Properties (whether to enable special features) 10. Functionality Representative Information ‘Content’ Data Object Description Environment Description (Web Page requires Java. Script) 11. Desc of Rendered Content 12. Quicks (Lost Features) 13. Documentation 2/26/2004 Dan Swaney 10

Content Information (CI) Package ‘Content’ Data Object Representative Information Content Information (CI) Attributes -Rendering Content Information (CI) Package ‘Content’ Data Object Representative Information Content Information (CI) Attributes -Rendering Programs is a two-step process: 1. Transform 2. Display/Access -Defines Attributes: 1. Transform Process ‘Content’ Environment + Transformer Engine Data Object Description + Params Description + Input Format + Output Format + Location Hardware Software + Documentation Environment 2. Display/Access App + Input Format + Output Format + Location Rendering Operating + Documentation System Programs Dan Swaney 2/26/2004 11

Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object Representative Information ‘Content’ Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object Representative Information ‘Content’ Data Object Description Environment Description Hardware Environment Software Environment Rendering Programs 2/26/2004 -Defines Attributes: + OS Name + OS version + Location + Documentation Lacks/Needs: - Recommended Env. or - Minimum Env. - It’s easier to define the environment in terms of recommended or minimum. Operating System Dan Swaney 12

Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object Representative Information ‘Content’ Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object Representative Information ‘Content’ Data Object Description Environment Description Hardware Environment Computational Resources 2/26/2004 -Defines Attributes: 1. Computation Resources + Microprocessor Required (e. g. Pentium 4 1 Ghz) + Memory Required + Documentation Software Environment Storage + Location (URL) 2. Storage + Storage Information (req’s 10 GB diskspace) + Documentation + Location (URL) 3. Peripherals + Peripheral Requirements (Sound card, Monitor Resolution) Peripherals + Documentation Dan Swaney + Location (URL) 13

Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object 4. Hardware Environment Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object 4. Hardware Environment as a Whole + Location Representative Information ‘Content’ Data Object Description Computational Resources 2/26/2004 (e. g. the machine is in a ‘technology museum’ or available through a emulation program like VMWare) Environment Description Hardware Environment 2/26/2004 -Defines Attributes: Software Environment Storage Peripherals Dan Swaney 14

Implementing Two of the Components of the OAIS Model u Second: Preservation Description Information Implementing Two of the Components of the OAIS Model u Second: Preservation Description Information (PDI) – Focuses on the information to track a history of the ‘Content’ Data Object u How it was added/scanned into digital form u Who did it u Who took care of it at some point in time u Like a Library Index Card in the back of a book tracking who checked it out 2/26/2004 Dan Swaney 15

PDI’s Four Categories Preservation Description Information (PDI) Reference Info Describes mechanisms for assigning an PDI’s Four Categories Preservation Description Information (PDI) Reference Info Describes mechanisms for assigning an ID to represent the Data Object both: -Locally (within the archive) (and) -Globally (referenced by an external system) 2/26/2004 Context Info Provenance Info Defines Attributes: 1. Archival System ID + Value + Constr. Method + Resp. Agency 2. Global ID (ISBN, URL) + Value + Constr. Method + Resp. Agency Dan Swaney Fixity Info 3. Resource Description + Existing Metadata (MARC bibl. record) + Existing Records (bibliographic record in World. Cat) 16

PDI: 3 Types of Reference Info Preservation Description Information (PDI) Reference Information Archival System PDI: 3 Types of Reference Info Preservation Description Information (PDI) Reference Information Archival System Identification Defines Attributes: 1. Archival System ID + Value + Constr. Method + Resp. Agency 2/26/2004 Context Information Global Identification 2. Global ID (ISBN, URL) + Value + Constr. Method + Resp. Agency Dan Swaney Provenance Information Fixity Information Resource Description 3. Resource Description + Existing Metadata (MARC bibl. record) + Existing Records (bibliographic record in World. Cat) 17

Defines Attributes: 1. Reason for Creation 2. (TIFF file created to save a rare Defines Attributes: 1. Reason for Creation 2. (TIFF file created to save a rare book) 2. Relationships 3. (Part of a Collection) 4. (Chapters in a Book) PDI: Types of Context Information Preservation Description Information (PDI) + Manifestation (Change History, Reference Context Recording outcome Information of a migration) Information + Relationship Type (Translated to HTML) + Identification (ID/Link to Description of Object) Reason for + Intellectual Content (Relates a chapter to a book) + Relationship Type (Web Page, Collection) + Identification (ID/Link to Description of ‘related’ object) 2/26/2004 Creation Provenance Information Relationships Manifestation Dan Swaney Fixity Information Intellectual Content 18

There are 5 Event Types defined as Attributes: 1. Origin (Event) 1. PDI: Types There are 5 Event Types defined as Attributes: 1. Origin (Event) 1. PDI: Types of Provenance Information Describ es the process by which the object Reference was Information created. Preservation Description Information (PDI) Context Information 2. Pre-Ingest (Event) 3. - Chain of Custody or 4. Audit Trail. 5. - Tracks History of Content 6. before it was digitized or 5. Rights Management (Event) added to the archive. 6. - Access Permissions 3. Ingest (Event) - Legal Deposit - Tracks how the object was 7. Responsibilities (if added to the archive sensitive) 4. Archival Retention 2/26/2004 migration history Dan Swaney - Tracks Provenance Information Fixity Information 1. *. Event 2. + Designation 3. - Change in Custody 4. - Migration 5. + Procedure 6. + Date 7. + Resp. Agency 8. + Outcome 9. + Note 19

Goal: To not have something altered and not know when, how, or why. PDI: Goal: To not have something altered and not know when, how, or why. PDI: Types of Provenance Information Defined Attributes: 1. Object Authentication 2. - Digital Signature 3. - Watermark 4. - Checksum 2. 3. 4. bit hash) + Auth Type. Reference (Signed using 128 Information Preservation Description Information (PDI) Context Information Provenance Information Fixity Information one-way SHA-1 5. 6. + Auth Procedure (Pointer to software capable of generating a new SHA-1 hash for comparison) 7. 8. + Auth Date (Last time this procedure was used/ran) 9. + Auth Result 2/26/2004 (Latest result of 10. Dan Swaney 20

Review of the PDI Content Information Package Preservation Description Information (PDI) - Info to Review of the PDI Content Information Package Preservation Description Information (PDI) - Info to manage preservation of Content Info -Reference Info -Identifiers both internal and external to the archive (e. g. ISBN, URN) -Provenance Info -Documents history of the CI (simulates a library checkout card that shows who checked out the book) -Context Info -Relates CI to why it was created, relations to other objects -Fixity Info -Data Integrity (Checksum, Hash, Signature) -History of Changes 2/26/2004 21 -Keeps content from being Dan Swaneywithout knowing when or why altered

Inside the Information Package Content Information (CI) Descriptive Information - ‘Content’ Data Object - Inside the Information Package Content Information (CI) Descriptive Information - ‘Content’ Data Object - Representation Info Preservation Description Information (PDI) - Info to manage preservation of Content Info - Reference Info - Provenance Info - Context Info - Fixity Info 2/26/2004 Packaging Information Dan Swaney - Metadata for Resource Discovery - Assists finding aids - An Abstract? - Derived from: CI & PDI - Header block of info that binds together an Archive Information Package - Binds together: - digital object + - assoc. metadata 22

Conclusion u Extended the OAIS Information Model to define a Framework of Metadata Elements Conclusion u Extended the OAIS Information Model to define a Framework of Metadata Elements that implement the concept. u Focused on only 2 areas critical to preserving a Data Object 2/26/2004 Dan Swaney 23

What’s Next to Do? u Develop ‘best practices’ toward populating a database archive. – What’s Next to Do? u Develop ‘best practices’ toward populating a database archive. – Assess degree of technical richness – Develop automated algorithms – Determine scope of sharing u Later move from ‘best practices’ to a formalized standard of processes. 2/26/2004 Dan Swaney 24