ecbe0f4787b9e6ea558e7aa68bdf7ae4.ppt
- Количество слайдов: 55
ISO “Reference Model For an Open Archival Information System (OAIS)” Tutorial Presentation IEEE/GSFC Mass Storage Symposium Lou Reich /CSC Don Sawyer /NASA/NSSDC March 15, 1999 10041267 M-1
Outline of Talk History Reference Model overview Digital Archive Directions (DADs) workshop Reference Model Status 10041267 M-2
NASA Role National Space Science Data Center — NASA’s first digital archive — Experienced many technology changes since 1966 Consultative Committee for Space Data Systems — International group of space agencies — Developed variety of science discipline- independent standards — Became working body for an ISO TC 20/ SC 13 about 1990 TC 20: Aircraft and Space Vehicles SC 13: Space Data and Information Transfer Systems 10041267 M-3
Initial Archive Standards Proposal ISO suggested that SC 13 should develop archive standards – Address data used in conjunction with space missions – Address intermediate and indefinite long term storage of digital data 10041267 M-4
Response to Consultative Committee for Space Data Systems (CCSDS) and ISO TC 20/SC 13 – No framework widely recognized for developing specific digital archive standards – Begin by developing a ‘Reference Model’ to establish common terms and concepts – Ensure broad participation, including traditional archives (Not restricted to space communities; all participation is welcome!) – Focus on data in electronic forms, but recognize that other forms exist in most archives – Follow up with additional archive standards efforts as appropriate 10041267 M-5
Getting Started First open US workshop held October 1995 — Variety of government, academic, and industry participation, including National Archives — Active US working group was formed US workgroup activities are fully open — New participants always welcome — Plans, minutes, drafts available from Web Broad international workshops also held — Britain and France Issue resolution at CCSDS international workshops 10041267 M-6
Results Reference Model targeted to several categories of reader — Archive designers — Archive users — Archive managers, to clarify digital preservation issues and assist in securing appropriate resources — Standards developers Adopting terminology that crosses various disciplines — Traditional archivists — Scientific data centers — Digital libraries Getting favorable comments wherever exposed 10041267 M-7
Reference Model for an Open Archival Information System 10041267 M-8
Open Archival Information System (OAIS) Open – Reference Model standard(s) are developed using a public process and are freely available Information – Any type of knowledge that can be exchanged – Independent of the forms (i. e. , physical or digital) used to represent the information – Data are the representation forms of information Archival Information System – Hardware, software, and people who are responsible for the acquisition, preservation and dissemination of the information – Additional OAIS responsibilities are identified later and are more fully defined in the Reference Model document 10041267 M-9
Document Organization Introduction – Purpose and Scope, Applicability, Rationale, Road Map for Future Work, Document Structure, and Definitions of Terms OAIS Concepts – High level view of OAIS functionality and information models – OAIS external environment – Minimum responsibilities to become an “OAIS” Detailed Models – Functional model descriptions and information model perspectives Migration perspectives – Media migration, compression, and format conversions Archive Interoperability – Criteria to distinguish types of cooperation among archives Annexes – Scenarios of existing archives, compatibility with other standards 10041267 M-10
Purpose, Scope, and Applicability Framework for understanding and applying concepts needed for long-term digital information preservation – Long-term is long enough to be concerned about changing technologies – Starting point for model addressing non-digital information Provides set of minimal responsibilities to distinguish an OAIS from other uses of ‘archive’ Framework for comparing architectures and operations of existing and future archives Basis for development of additional related standards Addresses a full range of archival functions Applicable to all long-term archives and those organizations and individuals dealing with information that may need longterm preservation Does NOT specify any implementation 10041267 M-11
Model View of an OAIS’s Environment Producer is the role played by those persons, or client systems, who provide the information to be preserved Management is the role played by those who set overall OAIS policy as one component in a broader policy domain Consumer is the role played by those persons, or client systems, who interact with OAIS services to find acquire preserved information of interest Producer OAIS (archive) Consumer Management 10041267 M-12
OAIS Information Definition Information is defined as any type of knowledge that can be exchanged, and this information is always expressed (i. e. , represented) by some type of data In general, it can be said that “Data interpreted using its Representation Information yields Information” In order for this Information Object to be successfully preserved, it is critical for an archive to clearly identify and understand the Data Object and its associated Representation Information Data Object Interpreted Using its Yields Representation Information Object 10041267 M-13
Information Package Definition Content Information Preservation Description Information An Information Package is a conceptual container of two types of information called Content Information and Preservation Description Information (PDI) 10041267 M-14
Information Package Variants Submission Information Package – Negotiated between Producer and OAIS – Sent to OAIS by a Producer Archival Information Package – Information Package used for preservation – Includes complete set of Preservation Description Information for the Content Information Dissemination Information Package – Includes part or all of one or more Archival Information Packages – Sent to a Consumer by the OAIS 10041267 M-15
External Data Flow Diagram Producer Submission Information Packages OAIS Legend Archival Information Packages queries = Entity Information = Package Data Object = Data Flow query response Dissemination Information Packages orders Consumer 10041267 M-16
OAIS Responsibilities Negotiates and accepts Information Packages from information producers Obtains sufficient control to ensure long-term preservation Determines which communities (designated) need to be able to understand the preserved information Ensures the information to be preserved is independently understandable to the Designated Communities Follows documented policies and procedures which ensure the information is preserved against all reasonable contingencies Makes the preserved information available to the Designated Communities in forms understandable to those communities 10041267 M-17
Detailed Models Overview 10041267 M-18
Overview of Detailed Models It was decided to do both a functional and an information model of the OAIS Both models were tasked to: — Use the models to better communicate OAIS Concepts — Use a well established, formal modeling technique — Stay as implementation independent as possible — Avoid detailed designs 10041267 M-19
Detailed Models Information Model 10041267 M-20
General Principles Define classes of “information objects’ that illustrate information necessary to enable Long-term storage and access to Archives The class definition should be implementation Independent Use a variant of Object Modeling Technique (OMT) as a notation (being updated to UML) 10041267 M-21
OMT Notation Overview Class: Class Name Multiplicity of Associations: Class Exactly one Class Many (zero or more) Class Optional (zero or one) Class One or more Aggregation: Assembly Class 1+ Part -1 Class Part-2 Class Specialization: Association: Parent Class Child -1 Class-1 Association Name Class-2 Child-2 Class 10041267 M-22
Information Objects Information Object 1+ Data Object Physical Object interpreted using 1+ Representation Information interpreted using Digital Object 1+ Bit Sequence 10041267 M-23
Representation Information The Representation Information accompanying a physical object like a moon rock may give additional meaning, as a result of some analysis, to the physically observable attributes of the rock The Representation Information accompanying a digital object, or sequence of bits, is used to provide additional meaning. It typically maps the bits into commonly recognized data types such as character, integer, and real and into groups of these data types. It associates these with higher level meanings which can have complex interrelationships that are also described 10041267 M-24
Recursive Nature of Representation Information Preexisting standards that define primitive data-types Mapping rules that map those primitive data-type into the more complex data -type concept used by the Data Object Other semantic information that aids in the understanding of the Data such as a Data Dictionary 10041267 M-25
Sample Representation Net 10041267 M-26
Types of Information Used in OAIS 10041267 M-27
Content Information The information which is the primary object of preservation An instance of Content Information is the information that an archive is tasked to preserve. Deciding what is the Content Information may not be obvious and may need to be negotiated with the Producer The Data Object in the Content Information may be either a Digital Object or a Physical Object (e. g. , a physical sample, microfilm) 10041267 M-28
Preservation Description Information Provenance Information – Describes the source of Content Information, who has had custody of it, what is its history Context Information – Describes how the Content Information relates to other information outside the Information Package Reference Information – Provides one or more identifiers, or systems of identifiers, by which the Content Information may be uniquely identified Fixity Information – Protects the Content Information from undocumented alteration 10041267 M-29
Example of Preservation Description Information Content Information Type Space Science Data Reference Object Identifier Journal Reference Mission, instrument, and title attribute set Provenance Instrument Description Processing History Sensor Description Instrument Context Fixity Calibration history Related data sets Mission Funding history CRC Checksum Reed-Solomon coding Instrument mode Decommutation map Software Interface Specifications Bibliographic Information Software Package ISBN Title Author Printing history Copyright Position in series Manuscripts References Related References Dewy Decimal System Publishing Data Publisher Author Digital Name Author Version number Serial Number Revision Histroy License holder Registration Copyright Help file User Guide Related Software Language Certificate Checksum Encryption CRC signature Cover 10041267 M-30
Descriptive Information Contain the data that serves as the input to documents or applications called Access Aids can be used by a consumer to locate, analyze, retrieve, or order information from the OAIS. 10041267 M-31
Packaging Information which, either actually or logically, binds and relates the components of the package into an identifiable entity on specific media Examples of Packaging Information include tape marks, directory structures and filenames 10041267 M-32
OAIS Archival Information Package Descriptor Archival Information Package (AIP) derived from e. g. , Information supporting customer searches for AIP Content Information Packaging Information delimited by e. g. , How to find Content information and PDI on some medium further described by e. g. , • Hardcopy document • Document as an electronic file together with its format description • Scientific data set consisting of images and text in three electronic files together with format descriptions Preservation Description Information (PDI) e. g. , • How the Content Information came into being, who has held it, how it relates to other information, and how its integrity is assured 10041267 M-33
AIP Types Based on the difference in Content Object complexity AIUs contain a single Data Object as the Content Object AICs contain multiple AIPs in their Content Objects — Each member of an AIC is an AIP containing Content Information and PDI — The AIC contains unique PDI on the collection process 10041267 M-34
Package Descriptors and Access Aids Package descriptors are needed by an OAIS to provide visibility and access to the OAIS holdings Package Descriptors contain 1 or more Associated Descriptions which describe the AIP Content Information from the point of view of a single Access Aid Some example of Access Aids Include: — Finding Aids - assist the consumer in locating information of interest — Ordering Aids - allow the consumer to discover the cost of and order AIUs of interest — Retrieval Aids - enable authorized users to retrieve the AIU described by the Unit Descriptor from Archival Storage 10041267 M-35
Information Model Summary Presented a model of information objects as containing data objects and representation objects Classified information required for Long-term archiving into 4 classes: Content Information, PDI, Packaging Information and Descriptive Information Described how these classes would be aggregated and related in an AIP to fully describe an instance of Content Information Presented information needed for Access, in addition to that needed for Long-term Preservation Put the Access oriented structures in the context of the other data needed to operate an OAIS 10041267 M-36
Detailed Models Functional View 10041267 M-37
General Principles Highlight the major functional areas important to digital archiving Use functional decomposition to clarify the range of functionality that might be encountered – Don't decompose beyond two levels to avoid becoming too implementation dependent – Provide a useful set of terms and concepts – Do not imply that all archives need to implement all the sub-functions Identify some common services which are likely to be needed, and are assumed to be available, as underlying support 10041267 M-38
Common Services Modern, distributed computing applications assume a number of supporting services Examples of Common Services include: — inter-process communication — name services — temporary storage allocation — exception handling — security — file and directory services 10041267 M-39
OAIS Functional Entities Descriptive Info. P R O D U C E R Data Management Descriptive Info. Requests Ingest Access SIP Archival Storage AIP Administration other information DIP C O N S U M E R MANAGEMENT SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package 10041267 M-40
Functional Entities In An OAIS Ingest: This entity provides the services and functions to accept Submission Information Packages (SIPs) from Producers and prepare the contents for storage and management within the archive Archival Storage: This entity provides the services and functions for the storage, maintenance and retrieval of Archival Information Packages Data Management: This entity provides the services and functions for populating, maintaining, and accessing both descriptive information which identifies and documents archive holdings and internal archive administrative data. Administration: This entity manages the overall operation of the archive system Access: This entity supports consumers in determining the existence, description, location and availability of information stored in the OAIS and allowing consumers to request and receive information products 10041267 M-41
Ingest Functions Schedule Submission Delivery: negotiates a data submission schedule with the producer Receive Submission: provides the appropriate storage capability or devices to receive a SIP from the producer. The Receive SIP function may represent a legal transfer of custody for the CI in the SIP, and may require that special access controls be placed on the contents Generate Archival Information Package: transforms one or more SIPs into one or more AIPs that conforms to the internal data model of the archive. Generate Descriptive Information: extracts Descriptive Information from the AIPs to populate the data management system. Coordinate Updates: responsible for transferring the AIPs to Archival Storage and the Descriptive Information to Data Management 10041267 M-42
Ingest Data Flow Diagram 10041267 M-43
Analysis of Archive Issues Using OAIS RM Migration 10041267 M-44
Digital Migration is defined to be the transfer of digital information, while intending to preserve it, within the OAIS. Focus on preservation of the full information content New information implementation replaces the old OAIS has full control and responsibility over all aspects of the transfer Three major motivators are seen to drive Digital Migrations of Archival Information Packages within an OAIS: Media Decay Increased Cost Effectiveness New Consumer Service Requirements 10041267 M-45
Digital Migration Approaches Four primary types of digital migration in response to motivators, ordered by increasing risk of information loss: — Refreshment • Media replacement with no bit changes — Replication • No change to Packaging Information or Content Information bits — Repackaging • Some bit changes in Packaging Information — Transformation • Reversible: Bit changes in Content Information are reversible by an algorithm • Non-reversible: Bit changes in Content Information are not reversible by an algorithm 10041267 M-46
Analysis of Archive Issues Using OAIS RM Archive Associations 10041267 M-47
Archive Interoperability Motivators Users of multiple OAIS archives have reasons to wish for some interoperability or cooperation among the OAISs. Consumers — Common finding aids to aid in locating information over several OAIS archives — Common Package Descriptor schema for access — Common DIP schema for dissemination, or a single global access site. Producers — common SIP schema for submission to different archives — a single depository for all their products. Managers — Cost reduction through sharing of expensive hardware increasing the uniformity and quality of user interactions with the OAIS 10041267 M-48
Categories of Archive Interactions Independent: no knowledge by one OAIS of Standards implemented at another Cooperating: Potentially common submission standards, and common dissemination standards, but no common access. One archive may make subscription requests for key data at the cooperating archive Federated: Access to all federated OAIS is provided through a common set of access aids that provide visibility into all participating OAISs. Global dissemination and Ingest are options Shared resources: An OAIS in which Management has entered into agreements with other OAISs is to share resources to reduce cost. This requires various standards internal to the archive (such as ingest-storage and access-storage interface standards), but does not alter the community’s view of the archive 10041267 M-49
Federated Archives Local Consumer OAIS 1 Access Ingest Dissemination Information Package (Optional) Administration Ingest Access Common Catalog OAIS 2 Administration Global Consumer Access Local Consumer Dissemination Information Package (Optional) 10041267 M-50
Levels of Autonomy in Associated Archives No interactions and therefore no association Associations that maintain your autonomy. You have to do certain things to participate, but you can leave the association without notice or impact to you. Associations that bind you by contract. To change the nature of this association you will have to re-negotiate the contract. The amount of autonomy retained depends on how difficult it is to negotiate the changes. 10041267 M-51
Reference Model Summary Reference model is to be applicable to all digital archives, and their Producers and Consumers Identifies a minimum set of responsibilities for an archive to claim it is an OAIS Establishes common terms and concepts for comparing implementations, but does not specify an implementation Provides detailed models of both archival functions and archival information Discusses OAIS information migration and interoperability among OAISs 10041267 M-52
Reference Model Acceptance Reference model has been getting good reviews — Society of American Archivists 1997 annual meeting — NAGARA 1998 Annual Meeting — Various international conferences Major context for June 1998 Digital Archive Directions (DADs) workshop hosted by NARA — Attended by representatives of international and US national archives, science data centers, digital libraries, and academic institutions — Provided strong support for model and offered improvements http: //ssdoo. gsfc. nasa. gov/nost/isoas/dads/ 10041267 M-53
DADs Workshop Recommendations Working groups identified several ‘best practices’ and standards desired, including: — Recommended data ingest methodology — Best practices for digitizing analogue data — Best practices for media selection, testing and usage — Best practices for error control through the archive — Archival Submission Information Package standard — Consumer Archive Interface standard — Unique Archival Information Package Identifier standard — AIP Content Layered Model and Standard APIs report Plenary recommended a coordination function be established — Promote the reference model — Promote coordination of work on best practices and standards — Promote development of an archive accreditation method 10041267 M-54
Reference Model Status Ultimate success of OAIS Reference Model effort depends on obtaining adequate review and comment Reference Model Red Book/ISO Draft International Standard (DIS) expected June 1999 — Current version is White Book 4. 0, available under Reference Materials heading, at: — http: //ssdoo. gsfc. nasa. gov/nost/isoas/overview. html Comments are actively solicited — Most productive if received by April 15, 1999 — Send to donald. sawyer@gsfc. nasa. gov 10041267 M-55