Скачать презентацию Managing the Metadata Lifecycle The Future of DDI Скачать презентацию Managing the Metadata Lifecycle The Future of DDI

0579ad15512a7e049fe16d042e25a856.ppt

  • Количество слайдов: 24

Managing the Metadata Lifecycle The Future of DDI at GESIS and ICPSR Peter Granda, Managing the Metadata Lifecycle The Future of DDI at GESIS and ICPSR Peter Granda, ICPSR Meinhard Moschner, GESIS Mary Vardigan, ICPSR Joachim Wackerow, GESIS Wolfgang Zenk-Möltgen, GESIS

Research Data Life Cycle Archiving Concept Collection Processing Distribution Discovery Repurposing Analysis Research Data Life Cycle Archiving Concept Collection Processing Distribution Discovery Repurposing Analysis

Current Uses of DDI • DDI 2 used for many different purposes by many Current Uses of DDI • DDI 2 used for many different purposes by many different archival institutions, e. g. , metadata records for data catalogs, export to Web-based information systems such as Nesstar, long-term preservation, and PDF codebooks • GESIS and ICPSR are developing procedures and systems to extend use of DDI in their institutions

DDI 3 Expands in Scope • To date use mainly limited to Distribution and DDI 3 Expands in Scope • To date use mainly limited to Distribution and Archiving stages of data life cycle • DDI 3 enables use of new elements and structures to extend markup to other stages of the life cycle - both earlier and later • Emphasis is on projects and tasks already in process at each institution

DDI 3 Use at GESIS • • Structured Comments – Processing Translation of EVS DDI 3 Use at GESIS • • Structured Comments – Processing Translation of EVS Questionnaire – Collection Supporting Enhanced Publications – Analysis Continuity Guides: Trends by Concepts – Concept, Discovery, Repurposing

Extracting structured information in current workflow • Example: building derived variables by SPSS • Extracting structured information in current workflow • Example: building derived variables by SPSS • SPSS setups contain commands and comments • Necessary steps for using SPSS setups as information source for DDI – Improving comments for automated extraction • formalize layout • add keywords from a list – Extraction of structured comments and related commands by custom tool. – Transformation of this information into DDI 3 fragments

Extracting structured information in current workflow ***v* Variables/Derived. Variables * DESCRIPTION * This section Extracting structured information in current workflow ***v* Variables/Derived. Variables * DESCRIPTION * This section is on derived variables; ***v* Derived. Variables/w 101_new * NAME * w 101_new * DESCRIPTION * w 101_new is a derived variable from w 101; * It has the original value from w 101 * when w 102 is equal 1 * otherwise it has the value 5; * USED VARIABLES * w 101, w 102 * SOURCE **. compute w 101_new = 5. if ( w 102 = 1 ) w 101_new = w 101. ** * VERSION * 2009 -04 -18 * AUTHOR * Achim Wackerow * EMAIL * joachim. [email protected] org ***. Report (HTML) Extractor DDI 3 fragments Generation. Instruction Description Command SPSS Result

Translation of EVS Questionnaire DSDM http: //zacat. gesis. org Translation of EVS Questionnaire DSDM http: //zacat. gesis. org

Supporting Enhanced Publications DDI Alliance Publications with References to Data: ddi. de. DDI 3. Supporting Enhanced Publications DDI Alliance Publications with References to Data: ddi. de. DDI 3. 1 URN contains: s gesi cy Agency gen ss a Object find ddre ra Version olve es rn retu Publication with References (URNs) r http: //resolve. gesis. org find object return URL requ est d o http: //www. gesis. org/docxyz cum retu rn d ocum ent URL of Documentatio n and/or Data

Supporting Enhanced Publications DSDM DDI 3 EPE Simple Export Wizard 1. 2. 0 Supporting Enhanced Publications DSDM DDI 3 EPE Simple Export Wizard 1. 2. 0

Grouping Trends • Continuity guides in different contexts – Synoptical question / variable lists Grouping Trends • Continuity guides in different contexts – Synoptical question / variable lists – Documentation of changes in question wording / answer scales • Systematic organization by conceptual categories – Codebook. Exlorer tool (relational DB) – Publication as html links on variable level in ZACAT • Taking advantage of DDI 3 in the future – Defining the standard and comparison – Qualifying relations (e. g. q-text modified, scale modified, …)

Continuity guides Literal question text over time Conceptual categories Deviations in answer categories Continuity guides Literal question text over time Conceptual categories Deviations in answer categories

Trends by concepts Trend variables by study Conceptual categories Country 1 Country 2 Trends by concepts Trend variables by study Conceptual categories Country 1 Country 2

DDI 3 RESOURCE „Ex-post Standard“ Universe Concept Comparison map Ø Equivalency Ø Relationship Ø DDI 3 RESOURCE „Ex-post Standard“ Universe Concept Comparison map Ø Equivalency Ø Relationship Ø Description Data Collection Do you …? CODS 1 Logical Product often CATS 1 Cat 1 1 … Questio ntext <>modif ied<> STUDY UNIT 1 … n Data. Collection Have you …? … Logical. Product Label <>identical<> Values <>different>> <>generation instruction<> <>scale reversed<> often Cat 1 4 … GROUP STUDY UNIT 8 -14 Data. Collection … GROUP Logical. Product STUDY UNIT 15 -x … Data. Collection … Logical. Product …

DDI 3 Use at ICPSR • Information collected from data producers in precollection phase DDI 3 Use at ICPSR • Information collected from data producers in precollection phase – Concept • Metadata output from CAI applications – Data Collection • Processor‘s dashboard – Data Processing • Metadata mining: New faceted search tool to facilitate discovery through more precise searching – Data Discovery • Relational database for comparison and harmonization across studies – Repurposing

SMDS Metadata Modules SMDS Metadata Modules

OAIS SIP AIP DDI - An easy roundtrip should be possible between the core OAIS SIP AIP DDI - An easy roundtrip should be possible between the core structure and the AIP. - The purpose of the AIP is comparable to PDF/A where all fonts are included. as backbone foris headed to efficient processing - The core structured metadata Archive and reuse of metadata. Data / Documents outside of DDI DIP Distribution Packages Web information system Search engines. Distribution Statistical packages Online Analysis. Discovery Analysis Repurposing - A combination of this information forms a - The structured metadata combined with traditional SIP. Concept - An AIP must be specially built, because the metadata Collection Processing data forms the core - Information from each life cycle stage of the archive. can include just. It would be organised in a way where references to other reused metadata. sent to the archive - An AIP should includecan be reused one study, DDI- can be understood as metadata everything of and information can dynamic SIP. can Custom Tools be also the main structure distributed Datadynamic forms can be offered CAI be ingested and of the AIP. infrom by web Tools Information - Self-archiving be extracted a can inline (e. g. Forms-based)in DDI. An etc. would exist beside the core stages. MQDS AIP SPSS etc. different way. for the structure in the archive.

DDI-based archive as collection of reusable components • • Metadata in DDI is structured DDI-based archive as collection of reusable components • • Metadata in DDI is structured in small items which can be identified and maintained by one or more institutions These parts can be – the basis for comparison and metadata mining (discovery of new relationships) – a candidate for reuse in other studies or new studies (like standard questions or variables) Study 1 Study-specific information Items for reuse New study Repository of reusable components - Standard concepts - Standard questions - Standard variables - Harmonized information - Controlled vocabularies

Issues for Discussion • Advantages and disadvantages of seeking to capture additional metadata throughout Issues for Discussion • Advantages and disadvantages of seeking to capture additional metadata throughout the data life cycle • How much information to make available to funding agencies, data producers, and secondary users? • Rules for structured documentation and delivery of items to archives for preservation • An overall DDI tool to capture and curate all metadata and data – the Holy Grail? ? ?