
0579ad15512a7e049fe16d042e25a856.ppt
- Количество слайдов: 24
Managing the Metadata Lifecycle The Future of DDI at GESIS and ICPSR Peter Granda, ICPSR Meinhard Moschner, GESIS Mary Vardigan, ICPSR Joachim Wackerow, GESIS Wolfgang Zenk-Möltgen, GESIS
Research Data Life Cycle Archiving Concept Collection Processing Distribution Discovery Repurposing Analysis
Current Uses of DDI • DDI 2 used for many different purposes by many different archival institutions, e. g. , metadata records for data catalogs, export to Web-based information systems such as Nesstar, long-term preservation, and PDF codebooks • GESIS and ICPSR are developing procedures and systems to extend use of DDI in their institutions
DDI 3 Expands in Scope • To date use mainly limited to Distribution and Archiving stages of data life cycle • DDI 3 enables use of new elements and structures to extend markup to other stages of the life cycle - both earlier and later • Emphasis is on projects and tasks already in process at each institution
DDI 3 Use at GESIS • • Structured Comments – Processing Translation of EVS Questionnaire – Collection Supporting Enhanced Publications – Analysis Continuity Guides: Trends by Concepts – Concept, Discovery, Repurposing
Extracting structured information in current workflow • Example: building derived variables by SPSS • SPSS setups contain commands and comments • Necessary steps for using SPSS setups as information source for DDI – Improving comments for automated extraction • formalize layout • add keywords from a list – Extraction of structured comments and related commands by custom tool. – Transformation of this information into DDI 3 fragments
Extracting structured information in current workflow ***v* Variables/Derived. Variables * DESCRIPTION * This section is on derived variables; ***v* Derived. Variables/w 101_new * NAME * w 101_new * DESCRIPTION * w 101_new is a derived variable from w 101; * It has the original value from w 101 * when w 102 is equal 1 * otherwise it has the value 5; * USED VARIABLES * w 101, w 102 * SOURCE **. compute w 101_new = 5. if ( w 102 = 1 ) w 101_new = w 101. ** * VERSION * 2009 -04 -18 * AUTHOR * Achim Wackerow * EMAIL * joachim. wackerow@gesis. org ***. Report (HTML) Extractor DDI 3 fragments Generation. Instruction Description Command SPSS Result
Translation of EVS Questionnaire DSDM http: //zacat. gesis. org
Supporting Enhanced Publications DDI Alliance Publications with References to Data: ddi. de. DDI 3. 1 URN contains: s gesi cy Agency gen ss a Object find ddre ra Version olve es rn retu Publication with References (URNs) r http: //resolve. gesis. org find object return URL requ est d o http: //www. gesis. org/docxyz cum retu rn d ocum ent URL of Documentatio n and/or Data
Supporting Enhanced Publications DSDM DDI 3 EPE Simple Export Wizard 1. 2. 0
Grouping Trends • Continuity guides in different contexts – Synoptical question / variable lists – Documentation of changes in question wording / answer scales • Systematic organization by conceptual categories – Codebook. Exlorer tool (relational DB) – Publication as html links on variable level in ZACAT • Taking advantage of DDI 3 in the future – Defining the standard and comparison – Qualifying relations (e. g. q-text modified, scale modified, …)
Continuity guides Literal question text over time Conceptual categories Deviations in answer categories
Trends by concepts Trend variables by study Conceptual categories Country 1 Country 2
DDI 3 RESOURCE „Ex-post Standard“ Universe Concept Comparison map Ø Equivalency Ø Relationship Ø Description Data Collection
DDI 3 Use at ICPSR • Information collected from data producers in precollection phase – Concept • Metadata output from CAI applications – Data Collection • Processor‘s dashboard – Data Processing • Metadata mining: New faceted search tool to facilitate discovery through more precise searching – Data Discovery • Relational database for comparison and harmonization across studies – Repurposing
SMDS Metadata Modules
OAIS SIP AIP DDI - An easy roundtrip should be possible between the core structure and the AIP. - The purpose of the AIP is comparable to PDF/A where all fonts are included. as backbone foris headed to efficient processing - The core structured metadata Archive and reuse of metadata. Data / Documents outside of DDI DIP Distribution Packages Web information system Search engines. Distribution Statistical packages Online Analysis. Discovery Analysis Repurposing - A combination of this information forms a - The structured metadata combined with traditional SIP. Concept - An AIP must be specially built, because the metadata Collection Processing data forms the core - Information from each life cycle stage of the archive. can include just. It would be organised in a way where references to other reused metadata. sent to the archive - An AIP should includecan be reused one study, DDI- can be understood as metadata everything of and information can dynamic SIP. can Custom Tools be also the main structure distributed Datadynamic forms can be offered CAI be ingested and of the AIP. infrom by web Tools Information - Self-archiving be extracted a can inline (e. g. Forms-based)in DDI. An etc. would exist beside the core stages. MQDS AIP SPSS etc. different way. for the structure in the archive.
DDI-based archive as collection of reusable components • • Metadata in DDI is structured in small items which can be identified and maintained by one or more institutions These parts can be – the basis for comparison and metadata mining (discovery of new relationships) – a candidate for reuse in other studies or new studies (like standard questions or variables) Study 1 Study-specific information Items for reuse New study Repository of reusable components - Standard concepts - Standard questions - Standard variables - Harmonized information - Controlled vocabularies
Issues for Discussion • Advantages and disadvantages of seeking to capture additional metadata throughout the data life cycle • How much information to make available to funding agencies, data producers, and secondary users? • Rules for structured documentation and delivery of items to archives for preservation • An overall DDI tool to capture and curate all metadata and data – the Holy Grail? ? ?