Скачать презентацию e Crystals Federation Open Repositories for Open Science Скачать презентацию e Crystals Federation Open Repositories for Open Science

fa81500857ee0c12a2037c272e813152.ppt

  • Количество слайдов: 38

e. Crystals Federation: Open Repositories for Open Science Dr Liz Lyon, UKOLN, University of e. Crystals Federation: Open Repositories for Open Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton, UK Dr Manjula Patel, UKOLN, University of Bath, UK CNI Taskforce Meeting, Washington DC, December 2007 This work is licensed under a Creative Commons Licence Attribution-Share. Alike 3. 0 http: //creativecommons. org/licenses/by-sa/3. 0/ Federation

Overview 1. Chemistry and Open Science : context and practice. 2. Lessons learnt from Overview 1. Chemistry and Open Science : context and practice. 2. Lessons learnt from e. Bank Phase 3 3. Data curation and preservation issues 4. Setting up the Federation: Challenges ahead?

Chemistry and Open Science: context and practice Federation Chemistry and Open Science: context and practice Federation

Social networks for chemists…. New postgraduate cohorts : millennials / Google generation : new Social networks for chemists…. New postgraduate cohorts : millennials / Google generation : new behaviours

>8000 views Community content for chemists : rich media video + paper = Pubcast >8000 views Community content for chemists : rich media video + paper = Pubcast

At the coalface: tagging & sharing workflows Astronomy, Bioinformatics, Chemistry, Social Science pilots. Universities At the coalface: tagging & sharing workflows Astronomy, Bioinformatics, Chemistry, Social Science pilots. Universities of Manchester & Southampton

“Small science” : sharing in the lab “Small science” : sharing in the lab

Open Wetware Laboratory wikis Open Wetware Laboratory wikis

Transforming practice? 2006 Open Notebook Science (ONS) 26 September: 1 st use of term Transforming practice? 2006 Open Notebook Science (ONS) 26 September: 1 st use of term blogged by Jean. Claude Bradley, Drexel University

2007 27 March: ONS at Amer Chem Society Symposium 7 August: ONS Poster in 2007 27 March: ONS at Amer Chem Society Symposium 7 August: ONS Poster in Second Life on Nature island 24 September: ONS Case Studies in Second Life 4 October: > 43, 000 hits in Google for term ONS

10 & 15 October: Policy lists, Dabble. DB membership database created US 11 October: 10 & 15 October: Policy lists, Dabble. DB membership database created US 11 October: ONS experiment starts in Cambridge, UK 7 November: Cameron Neylon (Univ Southampton / STFC, UK) posts “Sourceforge for Science” concept

10 November: Open Data for common molecules Wikichemicals? Peter Murray-Rust’s blog at Univ. Cambridge, 10 November: Open Data for common molecules Wikichemicals? Peter Murray-Rust’s blog at Univ. Cambridge, UK 27 November: Research Network proposal submitted to UK research council Yesterday: about 2, 400, 000 Google hits for Open Notebook Science New ideas are surfacing very fast with instant development, testing and take-up…. .

e. Bank Project – building the e. Crystals Data Repository Institutional Repository exemplar http: e. Bank Project – building the e. Crystals Data Repository Institutional Repository exemplar http: //ecrystals. chem. soton. ac. uk

Metadata Publication • Using simple Dublin Core • Crystal structure • Title (Systematic IUPAC Metadata Publication • Using simple Dublin Core • Crystal structure • Title (Systematic IUPAC Name) • Authors • Affiliation • Creation Date • Additional chemical information through Qualified Dublin Core • Empirical formula • International Chemical Identifier (In. Ch. I) • Compound Class & Keywords • Specifies which ‘datasets’ are present in an entry • DOI http: //dx. doi. org/10. 1594/ecrystals. chem. soton. ac. uk/145 • Rights & Citation http: //ecrystals. chem. soton. ac. uk/rights. html • Application Profile http: //www. ukoln. ac. uk/projects/ebank-uk/schemas/

wikis blogs Harvest Publish wikis blogs Harvest Publish

Lessons learnt from e. Bank Phase 3 Federation Lessons learnt from e. Bank Phase 3 Federation

Study Aims and Approach • Scoping the e. Crystals Federation of crystallography data repositories Study Aims and Approach • Scoping the e. Crystals Federation of crystallography data repositories • Questionnaire and interview-based • Joint Consultation Workshop (e. Bank, R 4 L, SPECTRa) & Report • Engage whole data lifecycle community – crystallographers, central facilities, publishers, data centres, and chemical information specialists. • Mixed project team: Chemists, Digital Library researchers & Computer Scientists

Lessons: Policy and practice • Must be considered at level of the Institution and Lessons: Policy and practice • Must be considered at level of the Institution and the practising Laboratory • Mixed lab practice – central service facility versus single “staff crystallographer” in department • “Repository Lite” for smaller lab operations? • Established data ‘publication’ practice + domain subject repository: Cambridge Crystallographic Data Centre (CCDC) • Institutional policy buy-in is essential • Demonstrate benefits and added value to senior managers • Implications for information services structure

Interoperability & Standards • Instrument manufacturers proprietary formats • Technical software platform • Metadata Interoperability & Standards • Instrument manufacturers proprietary formats • Technical software platform • Metadata schema : Application profiles • Standards and identifiers – International Chemical Identifier (In. Ch. I), DOI, CIF, CML, de facto software X-ray diffractometers • Semantic interoperability

Subject Repositories, Publishing and IPR • Established subject repository at CCDC (40 years old!) Subject Repositories, Publishing and IPR • Established subject repository at CCDC (40 years old!) : repository interactions? • The “embargo problem” : prior dissemination affecting publication of journal article • Cultural issues related to chemists “its my data” (journal article will always be sacred) • Mechanisms for sharing with collaborators and referees prior to publication?

Advocacy • The most important issue? !? Advocacy • The most important issue? !?

Data curation and preservation issues Federation Data curation and preservation issues Federation

Digital Curation Centre http: //www. dcc. ac. uk/ • Community Development work • Led Digital Curation Centre http: //www. dcc. ac. uk/ • Community Development work • Led by UKOLN • e. Bank/e. Crystals partner

e. Bank-UK Phase 3 Curation & Preservation Study http: //www. ukoln. ac. uk/projects/ebankuk/curation/ Examined e. Bank-UK Phase 3 Curation & Preservation Study http: //www. ukoln. ac. uk/projects/ebankuk/curation/ Examined four main areas 1. Audit and certification (TRAC, DRAMBORA, NESTOR, ISO International repository audit and certification BOF Group) 2. The Open Archival Information System (OAIS) and Representation Information (RI) 3. e. Bank-UK application profile and preservation metadata 4. e. Prints. org repository platform

Observations & Recommendations 1 • Self-assessment using DRAMBORA toolkit • Engage DCC audit & Observations & Recommendations 1 • Self-assessment using DRAMBORA toolkit • Engage DCC audit & certification team • Formulation of long-term objectives and policy – Deposit agreements – Services • Aim for community-supported sustainability plan • Implement regular audits: annual • Produce evidence of compliance – – Documentation Transparency Adequacy Measurability • Federation context

Observations & Recommendations 2 • Maintenance and open access of critical file formats and Observations & Recommendations 2 • Maintenance and open access of critical file formats and software – Work-up software e. g. XPREP – Export raw data from instrumentation as img. CIF • Consider Representation Information (RI) in context of whole crystallography landscape (CCDC, IUCR etc. ) • Develop a preservation and curation strategy and formal policies to indicate levels of service – Deposit, ingest, validation, dissemination • Consider services to be developed over the DCC Registry/Repository of Representation Information (RRo. RI)

Observations & Recommendations 3 • Develop preservation strategy & plan for the specific content Observations & Recommendations 3 • Develop preservation strategy & plan for the specific content • Capture preservation metadata, including versioning and provenance information • PREMIS Data Dictionary – Semantic Units (e. g. file format, significant properties, provenance, fixity info) – Extend e. Bank metadata application profile (AP)? • Obtain consensus on AP • Seek to automate metadata generation, extraction, maintenance • e. Prints. org support for information packages

Setting up the Federation: Challenges ahead? Federation Setting up the Federation: Challenges ahead? Federation

Funder Scientist Create Deposit Data centres / aggregator services Advisory IR Federation Curate Policy Funder Scientist Create Deposit Data centres / aggregator services Advisory IR Federation Curate Policy Preserve Advocacy Standards Training Collaborate Share Harvest Link Discover Re-use Link Publishers e. Crystals Federation Data Deposit Model User Link

Repository deployment & support • Roll-out in 2 phases – Universities Sydney, Glasgow, Newcastle Repository deployment & support • Roll-out in 2 phases – Universities Sydney, Glasgow, Newcastle with eprints. org platform – Universities Cambridge, STFC, Reciprocal. Net, ARCHER with other platforms • Information Environment Service Registry (IESR) listing Federation Collections

Laboratory Workflow & Provenance • Achieving end-to-end workflows: avoiding fragmentation of data, results and Laboratory Workflow & Provenance • Achieving end-to-end workflows: avoiding fragmentation of data, results and interpretations • Account for differing laboratory practice RAW DATA DERIVED DATA Public domain material RESULTS DATA Raw Data

Repository interoperability & linking services • Establish core Federation application profile and mappings • Repository interoperability & linking services • Establish core Federation application profile and mappings • Bi-directional links with derived articles in “publisher repositories”, IUCr, Royal Society of Chemistry (RSC), Chemistry Central • Test linking options: St. ORe middleware and CLADDIER (JISC-funded projects) • OAI-ORE Pathways Project developments

Interoperability testbed • • Experimental data sets + metadata as compound objects Dublin Core Interoperability testbed • • Experimental data sets + metadata as compound objects Dublin Core and METS not sufficient OAI-ORE (base: Atom Publishing Protocol) testbed Enable 3 rd party services e. g. data / text mining e. Chemistry project

Enabling data discovery • Royal Society of Chemistry Project Prospect tagging & semantic linking Enabling data discovery • Royal Society of Chemistry Project Prospect tagging & semantic linking

Preservation & Sustainability • • DRAMBORA Assessment : use DRAMBORA Interactive Enhance Application Profile Preservation & Sustainability • • DRAMBORA Assessment : use DRAMBORA Interactive Enhance Application Profile with PREMIS preservation metadata Populate RRo. RI with crystallography representation information Examine repository platform conformance to OAIS Ref Model • Survey partner institutional preservation policies

Embedding into current publishing practice • Chemists still want to publish scholarly articles • Embedding into current publishing practice • Chemists still want to publish scholarly articles • Blogs and repositories are a new form of rapid communication, but there are prior publication concerns • Timing of release of data into public domain and formal publication will be crucial – Repository must provide control over timing of public visibility – EPrints 3 version of e. Crystals has ‘embargo tokens’ • Validation and quality in an ‘Open’ world – Quality indicators?

 • Chemists still wary of ‘Open Access’ • e. Crystals Roadshow Workshops engaging • Chemists still wary of ‘Open Access’ • e. Crystals Roadshow Workshops engaging both crystallographers and their service ‘users’ in the workplace • Open forum at • Publishers International Union of Workshop to Crystallography world demonstrate cocongress (Aug 2008) existence of open data models & traditional Advocacy

Questions? Slides will be available at : http: //wiki. ecrystals. chem. soton. ac. uk/index. Questions? Slides will be available at : http: //wiki. ecrystals. chem. soton. ac. uk/index. php This work is licensed under a Creative Commons Licence Attribution-Share. Alike 3. 0 http: //creativecommons. org/licenses/by-sa/3. 0/ Federation