59435f3a9c356c01b62a4d4665de4935.ppt
- Количество слайдов: 40
Digital Object Identifiers for Science Data Norman Paskin, International DOI Foundation doi>
Digital Object Identifier = DOI • • doi> A name (not a location) for an entity on digital networks A system for persistent and actionable identification and interoperable exchange of managed information on digital networks – Standards-based components (detail in a moment) – Now to become an International Standard (in ISO TC 46) • Developed as cross-industry, cross-sector, not-for-profit effort managed by an open membership collaborative development body – International DOI Foundation (IDF) • In widespread use now: • In use, is a mechanism “behind the scenes”, • Offers interoperable common system for identification of science data: two projects considered as examples: – Over 15 million assigned, over 1000 naming authorities (users) – Key feature of scientific primary publishing as part of Cross. Ref system – Adopted for government documents (EC, OECD, UK, etc) – e. g. looks like a URL in a web context – TIB project (citation of primary data sets) – Names for Life (biological taxonomy)
Identifiers doi> • The word “identifier” can mean several different things, e. g. : • Requirements: reliability, automated global access, and interoperability – Labels : Output of numbering schemes e. g. “ISBN 3 -540 -40465 -1” – Specifications for using labels: e. g. on internet URL, URN, URI (URI = Uniform Resource Identifier) – Implemented systems: Labels, following a specification, in a system e. g. DOI system. Packaged system offering label + tools + implementation mechanisms – Interoperability = the possibility of use in services outside the direct control of the issuing assigner. • • • Persistence implies interoperability (with the future) Interoperability implies extensibility (do not know future uses) Hence DOI is a generic framework applicable to any digital object – Digital object can be a representation of any entity
Numbering scheme Policies DOI is the combination of these four components doi> Data Model Internet Resolution
DOI syntax can include any existing identifier “label” formal or informal, of any entity • • An identifier “container” e. g. – 10. 1234/NP 5678 – 10. 5678/ISBN-0 -7645 -4889 -4 – 10. 2224/2004 -10 -ISO-DOI NISO Z 39. 84, DOI Syntax
• • • Resolve from DOI to data – initially to location (URL) – persistence May be to multiple data: – Multiple locations – Metadata – Services – Extensible user-defined Uses the Handle system - Implementing URI/URN concept - Running on TCP/IP (common co-inventor) - IETF RFCs 3650, 3651, 3652 - See Release 1. 0, September 2003 "Online Registries: The DNS and Beyond. . . “ [doi: 10. 1340/309 registries ] Internet resolution allows a DOI to link to any & multiple pieces of current data
• DOI Data Model = Metadata tools: – a data dictionary to define + – a grouping mechanism to relate • Necessary for interoperability – “Enabling information that originates in one context to be used in another in ways that are as highly automated as possible”. • Able to use existing metadata <indecs> Data Dictionary + DOI AP framework – Mapped using a standard dictionary – Can describe any entity at any level of granularity – indecs. DD which incorporates ISO MPEG 21 RDD • IDF is the MPEG 21 RDD registration authority
DOI policies allow any model for practical implementations • • Implementation through IDF – Governance and agreed scope, policy, “rules of the road” – Technical infrastructure: resolution mechanism, proxy servers, mirrors, back-up, central dictionary, – Social infrastructure: persistence commitments, fall-back procedures, costrecovery (self-sustaining), shared use of system – Not a standard but a Registration Authority/maintenance agency IDF delegates through Registration Agencies – Each can develop own applications – Use in “own brand” ways appropriate for their community
DOI to become ISO TC 46/SC 9 standard http: //www. collectionscanada. ca/iso/tc 46 sc 9/index. htm Information and Documentation - Identification and Description Home of “identification numbering”: identifiers for semantically meaningful entities: ISO 2108 International Standard Book Numbering (ISBN) ISO 3297 International Standard Serial Number (ISSN) ISO 3901 International Standard Recording Code (ISRC) ISO 10444 International Standard Technical Report Number (ISRN) ISO 10957 International Standard Music Number (ISMN) ISO 15706 International Standard Audiovisual Number (ISAN) ISO 15707 International Standard Musical Work Code (ISWC) ISO Project 20925 Version identifier for Audiovisual Works (V-ISAN) ISO Project 21047 International Standard Text Code (ISTC)
DOI combination of components Identify DOI syntax can include any existing identifier, formal or informal, of any entity eg 10. 2341/0 -7645 -4889 -1 10. 5678/978 -0 -7645 -4889 -4 10. 1000/ISBN 0764548891 10. 1234/Norman_presentation 10. 2224/2004 -10 -28 -ISO-DOI doi> Describe DOI metadata can be of any type, standard or proprietary eg Onix. For. Books Onix. For. Serials IEEE/LOM MARC Dublin Core Proprietary scheme Resolve The Handle resolution technology allows you to access any kind of Service associated with your DOI. eg (to interoperate with anyone else in the DOI network, map to the <indecs> Data Dictionary (i. DD). A package of services is an Application Profile Services can include metadata services
DOI and scientific data doi> • DOI is already the core technology for maintaining cross-reference • Cross. Ref system used by 350+ publishers representing bulk of STM articles (as pre-publication link builder) www. crossref. org 9, 000 DOIs per day added to Cross. Ref. • • – persistent links between a citation and internet access to article – Over 12 million DOIs now registered with Cross. Ref, – Over 850, 000 assigned to books and conference proceedings. Several projects suggested to IDF using DOIs for data (not connected with Cross. Ref) – physico-chemical property data; biological microscopy images. – See Paskin, ICSTI 2002 paper • Some projects have developed their own identifiers, very useful for their own area – E. g. Life Science Identifier (I 3 C/IBM): simple URN mechanism, nongeneric, non-global – These can be incorporated into a DOI if needed to make globally interoperable and extensible • Two projects in particular have developed DOI applications:
(1) TIB: Citation of Primary Data • Problem: re-use of existing data sets • Background • doi> Solution: – Attribution of data source: make data publications citable in a standard way (cf. articles Citation Index) – Archiving of data in context so as to be discoverable and interoperable (usable by others) – CODATA National Committee WG, grant-aided by DFG (Sept 2001 to May 2002): Report "Concept of Citing Scientific Primary Data“ – Continuation as project for pilot implementation funded by DFG Oct 2003 to Oct 2005 at TIB (German National Library of Science & Technology) – Development of DOI registration agency for Data - DOIs for data sets, with associated metadata - Core management metadata applicable to all datasets - Structured metadata extensible to specific science disciplines
(1) Citation of Primary Data: illustration of solution • doi> During her research for the World Data Center Climate (WDCC) Dr. Weather gains primary data about the weather in Hannover in the year 2003. – Primary data is tested, evaluated, stored and administrated at the WDCC. – Primary data is registered and allocated DOI at the TIB – With quality control of metadata, no change once allocated, etc • Dr Weather can now cite this with a resolvable DOI e. g DOI: 10. 1594 /WDCC/W_Han_2003_MMB_2 10. 1594 (Prefix) WDCC W_Han_2003_MMB_2 • = TIB as the registration agency. = research institute. = internal name of the Data DOI is resolvable directly, or via http as http: //dx. doi. org/10. 1594/WDCC/W_Han_2003_MMB_2
(1) Citation of Primary Data: illustration of solution doi> Usage scenario 1: • • Dr. Storm is reading publications from Dr. Weather in a journal and would like to analyse her data under different aspects. Can resolve the DOI to obtain the data set for use In his publication ”Comparison of the weather from Hannover and Miami” Dr. Storm cites Dr. Weather’s data using its DOI, referring to the uniqueness and own identity of the original data. Citation example: Weather, 2003: “Weather in Hannover for 2003” doi: 10. 1594/WDCC/W_Han_2003_MMB_2 Usage scenario 2: • • • Mr. Nice is writing a paper about the sales figures of ice cream in Hannover in 2003, but he has no information about the weather. Searches via TIB central registration agency metadata search Result is doi: 10. 1594/WDCC/W_Han_2003_MMB_2 He resolves the DOI to find the data. The metadata refers him to the WDCC as publisher and data archive. In his paper he cites the data using the DOI.
(2) Names for life: Biological taxonomy doi> • Problem: “Future-proofing biological nomenclature” • For a given nomenclature in a biological taxonomy, change occurs • Solution: DOI proposed as tool – See Garrity and Lyons, OMICS, 2003 – e. g. new species recognised, species reassigned as the founding species of new genera; synonyms; species split into subspecies which later became separate species; – resulting in changes of names, genera, families, classes, relationships over time – How does researcher keep track? – – a data model of nomenclature and taxonomy enabling disambiguation of synonyms and competing taxonomies a metadata resolution service enabling dissemination of archived and updated information objects through persistent links
(2) Names for Life: illustration of problem Alteromonas macleodii(T) communis vaga nomenclature doi>
1972 Alteromonas macleodii(T) communis vaga nomenclature
1972 1973 Alteromonas macleodii(T) communis vaga haloplanktis nomenclature
1972 1973 1976 Alteromonas macleodii(T) communis vaga haloplanktis rubra nomenclature
1972 1973 1976 1977 Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea nomenclature
1972 1973 1976 1977 1978 Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina nomenclature
1972 1973 1976 1977 1978 1979 Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia nomenclature
1972 1973 1976 1977 1978 1979 1981 Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai nomenclature
1972 1973 1976 1977 1978 1979 1981 1982 Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae nomenclature
1972 1973 1976 1977 1978 1979 1981 1982 1984 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum nomenclature Marinomonas communis(T) vaga Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum nomenclature Marinomonas communis(T) vaga Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae Shewanella putrifaciens(T) benthica hanedai
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 1987 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum nomenclature Marinomonas communis(T) vaga Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae denitrificans Shewanella putrifaciens(T) benthica hanedai
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 1987 1988 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum nomenclature Marinomonas communis(T) vaga Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae denitrificans colwelliana Shewanella putrifaciens(T) benthica hanedai
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 1987 1988 1990 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum biejerinckii pelagicum maris hiroshimense nomenclature Marinomonas communis(T) vaga Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae denitrificans colwelliana tetradonis Shewanella putrifaciens(T) benthica hanedai colwelliana
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 1987 1988 1990 1992 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum biejerinckii pelagicum maris hiroshimense nomenclature Marinomonas communis(T) vaga Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae denitrificans colwelliana tetradonis atlantica carageenovora Shewanella putrifaciens(T) benthica hanedai colwelliana algae
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 1987 1988 1990 1992 1995 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum biejerinckii pelagicum maris hiroshimense nomenclature Marinomonas communis(T) vaga Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae denitrificans colwelliana tetradonis atlantica carageenovora distincta fulginea Shewanella putrifaciens(T) benthica hanedai colwelliana algae Pseudoalteromonas haloplanktis(T) haloplanktis tetradonis atlantica aurantia carrageenovora citrea esperjiana luteoviolacea nigrifaciens pisicida rubra undina
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 1987 1988 1990 1992 1995 1997 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum biejerinckii pelagicum maris hiroshimense nomenclature Marinomonas communis(T) vaga Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae denitrificans colwelliana tetradonis atlantica carageenovora distincta fulginea elyakoviii Shewanella putrifaciens(T) benthica hanedai colwelliana algae Pseudoalteromonas haloplanktis(T) haloplanktis tetradonis atlantica aurantia carrageenovora citrea esperjiana luteoviolacea nigrifaciens pisicida rubra undina antartica
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 1987 1988 1990 1992 1995 1997 2000 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum biejerinckii pelagicum maris hiroshimense nomenclature Marinomonas communis(T) vaga mediterannea Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae denitrificans colwelliana tetradonis atlantica carageenovora distincta fulginea elyakoviii Shewanella putrifaciens(T) benthica hanedai colwelliana algae fridgidimarina geldimarina woodyii amazonensis baltica oneidensis pealeana violacea Pseudoalteromonas haloplanktis(T) haloplanktis tetradonis atlantica aurantia carrageenovora citrea esperjiana luteoviolacea nigrifaciens pisicida rubra undina antartica bacteriolytica prydzensis tunicata distincta elyakovii peptidolytica
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 1987 1988 1990 1992 1995 1997 2000 2001 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum biejerinckii pelagicum maris hiroshimense nomenclature Marinomonas communis(T) vaga mediterannea Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae denitrificans colwelliana tetradonis atlantica carageenovora distincta fulginea elyakoviii Shewanella putrifaciens(T) benthica hanedai colwelliana algae fridgidimarina geldimarina woodyii amazonensis baltica oneidensis pealeana violacea japonica Pseudoalteromonas haloplanktis(T) haloplanktis tetradonis atlantica aurantia carrageenovora citrea esperjiana luteoviolacea nigrifaciens pisicida rubra undina antartica bacteriolytica prydzensis tunicata distincta elyakovii peptidolytica tetrodonis
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 1987 1988 1990 1992 1995 1997 2000 2001 2002 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum biejerinckii pelagicum maris hiroshimense nomenclature Marinomonas communis(T) vaga mediterannea Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae denitrificans colwelliana tetradonis atlantica carageenovora distincta fulginea elyakoviii Shewanella putrifaciens(T) benthica hanedai colwelliana algae fridgidimarina geldimarina woodyii amazonensis baltica oneidensis pealeana violacea japonica denitrificans livingstonensis alleyanna Pseudoalteromonas haloplanktis(T) haloplanktis tetradonis atlantica aurantia carrageenovora citrea esperjiana luteoviolacea nigrifaciens pisicida rubra undina antartica bacteriolytica prydzensis tunicata distincta elyakovii peptidolytica tetrodonis
1972 1973 1976 1977 1978 1979 1981 1982 1984 1986 1987 1988 1990 1992 1995 1997 2000 2001 2002 2004 Oceanosprillum linum(T) japonicum minutium biejerinckii maris williamsae hiroshimense multiglobiferum pelagicum pusillum commune jannaschii kreigii vagum biejerinckii pelagicum maris hiroshimense nomenclature Marinomonas communis(T) vaga mediterannea primoryensis Alteromonas macleodii(T) communis vaga haloplanktis rubra citrea esperjiana undina aurantia putrifaciens hanedai luteoviolaceae denitrificans colwelliana tetradonis atlantica carageenovora distincta fulginea elyakoviii Shewanella putrifaciens(T) benthica hanedai colwelliana algae fridgidimarina geldimarina woodyii amazonensis baltica oneidensis pealeana violacea japonica denitrificans livingstonensis alleyanna mariniintestina saire schlegeliana gaetbuli Pseudoalteromonas haloplanktis(T) haloplanktis tetradonis atlantica aurantia carrageenovora citrea esperjiana luteoviolacea nigrifaciens pisicida rubra undina antartica bacteriolytica prydzensis tunicata distincta elyakovii peptidolytica tetrodonis 11 others
(2) Names for Life: illustration of solution links from the web links to the web journal article name DOI strain record nomos gene annotation DOI journal article taxon DOI exemplar DOI any online information combined DOI name journal article strain record gene annotation
(2) Names for Life: illustration of solution doi> By reasoning over information objects, construct services that can be offered through multiple resolution. name nomos combined name Look up this name and all its synonyms in Pub. Med taxon Compare this name to the current state (contents) of the taxon exemplar Determine whether this exemplar is part of a taxon in another nomos dissemination
Summary: DOI • doi> A system for persistent and actionable identification and interoperable exchange of managed information on digital networks – Standards-based components (detail in a moment) – Now to become an International Standard (in ISO TC 46) • Developed as cross-industry, cross-sector, not-for-profit effort managed by an open membership collaborative development body – International DOI Foundation (IDF) • In widespread use now: • In use, is a mechanism “behind the scenes”, • Offers interoperable common system for identification of science data: two projects considered as examples: – Over 15 million assigned, over 1000 naming authorities (users) – Key feature of scientific primary publishing as part of Cross. Ref system – Adopted for government documents (EC, OECD, UK, etc) – e. g. looks like a URL in a web context – TIB project (citation of primary data sets) – Names for Life (biological taxonomy)
doi> n. paskin@doi. org www. doi. org
59435f3a9c356c01b62a4d4665de4935.ppt