- Количество слайдов: 23
Training Course on Data Management for Information Professionals and In-Depth Digitization Practicum 26 - 30 September 2011, Oostende, Belgium Citation Linking and other Access Models Lisa Raymond
Data Citation Data citation is an evolving practice. It is still rarely done in publications, but researchers are starting to see the benefit of being able to get credit for their data. Additionally as more data is being made available online and re-used, it needs to be cited.
Purpose • To provide fair credit for data creators or authors, data stewards, and other critical people in the data production and curation process.
Purpose • To ensure scientific transparency and reasonable accountability for authors and stewards.
Purpose • To aid in tracking the impact of data set through reference in scientific literature.
Purpose • To help data authors verify how their data are being used. Download Stuff Now This is very important to scientists. An obstacle to data deposit in open access repositories has been concern over misuse of data. Proper citation and attribution will help researchers verify other work being done with their data.
Core Required Elements • • Author(s)--the people or organizations responsible for the intellectual work to develop the data set. The data creators. Release Date--when the particular version of the data set was first made available for use (and potential citation) by others. Title--the formal title of the data set Version--the precise version of the data used. Careful version tracking is critical to accurate citation. Archive and/or Distributor--the organization distributing or caring for the data, ideally over the long term. Locator/Identifier--this could be a URL but ideally it should be a persistant service that resolves to the current location of the data in question. The Digital Object Identifier is currently the most broadly adopted service for persistently identifying and locating whole data collections (as opposed to individual files or granules), although other identifier/locator services, such as ARKs and Handles, could be used. Time, date accessed--because data can be dynamic and changeable in ways that are not always reflected in release dates and versions, it is important to indicate when on-line data were accessed.
Example • Additional fields can be added as necessary to credit other people and institutions, etc. Additionally, it is important to provide a scheme for users to indicate the precise subset of data that were used. This could be the temporal and spatial range of the data, the types of files used, a specific query id, or other ways of describing how the data were subsetted. An example citation: • Zwally, H. J. , R. Schutz, C. Bentley, J. Bufton, T. Herring, J. Minster, J. Spinhirne, and R. Thomas. 2003. GLAS/ICESat L 1 A Global Altimetry Data V 018, 15 October to 18 November 2003. Data set accessed 2011 -07 -21 at doi: 10. 3334/NSIDC/gla 01.
Identifiers vs. Locators URL is an identifier Digital Object Identifier (DOI) is a locator Consider a human example. A name such as “Lisa Raymond" (Associate Library Director) is an identifier. An address such as “ 260 Woods Hole Road, Woods Hole, MA, USA" is a locator. The locator might work as an identifier, because you might find Lisa in her office, but she may also have retired and there is a new Associate Director who plays the same role but is not the same person. Similarly, you may be able to locate Lisa based on her name and title, but what happens if she is telecommuting this week and is in Oostende not Massachusetts? It is similar with digital objects. One might be able to identify a data set by its URL, for example, but there is no guarantee that what is at that URL today is the same as what was there yesterday.
DOI, Handle, URI/URL DOI - The DOI System provides a framework for persistent identification, managing intellectual content, managing metadata, linking customers with content suppliers, facilitating electronic commerce, and enabling automated management of media. DOI names can be used for any form of management of any data, whether commercial or non-commercial. The DOI System is an ISO International Standard. The system is managed by the International DOI Foundation, an open membership consortium including both commercial and non-commercial partners. Over 50 million DOI names have been assigned by DOI System Registration Agencies in the US, Australasia, and Europe. http: //www. doi. org/ A DOI … is a handle system manged by the International DOI Foundation through registry agents such as Cross Ref and Data. Cite. At this time Woods Hole is paying 6 cents per dataset for DOIs, we pay $1. 00 for current technical reports, articles, etc. We decided to assign DOIs in addition to handles because scientists were familiar with the term, the cost is minimal, and it is a widely used international standard.
DOI, Handle, URI/URL Handle - The Handle System is a technology specification for assigning, managing, and resolving persistent identifiers for digital objects and other resources on the Internet. The protocols specified enable a distributed computer system to store identifiers (names, or handles), of digital resources and resolve those handles into the information necessary to locate, access, and otherwise make use of the resources. That information can be changed as needed to reflect the current state and/or location of the identified resource without changing the handle. http: //en. wikipedia. org/wiki/Handle_System A handle is a technology that allows the item to be moved, but the system allows the handle to stay the same and resolve to the correct place. DSpace uses the Handle system – as in Ocean Docs and Published Ocean Data. DSpace generates the handle and there is no cost.
DOI, Handle, URI/URL - In computing, a Uniform Resource Locator or Universal Resource Locator (URL) is a character string that specifies where a known resource is available on the Internet and the mechanism for retrieving it. A URL is technically a type of Uniform Resource Identifier (URI) but in many technical documents and verbal discussions URL is often used as a synonym for URI.  In computing, a Uniform Resource Identifier (URI) is a string of characters used to identify a name or a resource on the Internet. Such identification enables interaction with representations of the resource over a network (typically the World Wide Web) using specific protocols. Schemes specifying a concrete syntax and associated protocols define each URI. One can classify URIs as locators (URLs), or as names (URNs), or as both. A Uniform Resource Name (URN) functions like a person's name, while a Uniform Resource Locator (URL) resembles that person's street address. In other words: the URN defines an item's identity, while the URL provides a method for finding it. URI and URL specifies where a known resource Wikipedia is available on the internet. Think of it as a naming convention, not a locator.
DOI, Handle, URI/URL Examples DOI – 10. 1575/1912/4199 DOI - The combination of a unique prefix element (assigned to a particular DOI registrant) and a unique suffix element (provided by that registrant) is unique, and so allows the Decentralized allocation of DOI numbers. The 4199 is the item number in our WH system. Handle – http: //hdl. handle. net/1912/4199 Handle – lacks the suffix, has the handle resolver information and then the 1912 is for WHOAS and 4199 the item number URI – ftp: //example. org/resource. txt URL - http: //en. wikipedia. org/wiki/Uniform_Resource_Identifier URI / URL as you can see names the items and points to it on the internet
We must rely on location information combined with other information such as author, title, and version to uniquely identify data. The key to making registered locators, such as DOIs or Handles, work to identify and locate data sets is through careful tracking and documentation of versions.
Location alone is not enough • It is important to remember when creating citations that location alone is not enough. DOIs are now being assigned to datasets that are still being added to – such as daily temperature data and also to datasets that are updated because of corrections. Note, it is agreed that major version changes to datasets should get a new DOI, but minor may not. So author, version and date accessed are very important in the citation.
More Examples Doe, J. and R. Roe. 2001. The FOO Data Set. The FOO Data Center. doi: 10. xxxx/notfoo. 547983. Accessed 1 May 2011. Doe, J. and R. Roe. 2001, updated 2005. The FOO Occasionally Updated Data Set. The FOO Data Center. doi: 10. xxxx/notfoo. 547983. Accessed 1 May 2011. Doe, J. and R. Roe. 2001, updated daily. The FOO Time Series Data Set. The FOO Data Center. doi: 10. xxxx/notfoo. 547983. Accessed 1 May 2011. Doe, J. and R. Roe. 2001. The FOO Data Set. Version 2. 3. The FOO Data Center. doi: 10. xxxx/notfoo. 547983. Accessed 1 May 2011.
Data linked to articles Deposit Data … Where? Some Existing Models
Data linked to published articles • There are several projects that are working in this area. SCOR/IODE and the MBLWHOI Library have been collaborating on data publications and looking at two use cases – data from a data center and data associated with published articles.
Citation for WHOAS Data Seigel, David A. , 2006. VERTIGO project Niskin bottle sample data from KM 0414 and RR_K 2 cruises. Bottle KM 0414. csv. doi: 10. 1575/1912/4199. Accessed 3 August 2011.
Links to Linked Data Sources http: //www. pangaea. de/about/ http: //thedata. org/ http: //datadryad. org/ http: //www. mblwhoilibrary. org/services/whoas-repositoryservices • • http: //publishedoceandata. net/ • • •
Acknowledgments Mark Parson (NSIDC) and the ESIP Federation Resources Interagency Data Stewardship/Citations/provider guidelines http: //wiki. esipfed. org/index. php/Interagency_Data_Stewardship/Citations/provider_ guidelines Data Citation presentation. Geo. Data , 2011 Broomfield, CO, 3 March 2011 http: //tw. rpi. edu/media/latest/Parsons. Data. Citation. pdf A Proposed Standard for the Scholarly Citation of Quantitative Data http: //www. dlib. org/dlib/march 07/altman/03 altman. html