d10d0ee3402c5f9a2a71aaf373d95548.ppt
- Количество слайдов: 14
Use Cases for Identifiers Beyond Data Citation* … IN 041 D-08 December 15, 2016, AGU Peter Fox (RPI) pfox@cs. rpi. edu, @taswegian, #twcrpi Tetherless World Constellation, http: //tw. rpi. edu Earth and Environmental Science, Computer Science, Cognitive Science, IT and Web Science and Mark Parsons (RPI and RDA) parsom 3@rpi. edu, Institute for Data Exploration and Applications http: //idea. rpi. edu @chutneyboy * originating from 2014 ESIP Winter dinner conversation
Motivation • It started with a tweet in 2010 from Cape Town at the CODATA meeting and continued over many cappuccinos in Melbourne at IUGG 2011 • Led to Parsons & Fox, 2012, Is Data Publication the Right Metaphor? Data Science Journal, … aka orcid. org/0000 -0002 -7723 -0950 & orcid. org/0000 -0002 -10097163, xsd: 2012, http: //dx. rpi. edu/10833/4199 -5811 -32210002 -CC/? dc: title, jns: 10. 2481, doi: 10. 2481/dsj. WDS-042 • We were exploring metaphors and concluded that the community at-large might be expecting too much from the publishing metaphor • We had “concerns” • Then data citation got people’s attention, cf. literature
Metaphors • Lakoff and Johnson (1980) Metaphor is for most people a device of the poetic imagination and the rhetorical flourish— a matter of extraordinary rather than ordinary language. Moreover, metaphor is typically viewed as characteristic of language alone, a matter of words rather than thought or action. For this reason, most people think they can get along perfectly well without metaphor. We have found, on the contrary, that metaphor is pervasive in everyday life, not just in language but in thought and action. Our ordinary conceptual system, in terms of which we both think and act, is (p. 3, our emphasis) fundamentally metaphorical in nature
Data Citation • “Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the enduring scholarly record. In other words, data should be considered legitimate, citable products of research. Data citation, like the citation of other evidence and sources, is good research practice. ” 4 (http: //www. force 11. org/datacitation)
Concerns • Identification v. Location – URI v. URL = what is it versus where is it (now)? • Attribution and credit (discredit) • Ownership and governance • Provenance and traceability • Impact and return on investment = fame and fortune ; -)
Complete Traceability for National Climate Assessment Transparency ------------------------------------ Reproducibility Traceable Data Traceable Processes • References • Image sources • Data sources • Link to datasets • Complete metadata • Description of methods • Access to process info & review Traceable Tools • Access to computer code • Description of systems and platforms Easier. . . Harder Traceable Sources
Dataset metadata from a image in a figure 7
Identifier Resolution doi: 10. 5067/MEASURES/GSSTF/DATA 308 A common, persistent, citable reference to that dataset. We build GCIS specific identifiers from those: http: //data. globalchange. gov/doi/10. 5067/MEASURES/GSSTF/DATA 308 Then we can resolve it (with content negotiation) on our site, and link it with identifiers for our other resources, including asserting equivalence and linking with the data center responsible for stewardship and distribution of the actual data. We can also refer and link to other repositories of information about those resources. 8
Citation as-a group of use cases • Back to the granularity choice… – Citation – Micro-citation – Nano-citation – I mean, really? ? ? Yotta-citation? • Software citation use cases: https: //github. com/researchsoftwareinsti tute/software-data-citation-ws/issues/ • All about identifiers…
Roles: CRedi. T http: //docs. casrai. org/CRedi. T • Facilitate authorship/contributor-ship disclosure processes and policies • Identify good practices for tracking contributions to the components of scholarly published output • Minimize authorship disputes • Enable appropriate recognition for the different contributions in multi-authored works – across all aspects of the research being reported (including data curation, statistical analysis, etc. ) • Support identification of peer reviewers and experts • Support grant making by enabling funders to more easily identify those responsible for specific research products, developments or breakthroughs
CRedi. T • Improve automated tracking of funding outcomes and impact • Support new forms of social and research networking • Further developments in data management and nano -publication • Inform the “science of science”, e. g. studies of productivity over a career trajectory • Enable new metrics of credit and attribution
About Types (~ Roles, ~ Artifacts, ~ Activities) • Types of identifiers – PIT from the Research Data Alliance - https: //www. rd -alliance. org/group/pid-information-types -wg/outcomes/pid-information-types • Permanency -> PIDs = get your tattoo • Roles seem to be important • Conjecture: it might be about reducing uncertainty in the what and the who -> entropy and mutual information
Entropy and Rates • R=H(x)-H(x|y) (Shannon; 1948) orcid. org/0000 -0002 -7723 -0950 & • R=Rate of transmission - measures the average orcid. org/0000 -0002 -1009 -7163, ambiguity of the received signal xsd: 2012, http: //dx. rpi. edu/10833/41995811 -3221 -0002 -CC/? dc: title, jns: 10. 2481, doi: 10. 2481/dsj. WDS-042 Mutual information – lowers entropy p(xi) is the probability mass function of outcome xi. 13
Recap • Is anyone “crawling” our identifiers? • Who is scraping our landing pages? • Are we querying for additional content? • “It” continues over many adult beverages in <any place> • “We” want more metaphors from the community at-large • “We” still have “concerns” We = orcid. org/0000 -0002 -7723 -0950 & orcid. org/0000 -00021009 -7163 aka @chutneyboy & @taswegian
d10d0ee3402c5f9a2a71aaf373d95548.ppt