a48e715949ede6e5122a7230f8b6764f.ppt
- Количество слайдов: 51
my. Experiment: Towards Research Objects David De Roure Building Linked Web Communities in Biomedicine to Accelerate Research
• • What is it? How it’s being used How we built it Towards the e-Laboratory
The social process of Science 2. 0 Digital Libraries Virtual Learning Environment Undergraduate Students scientists Graduate Students Reprints Peer. Reviewed Journal & Conference Papers experimentation Technical Preprints Reports & Metadata Repositories Local Web Certified Experimental Results & Analyses Data, Metadata Provenance Workflows Ontologies
Sharing pieces of process http: //www. mygrid. org. uk/tools/taverna/ http: //www. microsoft. com/mscorp/tc/trident. mspx http: //usefulchem. wikispaces. com/page/code/EXPLAN 001
E. Science laboris n Workflows are the new rock and roll n Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources n The era of Service Oriented Applications n Repetitive and mundane boring stuff made easier
Triana Trident Kepler Ptolemy II Taverna BPEL Bio. Extract
Reuse, Recycling, Repurposing n Paul writes workflows for identifying biological pathways implicated in resistance to Trypanosomiasis in cattle n Paul meets Jo. Jo is investigating Whipworm in mouse. n Jo reuses one of Paul’s workflow without change. n Jo identifies the biological pathways involved in sex dependence in the mouse model, believed to be involved in the ability of mice to expel the parasite. n Previously a manual two year study by Jo had failed to do this.
n “Facebook for Scientists”. . . but different to Facebook! n A repository of research methods n A community social network n A Virtual Research Environment n Open source (BSD) Ruby on Rails application with HTML, REST and SPARQL interfaces n Project started March 2007 n Closed beta since July 2007 n Open beta November 2007 my. Experiment currently has 1712 registered users, 141 groups, 584 Taverna workflows plus 81 others, and 51 packs Go to www. myexperiment. org to access publicly available content or create an account
Distinctives my. Experiment Features n n n User Profiles Groups Friends Sharing Tags Workflows Developer interface Credits and Attributions Fine control over privacy Packs Federation Enactment
Control over sharing The most important aspect of my. Experiment Designed by scientists
QTL Workflow 16 Logs Results A Pack Metadata Slides Common pathways Workflow 13 Paper Results
For Developers n All the my. Experiment services are accessible through simple RESTful programming interfaces n use your existing environment and augment it with my. Experiment functionality n build entirely new interfaces and functionality mashups n The Ruby on Rails codebase is open source (BSD) so you can run your own my. Experiment – perhaps for your own lab or to develop new funcionality n Go to wiki. myexperiment. org for information about our Developer Community
• • What is it? How it’s being used How we built it Towards the e-Laboratory
Adam Belloum
Sig. Win-detector: is a grid-enabled workflow application that takes a sequence of numbers and a series of window sizes as input and detects all significant windows for each window size using a moving median false discovery rate (mm. FDR) procedure. Human transcriptome map WS-VLAM composer discovered RIDGE Human transcriptome map DNA curvature of the Escherichia Coli chromosome More details: http: //staff. science. uva. nl/~inda/Sig. Win-detector. html
Carol Lushbough
Google Gadgets Bringing my. Experiment to the i. Google user
Taverna Plugin Bringing my. Experiment to the Taverna user
• Of the 661 workflows, 531 are publicly visible whereas 502 are publicly downloadable. • 3% of the workflows with restricted access are entirely private to the contributor and for the remaining they elected to share with individual users and groups. • 69 workflows (over 10%) have been shared, with the owner granting edit permissions to specific users and Scientists. C do share! groups. • In addition there are 52 instances where users have noted Consumers > Curators > Producers that a workflow is based on another workflow on the site. • The most viewed workflow has 1566 views. • There are 50 packs, ranging from tutorial examples to bundles of materials relating to specific experiments.
Analysis Two distinct my. Experiment communities: • Supermarket shoppers Workflow consumers prefer larger workflows ready to be downloaded and enacted • Tool builders Workflow authors prefer smaller, modularized workflows which can be assembled & customized Considerations in Collaborative Curation: • Quality and sufficiency of good documentation • Content decay surveillance • Consumers > curators > producers • Contributor, expert and community curation • Incentives for curation
• • What is it? How it’s being used How we built it Towards the e-Laboratory
24/5/2007 | my. Experiment | Slide 28
For Developers android API config ORE FOAF SIOC Search API Managed REST API EPrints ` DSpace Fedora S 3 tags ratings reviews profiles workflows credits groups friendships packs files Search Engine SPARQL endpoint HTML i. Google facebook XML RDF Store my. SQL Enactor API Enactor
SPARQL endpoint PREFIX rdf: <http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#> PREFIX myexp: <http: //rdf. myexperiment. org/ontology#> PREFIX sioc: <http: //rdfs. org/sioc/ns#> select ? friend 1 ? friend 2 ? acceptedat where {? z rdf: type <http: //rdf. myexperiment. org/ontology#Friendship>. ? z myexp: has-requester ? x sioc: name ? friend 1. ? z myexp: has-accepter ? y sioc: name ? friend 2. ? z myexp: accepted-at ? acceptedat } All accepted Friendships including accepted-at time Semantically-Interlinked Online Communities
http: //rdf. myexperiment. org/Aggregation/Pack/56
Exporting packs
Scientific Discourse Relationships Ontology Specification Open Provenance Model Communications of the ACM 51, 4 (Apr. 2008), 52 -58
Phase 2 • Repository integration (institutional: EPrints, Fedora) • Controlled vocabularies • Relationships between items (in and between packs) • Recommendations • Improved search ranking and faceted browsing • Indexing of packs • New contribution types (Meandre, Kepler, e-books) • Further blog / wiki integration • Biocatalogue integration
Reuse and Symbiosis Content Capture and Curation Self by Service Providers refine validate Experts refine validate seed Workflows and Services refine validate seed Social by User Community seed refine validate Automated
Six Principles of Software Design to Empower Scientists 1. Fit in, Don’t Force Change 2. Jam today and more jam tomorrow 3. Just in Time and Just Enough 4. Act Local, think Global 5. Enable Users to Add Value 6. Design for Network Effects 1. Keep your Friends Close 2. Embed 3. Keep Sight of the Bigger Picture 4. Favours will be in your Favour 5. Know your users 6. Expect and Anticipate Change De Roure, D. and Goble, C. "Software Design for Empowering Scientists, " IEEE Software, vol. 26, no. 1, pp. 88 -95, January/February 2009
• • What is it? How it’s being used How we built it Towards the e-Laboratory
e-Laboratory Lifecycle Local projects using Taverna and/or my. Experiment Sys. MO Ondex NEMA Obesity e. Lab Shared Genomics Combe. Chem Life. Guide IBBRE
What is an e-Laboratory? • A laboratory is a facility that provides controlled conditions in which scientific research, experiments and measurements may be performed, offering a work space for researchers. • An e-Laboratory is a set of integrated components that, used together, form a distributed and collaborative space for e-Science, enabling the planning and execution of in silico experiments -processes that combine data with computational activities to yield experimental results
e-Labs • An e-Lab consists of: 1. a community 2. work objects 3. generic resources for building and transforming work objects • Sharing infrastructure and content across projects
e-Labs + Research Objects • An e-Lab is built from a collection of services, consuming and producing Research Objects Visualisation Notification Annotation etc. Workbench/ RO driven UI Service RO Bus Service RO aware services
e-Laboratory Evolution 1 st Generation Current practice of early adoptors of e-Labs tools such as Taverna 2 nd Generation Characterised by researchers using tools within Obesity e-Lab Designing and delivering now, e. g. their particular problem area, with some re-use of tools, Experience with Taverna and my. Experiment and on data and methods within the discipline. 3 rd Generation our research results arisinge-Labs we'll activities from these be The vision - the by publication delivering in 5 years Traditional publishing is supplemented Key characteristic is re-use - of the increasing pool of illustrated by open science. of some digital artefacts like workflows and links to tools, data Characterisedacross areas/disciplines. data and methods by global reuse of tools, data. methods across recombinant, Contain some freestanding, any discipline, and surfacing the Provenance is recorded but levels of complexity for the researcher. right not shared and re-used. reproducible research objects. Provenance analytics Science is accelerated Key practice beginning to shift and plays a role. characteristic is radical sharing to emphasise in silico work Research is are established driven - plundering the New scientific practicessignificantly dataand backlog of completely and methods. opportunities arise fordata, resultsnew scientific Increasing automation and decision-support for the investigations. researcher - the e-Laboratory becomes assistive. Provenance assists design Curation is autonomic and social
Assembling e-Laboratories n An e-Lab is a set of components and resources n An open system, not a software monolith n Utility of components transcends their immediate application n We envisage an ecosystem of cooperating e-Laboratories n What are the e-Lab components and services? n What are the Research Objects? Example Core Services Workflow Monitoring Event Logging Social Metadata Annotation Service Search, ranking User Registration Distributed Data Query Job Execution Naming and Identity Anonimisation Text Mining Research Object Management Probity Coreference Resolution
Paul Fisher Logs QTL Workflow 16 Results produces Included in Published in Included in Feeds into produces Included in Metadata Slides produces Common pathways Workflow 13 Paper Published in Results
David Shotton
Anatomy of a Research Object
SWAN-SIOC Experiments my. Experiment Tim Clark
Characteristics of a Research Object 1. Composite. Contain typed interrelationships and dependencies between resources but are in turn labelled and identifiable as an individual resource. 2. Distributed. Structured collections of references to locally managed and externally located resources. Implications for reliability, consistency, mixed stewardship, versioning and identity resolution. 3. Annotated. Carry metadata concerning provenance profile, lifecycle profile, sharing profile (permissions, licensing, downloads, views), curation profile (tags, comments, ratings) and usage profile. 4. Repeatable. Capture information about the lifecycle of the investigation facilitating experiments to be repeatable (without change), reusable (with reconfiguration), replayable and/or repurposable (as new components or templates). 5. Interoperable. Publishable and exchangeable units that facilitate interoperability; OAI-ORE standards increase interoperability and facilitate the consumption of Research Objects in between applications.
Thoughts n my. Experiment provides social infrastructure – it facilitates sharing and enables scientists to “collaborate in order to compete” n my. Experiment has growing community and growing content n New content types: meandre, kepler, R, matlab, . . . , spreadsheets? SPARQL queries? n We are targetting how we believe research will be conducted in the future, through the assembly of e-Laboratories which share Research Objects n SPARQL endpoint is an effective alternative to the API – provides any service you want! n Workflows for Semantic Web scripting?
Contact David De Roure dder@ecs. soton. ac. uk Carole Goble carole. goble@manchester. ac. uk Slide Credits Simon Coles, Paul Fisher, Adam Belloum, Sean Bechhofer, David Shotton
a48e715949ede6e5122a7230f8b6764f.ppt