9f8b92c74b9abec5d5b80d2d7d090a0a.ppt
- Количество слайдов: 20
Cyberinfrastructure - Revolutionizing Discovery, Learning & Innovation Deborah L. Crawford Deputy Assistant Director of NSF for Computer & Information Science & Engineering January 27, 2004
“[Science is] a series of peaceful interludes punctuated by intellectually violent revolutions. . . [in which]. . . one conceptual world view is replaced by another. ” --Thomas Kuhn From The Structure of Scientific Revolutions 2
Daniel E. Atkins, Chair, University of Michigan Kelvin K. Droegemeier, University of Oklahoma Stuart I. Feldman, IBM Hector Garcia-Molina, Stanford University Michael L. Klein, University of Pennsylvania David G. Messerschmitt, University of California at Berkeley Paul Messina, California Institute of Technology Jeremiah P. Ostriker, Princeton University Margaret H. Wright, New York University http: //www. communitytechnology. org/nsf_ci_report/ 3
Setting the Stage In summary then, the opportunity is here to create cyberinfrastructure that enables more ubiquitous, comprehensive knowledge environments that become functionally complete. . in terms of people, data, information, tools, and instruments and that include unprecedented capacity for computation, storage, and communication. They can serve individuals, teams and organizations in ways that revolutionize what they do, how they do it, and who can participate. - The Atkins Report 4
National Virtual Observatory (NVO) • The goal: To put the “universe on the grid. ” –unite the astronomical databases of many observatories –take advantage of latest computer technologies and data storage and analysis techniques • NVO will make possible significant interactions between large datasets and large-scale theoretical simulations of astrophysical systems. • NVO will maximize the potential for new scientific insights from the data by making them available in an accessible, unified form to researchers, amateur astronomers, and students. • Will change astronomy as we know it. http: //www. us-vo. org/ 5
New Modes of Interaction with Resources Source: W. Feiereisen 6
The Information Tsunami --An Example • Terabyte [ 1, 000, 000 bytes OR 1012 bytes] • 1 Terabyte: An automated tape robot OR all the X-ray films in a large technological hospital OR 50000 trees made into paper and printed OR daily rate of EOS data (1998) • 2 Terabytes: An academic research library OR a cabinet full of Exabyte tapes • 10 Terabytes: The printed collection of the US Library of Congress • 50 Terabytes: The contents of a large Mass Storage System • 400 Terabytes: National Climactic Data Center (NOAA) database • Petabyte [ 1, 000, 000 bytes OR 1015 bytes] • 1 Petabyte: 3 years of EOS data (2001), OR 1 sec of CMS data collection • 2 Petabytes: All US academic research libraries • 8 Petabytes: All information available on the Web • 20 Petabytes: Production of hard-disk drives in 1995 • 200 Petabytes: All printed material OR production of digital magnetic tape in 1995 • Exabyte [ 1, 000, 000 bytes OR 1018 bytes] • 2 Exabytes: Total volume of information generated worldwide annually • 5 Exabytes: All words ever spoken by human beings • Zettabyte [ 1, 000, 000, 000 bytes OR 1021 bytes] • Yottabyte [ 1, 000, 000, 000 bytes OR 1024 bytes] 7
The Future: Multi-Petabyte Databases Generated by New Science Programs Science Program/Application 2002 2003 2004 Large Hadron Collider 100 500 2500 National Virtual Observatory 35 55 1000 Laser Interferometer 20 100 600 Gravitational Wave Observatory Neuroscience Imaging <1 50 200 Network for Earthquake <1 5 50 Engineering Simulation National Ecological Observatory <1 5 50 Network Source: PACI 8
NSF Investments in IT Infrastructure FY 2004 Baseline • Network for Earthquake Engineering Simulation • Protein Databank • International Integrated Microdata Access System • Partnerships for Advanced Computational Infrastructure • Circumarctic Environmental Observatory Network • National Science Digital Library • Pacific Rim Application and Grid Middleware Assembly • Geosciences Network • international Virtual Data Grid Laboratory. . and others too numerous to mention (~$400 M in FY’ 04) 9
Baseline Conclusions • Information Technology Infrastructure • critical enabler in wide variety of projects • enables collaboration, computation, communication, data collection, curation, sharing • ad hoc development • Absence of systems approach. Little attention to • interoperability • accessability • usability • resilience • sustainability 10
Desired Characteristics • • Science- and engineering-driven Enabling discovery, learning and innovation Promising economies of scale and scope Supporting data-, instrumentation-, computationintensive applications • Covering high-end to desktop • Heterogeneous • Interoperable-enabled by collection of reusable, common building blocks that are well-supported and wellmaintained 11
Overarching Principles • Enabling new science and engineering opportunities • Demonstrate transformative power of CI across S&E enterprise • Empower range of CI users – current and emerging • System-wide evaluation and CI-enabling research informs progress • Develop intellectual capital • Catalyze community development and support • Enable training and professional development • Broaden participation in the CI enterprise • Enable integration and interoperability • Promote collaboration, coordination and communication across fields • Develop shared vision, integrating architectures, common investments • Share promising technologies, practices and lessons learned 12
CI Planning - A Systems Approach S&E Frontiers CI-enabling Research ADVANCING THE FRONTIER -Reveal new S&E opportunities -Technology/human capital roadmaps -Gaps and barrier analyses -Gateways/Portals/Digital Knowledge Environments Integrative CI “system of systems” Core Activities SYSTEM-WIDE ACTIVITIES -Education, training -(Inter)national networks -Capacity computing -Science of CI Common Tools INTEGRATING ARCHITECTURES - Compute-centric - Information-intensive - Instrumentation-enabling 13
CI Commons Integrating Architectures Goals • Commercial-grade software: stable, well-documented, -supported and – maintained • User surveys and focus groups inform priority-setting • Development of technology roadmaps Unanswered questions • What role does industry, other partners play in development and support of products? • In what timeframe will software and services be available? • How will customer satisfaction be assessed and by whom? • What role do standards play? 14
Community Development Processes • Customizes common building blocks for domain applications • End-user communities willing and able to modify code • Adds features, repairs defects, improves code • Leads to higher quality code, enhances diversity • Natural way to set priorities Requires • Education, training in community development methodologies • Effective Commons governance plan • Strong, sustained interaction between Commons developers and community code users/enhancers 15
Challenging Context • Cyberinfrastructure Ecology – Technological change more rapid than organizational change – Disruptive technology promises unforeseen opportunity • Seamless Integration of New and Old – Balancing upgrades of existing and creation of new resources – Legacy instruments, models, data, methodologies • Broadening Participation • Community-Building 16
Extensible Terascale Facility (TERAGRID) A CI Pathfinder • Pathfinder Role – integrated with extant CI capabilities – value-added • supporting a new class of S&E applications • Deploy a balanced, distributed system – not a “distributed computer” but rather – a distributed “system” using Grid technologies • computing and data management • visualization and scientific application analysis • remote instrumentation access • Define an open and extensible infrastructure – an “enabling cyberinfrastructure” demonstration – extensible beyond original sites with additional funding • NCSA, SDSC, ANL, Caltech and PSC • ORNL, TACC, Indiana University, Purdue University and Atlanta hub 17
TERAGRID as a Pathfinder • Application Gateways to New Frontiers -On-demand computing -Remote visual steering -Data-intensive computing • Systems Approach -Common TERAGRID Software Stack -User training & services -TERAGRID Operations Center • Resource Providers -Data resources, compute engines, visualization facilities, research instruments 18
Focus on Policy and Social Dynamics • Policy issues must be considered up front • Social engineering will be at least as important as software engineering • Well-defined interfaces will be critical for successful software development • Application communities will need to participate from the beginning Source: Fran Berman, SDSC 19
20
9f8b92c74b9abec5d5b80d2d7d090a0a.ppt