Скачать презентацию The Initiative in Innovative Computing at Harvard Alyssa Скачать презентацию The Initiative in Innovative Computing at Harvard Alyssa

4ab65027af8aacef4f3088d88ee03ddf.ppt

  • Количество слайдов: 36

The Initiative in Innovative Computing at Harvard Alyssa A. Goodman IIC Director & Prof. The Initiative in Innovative Computing at Harvard Alyssa A. Goodman IIC Director & Prof. of Astronomy

Agenda What is IIC? (“Filling the Gap”) Where did it come from? (A Story) Agenda What is IIC? (“Filling the Gap”) Where did it come from? (A Story) What have we done so far? (Startup Mode) What are we about to do? (Projects, Hiring Plans) What do we hope to do? (Long-term Goals) 2

Filling the “Gap” between Science and Computer Science Scientific disciplines Computer Science departments Increasingly, Filling the “Gap” between Science and Computer Science Scientific disciplines Computer Science departments Increasingly, core problems in science require computational solution Focused on finding elegant solutions to basic computer science challenges Typically hire/“home grow” computationalists, but often lack the expertise or funding to go beyond the immediate pressing need Often see specific, “applied” problems as outside their interests 3

Where did IIC come from? Short Version: Response to Harvard’s “expansion” in Science, and Where did IIC come from? Short Version: Response to Harvard’s “expansion” in Science, and into Allston. See IIC Whitepaper (2004) & Task Force on Science & Technology report (2005) for more. Long Version… 4

IIC & NCSA/IACAT’s Shared Focus “We must make the cyberinfrastructure as easy to use IIC & NCSA/IACAT’s Shared Focus “We must make the cyberinfrastructure as easy to use as Mosaic made the Internet to use…. Raw computing power is still very important, but many scientific and engineering problems require integration of simulation, data sources (sensor arrays, telescopes, etc. ), databases, and analysis and visualization, all in a distributed environment. Cyberenvironments will allow the scientist or engineer to fashion her own course, knowing that all of the capabilities in the cyberinfrastructure are reliably behind her … The tools and services in the cyberenvironments will include scientific and engineering applications and web services, graphical user interfaces and portals for easy interaction with the applications, and workflow and collaboration software to support complex, collaborative projects. A major component of this effort will be an integrated data analysis and visualization capability, which is needed by an increasing number of scientific and engineering communities. ” --Thom Dunning (in Will, Ingenuity, Skill & Planning, 2006) 5

Computational challenges are common across scientific disciplines How to: Acquire, transmit, organize, and query Computational challenges are common across scientific disciplines How to: Acquire, transmit, organize, and query new kinds of data? Apply distributed computing resources to solve complex problems? Derive meaningful insight from large datasets? Share, integrate and analyze knowledge across geographically dispersed researchers? Visually represent scientific results so as to maximize understanding? Opportunity to collaborate and apply insights from one field to another 6

Workflow and WORKFLOW Examples Astronomy Public Health “Collect” Telescope Microscope, Stethoscope, Survey COLLECT “National Workflow and WORKFLOW Examples Astronomy Public Health “Collect” Telescope Microscope, Stethoscope, Survey COLLECT “National Virtual Observatory”/ COMPLETE CDC Wonder “Analyze” Study the density structure of a starforming glob of gas Find a link between one factory’s chlorine runoff & disease ANALYZE Study the density structure of all starforming gas in… Study the toxic effects of chlorine runoff in the U. S. “Collaborate” Work with your student COLLABORATE Work with 20 people in 5 countries, in real-time “Respond” Write a paper for a Journal. RESPOND Write a paper, the quantitative results of which are shared globally, digitally. 7

Real World Workflow e. g. Emergency Medicine in the Age of High-Speed Networks, Fast Real World Workflow e. g. Emergency Medicine in the Age of High-Speed Networks, Fast Processors, Mass Storage, and Miniature Devices 8 IIC/Harvard contact: Matt Welsh, DEAS

Continuum “Computational Science” Missing at Most Universities “Pure” Discipline Science “Pure” Computer Science (e. Continuum “Computational Science” Missing at Most Universities “Pure” Discipline Science “Pure” Computer Science (e. g. Galileo) (e. g. Turing) 9

Filling the “computational science” gap: IIC Problem-driven approach …focusing effort on solving problems that Filling the “computational science” gap: IIC Problem-driven approach …focusing effort on solving problems that will have greatest impact & educational value Collaborative projects …combining disciplinary knowledge with computer science expertise Interdisciplinary effort …to ensure that best practices are shared across fields and that new tools and methodologies will be broadly applicable Links with industry …to draw on and learn from experience in applied computation Institutional funding …to ensure effort is directed towards key needs and not driven solely by narrow priorities of funding agencies 10

HIgh Where are the optimal “IIC” problems? Domain Science Payoff Science Departments “Never Mind” HIgh Where are the optimal “IIC” problems? Domain Science Payoff Science Departments “Never Mind” CS Comput Departments Low er Science Depart ment Low High What is the right shape for that boundary? Computer Science Payoff 12

IIC Research Branches ( and Projects Draw upon >1 ) Visualization Distributed Computing Databases/ IIC Research Branches ( and Projects Draw upon >1 ) Visualization Distributed Computing Databases/ Provenance Physically meaningful combination of diverse data types. e-Science aspects of large collaborations. Management, and rapid retrieval, of data. Sharing of data and computational resources and tools in real-time. V D C “Research reproducibility” …where did the data come from? How? DB/P Analysis & Simulations Instrumentation Development of Improved data efficient algorithms. acquisition. Cross-disciplinary comparative tools (e. g. statistical). Novel hardware approaches (e. g. GPUs, sensors). AS I Plus…Educational Programs that bring IIC Science to Harvard students, and to the public at large. 13

Education is central to IIC’s mission At Harvard: Undergraduate & graduate courses focused on Education is central to IIC’s mission At Harvard: Undergraduate & graduate courses focused on “data-intensive science” New graduate certificate program, within existing Ph. D. programs Research opportunities at undergraduate, and postdoctoral levels Beyond Harvard: New museum, highlighting the kind of science done at the IIC 14

IIC’s First Activities (2005 -) Image & Meaning Collaboration V IIC Seminar Series at IIC’s First Activities (2005 -) Image & Meaning Collaboration V IIC Seminar Series at Harvard V I DC DB/P AS I Astronomical Medicine (IIC/Cf. A/HMS/MGH/BWH-SPL) V 1 st Call for Ideas (deadline was 3/15/06) V DC DB/P DC AS DB/P AS I 15

“Image and Meaning” “I-M”=Working group of scientists, computer scientists, graphic artists, writers, publishers, designers “Image and Meaning” “I-M”=Working group of scientists, computer scientists, graphic artists, writers, publishers, designers organized and led by Felice Frankel of MIT (soon to be at IIC!) Goal: To increase both scientists understanding of their own data, and the public’s understanding of scientists’ findings, through graphical display. Activities: Large conferences at MIT in 2001 and Getty Center in 2005. Smaller “IM 2. x” local workshops throughout 2006 -7, including @ IIC. Upcoming IM/SIGGRAPH, in conjunction with SIGGRAPH 2007. Online community to be hosted by IIC, beginning later this year. (Social Network model. ) 16

Seminar Sampler (Fall 2005 -Spring 2006) Jim Reese How to Build Google in Your Seminar Sampler (Fall 2005 -Spring 2006) Jim Reese How to Build Google in Your Spare Time Ian Foster Service-Oriented Science Volker Springel/Nick Holliman Numerical Cosmology & 3 D Viz Tim Kaxiras Multi-Scale Modeling Anne Trefethen UK e-Science Carl Kesselman Emergence of Cyberinfrastructure Panel on CS & Visual Depiction (Frankel, Rheigans, Durand, Pfister) Jim Hendler Science & the Semantic Web Mark Green Building a Grid-enabled Gateway for Science & Engineering Roy Williams Virtual Observatory as a Model for Information Sharing Andy van Dam/Anne Spalter Digital Visual Literacy Pete Eltgroth Profiles in Supercomputing Luc Moreau Provenance Curtis Wong Interactive Media Eric Klopfer Games, Simulation & Learning Jim Myers yesterday! Cyberenvironments Phil Campbell Future of e-Publishing And more to come… Grid, Agile Methods, Array-based Databases, Bio & Neuro informatics, Clinical Applications in Autism Research, Astronomical 17

Responses to 1 st IIC Call for Ideas Atomistic Modeling of Biomolecular Function V Responses to 1 st IIC Call for Ideas Atomistic Modeling of Biomolecular Function V DC DB/P AS Multiscale Hemodynamics V DC DB/P AS Gene Pattern + The Virtual Data Center V DC DB/P AS DC DB/P V DC V Medical Treatment Outcomes Online Enhanced Viz/Analysis Tools for Archaeo/Geo/Seismology I AS I DB/P AS I DC DB/P AS I V DC DB/P AS Spatial Ontology Mapping (Community-based) Knowledge Ecology of Science (Peer-to-Peer Collaboration Networks) Framework for Multimodal Studies in Genetics, Biology & the Mind Connectional Analysis of Synaptic Circuitry in the Mammalian Nervous System LHC/LSST/MWA Consortium for Data-Intensive Science A Portal for the National Virtual Observatory Time-Series Research Collaborative 18

Wiring Diagram for a Complete Brain Circuit (Connectional Analysis of Synaptic Circuitry in the Wiring Diagram for a Complete Brain Circuit (Connectional Analysis of Synaptic Circuitry in the Mammalian Nervous System) 3 D images from electron-microsope images of serial sections (slices) – Large volumes studies: up to 500 mm cubes – High resolution: 5 nm x-y; 50 nm in z (105 x 104=1014 voxels) – Large datasets: 10 -100 TB Potentially intractable computationally w/o a hierarchical approach Start with the large, dominant pathways: The biggest wires and the biggest excitatory connections. Use this as scaffolding to then solve other pathways: inhibition, lateral connections, feedback. V DC DB/P AS I 19

Modeling Blood Flow (Multiscale Hemodynamics) V DC DB/P AS Develop parallelization, visualization tools, to Modeling Blood Flow (Multiscale Hemodynamics) V DC DB/P AS Develop parallelization, visualization tools, to scale up to real applications Ultimate goal is MULTISCALE HEMODYNAMICS Movie: Multiscale approach for translocation of DNA through a nanopore –Molecular Dynamics for DNA –Lattice Boltzmann Equation for the solvent S. Melchionna, S. Succi, M. Fyta, L. Stein , E. Kaxiras, M. Seltzer 20

Modeling Blood Flow (Multiscale Hemodynamics) V DC DB/P AS Develop parallelization, visualization tools, to Modeling Blood Flow (Multiscale Hemodynamics) V DC DB/P AS Develop parallelization, visualization tools, to scale up to real applications Ultimate goal is MULTISCALE HEMODYNAMICS 21

Virtual Observatory Portal V DC 22 Virtual Observatory Portal V DC 22

Virtual Observatory Portal? 23 Virtual Observatory Portal? 23

Virtual Observatory Portal? 24 Virtual Observatory Portal? 24

Virtual Observatory Portal Default values are shown in green Data on: One object I Virtual Observatory Portal Default values are shown in green Data on: One object I want: Spectra One Region Images Catalogs I want to: Use VO tools to browse data A list of objects A list of regions (click all that apply) Download data to local computer Would you like help writing a script to do your query? Yes or No Continue V DC 25

A Computational Framework for Multimodal Studies in GENETICS, BIOLOGY, AND THE MIND Family history A Computational Framework for Multimodal Studies in GENETICS, BIOLOGY, AND THE MIND Family history pedigree software toolkit V DC DB/P AS I Cortical Thickness AD vs. Controls Core Imaging Methodologies Lab 4 Lab 1 Lab 3 Topology differences in cocaine addiction Lab 2 Computational Framework Lab 5 Histological Correlates of AD

A Computational Framework for Multimodal Studies in GENETICS, BIOLOGY, AND THE MIND V DC A Computational Framework for Multimodal Studies in GENETICS, BIOLOGY, AND THE MIND V DC DB/P AS I “An Entire Disease or Condition of the Brain” Computational Framework

“Astronomical Medicine” Brigham & Women’s Hospital, Surgical Planning Lab Michael Halle (IIC) Massachusetts General “Astronomical Medicine” Brigham & Women’s Hospital, Surgical Planning Lab Michael Halle (IIC) Massachusetts General Hospital, Martinos Center David Kennedy (IIC) Harvard-Smithsonian Center for Astrophysics Alyssa Goodman (IIC) Pepi Fabbiano Martin Elvis Jonathan Mc. Dowell IIC Douglas Alan (Systems Engineer) Michelle Borkin (Harvard Senior) Demo Movie 28

Building the Best (Startup) Program Visualization Distributed Computing Databases/ Provenance Analysis & Simulations DB/P Building the Best (Startup) Program Visualization Distributed Computing Databases/ Provenance Analysis & Simulations DB/P AS Instrumentation Project 1 V D C I 29

Building the Best (Startup) Program Visualization Distributed Computing Databases/ Provenance Analysis & Simulations Instrumentation Building the Best (Startup) Program Visualization Distributed Computing Databases/ Provenance Analysis & Simulations Instrumentation Project 1 Project 2 Project 3 V D C DB/P AS I 30

Agenda What is IIC? (“Filling the Gap”) Where did it come from? (A Story) Agenda What is IIC? (“Filling the Gap”) Where did it come from? (A Story) What have we done so far? (Startup Mode) What are we about to do? (Projects, Hiring Plans) What do we hope to do? (Long-term Goals) 31

IIC will evolve over three phases Phase III Timing 2005 -08 2008 -10 2011+ IIC will evolve over three phases Phase III Timing 2005 -08 2008 -10 2011+ IIC staffing level, combo of • new faculty • senior scientists • admin staff Total ~25 to ~100 Number of projects ~3 -5 to ~15 Educational mission • New courses offered • Outreach programs Other key milestones New courses to museum Evaluation schedule (internal, external committees) 32

Challenges In “Phase I” Result of “Allston” Science & Technology Task Force IIC intended Challenges In “Phase I” Result of “Allston” Science & Technology Task Force IIC intended to be a “University” (not a single school) initiative FAS (Faculty of Arts & Science) Constraints Faculty Appointments Non-Faculty Appointments Startup Space “Chicken-and-Egg” Problem with Recruiting Good, but not certain, Funding Prospects Role of DEAS Computer Science 33

Will departments hire “computationalists” with regular slots? How big is this overlap? How do Will departments hire “computationalists” with regular slots? How big is this overlap? How do we give Senior non-faculty similar stature to faculty? (e. g. P. I. rights, job security) HIgh “Challenges” Domain Science Payoff Science Departments “Never Mind” CS Comput Departments Low er Science Depart ment Low Will CS/DEAS use slots for these people? How big is that overlap? High Computer Science Payoff 34

Sample Long Term Goal “ 3 D Data Desk” Demo, using data from http: Sample Long Term Goal “ 3 D Data Desk” Demo, using data from http: //www. electoralvote. com/2004/info/president. csv) Perseus file 35

IIC: Mission The Institute for Innovative Computing (IIC) will make Harvard a world leader IIC: Mission The Institute for Innovative Computing (IIC) will make Harvard a world leader in the innovative and creative use of computational resources to address forefront scientific problems. We will focus on developing capabilities that are applicable to multiple disciplines, by undertaking specific, well-defined projects, thereby developing tools and approaches that can be generalized and shared. We will foster the flow of ideas and inventions along the continuum from basic science to scientific computation to computational science to computer science. We will train a next generation of creative and computationally capable scientists, build linkages to industry, and communicate with the public at large. 36

The Initiative in Innovative Computing at Harvard Alyssa A. Goodman IIC Director & Prof. The Initiative in Innovative Computing at Harvard Alyssa A. Goodman IIC Director & Prof. of Astronomy