
0e26b7d02eeecf7bbfca9669dc5ee229.ppt
- Количество слайдов: 11
Cloud Computing BOF OGF 22 Birds of a Feather Session Hyatt Regency Cambridge February 27 2008 Geoffrey Fox Indiana University gcf@indiana. edu © 2007 Open Grid Forum
Cloud Agenda • • • Geoffrey Fox (Indiana U. ) Remarks on Cloud Computing Martin Swany (Internet 2) Clouds and Dynamic Networking Steven Newhouse (Microsoft) Personal View on Clouds Kate Keahey (Argonne, Chicago) First Steps in the Clouds Next Steps © 2007 Open Grid Forum 2
What are Clouds? • Clouds are “Virtual Clusters” (“Virtual Grids”) of possibly “Virtual Machines” • They may cross administrative domains or may “just be a single cluster”; the user cannot and does not want to know • Clouds support access (lease of) computer instances • Instances accept data and job descriptions (code) and return results that are data and status flags • Each Cloud is a “Narrow” (perhaps internally proprietary) Grid • When does Cloud concept work • Parameter searches, LHC style data analysis. . • Common case (most likely success case for clouds) versus corner case? • Clouds can be built from Grids • Grids can be built from Clouds © 2007 Open Grid Forum 3
Cloud References • http: //en. wikipedia. org/wiki/Cloud_computing • Includes references to Amazon, Apple, Dell, Enomalism, Globus, Google, IBM, Knowledge. Tree. Live, Nature, New York Times, Zimdesk • Others like Microsoft Windows Live Skydrive important • http: //en. wikipedia. org/wiki/Amazon_Elastic_Compute_Cloud • http: //uc. princeton. edu/main/index. php? option=com_content&ta sk=view&id=2589&Itemid=1 Policy Issues • http: //www. cra. org/ccc/home. article. bigdata. html • Hadoop (Map. Reduce) and “Data Intensive Computing” • See Data intensive computing minitrack at HICSS-42 January 2009 • http: //ianfoster. typepad. com/blog/2008/01/theres-grid-in. html • OGF Thought Leadership blog • OGF 22 talks by Charlie Catlett and Irving Wladawsky-Berger © 2007 Open Grid Forum 4
Big-Data Computing Study Group CCC Role Versus OGF? © 2007 Open Grid Forum Hadoop and Map. Reduce are “just” workflow? 5
Google Map. Reduce Simplified Data Processing on Clusters/Clouds • http: //labs. google. com/papers/mapreduce. html • This is a dataflow model between services where services can do useful document oriented data parallel applications including reductions • The decomposition of services onto cluster engines (clouds) is automated • The large I/O requirements of datasets changes efficiency analysis in favor of dataflow • Services (count words in example) can obviously be extended to general parallel applications • There are many alternatives to language expressing either dataflow and/or parallel operations and/or workflow © 2007 Open Grid Forum 6
Technical Questions about Clouds I • What is performance overhead? • On individual CPU • On system including data and program transfer • What is cost gain • From size efficiency; “green” location (rumor that Google has purchased the Niagara Falls including Canada!) • Is Cloud Security adequate: can clouds be trusted? • Can one can do parallel computing on clouds? • Looking at “capacity” not “capability” i. e. lots of modest sized jobs • Marine corps will use Petaflop machines – they just need ssh and a. out © 2007 Open Grid Forum 7
Technical Questions about Clouds II • How is data compute affinity tackled in clouds? • Co-locate data and compute clouds? • Lots of optical fiber i. e. “just” move the data? • What happens in clouds when demand for resources exceeds capacity – is there a multi-day job input queue? • Are there novel cloud scheduling issues? • Do we want to link clouds (or ensembles as atomic clouds); if so how and with what protocols • Is there an intranet cloud e. g. “cloud in a box” software to manage personal (cores on my future 128 core laptop) department or enterprise cloud? © 2007 Open Grid Forum 8
Standards for Compute and Storage Clouds • We no longer need interoperability of services and messages (SOAP) but rather interoperability of clouds • Maybe each cloud so big that interoperability between clouds not so critical • Interoperability certainly for application specific data and perhaps also for job specifications • WFS, GML for Geo-data; IVOA standards; DST LHC experiment formats • JSDL, BES etc. • Each Cloud will be proprietary but they might want raw infrastructure standards so they can easily swap in and out different vendor’s disk drives • Clouds very loosely coupled; services loosely coupled © 2007 Open Grid Forum 9
MSI Challenge Problem • There are > 330 MSI’s – Minority Serving Institutions • 2 examples • ECSU is a small state university in North Carolina • HBCU with 4000 students • Working on Polar. Grid (Sensors in Arctic/Antarctic linked to “Tera. Grid”) • Navajo Tech in Crown Point NM is community college with technology leadership for Navajo Nation • “Internet to the Hogan and Dine Grid” links Navajo communities by wireless • Wish to integrate Tera. Grid science into Navajo Nation education curriculum • Current Grid technology too complicated if you are not an R 1 institution • Hard to deploy campus grids broadly into MSI’s • Clouds provide virtual campus resources? © 2007 Open Grid Forum 10
Next Steps at OGF • Clouds are just starting and build on/are related to Grids • Clear need for best practice in use and technology • Likely to be need for new standards and novel use of existing/projected standards • New Cloud Community Group? • Chairs, participants? • Workshop? • OGF 23 activity? • Identify key players not currently involved with OGF? © 2007 Open Grid Forum 11
0e26b7d02eeecf7bbfca9669dc5ee229.ppt