de465d6fe5c61652e44c7dbb7f3fa56c.ppt
- Количество слайдов: 26
S 06: Open-Source Stack for Cloud Computing Milind Bhandarkar Yahoo! Richard Gass Intel Michael Kozuch Intel Michael Ryan Intel
Agenda Sessions: (A) Introduction 8. 30 -9. 00 (B) Hadoop 9. 00 -10. 00 Break 10. 00 -10. 30 Hadoop/Pig 10. 30 -12: 00 Lunch 12. 00 -1. 30 (C) Pig 1. 30 -2. 00 (D) Tashi 2. 00 -3. 00 Break 3. 00 -3. 30 Zoni (E) PRS 3. 30 -4. 45 Wrapup 4. 45 -5. 00 I. III. IV. V. Speaker intros Motivation Open Cirrus software stack Getting involved 2
Session A: Introduction 3
Michael Kozuch (Intro) • Michael Kozuch is a Principal Engineer with Intel Labs Pittsburgh and manager of the ILP Systems Research and Engineering group – Manages the Intel Open Cirrus cluster and is the PI for the Tashi research project • Michael is a 12 -year veteran of Intel and contributed to the development of Intel’s VT and TXT technologies • He has published 25+ scientific papers and 20+ patents 4
Milind Bhandarkar (Hadoop) • Lead Yahoo! Grid Solutions Team since June 2005 • Contributor to Hadoop since January 2006 • Trained 1000+ Hadoop users at Yahoo! & elsewhere • 20+ years of experience in Parallel Programming 5
Michael Ryan (Tashi) • Michael is a research engineer with Intel Labs Pittsburgh • Lead developer for Tashi • Serves as sysadmin for the Intel Open Cirrus site • Coordinates the Global Monitoring service for Open Cirrus 6
Richard Gass (Zoni) • Richard is currently a research engineer with Intel Labs Pittsburgh – Lead developer for Zoni – Serves as sysadmin for the Intel Open. Cirrus site • Richard has published 9+ scientific papers and is also an (imminent) Ph. D candidate with University Pierre and Marie Curie LIP 6 in Paris 7
Motivation 8
Why Open and Cloud makes sense • Cloud Computing is a new, critical technology – Efficiency: Admin costs aggregated – Scalability: From 1 to 1000 servers in 10 sec. flat – Empowerment: Anyone can buy a cluster • Open Communities enable rapid innovation – Exchange of ideas: Knowledge grows – Constructive Darwinism: Best tools survive/evolve – Empowerment: Anyone can build a LAMP stack Rapidly developing and deploying innovative computing technologies 9
Research Interest: Big Data • Interesting applications are data hungry • The data grows over time • The data is immobile – 100 TB @ 1 Gbps ~= 10 days • Compute comes to the data (Data-Rich Computing theme proposal. J. Campbell, et al. , 2007) • Big Data clusters are the new libraries The value of a cluster is its data 10
Open Cirrus 11
Open Cirrus* Cloud Computing Testbed Collaboration between industry and academia, sharing • hardware infrastructure • software infrastructure UIUC* • research • applications and data sets KIT* ISPRAS* ETRI* IDA* MIMOS* Sponsored by HP, Intel, and Yahoo! (with additional support from NSF) 9 sites currently, target of around 20 in the next two years
Open Cirrus* • Objectives – Foster systems research around cloud computing – Vendor-neutral open-source stacks and APIs for the cloud – Expose research community to enterprise level requirements – Provide realistic traces of cloud workloads • How are we unique – Support for systems research and applications research – Federation of heterogeneous datacenters – Collection of interesting data sets Independently-managed sites… providing a cooperative research testbed
User Access to Open Cirrus* • User access is organized around Research Projects – Led by Principal Investigator (PI) • Project PIs apply to each site separately – Identifying additional team members • Contact information for applications to each site are available on the Open Cirrus Web site (http: //opencirrus. org) • Each Open Cirrus site decides which users and projects get access to its site. 14
Open Cirrus* Research Projects Example research areas of interest Datacenter federation Datacenter management Web services Data-intensive systems Projects typically not of interest Traditional HPC app development Production apps looking for “free” cycles Closed-source system development 15
Software Stack 16
Open Cirrus* Software Components Single Sign-On Global User Monitoring Directories Application Services (Hadoop) Global Services Virtual Machine Allocation (AWS* Compatible, e. g. Tashi or Eucalyptus) Cluster Storage Data Location Resource Billing/ Telemetry Accounting Site Services (HDFS) Physical Machine Allocation (Zoni) Compute Node Services 17
Physical Machine Allocation: Zoni • Zoni dynamically divides compute nodes into isolated subdomains Provides each project with a mini-datacenter Isolation of experiments Open service research Tashi development Production storage service Proprietary service research Apps running in a VM mgmt infrastructure (e. g. , Tashi, Eucalyptus) Open workload monitoring and trace collection 18
Cluster Storage: HDFS • Storage system aggregating standard devices – High-performance, parallel access – High data reliability through replication • Exposing location information enables intelligent placement of computation Storage Service Node Node 19
Virtual Machine Allocation: Tashi • An open source Apache Software Foundation incubator project – – Infrastructure for cloud computing on Big Data http: //incubator. apache. org/projects/tashi Support for AWS* interface OS, FS, and VMM agnostic • Research focus: – Location-aware co-scheduling of compute, storage, and power – Seamless physical/virtual migration 20
Application Service: Hadoop • An open-source Apache Software Foundation project sponsored by Yahoo! – http: //hadoop. apache. org • Provides a scalable, parallel programming model (Map. Reduce) and the associated runtime 21
Getting Involved 22
Summary • Open Communities can shape the development of Cloud Computing • Open Cirrus* is a multi-partner test bed for research in Cloud Computing • The Open Cirrus software stack provides a good starting point for open-source cloud computing software development 23
Getting Involved http: //opencirrus. org • Contact Open Cirrus* with research proposals • Contribute to the Open Cirrus software stack – Zoni, Tashi, Hadoop – Apache Software Foundation* 24
The Rest of the Day 25
Ground Rules • Questions? – Please ask, we’d love an interactive day – But, if the answer is not of general interest, we may defer until the break • Need to step out? – That’s OK, but please take your belongings – Including the lunch • Please be considerate – And keep conversations focused on the topic 26


