Скачать презентацию The past present and future of Green Computing Скачать презентацию The past present and future of Green Computing

4acbd8f130ca610414b72c5cc6323e1f.ppt

  • Количество слайдов: 50

The past, present, and future of Green Computing Kirk W. Cameron SCAPE Laboratory Virginia The past, present, and future of Green Computing Kirk W. Cameron SCAPE Laboratory Virginia Tech SCAPE Laboratory Confidential 1

Enough About Me • • Associate Professor Virginia Tech Co-founder Green 500 Co-founder Miser. Enough About Me • • Associate Professor Virginia Tech Co-founder Green 500 Co-founder Miser. Ware Founding Member Spec. Power Consultant for EPA Energy Star for Servers IEEE Computer “Green IT” Columnist Over $4 M Federally funded “Green” research System. G Supercomputer 2

What is SCAPE? • Scalable Performance Laboratory – Founded 2001 by Cameron • Vision What is SCAPE? • Scalable Performance Laboratory – Founded 2001 by Cameron • Vision – Improve efficiency of high-end systems • Approach – Exploit/create technologies for high-end systems – Conduct quality research to solve important problems – When appropriate, commercialize technologies – Educate and train next generation HPC CS SCAPE Laboratory Confidential 3

The Big Picture (Today) • Past: Challenges – Need to measure and correlate power The Big Picture (Today) • Past: Challenges – Need to measure and correlate power data – Save energy while maintaining performance • Present – Software/hardware infrastructure for power measurement – Intelligent Power Management (CPU Miser, Memory Miser) – Integration with other toolkits (PAPI, Prophesy) • Future: Research + Commercialization – Management Infra-Structure for Energy Reduction – Miser. Ware, Inc. – Holistic Power Management 4

1882 - 2001 5 1882 - 2001 5

Prehistory 1882 - 2001 • Embedded systems • General Purpose Microarchitecture – – Circa Prehistory 1882 - 2001 • Embedded systems • General Purpose Microarchitecture – – Circa 1999 power becomes disruptive technology Moore’s Law + Clock Frequency Arms Race Simulators emerge (e. g. Princeton’s Wattch) Related work continues today (CMPs, SMT, etc) 6

2002 7 2002 7

Server Power 2002 • IBM Austin – Energy-aware commercial servers [Keller et al] • Server Power 2002 • IBM Austin – Energy-aware commercial servers [Keller et al] • LANL – Green Destiny [Feng et al] • Observations – IBM targets commercial apps – Feng et al achieve power savings in exchange for performance loss 8

HPC Power 2002 • My observations – Power will become disruptive to HPC – HPC Power 2002 • My observations – Power will become disruptive to HPC – Laptops outselling PC’s – Commercial power-aware not appropriate for HPC $800, 000 per year per megawatt! $4, 000/yr TM CM-5. 005 Megawatts $12, 000/yr Residential A/C. 015 Megawatts $680, 000/yr $8 million/yr $9. 6 million/yr Intel ASCI Red High-speed train Conventional Power Plant Simulator Earth 10 Megawatts. 850 Megawatts 300 Megawatts 12 Megawatts 9

HPPAC Emerges 2002 • SCAPE Project – High-performance, power-aware computing – Two initial goals HPPAC Emerges 2002 • SCAPE Project – High-performance, power-aware computing – Two initial goals • Measurement tools • Power/energy savings – Big Goals…no funding (risk all startup funds) 10

2003 - 2004 11 2003 - 2004 11

Cluster Power 2003 - 2004 • IBM Austin – On evaluating request-distribution schemes for Cluster Power 2003 - 2004 • IBM Austin – On evaluating request-distribution schemes for saving energy in server clusters, ISPASS ‘ 03 [Lefurgy et al] – Improving Server Performance on Trans Processing Workloads by Enhanced Data Placement. SBAC-PAD ’ 04 [Rubio et al] • Rutgers – Energy conservation techniques for disk array-based servers. ICS ’ 04 [Bianchini et al] • SCAPE – High-performance, power-aware computing, SC 04 – Power measurement + power/energy savings 12

2003 - 2004 Power. Pack Measurement Scalable, synchronized, and accurate. Hardware power/energy profiling Power/Energy 2003 - 2004 Power. Pack Measurement Scalable, synchronized, and accurate. Hardware power/energy profiling Power/Energy Profiling Data Baytech Management unit Baytech Powerstrip Multi-meter AC Power from outlet Data Log Multi-meter MM Thread Single node MM Thread Multi-meter Control Thread Microbenchmarks Data collection DC Baytech Power Strip Multi-meter control Data Analysis Data Repository Multi-meter DC Power from power supply AC Applications DVS control Power. Pack libraries (profile/control) DVS Thread DVS Control Thread Software power/energy control High-performance Power-aware Cluster 13

After frying multiple components… 14 After frying multiple components… 14

Power. Pack Framework (DC Power Profiling) If node. eq. root then call pmeter_init (xmhost, Power. Pack Framework (DC Power Profiling) If node. eq. root then call pmeter_init (xmhost, xmport) call pmeter_log (pmlog, NEW_LOG) endif If node. eq. root then call pmeter_start_session(pm_label) endif If node. eq. call endif root then pmeter_pause() pmeter_log(pmlog, CLOSE_LOG) pmeter_finalize() Multi-meters + 32 -node Beowulf 15

Power Profiles – Single Node • CPU is largest consumer of power typically (under Power Profiles – Single Node • CPU is largest consumer of power typically (under load) 16

Power Profiles – Single Node Power Consumption for Various Workloads memory-bound CPU-bound network-bound disk-bound Power Profiles – Single Node Power Consumption for Various Workloads memory-bound CPU-bound network-bound disk-bound 17

NAS PB FT – Performance Profiling compute reduce (comm) compute all-to-all (comm) About 50% NAS PB FT – Performance Profiling compute reduce (comm) compute all-to-all (comm) About 50% time spent in communications. 18

Power profiles reflect performance profiles. 19 Power profiles reflect performance profiles. 19

One FFT Iteration SCAPE Laboratory Confidential 20 One FFT Iteration SCAPE Laboratory Confidential 20

2005 - present 21 2005 - present 21

Intuition confirmed 2005 - Present 22 Intuition confirmed 2005 - Present 22

HPPAC Tool Progress 2005 - Present • Power. Pack – Modularized Power. Pack and HPPAC Tool Progress 2005 - Present • Power. Pack – Modularized Power. Pack and Syste. MISER – Extended analytics for applicability – Extended to support thermals • Syste. MISER – Improved analytics to weigh tradeoffs at runtime – Automated cluster-wide, DVS scheduling – Support for automated power-aware memory 23

Predicting CPU Power 2005 - Present 24 Predicting CPU Power 2005 - Present 24

Predicting Memory Power 2005 - Present 25 Predicting Memory Power 2005 - Present 25

Correlating Thermals BT 2005 - Present 26 Correlating Thermals BT 2005 - Present 26

Correlating Thermals MG 2005 - Present SCAPE Laboratory Confidential 27 Correlating Thermals MG 2005 - Present SCAPE Laboratory Confidential 27

Tempest Results FT 2005 - Present 28 Tempest Results FT 2005 - Present 28

Syste. MISER 2005 - Present • Our software approach to reduce energy – Management Syste. MISER 2005 - Present • Our software approach to reduce energy – Management Infrastructure for Energy Reduction • Power/performance – measurement – prediction – control The Heat Miser. 29

Power-aware DVS scheduling strategies 2005 - Present CPUSPEED Daemon [example]$ start_cpuspeed [example]$ mpirun –np Power-aware DVS scheduling strategies 2005 - Present CPUSPEED Daemon [example]$ start_cpuspeed [example]$ mpirun –np 16 ft. B. 16 Internal scheduling MPI_Init(); setspeed(600); setspeed(1400); MPI_Finalize(); External Scheduling [example]$ psetcpuspeed 600 [example]$ mpirun –np 16 ft. B. 16 NEMO & Power. Pack Framework for saving energy 30

CPU MISER Scheduling (FT) 2005 - Present Normalized Energy and Delay with CPU MISER CPU MISER Scheduling (FT) 2005 - Present Normalized Energy and Delay with CPU MISER for FT. C. 8 normalized delay 1. 20 normalized energy 1. 00 0. 80 0. 60 0. 40 0. 20 0. 00 auto 600 800 1000 1200 1400 CPU MISER 36% energy savings, less than 1% performance loss See SC 2004, SC 2005 publications. 31

Where else can we save energy? • Processor – DVS 2005 - Present – Where else can we save energy? • Processor – DVS 2005 - Present – Where everyone starts. • NIC – Very small portion of systems power • Disk – A good choice (our future work) • Power-supply – A very good choice (for a EE or ME) • Memory – Only 20 -30% of system power, but… 32

The Power of Memory 2005 - Present 33 The Power of Memory 2005 - Present 33

Memory Management Policies 2005 - Present Default Static Dynamic Memory MISER = Page Allocation Memory Management Policies 2005 - Present Default Static Dynamic Memory MISER = Page Allocation Shaping + Allocation Prediction + Dynamic Control 34

Memory MISER Evaluation of Prediction and Control 2005 - Present Prediction/control looks good, but Memory MISER Evaluation of Prediction and Control 2005 - Present Prediction/control looks good, but are we guaranteeing performance? 35

Memory MISER Evaluation of Prediction and Control 2005 - Present Stable, accurate prediction using Memory MISER Evaluation of Prediction and Control 2005 - Present Stable, accurate prediction using PID controller. But, what about big (capacity) spikes? 36

Memory MISER Evaluation of Prediction and Control 2005 - Present Memory MISER guarantees performance Memory MISER Evaluation of Prediction and Control 2005 - Present Memory MISER guarantees performance in “worst” conditions. 37

Memory MISER Evaluation Energy Reduction 2005 - Present 30% total system energy savings, less Memory MISER Evaluation Energy Reduction 2005 - Present 30% total system energy savings, less than 1% performance loss 38

Present - 2012 39 Present - 2012 39

System. G Supercomputer @ VT System. G Supercomputer @ VT

System. G Stats • 325 Mac Pro Computer nodes, each with two 4 -core System. G Stats • 325 Mac Pro Computer nodes, each with two 4 -core 2. 8 gigahertz (GHZ) Intel Xeon Processors. • Each node has eight gigabytes (GB) random access memory (RAM). Each core has 6 MB cache. • Mellanox 40 Gb/s end-to-end Infini. Band adapters and switches. • LINPACK result: 22. 8 TFLOPS (trillion operations per sec) • Over 10, 000 power and thermal sensors • Variable power modes: DVFS control (2. 4 and 2. 8 GHZ), Fan. Speed control, Concurrency throttling, etc. (Check: /sys/devices/system/cpu. X/Scaling_avaliable_frequencies. ) • Intelligent Power Distribution Unit: Dominion PX (remotely control the servers and network devices. Also monitor current, voltage, power, and temperature through Raritan’s KVM switches and secure Console Servers. )

Deployment Details * 13 racks total, 24 nodes on each rack and 8 nodes Deployment Details * 13 racks total, 24 nodes on each rack and 8 nodes on each layer. * 5 PDUs per rack. Raritan PDU Model DPCS 12 -20. Each single PUD in System. G has an unique IP address and Users can use IPMI to access and retrieve information from the PDUS and also control them such as remotely shuting down and restarting machines, recording system AC power, etc. * There are two types of switch: 1) Ethernet Switch: 1 Gb/sec Ethernet switch. 36 nodes share one Ethernet switch. 2) Infini. Band switch: 40 Gb/sec Infini. Band switch. 24 nodes (which is one rack) share one IB switch.

Data collection system and Labview Sample diagram and corresponding front panel from Labview: Data collection system and Labview Sample diagram and corresponding front panel from Labview:

A Power Profile for HPCC benchmark suite A Power Profile for HPCC benchmark suite

Published Papers And Useful Links Papers: 1. 2. Rong Ge, Xizhou Feng, Shuaiwen Song, Published Papers And Useful Links Papers: 1. 2. Rong Ge, Xizhou Feng, Shuaiwen Song, Hung-Ching Chang, Dong Li, Kirk W. Cameron, Power. Pack: Energy profiling and analysis of High-Performance Systems and Applications , IEEE Transactions on Parallel and Distributed Systems, Apr. 2009. Shuaiwen Song, Rong Ge, Xizhou Feng, Kirk W. Cameron, Energy Profiling and Analysis of the HPC Challenge Benchmarks, The International Journal of High Performance Computing Applications, Vol. 23, No. 3, 265 -276 (2009) NI system set details: http: //sine. ni. com/nips/cds/view/p/lang/en/nid/202545 http: //sine. ni. com/nips/cds/view/p/lang/en/nid/202571

The future… Present - 2012 • Power. Pack – Streaming sensor data from any The future… Present - 2012 • Power. Pack – Streaming sensor data from any source • PAPI Integration – Correlated to various systems and applications • Prophesy Integration – Analytics to provide unified interface • Syste. MISER – Study effects of power-aware disks and NICs – Study effects of emergent architectures (CMT, SMT, etc) – Coschedule power modes for energy savings 46

Outreach • • See See http: //green 500. org http: //thegreengrid. org http: //www. Outreach • • See See http: //green 500. org http: //thegreengrid. org http: //www. spec. org/specpower/ http: //hppac. cs. vt. edu SCAPE Laboratory Confidential 48

Acknowledgements • My SCAPE Team – – – Dr. Xizhou Feng (Ph. D 2006) Acknowledgements • My SCAPE Team – – – Dr. Xizhou Feng (Ph. D 2006) Dr. Rong Ge (Ph. D 2008) Dr. Matt Tolentino (Ph. D 2009) Mr. Dong Li (Ph. D Student, exp 2010) Mr. Song Shuaiwen (Ph. D Student, exp 2010) Mr. Chun-Yi Su, Mr. Hung-Ching Chang • Funding Sources – National Science Foundation (CISE: CCF, CNS) – Department of Energy (SC) – Intel 49

Thank you very much. http: //scape. cs. vt. edu cameron@cs. vt. edu Thanks to Thank you very much. http: //scape. cs. vt. edu cameron@cs. vt. edu Thanks to our sponsors: NSF (Career, CCF, CNS), DOE (SC), Intel 50