Скачать презентацию Slide 1 Managing Energy and Server Resources in Скачать презентацию Slide 1 Managing Energy and Server Resources in

23bd58e593377a501fb5a3438d5551c7.ppt

  • Количество слайдов: 28

Slide 1 Managing Energy and Server Resources in Hosting Centers Jeff Chase, Darrell Anderson, Slide 1 Managing Energy and Server Resources in Hosting Centers Jeff Chase, Darrell Anderson, Ron Doyle, Prachi Thakar, Amin Vahdat Duke University

Slide 2 Back to the Future l Return to server-centered computing: applications run as Slide 2 Back to the Future l Return to server-centered computing: applications run as services accessed through the Internet. – Web-based services, ASPs, “netsourcing” l Internet services are hosted on server clusters. – Incrementally scalable, etc. l Server clusters may be managed by a third party. – Shared data center or hosting center – Hosting utility offers economies of scale: • Network access • Power and cooling • Administration and security • Surge capacity

Slide 3 Managing Energy and Server Resources l Key idea: a hosting center OS Slide 3 Managing Energy and Server Resources l Key idea: a hosting center OS maintains the balance of requests and responses, energy inputs, and thermal outputs. 1. Adaptively provision server resources to match request load. energy US in 2003: 22 TWh ($1 B - $2 B+) 2. Provision server resources for energy efficiency. 3. Degrade service on power/cooling failures. requests Power/cooling “browndown” Dynamic thermal management [Brooks] responses waste heat

Slide 4 Contributions l Architecture/prototype for adaptive provisioning of server resources in Internet server Slide 4 Contributions l Architecture/prototype for adaptive provisioning of server resources in Internet server clusters (Muse) – – – l Software feedback Reconfigurable request redirection Addresses a key challenge for hosting automation Foundation for energy management in hosting centers – 25% - 75% energy savings – Degrade rationally (“gracefully”) under constraint (e. g. , browndown) l Simple “economic” resource allocation – Continuous utility functions: customers “pay” for performance. – Balance service quality and resource usage.

Slide 5 Static Provisioning l Dedicate fixed resources per customer l Typical of “co-lo” Slide 5 Static Provisioning l Dedicate fixed resources per customer l Typical of “co-lo” or dedicated hosting l Reprovision manually as needed l Overprovision for surges – High variable cost of capacity How to automate resource provisioning for managed hosting?

Slide 6 Load Is Dynamic M T W Th F Week 6 S 7 Slide 6 Load Is Dynamic M T W Th F Week 6 S 7 S 8 ibm. com external site • February 2001 • Daily fluctuations (3 x) • Workday cycle • Weekends off World Cup soccer site • May-June 1998 • Seasonal fluctuations • Event surges (11 x) • ita. ee. lbl. gov

Slide 7 Adaptive Provisioning - Efficient resource usage - Load multiplexing - Surge protection Slide 7 Adaptive Provisioning - Efficient resource usage - Load multiplexing - Surge protection - Online capacity planning - Dynamic resource recruitment - Balance service quality with cost - Service Level Agreements (SLAs)

Slide 8 Utilization Targets i = allocated server resource for service i i = Slide 8 Utilization Targets i = allocated server resource for service i i = utilization of i at i’s current load i target = configurable target level for i Leave headroom for load spikes. i > target : service i is underprovisioned i < target : service i is overprovisioned

Slide 9 Muse Architecture Executive configuration commands Control performance measures offered request load storage Slide 9 Muse Architecture Executive configuration commands Control performance measures offered request load storage tier reconfigurable switches Executive controls mapping of service traffic to server resources by means of: • reconfigurable switches • scheduler controls (shares) server pool stateless interchangeable

Slide 10 Server Power Draw 866 MHz P-III Super. Micro 370 -DER (Free. BSD) Slide 10 Server Power Draw 866 MHz P-III Super. Micro 370 -DER (Free. BSD) Brand Electronics 21 -1850 digital power meter boot 136 w CPU max 120 w watts CPU idle 93 w Idling consumes 60% to 70% of peak power demand. disk spin off/hiber 2 6 -10 w -3 w work

Slide 11 Energy vs. Service Quality A B Active set = {A, B, C, Slide 11 Energy vs. Service Quality A B Active set = {A, B, C, D} A B C Active set = {A, B} D i < target i = target • Low latency • Meets quality goals • Saves energy

Slide 12 Energy-Conscious Provisioning l Light load: concentrate traffic on a minimal set of Slide 12 Energy-Conscious Provisioning l Light load: concentrate traffic on a minimal set of servers. – Step down surplus servers to a low-power state. • APM and ACPI – Activate surplus servers on demand. • Wake-On-LAN l Browndown: can provision for a specified energy target.

Slide 13 Resource Economy l Input: the “value” of performance for each customer i. Slide 13 Resource Economy l Input: the “value” of performance for each customer i. – Common unit of value: “money”. – Derives from the economic value of the service. – Enables SLAs to represent flexible quality vs. cost tradeoffs. l Per-customer utility function Ui = bid – penalty. – Bid for traffic volume (throughput i). – Bid for better service quality, or subtract penalty for poor quality. l Allocate resources to maximize expected global utility (“revenue” or reward). – Predict performance effects. – “Sell” to the highest bidder. – Never sell resources below cost. Maximize bidi( i(t, i)) Subject to i max

Slide 14 Maximizing Revenue l Consider any customer i with allotment i at fixed Slide 14 Maximizing Revenue l Consider any customer i with allotment i at fixed time t. – The marginal utility (pricei) for a resource unit allotted or reclaimed from i is the gradient of Ui at i. Adjust allotments until price equilibrium is reached. pricei The algorithm assumes that Ui is “concave”: the price gradients are non-negative and monotonically non-increasing. Expected Utility Ui(t, i) Resource allotment i

Slide 15 Feedback and Stability l Allocation planning is incremental. – Adjust the solution Slide 15 Feedback and Stability l Allocation planning is incremental. – Adjust the solution from the previous interval to react to new observations. l Allow system to stabilize before next re-evaluation. – Set adjustment interval and magnitude to avoid oscillation. – Control theory applies. [Abdelzaher, Shin et al, 2001] l Filter the load observations to distinguish transient and persistent load changes. – Internet service workloads are extremely bursty. – Filter must “balance stability and agility” [Kim and Noble 2001].

Slide 16 “Flop-Flip” Filter l EWMA-based filter alone is not sufficient. – Average At Slide 16 “Flop-Flip” Filter l EWMA-based filter alone is not sufficient. – Average At for each interval t: At = At-1 + (1 - )Ot – The gain may be variable or flip-flop. l Load estimate Et = Et-1 if Et-1 - At < tolerance else Et = At l Stable l Responsive

Slide 17 IBM Trace Run (Before) Power draw (watts) Latency (ms*50) Throughput (requests/s) 1 Slide 17 IBM Trace Run (Before) Power draw (watts) Latency (ms*50) Throughput (requests/s) 1 ms

Slide 18 IBM Trace Run (After) 1 ms Slide 18 IBM Trace Run (After) 1 ms

Slide 19 Evaluating Energy Savings Trace replay shows adaptive provisioning in action. Server energy Slide 19 Evaluating Energy Savings Trace replay shows adaptive provisioning in action. Server energy savings in this experiment was 29%. – 5 -node cluster, 3 x load swings, target = 0. 5 – Expect roughly comparable savings in cooling costs. • Ventilation costs are fixed; chiller costs are proportional to thermal loading. For a given “shape” load curve, achievable energy savings increases with cluster size. • E. g. , higher request volumes, • or lower target for better service quality. – Larger clusters give finer granularity to closely match load.

Slide 20 Expected Resource Savings Slide 20 Expected Resource Savings

Slide 21 Conclusions l Dynamic request redirection enables fine-grained, continuous control over mapping of Slide 21 Conclusions l Dynamic request redirection enables fine-grained, continuous control over mapping of workload to physical server resources in hosting centers. l Continuous monitoring and control allows a hosting center OS to provision resources adaptively. l Adaptive resource provisioning is central to energy and thermal management in data centers. – Adapt to energy “browndown” by degrading service quality. – Adapt to load swings for 25% - 75% energy savings. l Economic policy framework guides provisioning choices based on SLAs and cost/benefit tradeoffs.

Slide 22 Future Work l multiple resources (e. g. , memory and storage) l Slide 22 Future Work l multiple resources (e. g. , memory and storage) l multi-tier services and multiple server pools l reservations and latency Qo. S penalties l rational server allocation and request distribution l integration with thermal system in data center l flexibility and power of utility functions l server networks and overlays l performability and availability SLAs l application feedback

Slide 23 Muse Prototype and Testbed SURGE or trace load generators Executive Extreme Gig. Slide 23 Muse Prototype and Testbed SURGE or trace load generators Executive Extreme Gig. E switch client cluster Link. Sys 100 Mb/s switch redirectors (Power. Edge 1550) faithful trace replay + synthetic Web loads server CPU-bound power meter server pool Free. BSD-based redirectors resource containers APM and Wake-on-LAN

Slide 24 Throughput and Latency saturated: i > target i increases linearly with i Slide 24 Throughput and Latency saturated: i > target i increases linearly with i overprovisioned: i > target may reclaim: i( target - i) Average per-request service demand: i i / i

Slide 25 An OS for a Hosting Center l Hosting centers are made up Slide 25 An OS for a Hosting Center l Hosting centers are made up of heterogeneous components linked by a network fabric. – Components are specialized. – Each component has its own OS. l The role of a hosting center OS is to: – Manage shared resources (e. g. , servers, energy) – Configure and monitor component interactions – Direct flow of request/response traffic

Slide 26 Allocation Under Constraint (0) Slide 26 Allocation Under Constraint (0)

Slide 27 Allocation Under Constraint (1) Slide 27 Allocation Under Constraint (1)

Slide 28 Outline l Adaptive server provisioning l Energy-conscious provisioning l Economic resource allocation Slide 28 Outline l Adaptive server provisioning l Energy-conscious provisioning l Economic resource allocation l Stable load estimation l Experimental results