Slide 1 Managing Energy and Server Resources in

Скачать презентацию Slide 1 Managing Energy and Server Resources in

23bd58e593377a501fb5a3438d5551c7.ppt

Количество слайдов: 28

Slide 1 Managing Energy and Server Resources in Hosting Centers Jeff Chase, Darrell Anderson, Ron Doyle, Prachi Thakar, Amin Vahdat Duke University

Slide 2 Back to the Future l Return to server-centered computing: applications run as services accessed through the Internet. – Web-based services, ASPs, “netsourcing” l Internet services are hosted on server clusters. – Incrementally scalable, etc. l Server clusters may be managed by a third party. – Shared data center or hosting center – Hosting utility offers economies of scale: • Network access • Power and cooling • Administration and security • Surge capacity

Slide 3 Managing Energy and Server Resources l Key idea: a hosting center OS maintains the balance of requests and responses, energy inputs, and thermal outputs. 1. Adaptively provision server resources to match request load. energy US in 2003: 22 TWh ($1 B - $2 B+) 2. Provision server resources for energy efficiency. 3. Degrade service on power/cooling failures. requests Power/cooling “browndown” Dynamic thermal management [Brooks] responses waste heat

Slide 4 Contributions l Architecture/prototype for adaptive provisioning of server resources in Internet server clusters (Muse) – – – l Software feedback Reconfigurable request redirection Addresses a key challenge for hosting automation Foundation for energy management in hosting centers – 25% - 75% energy savings – Degrade rationally (“gracefully”) under constraint (e. g. , browndown) l Simple “economic” resource allocation – Continuous utility functions: customers “pay” for performance. – Balance service quality and resource usage.

Slide 5 Static Provisioning l Dedicate fixed resources per customer l Typical of “co-lo” or dedicated hosting l Reprovision manually as needed l Overprovision for surges – High variable cost of capacity How to automate resource provisioning for managed hosting?

Slide 6 Load Is Dynamic M T W Th F Week 6 S 7 S 8 ibm. com external site • February 2001 • Daily fluctuations (3 x) • Workday cycle • Weekends off World Cup soccer site • May-June 1998 • Seasonal fluctuations • Event surges (11 x) • ita. ee. lbl. gov

Slide 7 Adaptive Provisioning - Efficient resource usage - Load multiplexing - Surge protection - Online capacity planning - Dynamic resource recruitment - Balance service quality with cost - Service Level Agreements (SLAs)

Slide 8 Utilization Targets i = allocated server resource for service i i = utilization of i at i’s current load i target = configurable target level for i Leave headroom for load spikes. i > target : service i is underprovisioned i < target : service i is overprovisioned

Slide 9 Muse Architecture Executive configuration commands Control performance measures offered request load storage tier reconfigurable switches Executive controls mapping of service traffic to server resources by means of: • reconfigurable switches • scheduler controls (shares) server pool stateless interchangeable

Slide 10 Server Power Draw 866 MHz P-III Super. Micro 370 -DER (Free. BSD) Brand Electronics 21 -1850 digital power meter boot 136 w CPU max 120 w watts CPU idle 93 w Idling consumes 60% to 70% of peak power demand. disk spin off/hiber 2 6 -10 w -3 w work

Slide 11 Energy vs. Service Quality A B Active set = {A, B, C, D} A B C Active set = {A, B} D i < target i = target • Low latency • Meets quality goals • Saves energy

Slide 12 Energy-Conscious Provisioning l Light load: concentrate traffic on a minimal set of servers. – Step down surplus servers to a low-power state. • APM and ACPI – Activate surplus servers on demand. • Wake-On-LAN l Browndown: can provision for a specified energy target.

Slide 13 Resource Economy l Input: the “value” of performance for each customer i. – Common unit of value: “money”. – Derives from the economic value of the service. – Enables SLAs to represent flexible quality vs. cost tradeoffs. l Per-customer utility function Ui = bid – penalty. – Bid for traffic volume (throughput i). – Bid for better service quality, or subtract penalty for poor quality. l Allocate resources to maximize expected global utility (“revenue” or reward). – Predict performance effects. – “Sell” to the highest bidder. – Never sell resources below cost. Maximize bidi( i(t, i)) Subject to i max

Slide 14 Maximizing Revenue l Consider any customer i with allotment i at fixed time t. – The marginal utility (pricei) for a resource unit allotted or reclaimed from i is the gradient of Ui at i. Adjust allotments until price equilibrium is reached. pricei The algorithm assumes that Ui is “concave”: the price gradients are non-negative and monotonically non-increasing. Expected Utility Ui(t, i) Resource allotment i

Slide 15 Feedback and Stability l Allocation planning is incremental. – Adjust the solution from the previous interval to react to new observations. l Allow system to stabilize before next re-evaluation. – Set adjustment interval and magnitude to avoid oscillation. – Control theory applies. [Abdelzaher, Shin et al, 2001] l Filter the load observations to distinguish transient and persistent load changes. – Internet service workloads are extremely bursty. – Filter must “balance stability and agility” [Kim and Noble 2001].

Slide 16 “Flop-Flip” Filter l EWMA-based filter alone is not sufficient. – Average At for each interval t: At = At-1 + (1 - )Ot – The gain may be variable or flip-flop. l Load estimate Et = Et-1 if Et-1 - At < tolerance else Et = At l Stable l Responsive

Slide 17 IBM Trace Run (Before) Power draw (watts) Latency (ms*50) Throughput (requests/s) 1 ms

Slide 18 IBM Trace Run (After) 1 ms

Slide 19 Evaluating Energy Savings Trace replay shows adaptive provisioning in action. Server energy savings in this experiment was 29%. – 5 -node cluster, 3 x load swings, target = 0. 5 – Expect roughly comparable savings in cooling costs. • Ventilation costs are fixed; chiller costs are proportional to thermal loading. For a given “shape” load curve, achievable energy savings increases with cluster size. • E. g. , higher request volumes, • or lower target for better service quality. – Larger clusters give finer granularity to closely match load.

Slide 20 Expected Resource Savings

Slide 21 Conclusions l Dynamic request redirection enables fine-grained, continuous control over mapping of workload to physical server resources in hosting centers. l Continuous monitoring and control allows a hosting center OS to provision resources adaptively. l Adaptive resource provisioning is central to energy and thermal management in data centers. – Adapt to energy “browndown” by degrading service quality. – Adapt to load swings for 25% - 75% energy savings. l Economic policy framework guides provisioning choices based on SLAs and cost/benefit tradeoffs.

Slide 22 Future Work l multiple resources (e. g. , memory and storage) l multi-tier services and multiple server pools l reservations and latency Qo. S penalties l rational server allocation and request distribution l integration with thermal system in data center l flexibility and power of utility functions l server networks and overlays l performability and availability SLAs l application feedback

Slide 23 Muse Prototype and Testbed SURGE or trace load generators Executive Extreme Gig. E switch client cluster Link. Sys 100 Mb/s switch redirectors (Power. Edge 1550) faithful trace replay + synthetic Web loads server CPU-bound power meter server pool Free. BSD-based redirectors resource containers APM and Wake-on-LAN

Slide 24 Throughput and Latency saturated: i > target i increases linearly with i overprovisioned: i > target may reclaim: i( target - i) Average per-request service demand: i i / i

Slide 25 An OS for a Hosting Center l Hosting centers are made up of heterogeneous components linked by a network fabric. – Components are specialized. – Each component has its own OS. l The role of a hosting center OS is to: – Manage shared resources (e. g. , servers, energy) – Configure and monitor component interactions – Direct flow of request/response traffic

Slide 26 Allocation Under Constraint (0)

Slide 27 Allocation Under Constraint (1)

Slide 28 Outline l Adaptive server provisioning l Energy-conscious provisioning l Economic resource allocation l Stable load estimation l Experimental results