Скачать презентацию An Engineering Approach to Performance Define Скачать презентацию An Engineering Approach to Performance Define

a55a9e1cae850da10092d1782f5d122f.ppt

  • Количество слайдов: 52

An Engineering Approach to Performance • • Define Performance Metrics Measure Performance Analyze Results An Engineering Approach to Performance • • Define Performance Metrics Measure Performance Analyze Results Develop Cost x Performance Alternatives: – prototype – modeling • Assess Alternatives • Implement Best Alternative 1

Business in the Internet Age (Business Week, June 22, 1998; numbers in $US billion) Business in the Internet Age (Business Week, June 22, 1998; numbers in $US billion) 2

Caution Signs Along the Road (Gross & Sager, Business Week, June 22, 1998, p. Caution Signs Along the Road (Gross & Sager, Business Week, June 22, 1998, p. 166. ) There will be jolts and delays along the way for electronic commerce: congestion is the most obvious challenge. 3

Electronic Commerce: online sales are soaring “… IT and electronic commerce can be expected Electronic Commerce: online sales are soaring “… IT and electronic commerce can be expected to drive economic growth for many years to come. ” The Emerging Digital Economy, US Dept. of Commerce, 1998. 4

What are people saying about Web performance… • “Tripod’s Web site is our business. What are people saying about Web performance… • “Tripod’s Web site is our business. If it’s not fast and reliable, there goes our business. ”, Don Zereski, Tripod’s vice-president of Technology (Internet World) • “Computer shuts down Amazon. com book sales. The site went down at about 10 a. m. and stayed out of service until 10 p. m. ” The Seattle Times, 01/08/98 5

What are people saying about Web performance… • “Sites have been concentrating on the What are people saying about Web performance… • “Sites have been concentrating on the right content. Now, more of them -- specially e-commerce sites -- realize that performance is crucial in attracting and retaining online customers. ” Gene Shklar, Keynote, The New York Times, 8/8/98 6

What are people saying about Web performance… • “Capacity is King. ” Mike Krupit, What are people saying about Web performance… • “Capacity is King. ” Mike Krupit, Vice President of Technology, CDnow, 06/01/98 • “Being able to manage hit storms on commerce sites requires more than just buying more plumbing. ” Harry Fenik, vice president of technology, Zona Research, LANTimes, 6/22/98 7

Introduction 8 Introduction 8

Objectives Performance Analysis = Analysis + Computer System Performance Analyst = Mathematician + Computer Objectives Performance Analysis = Analysis + Computer System Performance Analyst = Mathematician + Computer Systems Person 9

You Will learn • • Specifying performance requirements Evaluating design alternatives Comparing two or You Will learn • • Specifying performance requirements Evaluating design alternatives Comparing two or more systems Determining the optimal value of a parameter (system tuning) • Finding the performance bottleneck (bottleneck identification) 10

You Will learn (cont’d) • Characterizing the load on the system (workload characterization) • You Will learn (cont’d) • Characterizing the load on the system (workload characterization) • Determining the number and size of components (capacity planning) • Predicting the performance at future loads (forecasting) 11

Performance Analysis Objectives Involve: • • Procurement Improvement Capacity Planning Design 12 Performance Analysis Objectives Involve: • • Procurement Improvement Capacity Planning Design 12

Performance Improvement Procedure START Understand System Analyze Operations Unsatisfactory Invalid Test Effectiveness of Modification Performance Improvement Procedure START Understand System Analyze Operations Unsatisfactory Invalid Test Effectiveness of Modification Implement Modification Formulate Improvement Hypothesis Analyze Cost Effectiveness N o n e STOP Test Specific Hypothesis Satisfactory STOP 13

Basic Terms • System: Any collection of hardware, software, and firmware • Metrics: The Basic Terms • System: Any collection of hardware, software, and firmware • Metrics: The criteria used to evaluate the performance of the system components • Workloads: The requests made by the users of the system 14

Examples of Performance Indexes • External Indexes Turnaround Time Response Time Through Put Capacity Examples of Performance Indexes • External Indexes Turnaround Time Response Time Through Put Capacity Availability Reliability 15

Examples of Performance Indexes • Internal Indexes CPU Utilization Overlap of Activities Multiprog. Stretch Examples of Performance Indexes • Internal Indexes CPU Utilization Overlap of Activities Multiprog. Stretch Factor Multiprog. Level Paging Rate Reaction time 16

Example I What performance metrics should be used to compare the performance of the Example I What performance metrics should be used to compare the performance of the following systems. 1. Two disk drivers? 2. Two transaction-processing systems? 3. Two packet-retransmission algorithms? 17

Example II Which type of monitor (software or hardware) would be more suitable for Example II Which type of monitor (software or hardware) would be more suitable for measuring each of the following quantities: 1. Number of Instructions executed by a processor? 2. Degree of multiprogramming on a timesharing system? 3. Response time of packets on a network? 18

Example III The number of packets lost on two links was measured for four Example III The number of packets lost on two links was measured for four file size as shown below: File Size 1000 1200 1300 50 Link A 5 7 3 0 Link B 10 3 0 1 Which link is better? 19

Example IV In order to compare the performance of two cache replacement algorithms: 1. Example IV In order to compare the performance of two cache replacement algorithms: 1. What type of simulation model should be used? 2. How long should the simulation be run? 3. What can be done to get the same accuracy with a shorter run? 4. How can one decide if the random-number generator in the simulation is a good generator? 20

Example V The average response time of a database system is three seconds. During Example V The average response time of a database system is three seconds. During a one-minute observation interval, the idle time on a system was ten seconds. Using a queueing model for the system, determine the following: 21

Example V (cont’d) 1. System utilization 2. Average service time per query 3. Number Example V (cont’d) 1. System utilization 2. Average service time per query 3. Number of queries completed during the observation interval 4. Average number of jobs in the system 5. Probability of number of jobs in the system being greater than 1 6. 90 -percentile response time 7. 90 -percentile waiting time 22

Sample Exercise 4. 1 The measured throughput in queries / sec for two database Sample Exercise 4. 1 The measured throughput in queries / sec for two database systems on two different workloads is as follows: System Workload 1 Workload 2 A 30 10 B 10 30 Compare the performance of the two systems and show that : a. System A is better b. System B is better 23

The Bank Problem The board of directors requires answers to several questions that are The Bank Problem The board of directors requires answers to several questions that are posed to the IS facility manager. Environment: • 3 Fully automated branch offices all located in the same city • ATMs are available at 10 locations through out the city • 2 mainframes (One used for on-line processing ; the other for batch) 24

The Bank Problem (cont’d) • 24 tellers in the 3 branch offices • Each The Bank Problem (cont’d) • 24 tellers in the 3 branch offices • Each teller serves an average of 20 customers / hr during peak times ; otherwise, 12 customers / hr • Each customer generates 2 one-line transactions on the average • Thus, during peak times, IS facility receives an average of 960 (24 x 20 x 2) teller originated transaction / hr 25

The Bank Problem (cont’d) • ATMs are active 24 hrs / day. During peak The Bank Problem (cont’d) • ATMs are active 24 hrs / day. During peak volume, each ATM serves an average of 15 customers / hr. Each customer generates an average of 1. 2 transactions • Thus, during the peak period for ATMs, IS facility receives an average of 180 (10 x 15 x 1. 2) ATM transactions / hr ; Otherwise ATM transaction rate is about 50 % of this 26

The Bank Problem (cont’d) • Measured average response time – Tellers : 1. 23 The Bank Problem (cont’d) • Measured average response time – Tellers : 1. 23 seconds during peak hrs ; 3 seconds are acceptable – ATM : 1. 02 seconds during peak hr ; 4 seconds are acceptable 27

The Bank Problem (cont’d) Board of directors requires answers to several questions • Will The Bank Problem (cont’d) Board of directors requires answers to several questions • Will the current central IS facility allow for the expected growth of the bank while maintaining the average response time figures at the tellers and at the ATMs within the stated acceptable limits? • If not, when will it be necessary to upgrade the current IS environment? • Of several possible upgrades (e. g. adding more disks, adding memory, replacing the central processor boards), which represents the best cost-performance trade-off? • Should the data processing facility of the bank remain centralized, or should the bank consider a distributed processing alternative? 28

6. 0 Response time vs. load for the car rental C/S system example Response 6. 0 Response time vs. load for the car rental C/S system example Response Time (sec) 5. 0 4. 0 3. 0 2. 0 1. 0 0. 0 Current + 5% Current + 10% Current + 15% Load (tps) Local Reservation Road Assistance Car Pickup Phone Reservation 29

Capacity Planning Concept • Capacity planning is the determination of the predicted time in Capacity Planning Concept • Capacity planning is the determination of the predicted time in the future when system saturation is going to occur, and of the most cost-effective way of delaying system saturation as long as possible 30

Questions • Why is saturation occurring? • In which parts of the system (CPU, Questions • Why is saturation occurring? • In which parts of the system (CPU, disks, memory, queues) will a transaction or job be spending most of its execution time at the saturation points? • Which are the best cost-effective alternatives for avoiding (or, at least, delaying) saturation? 31

Capacity planning situation Response time (sec) 7. 0 Original 6. 0 Fast Disk 5. Capacity planning situation Response time (sec) 7. 0 Original 6. 0 Fast Disk 5. 0 4. 0 3. 0 2. 0 1. 0 0. 0 1000 2000 3000 Arrival rate (tph) 4000 32

Cap. Planning I/O Vars. Workload evolution Desired service levels System Parameters Capacity Planning Saturation Cap. Planning I/O Vars. Workload evolution Desired service levels System Parameters Capacity Planning Saturation points Cost-effective alternatives 33

Common Mistakes in Capacity Planning 1. No Goals 2. Biased Goals 3. Unsystematic Approach Common Mistakes in Capacity Planning 1. No Goals 2. Biased Goals 3. Unsystematic Approach 4. Analysis Without Understanding the Problem 5. Incorrect Performance Metrics 6. Unrepresentative Workload 7. Wrong Evaluation Technique 34

Common Mistakes in Capacity Planning (cont’d) 8. Overlook Important Parameters 9. Ignore Significant Factors Common Mistakes in Capacity Planning (cont’d) 8. Overlook Important Parameters 9. Ignore Significant Factors 10. Inappropriate Experimental Design 11. Inappropriate Level of Detail 12. No Analysis 13. Erroneous Analysis 14. No Sensitivity Analysis 35

Common Mistakes in Capacity Planning (cont’d) 15. Ignoring Errors in Input 16. Improper Treatment Common Mistakes in Capacity Planning (cont’d) 15. Ignoring Errors in Input 16. Improper Treatment of Outliers 17. Assuming No Change in the Future 18. Ignoring Variability 19. Too Complex Analysis 20. Improper Presentation of Results 21. Ignoring Social Aspects 22. Omitting Assumptions and Limitations 36

Capacity Planning of Steps 1. 2. 3. 4. Instrument the system Monitor system usage Capacity Planning of Steps 1. 2. 3. 4. Instrument the system Monitor system usage Characterize workload Predict Performance under different alternatives 5. Select the lowest cost, highest performance alternative 37

CPU Utilization: Example Month 1 2 3 4 5 6 CPU Utilization Total CPU CPU Utilization: Example Month 1 2 3 4 5 6 CPU Utilization Total CPU Workload A Workload B Workload C Utilization 20 15 10 45 21 16 12 49 25 18 15 58 30 19 16 65 32 21 17 70 33 22 18 73 38

Future Predicted CPU Utilization Month 7 8 9 CPU Utilization Total CPU Workload A Future Predicted CPU Utilization Month 7 8 9 CPU Utilization Total CPU Workload A Workload B Workload C Utilization 37. 1 40. 0 43. 0 23. 6 25. 0 26. 5 20. 3 21. 9 23. 5 81. 0 86. 9 93. 0 39

Systematic Approach to P. Eval 1. State Goals and Define the System 2. List Systematic Approach to P. Eval 1. State Goals and Define the System 2. List Services and Outcomes 3. Select Metrics 4. List Parameters 5. Select Factors to Study 6. Select Evaluation Technique 7. Select Workload 8. Design Experiments 9. Analyze and Interpret Data 10. Present Results 11. Repeat 40

Case Study: Remote Pipes vs RPC • System Definition: Client Network Server System 41 Case Study: Remote Pipes vs RPC • System Definition: Client Network Server System 41

Case Study (cont’d) • Services: Small data transfer or large data transfer • Metrics: Case Study (cont’d) • Services: Small data transfer or large data transfer • Metrics: – No errors and failures. Correct operation only – Rate, Time, Resource per service – Resource = Client, Server, Network 42

Case Study (cont’d) This leads to: 1. Elapsed time per call 2. Maximum call Case Study (cont’d) This leads to: 1. Elapsed time per call 2. Maximum call rate per unit of time, or equivalently, the time required to complete a block of n successive calls 3. Local CPU time per call 4. Remote CPU time per call 5. Number of bytes sent on the link per call 43

Case Study (cont’d) • Parameters: – System Parameters: 1. Speed of the local CPU Case Study (cont’d) • Parameters: – System Parameters: 1. Speed of the local CPU 2. Speed of the remote CPU 3. Speed of the network 4. Operating system overhead for interfacing with the channels 5. Operating system overhead for interfacing with the networks 6. Reliability of the network affecting the number of retransmissions required 44

Case Study (cont’d) • Parameters : (cont’d) – Workload parameters: 1. Time between successive Case Study (cont’d) • Parameters : (cont’d) – Workload parameters: 1. Time between successive calls 2. Number and size of the call parameters 3. Number and size of the results 4. Type of channel 5. Other loads on the local and remote CPUs 6. Other loads on the network 45

Case Study (cont’d) • Factors: 1. Type of channel: Remote pipes and remote procedure Case Study (cont’d) • Factors: 1. Type of channel: Remote pipes and remote procedure calls 2. Size of the network: short distance and long distance 3. Size of the call parameters: small and large 4. Number n of consecutive calls: 1, 2, 4, 8, 16, 32, ··· , 512, and 1024 46

Case Study (cont’d) Note: – Fixed: type of CPUs and operating systems – Ignore Case Study (cont’d) Note: – Fixed: type of CPUs and operating systems – Ignore retransmissions due to network error – Measure under no other load on the hosts and the network 47

Case Study (cont’d) • Evaluation Technique: Prototypes implemented Measurements Use analytical modeling for validation Case Study (cont’d) • Evaluation Technique: Prototypes implemented Measurements Use analytical modeling for validation • Workload: Synthetic program generating the specified types of channel requests Null channel requests Resource used in monitoring and logging 48

Case Study (cont’d) • Experimental Design: A full factorial experimental design with 23 x Case Study (cont’d) • Experimental Design: A full factorial experimental design with 23 x 11 = 88 experiments will be used • Data Analysis: – Analysis of Variance (ANOVA) for the first three factors – Regression for number n of successive calls • Data Presentation: The final results will be plotted as a function of the block size n 49

Exercises 1. From published literature, select an article or a report that presents results Exercises 1. From published literature, select an article or a report that presents results of a performance evaluation study. Make a list of good and bad points of the study. What would you do different, if you wore asked to repeat the study. 50

Exercises (cont’d) 2. Choose a system for performance study. Briefly describe the system and Exercises (cont’d) 2. Choose a system for performance study. Briefly describe the system and list: a. Services b. Performance metrics c. System parameters d. Workload parameters e. Factors and their ranges f. Evaluation technique g. Workload Justify your choices. Suggestion: Each student should select a different system such as a network, database, processor, and so on and then present the solution to the class. 51

Selecting an Evaluation Technique Criterion 1. Stage 2. Time Required 3. Tools 4. Accuracy* Selecting an Evaluation Technique Criterion 1. Stage 2. Time Required 3. Tools 4. Accuracy* 5. Trade-off Evaluation 6. Cost 7. Saleability Analytical Modeling Any Small Analysts Low Easy Small Low Simulation Any Medium Computer Language Moderate Medium Measurement Post-Prototype Varies Instrumentation Varies Difficult High * In all cases, results may be misleading or wrong 52