Скачать презентацию Statistics for Managers Using Microsoft Excel 4 th Скачать презентацию Statistics for Managers Using Microsoft Excel 4 th

da74e5479289aa6366ad7a44f3802a07.ppt

  • Количество слайдов: 38

Statistics for Managers Using Microsoft® Excel 4 th Edition Chapter 13 Introduction to Multiple Statistics for Managers Using Microsoft® Excel 4 th Edition Chapter 13 Introduction to Multiple Regression Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. Chap 13 -1

Chapter Goals After completing this chapter, you should be able to: § § apply Chapter Goals After completing this chapter, you should be able to: § § apply multiple regression analysis to business decision-making situations analyze and interpret the computer output for a multiple regression model perform residual analysis for the multiple regression model test the significance of the independent variables in a multiple regression model Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 2

Chapter Goals (continued) After completing this chapter, you should be able to: § § Chapter Goals (continued) After completing this chapter, you should be able to: § § § use a coefficient of partial determination to test portions of the multiple regression model incorporate qualitative variables into the regression model by using dummy variables use interaction terms in regression models Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 3

The Multiple Regression Model The coefficients of the multiple regression model are estimated using The Multiple Regression Model The coefficients of the multiple regression model are estimated using sample data Estimated multiple regression model: Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients In this chapter we will always use Excel to obtain the regression slope coefficients and other regression summary measures. Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 4

Multiple Regression Model Two variable model Y e p lo r fo ia ar Multiple Regression Model Two variable model Y e p lo r fo ia ar le b X 1 v S X 2 le X 2 Slop ariab e for v X 1 Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 5

Example: 2 Independent Variables § A distributor of frozen desert pies wants to evaluate Example: 2 Independent Variables § A distributor of frozen desert pies wants to evaluate factors thought to influence demand § § § Dependent variable: Pie sales (units per week) Independent variables: Price (in $) Advertising ($100’s) Data is collected for 15 weeks Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 6

Pie Sales Model Week Pie Sales Price ($) Advertising ($100 s) 1 350 5. Pie Sales Model Week Pie Sales Price ($) Advertising ($100 s) 1 350 5. 50 3. 3 2 460 7. 50 3. 3 3 350 8. 00 3. 0 4 430 8. 00 4. 5 5 350 6. 80 3. 0 6 380 7. 50 4. 0 7 430 4. 50 3. 0 8 470 6. 40 3. 7 9 450 7. 00 3. 5 10 490 5. 00 4. 0 11 340 7. 20 3. 5 12 300 7. 90 3. 2 13 440 5. 90 4. 0 14 450 5. 00 3. 5 15 300 7. 00 2. 7 Multiple regression model: Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. Sales = b 0 + b 1 (Price) + b 2 (Advertising) 7

Estimating a Multiple Linear Regression Equation § § Excel will be used to generate Estimating a Multiple Linear Regression Equation § § Excel will be used to generate the coefficients and measures of goodness of fit for multiple regression Excel: § § Tools / Data Analysis. . . / Regression PHStat: § PHStat / Regression / Multiple Regression… Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 8

Multiple Regression Output Regression Statistics Multiple R 0. 72213 R Square 0. 52148 Adjusted Multiple Regression Output Regression Statistics Multiple R 0. 72213 R Square 0. 52148 Adjusted R Square 0. 44172 Standard Error 47. 46341 Observations ANOVA 15 df Regression SS MS F Significance F 2 29460. 027 14730. 013 Residual 12 27033. 306 2252. 776 Total 14 56493. 333 Coefficients Standard Error Intercept 306. 52619 114. 25389 2. 68285 0. 01993 57. 58835 555. 46404 Price -24. 97509 10. 83213 -2. 30565 0. 03979 -48. 57626 -1. 37392 74. 13096 25. 96732 2. 85478 0. 01449 17. 55303 130. 70888 Advertising Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 6. 53861 t Stat 0. 01201 P-value Lower 95% Upper 95% 9

The Multiple Regression Equation where Sales is in number of pies per week Price The Multiple Regression Equation where Sales is in number of pies per week Price is in $ Advertising is in $100’s. b 1 = -24. 975: sales will decrease, on average, by 24. 9751 pies per week for each $1 increase in selling price, net of the effects of changes due to advertising Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. b 2 = 74. 131: sales will increase, on average, by 74. 1310 pies per week for each $100 increase in advertising, net of the effects of changes due to price 10

Using The Model to Make Predictions Predict sales for a week in which the Using The Model to Make Predictions Predict sales for a week in which the selling price is $5. 50 and advertising is $350: Predicted sales is 428. 62 pies Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. Note that Advertising is in $100’s, so $350 means that X 2 = 3. 5 11

Predictions in PHStat § PHStat | regression | multiple regression … Check the “confidence Predictions in PHStat § PHStat | regression | multiple regression … Check the “confidence and prediction interval estimates” box Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 12

Predictions in PHStat (continued) Input values < Predicted Y value < Confidence interval for Predictions in PHStat (continued) Input values < Predicted Y value < Confidence interval for the mean Y value, given these X’s < Prediction interval for an individual Y value, given these X’s Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 13

Coefficient of Multiple Determination § Reports the proportion of total variation in Y explained Coefficient of Multiple Determination § Reports the proportion of total variation in Y explained by all X variables taken together Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 14

Multiple Coefficient of Determination (continued) Regression Statistics Multiple R 0. 72213 R Square 0. Multiple Coefficient of Determination (continued) Regression Statistics Multiple R 0. 72213 R Square 0. 52148 Adjusted R Square 0. 44172 Standard Error Observations ANOVA 15 df Regression 52. 1% of the variation in pie sales is explained by the variation in price and advertising 47. 46341 SS MS F Significance F 2 29460. 027 14730. 013 Residual 12 27033. 306 2252. 776 Total 14 56493. 333 Coefficients Standard Error Intercept 306. 52619 114. 25389 2. 68285 0. 01993 57. 58835 555. 46404 Price -24. 97509 10. 83213 -2. 30565 0. 03979 -48. 57626 -1. 37392 74. 13096 25. 96732 2. 85478 0. 01449 17. 55303 130. 70888 Advertising Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 6. 53861 t Stat 0. 01201 P-value Lower 95% Upper 95% 15

Adjusted r 2 § § r 2 never decreases when a new X variable Adjusted r 2 § § r 2 never decreases when a new X variable is added to the model § This can be a disadvantage when comparing models What is the net effect of adding a new variable? § We lose a degree of freedom when a new X variable is added § Did the new X variable add enough explanatory power to offset the loss of one degree of freedom? Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 16

Adjusted r 2 (continued) § Shows the proportion of variation in Y explained by Adjusted r 2 (continued) § Shows the proportion of variation in Y explained by all X variables adjusted for the number of X variables used (where n = sample size, k = number of independent variables) § § § Penalize excessive use of unimportant independent variables Smaller than r 2 Useful in comparing among models Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 17

Multiple Coefficient of Determination (continued) Regression Statistics Multiple R 0. 72213 R Square 0. Multiple Coefficient of Determination (continued) Regression Statistics Multiple R 0. 72213 R Square 0. 52148 Adjusted R Square 0. 44172 Standard Error 47. 46341 Observations ANOVA 15 df Regression 44. 2% of the variation in pie sales is explained by the variation in price and advertising, taking into account the sample size and number of independent variables SS MS F Significance F 2 29460. 027 14730. 013 Residual 12 27033. 306 2252. 776 Total 14 56493. 333 Coefficients Standard Error Intercept 306. 52619 114. 25389 2. 68285 0. 01993 57. 58835 555. 46404 Price -24. 97509 10. 83213 -2. 30565 0. 03979 -48. 57626 -1. 37392 74. 13096 25. 96732 2. 85478 0. 01449 17. 55303 130. 70888 Advertising Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 6. 53861 t Stat 0. 01201 P-value Lower 95% Upper 95% 18

Residual Plots Used in Multiple Regression § These residual plots are used in multiple Residual Plots Used in Multiple Regression § These residual plots are used in multiple regression: < § Residuals vs. Yi § Residuals vs. X 1 i § Residuals vs. X 2 i etc. Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 19

Residual Plots Pie Example Use Tools | Data Analysis | Regression Statistics for Managers Residual Plots Pie Example Use Tools | Data Analysis | Regression Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 20

Residual Plots Pie Example Use Tools | Data Analysis | Regression Statistics for Managers Residual Plots Pie Example Use Tools | Data Analysis | Regression Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 21

Is the Model Significant? § § F-Test for Overall Significance of the Model Shows Is the Model Significant? § § F-Test for Overall Significance of the Model Shows if there is a linear relationship between all of the X variables considered together and Y § Use F test statistic § Hypotheses: H 0: β 1 = β 2 = … = βk = 0 (no linear relationship) H 1: at least one βi ≠ 0 (at least one independent variable affects Y) Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 22

F-Test for Overall Significance (continued) Regression Statistics Multiple R 0. 72213 R Square 0. F-Test for Overall Significance (continued) Regression Statistics Multiple R 0. 72213 R Square 0. 52148 Adjusted R Square 0. 44172 Standard Error 47. 46341 Observations ANOVA 15 df Regression With 2 and 12 degrees of freedom SS MS P-value for the F-Test F Significance F 2 29460. 027 14730. 013 Residual 12 27033. 306 2252. 776 Total 14 56493. 333 Coefficients Standard Error Intercept 306. 52619 114. 25389 2. 68285 0. 01993 57. 58835 555. 46404 Price -24. 97509 10. 83213 -2. 30565 0. 03979 -48. 57626 -1. 37392 74. 13096 25. 96732 2. 85478 0. 01449 17. 55303 130. 70888 Advertising Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 6. 53861 t Stat 0. 01201 P-value Lower 95% Upper 95% 23

F-Test for Overall Significance (continued) Test Statistic: H 0: β 1 = β 2 F-Test for Overall Significance (continued) Test Statistic: H 0: β 1 = β 2 = 0 H 1: β 1 and β 2 not both zero =. 05 df 1= 2 df 2 = 12 Decision: p-value=. 01201 Reject H 0 at = 0. 05 Conclusion: =. 05 0 Do not reject H 0 Reject H 0 F There is evidence that at least one independent variable affects Y Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 24

Are Individual Variables Significant? § § § Use t-tests of individual variable slopes Shows Are Individual Variables Significant? § § § Use t-tests of individual variable slopes Shows if there is a linear relationship between the variable Xi and Y Hypotheses: § § H 0: βi = 0 (no linear relationship) H 1: βi ≠ 0 (linear relationship does exist between Xi and Y) Test Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. (df = n – k – 1) 25

Are Individual Variables Significant? (continued) Regression Statistics Multiple R 0. 72213 R Square 0. Are Individual Variables Significant? (continued) Regression Statistics Multiple R 0. 72213 R Square 0. 52148 Adjusted R Square 0. 44172 Standard Error 47. 46341 Observations ANOVA 15 df Regression t-value for Price is t = -2. 306, with p-value. 0398 t-value for Advertising is t = 2. 855, with p-value. 0145 SS MS F Significance F 2 29460. 027 14730. 013 Residual 12 27033. 306 2252. 776 Total 14 56493. 333 Coefficients Standard Error Intercept 306. 52619 114. 25389 2. 68285 0. 01993 57. 58835 555. 46404 Price -24. 97509 10. 83213 -2. 30565 0. 03979 -48. 57626 -1. 37392 74. 13096 25. 96732 2. 85478 0. 01449 17. 55303 130. 70888 Advertising Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 6. 53861 t Stat 0. 01201 P-value Lower 95% Upper 95% 26

Inferences about the Slope: t Test Example H 0: β 1 = 0 H Inferences about the Slope: t Test Example H 0: β 1 = 0 H 1: β 1 0 From Excel output: Advertising H 0: β 2 = 0 H 1: β 2 0 d. f. = 15 -2 -1 = 12 =. 05 Standard Error -24. 97509 Price Coefficients t Stat P-value 10. 83213 -2. 30565 0. 03979 74. 13096 25. 96732 2. 85478 0. 01449 The test statistic for each variable falls in the rejection region (p-values <. 05) Decision: Reject H 0 for each variable Conclusion: There is evidence that both Price and Advertising affect pie sales at =. 05 Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 27

Confidence Interval Estimate for the Slope (continued) Confidence interval for the population slope βi Confidence Interval Estimate for the Slope (continued) Confidence interval for the population slope βi where t has (n – k – 1) d. f. (15 – 2 – 1) = 12 d. f. Coefficients Standard Error … Intercept 306. 52619 114. 25389 … 57. 58835 555. 46404 Price -24. 97509 10. 83213 … -48. 57626 -1. 37392 74. 13096 25. 96732 … 17. 55303 130. 70888 Advertising Lower 95% Upper 95% Example: Excel output also reports these interval endpoints: Weekly sales are estimated to be reduced by between 1. 37 to 48. 58 pies for each increase of $1 in the selling price Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 28

Coefficient of Partial Determination for k variable model § Measures the proportion of variation Coefficient of Partial Determination for k variable model § Measures the proportion of variation in the dependent variable that is explained by Xj while controlling for (holding constant) the other explanatory variables Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 29

Coefficient of Partial Determination in Excel § Coefficients of Partial Determination can be found Coefficient of Partial Determination in Excel § Coefficients of Partial Determination can be found using Excel: § PHStat | regression | multiple regression … § Check the “coefficient of partial determination” box Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 30

Using Dummy Variables § A dummy variable is a categorical explanatory variable with two Using Dummy Variables § A dummy variable is a categorical explanatory variable with two levels: § § § yes or no, on or off, male or female coded as 0 or 1 Regression intercepts are different if the variable is significant Assumes equal slopes for other variables If more than two levels, the number of dummy variables needed is (number of levels - 1) Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 31

Dummy-Variable Model Example (with 2 Levels) Let: Y = pie sales X 1 = Dummy-Variable Model Example (with 2 Levels) Let: Y = pie sales X 1 = price X 2 = holiday (X 2 = 1 if a holiday occurred during the week) (X 2 = 0 if there was no holiday that week) Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 32

Dummy-Variable Model Example (with 2 Levels) (continued) Holiday No Holiday Different intercept Y (sales) Dummy-Variable Model Example (with 2 Levels) (continued) Holiday No Holiday Different intercept Y (sales) b 0 + b 2 b 0 Holi da No H o y (X liday 2 = 1) (X 2 = 0) Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. Same slope If H 0: β 2 = 0 is rejected, then “Holiday” has a significant effect on pie sales X 1 (Price) 33

Interpreting the Dummy Variable Coefficient (with 2 Levels) Example: Sales: number of pies sold Interpreting the Dummy Variable Coefficient (with 2 Levels) Example: Sales: number of pies sold per week Price: pie price in $ 1 If a holiday occurred during the week Holiday: 0 If no holiday occurred b 2 = 15: on average, sales were 15 pies greater in weeks with a holiday than in weeks without a holiday, given the same price Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 34

Dummy-Variable Models (more than 2 Levels) § § The number of dummy variables is Dummy-Variable Models (more than 2 Levels) § § The number of dummy variables is one less than the number of levels Example: Y = house price ; X 1 = square feet If style of the house is also thought to matter: Style = ranch, split level, condo Three levels, so two dummy variables are needed Advanced concept not included in homework or on the exam. Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 35

Interaction Between Explanatory Variables § Hypothesizes interaction between pairs of X variables § Response Interaction Between Explanatory Variables § Hypothesizes interaction between pairs of X variables § Response to one X variable may vary at different levels of another X variable § Contains two-way cross product terms § Hypotheses: § § H 0: 3 = 0 (no interaction between X 1 and X 2) H 1: 3 0 (X 1 interacts with X 2) Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 36

Interaction Regression Model Worksheet Item, i Yi X 1 i X 2 i X Interaction Regression Model Worksheet Item, i Yi X 1 i X 2 i X 1 i*X 2 i 1 1 1 3 3 2 4 8 5 40 3 1 3 2 6 4 3 5 6 30 … … … Multiply X 1 by X 2 to get X 1*X 2. Run regression with Y, X 1, X 2 , X 1 X 2 Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 37

Chapter Summary § § § § Developed the multiple regression model Tested the significance Chapter Summary § § § § Developed the multiple regression model Tested the significance of the multiple regression model Discussed adjusted r 2 Used residual plots to check model Tested individual regression coefficients Tested portions of the regression model Used dummy variables Evaluated interaction effects Statistics for Managers Using Microsoft Excel, 4 e © 2004 Prentice-Hall, Inc. 38