8a8d51b6ec9122b6c7b85da3de16c7a5.ppt
- Количество слайдов: 16
Regression Example Using Pop Quiz Data
Second Pop Quiz n n n At my former school (Irvine), I gave a “pop quiz” to my econometrics students. The quiz consisted of 10 questions. The first five questions were trivia-type questions. The second five questions tested TV knowledge. The last question asked students to report GPAs.
First Five Questions n n n Who is the Secretary of Defense? Who is the Speaker of the House? What is the capital of Brazil?
Second Five Questions n n n On “The Simpsons” Who Owns the Quickie Mart? On “Malcom in the Middle” what is the name of Malcom’s older brother? Who recently (not so recent any more) left “The West Wing”? On “ER” who is the doctor from Croatia? On “Everybody Loves Raymond” what does Raymond do for a living?
My Favorite Answers n Who is the Speaker of the House: n n Who is the Croatian from ER: n n George Bush Toni Kukoc What is your GPA? n You don’t even want to know.
My Favorite Answers (Cont’d) What is the capital of Brazil? u Irvine n Who is Malcom’s older Brother? u Justin n Who recently left the West Wing? u Michael J. Fox. n
Regression Example n n n Compute Number Correct for Each Set of 5. Match Number Correct with Midterm Score Only Include Those Quizzes with some answers. See if Number Correct is Correlated with Midterm Performance. First Five (+), Second Five (-)?
Source | SS df MS Number of obs = 57 -------+--------------- F( 2, 54) = 2. 88 Model | 25. 7073896 2 12. 8536948 Prob > F = 0. 0650 Residual | 241. 345242 54 4. 46935633 R-squared = 0. 0963 -------+--------------- Adj R-squared = 0. 0628 Total | 267. 052632 56 4. 76879699 Root MSE = 2. 1141 --------------------------------------- midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+-------------------------------- right_1 | . 8168767 . 3513228 2. 33 0. 024 . 1125169 1. 521236 right_2 | . 1041826 . 2916515 0. 36 0. 722 -. 4805435 . 6889088 _cons | 20. 11405 . 6115866 32. 89 0. 000 18. 88789 21. 3402 • The Number correct on the first 5 questions is a significant predictor of the midterm score. • (Every additional question answered correctly is associated with a. 82 point increase in the midterm score. ) • This coefficient is statistically significant at the 5 percent level, but not the one percent level.
Source | SS df MS Number of obs = 57 -------+--------------- F( 2, 54) = 2. 88 Model | 25. 7073896 2 12. 8536948 Prob > F = 0. 0650 Residual | 241. 345242 54 4. 46935633 R-squared = 0. 0963 -------+--------------- Adj R-squared = 0. 0628 Total | 267. 052632 56 4. 76879699 Root MSE = 2. 1141 --------------------------------------- midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+-------------------------------- right_1 | . 8168767 . 3513228 2. 33 0. 024 . 1125169 1. 521236 right_2 | . 1041826 . 2916515 0. 36 0. 722 -. 4805435 . 6889088 _cons | 20. 11405 . 6115866 32. 89 0. 000 18. 88789 21. 3402 • The Number correct on the second 5 (TV questions) is not a significant predictor of midterm score. • It seems to have no predictive power, and its t-statistic is very low.
Source | SS df MS Number of obs = 57 -------+--------------- F( 2, 54) = 2. 88 Model | 25. 7073896 2 12. 8536948 Prob > F = 0. 0650 Residual | 241. 345242 54 4. 46935633 R-squared = 0. 0963 -------+--------------- Adj R-squared = 0. 0628 Total | 267. 052632 56 4. 76879699 Root MSE = 2. 1141 --------------------------------------- midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+-------------------------------- right_1 | . 8168767 . 3513228 2. 33 0. 024 . 1125169 1. 521236 right_2 | . 1041826 . 2916515 0. 36 0. 722 -. 4805435 . 6889088 _cons | 20. 11405 . 6115866 32. 89 0. 000 18. 88789 21. 3402 n n Less than 10 percent of the variation in midterm scores is explained by variation in the number of questions answered correctly. Prediction: What is the expected midterm score for someone getting 0 questions correct: n n Ans: Just the Intercept: 20. 11 Prediction: Expected Score for Getting 2 correct on each section: n Ans: 20. 11 +. 8168(2) +. 1042(2) = 21. 95.
Source | SS df MS Number of obs = 57 -------+--------------- F( 2, 54) = 2. 88 Model | 25. 7073896 2 12. 8536948 Prob > F = 0. 0650 Residual | 241. 345242 54 4. 46935633 R-squared = 0. 0963 -------+--------------- Adj R-squared = 0. 0628 Total | 267. 052632 56 4. 76879699 Root MSE = 2. 1141 --------------------------------------- midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+-------------------------------- right_1 | . 8168767 . 3513228 2. 33 0. 024 . 1125169 1. 521236 right_2 | . 1041826 . 2916515 XXXX 0. 722 -. 4805435 . 6889088 _cons | 20. 11405 . 6115866 32. 89 0. 000 18. 88789 21. 3402 n Fill In Missing T-Statistic. n The missing value is the t-statistic under the Null that the Coefficient on right_2 = 0. So, the t-statistic is (. 1042 -0)/. 2917 =. 357 n
Source | SS df MS Number of obs = 57 -------+--------------- F( 2, 54) = 2. 88 Model | 25. 7073896 2 12. 8536948 Prob > F = 0. 0650 Residual | 241. 345242 54 4. 46935633 R-squared = 0. 0963 -------+--------------- Adj R-squared = 0. 0628 Total | xxxxx 56 4. 76879699 Root MSE = 2. 1141 --------------------------------------- midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+-------------------------------- right_1 | . 8168767 . 3513228 2. 33 0. 024 . 1125169 1. 521236 right_2 | . 1041826 . 2916515 0. 36 0. 722 -. 4805435 . 6889088 _cons | 20. 11405 . 6115866 32. 89 0. 000 18. 88789 21. 3402 n Fill In Missing TSS. Since TSS = ESS + RSS, n TSS = 25. 707 + 241. 345 = 267. 052 n
Source | SS df MS Number of obs = 57 -------+--------------- F( 2, 54) = 2. 88 Model | 25. 7073896 2 12. 8536948 Prob > F = 0. 0650 Residual | 241. 345242 54 4. 46935633 R-squared = xxxxxx -------+--------------- Adj R-squared = 0. 0628 Total | 267. 052632 56 4. 76879699 Root MSE = 2. 1141 --------------------------------------- midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+-------------------------------- right_1 | . 8168767 . 3513228 2. 33 0. 024 . 1125169 1. 521236 right_2 | . 1041826 . 2916515 0. 36 0. 722 -. 4805435 . 6889088 _cons | 20. 11405 . 6115866 32. 89 0. 000 18. 88789 21. 3402 n Fill In Missing R-squared value: R-squared is defined as the Model Sum of Squares (ESS) divided by TSS. n So, R-squared is 25. 707/267. 052 =. 0963 n
Source | SS df MS Number of obs = 57 -------+--------------- F( 2, 54) = 2. 88 Model | 25. 7073896 2 12. 8536948 Prob > F = 0. 0650 Residual | 241. 345242 54 4. 46935633 R-squared = 0. 0963 -------+--------------- Adj R-squared = 0. 0628 Total | 267. 052632 56 4. 76879699 Root MSE = 2. 1141 --------------------------------------- midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+-------------------------------- right_1 | . 8168767 . 3513228 2. 33 0. 024 xxxxxxxx right_2 | . 1041826 . 2916515 0. 36 0. 722 -. 4805435 . 6889088 _cons | 20. 11405 . 6115866 32. 89 0. 000 18. 88789 21. 3402 n Fill In Missing Confidence Interval: n The 5 percent critical value from t 57 -3 or t 54 is (approximately) 2. 00 So, the lower part of the interval is n. 8169 – 2. 00(. 3513) =. 1141 And the upper part of the interval is n. 8169 + 2. 00(. 3513) = 1. 519. n n
Source | SS df MS Number of obs = 57 -------+--------------- F( 2, 54) = 2. 88 Model | 25. 7073896 2 12. 8536948 Prob > F = 0. 0650 Residual | 241. 345242 54 4. 46935633 R-squared = 0. 0963 -------+--------------- Adj R-squared = 0. 0628 Total | 267. 052632 56 4. 76879699 Root MSE = 2. 1141 --------------------------------------- midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+-------------------------------- right_1 | . 8168767 . 3513228 2. 33 0. 024 . 1125169 1. 521236 right_2 | . 1041826 . 2916515 0. 06 0. 722 -. 4805435 . 6889088 _cons | 20. 11405 . 6115866 32. 89 0. 000 18. 88789 21. 3402 n n Given the results on the table, how could you estimate the variance parameter, s 2 ? Sounds like an interesting test question. . .