Скачать презентацию Lesson 9 Two Sample Tests of Hypothesis Ka-fu Скачать презентацию Lesson 9 Two Sample Tests of Hypothesis Ka-fu

1f6d1b313fc377c5d0d4d0e2b905849c.ppt

  • Количество слайдов: 42

Lesson 9: Two Sample Tests of Hypothesis Ka-fu Wong © 2004 ECON 1003: Analysis Lesson 9: Two Sample Tests of Hypothesis Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data Lesson 9 -1

Outline The formula of general test statistic Hypothesis tests for two population means Independent Outline The formula of general test statistic Hypothesis tests for two population means Independent versus dependent populations Large sample tests for the difference in means of two independent population Hypothesis testing involving paired observations Large sample tests for the difference in proportions of two independent population Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 2

Overview To test the effect of an herbal treatment on improvement of memory you Overview To test the effect of an herbal treatment on improvement of memory you randomly select two samples, one to receive the treatment and one to receive a placebo. Results of a memory test taken one month later are given. Sample 1 Sample 2 Experimental Group Control Group Treatment Placebo The resulting test statistic is 77 - 73 = 4. Is this difference significant or is it due to chance (sampling error)? Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 3

Two Sample Tests TEST FOR EQUAL VARIANCES Ho Population 1 TEST FOR EQUAL MEANS Two Sample Tests TEST FOR EQUAL VARIANCES Ho Population 1 TEST FOR EQUAL MEANS Ho Population 1 Population 2 H 1 Population 2 Ka-fu Wong © 2004 H 1 Population 2 ECON 1003: Analysis of Economic Data 4

The formula of general test statistic n Suppose we are interested in testing whether The formula of general test statistic n Suppose we are interested in testing whether the population parameter ( ) is equal to k. n H 0 : = k n H 1 : k n First, we need to get a sample estimate (q) of the population parameter ( ). n Second, we know in most cases, the test statistics will be in the following form: n t=(q-k)/ q n q is the standard deviation of q under the null. The form of q depends on what q is. n Sample size and the null at hand determine the distribution of the statistic. n If is population mean, and the sample size is larger than 30, t is approximately normal. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 5

Comparing two populations n We wish to know whether the distribution of the differences Comparing two populations n We wish to know whether the distribution of the differences in sample means has a mean of 0. n If both samples contain at least 30 observations we use the z distribution as the test statistic. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

Hypothesis Tests for Two Population Means Format 1 Two-Tailed Test Preferred Upper One. Tailed Hypothesis Tests for Two Population Means Format 1 Two-Tailed Test Preferred Upper One. Tailed Test Lower One. Tailed Test Format 2 Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 7

Two Independent Populations: Examples 1. An economist wishes to determine whethere is a difference Two Independent Populations: Examples 1. An economist wishes to determine whethere is a difference in mean family income for households in two socioeconomic groups. n Do HKU students come from families with higher income than CUHK students? 2. An admissions officer of a small liberal arts college wants to compare the mean SAT scores of applicants educated in rural high schools & in urban high schools. n Ka-fu Wong © 2004 Do students from rural high schools have lower Alevel exam score than from urban high schools? ECON 1003: Analysis of Economic Data 8

Two Dependent Populations: Examples 1. An analyst for Educational Testing Service wants to compare Two Dependent Populations: Examples 1. An analyst for Educational Testing Service wants to compare the mean GMAT scores of students before & after taking a GMAT review course. n 2. Get HKU graduates to take A-Level English and Chinese exam again. Do they get a higher A-Level English and Chinese exam score than at the time they enter HKU? Nike wants to see if there is a difference in durability of 2 sole materials. One type is placed on one shoe, the other type on the other shoe of the same pair. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 9

Thinking Challenge Are they independent or dependent? 1. 2. 3. 4. Miles per gallon Thinking Challenge Are they independent or dependent? 1. 2. 3. 4. Miles per gallon ratings of cars before & after dependent mounting radial tires The life expectancies of light bulbs made in two different factories independent Difference in hardness between 2 metals: one contains an alloy, one doesn’t independent Tread life of two different motorcycle tires: one on the front, the other on the back dependent Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 10

Comparing two populations n No assumptions about the shape of the populations are required. Comparing two populations n No assumptions about the shape of the populations are required. n The samples are from independent populations. n Values in one sample have no influence on the values in the other sample(s). n Variance formula for independent random variables A and B: V(A-B) = V(A) + V(B) n The formula for computing the value of z is: Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

EXAMPLE 1 Two cities, Bradford and Kane are separated only by the Conewango River. EXAMPLE 1 Two cities, Bradford and Kane are separated only by the Conewango River. There is competition between the two cities. The local paper recently reported that the mean household income in Bradford is $38, 000 with a standard deviation of $6, 000 for a sample of 40 households. The same article reported the mean income in Kane is $35, 000 with a standard deviation of $7, 000 for a sample of 35 households. At the. 01 significance level can we conclude the mean income in Bradford is more? Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

EXAMPLE 1 continued n Step 1: State the null and alternate hypotheses. H 0: EXAMPLE 1 continued n Step 1: State the null and alternate hypotheses. H 0: µB ≤ µK ; H 1: µB > µK n Step 2: State the level of significance. The. 01 significance level is stated in the problem. n Step 3: Find the appropriate test statistic. Because both samples are more than 30, we can use z as the test statistic. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

Example 1 continued n Step 4: State the decision rule. The null hypothesis is Example 1 continued n Step 4: State the decision rule. The null hypothesis is rejected if z is greater than 2. 33. Probability density of z statistic : N(0, 1) H 0: µB ≤ µK ; H 1: µB > µK Acceptance Region = 0. 01 Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data Rejection Region = 0. 01

Example 1 continued n Step 5: Compute the value of z and make a Example 1 continued n Step 5: Compute the value of z and make a decision. H 0: µB ≤ µK ; H 1: µB > µK Acceptance Region = 0. 01 1. 98 Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data Rejection Region = 0. 01

Example 1 continued n The decision is to not reject the null hypothesis. We Example 1 continued n The decision is to not reject the null hypothesis. We cannot conclude that the mean household income in Bradford is larger. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

Example 1 continued n The p-value is: n P(z > 1. 98) =. 5000 Example 1 continued n The p-value is: n P(z > 1. 98) =. 5000 -. 4761 =. 0239 P-value = 0. 0239 H 0: µB ≤ µK ; H 1: µB > µK Rejection Region = 0. 01 1. 98 Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

Small Sample Tests of Means n The t distribution is used as the test Small Sample Tests of Means n The t distribution is used as the test statistic if one or more of the samples have less than 30 observations. n The required assumptions are: 1. Both populations must follow the normal distribution. 2. The populations must have equal standard deviations. 3. The samples are from independent populations. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

Small sample test of means continued n Finding the value of the test statistic Small sample test of means continued n Finding the value of the test statistic requires two steps. Step 1: Pool the sample standard deviations. Why not n 1 + n 2? Step 2: Determine the value of t from the following formula. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 19

Small sample test of means continued Why not n 1 + n 2? (n Small sample test of means continued Why not n 1 + n 2? (n 1 – 1) is the degree of freedom. One df is lost because sample mean must be fixed before computation of the sample variance. Division by df instead of n 1 ensures the unbiasedness of the s 12 as an estimate of the population variance. (n 1 +n 2 – 2) is the degree of freedom. Two dfs are lost because two sample means must be fixed before computation of the sample variance. Division by df instead of (n 1+n 2) ensures the unbiasedness of the sp 2 as an estimate of the population variance. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 20

EXAMPLE 2 n A recent EPA study compared the highway fuel economy of domestic EXAMPLE 2 n A recent EPA study compared the highway fuel economy of domestic and imported passenger cars. A sample of 15 domestic cars revealed a mean of 33. 7 mpg with a standard deviation of 2. 4 mpg. A sample of 12 imported cars revealed a mean of 35. 7 mpg with a standard deviation of 3. 9. n At the. 05 significance level can the EPA conclude that the mpg is higher on the imported cars? Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

Example 2 continued n Step 1: State the null and alternate hypotheses. H 0: Example 2 continued n Step 1: State the null and alternate hypotheses. H 0: µD ≥ µI ; H 1: µD < µI n Step 2: State the level of significance. The. 05 significance level is stated in the problem. n Step 3: Find the appropriate test statistic. Both samples are less than 30, so we use the t distribution. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 22

EXAMPLE 2 continued Step 4: The decision rule is to reject H 0 if EXAMPLE 2 continued Step 4: The decision rule is to reject H 0 if t<-1. 708. There are 25 degrees of freedom. Probability density of t statistic : t (df=25) Rejection Region = 0. 05 Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

EXAMPLE 2 Step 5: Ka-fu Wong © 2004 continued We compute the pooled variance: EXAMPLE 2 Step 5: Ka-fu Wong © 2004 continued We compute the pooled variance: ECON 1003: Analysis of Economic Data

Example 2 continued We compute the value of t as follows. Ka-fu Wong © Example 2 continued We compute the value of t as follows. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 25

Example 2 continued Rejection Region = 0. 05 -1. 640 H 0 is not Example 2 continued Rejection Region = 0. 05 -1. 640 H 0 is not rejected. There is insufficient sample evidence to claim a higher mpg on the imported cars. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 26

Hypothesis Testing Involving Paired Observations n Independent samples are samples that are not related Hypothesis Testing Involving Paired Observations n Independent samples are samples that are not related in any way. n Dependent samples are samples that are paired or related in some fashion. For example: n If you wished to buy a car you would look at the same car at two (or more) different dealerships and compare the prices. n If you wished to measure the effectiveness of a new diet you would weigh the dieters at the start and at the finish of the program. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

Hypothesis Testing Involving Paired Observations n Use the following test when the samples are Hypothesis Testing Involving Paired Observations n Use the following test when the samples are dependent: n where is the mean of the differences n is the standard deviation of the differences n n is the number of pairs (differences) Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

EXAMPLE 3 n An independent testing agency is comparing the daily rental cost for EXAMPLE 3 n An independent testing agency is comparing the daily rental cost for renting a compact car from Hertz and Avis. A random sample of eight cities revealed the following information. At the. 05 significance level can the testing agency conclude that there is a difference in the rental charged? Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

EXAMPLE 3 City continued Hertz ($) Avis ($) Atlanta 42 40 Chicago 56 52 EXAMPLE 3 City continued Hertz ($) Avis ($) Atlanta 42 40 Chicago 56 52 Cleveland 45 43 Denver 48 48 Honolulu 37 32 Kansas City 45 48 Miami 41 39 Seattle 46 50 Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

EXAMPLE 3 continued n Step 1: State the null and alternate hypotheses. H 0: EXAMPLE 3 continued n Step 1: State the null and alternate hypotheses. H 0: µd = 0 ; H 1: µd ≠ 0 n Step 2: State the level of significance. The. 05 significance level is stated in the problem. n Step 3: Find the appropriate test statistic. We can use t as the test statistic. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

EXAMPLE 3 continued n Step 4: State the decision rule. H 0 is rejected EXAMPLE 3 continued n Step 4: State the decision rule. H 0 is rejected if t < -2. 365 or t > 2. 365. We use the t distribution with 7 degrees of freedom. H 0: µd = 0 ; H 1: µd ≠ 0 Probability density of t statistic : t (df=7) Rejection Region I Probability =0. 025 Rejection Region II probability=0. 025 Acceptance Region = 0. 01 Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data

Example 3 continued City Hertz ($) Avis ($) d d 2 Atlanta 42 40 Example 3 continued City Hertz ($) Avis ($) d d 2 Atlanta 42 40 2 4 Chicago 56 52 4 16 Cleveland 45 43 2 4 Denver 48 48 0 0 Honolulu 37 32 5 25 Kansas City 45 48 -3 9 Miami 41 39 2 4 Seattle 46 50 -4 16 Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 33

Example 3 continued Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 34 Example 3 continued Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 34

Example 3 continued n Step 5: Because 0. 894 is less than the critical Example 3 continued n Step 5: Because 0. 894 is less than the critical value, do not reject the null hypothesis. There is no difference in the mean amount charged by Hertz and Avis. H 0: µd = 0 ; H 1: µd ≠ 0 Rejection Region I Probability =0. 025 0. 894 Rejection Region II probability=0. 025 Acceptance Region = 0. 01 Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 35

Two Sample Tests of Proportions n We investigate whether two independent samples came from Two Sample Tests of Proportions n We investigate whether two independent samples came from populations with an equal proportion of successes. n The two samples are pooled using the following formula. where X 1 and X 2 refer to the number of successes in the respective samples of n 1 and n 2. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 36

Two Sample Tests of Proportions continued n The value of the test statistic is Two Sample Tests of Proportions continued n The value of the test statistic is computed from the following formula. Note: The form of standard deviation reflects the assumption of independence of the two samples. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 37

Example 4 n Are unmarried workers more likely to be absent from work than Example 4 n Are unmarried workers more likely to be absent from work than married workers? A sample of 250 married workers showed 22 missed more than 5 days last year, while a sample of 300 unmarried workers showed 35 missed more than five days. Use a. 05 significance level. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 38

Example 4 continued n The null and the alternate hypothesis are: H 0: U Example 4 continued n The null and the alternate hypothesis are: H 0: U ≤ M H 1: U > M The null hypothesis is rejected if the computed value of z is greater than 1. 65. Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 39

Example 4 continued n The pooled proportion is The value of the test statistic Example 4 continued n The pooled proportion is The value of the test statistic is Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 40

Example 4 continued n The null hypothesis is not rejected. We cannot conclude that Example 4 continued n The null hypothesis is not rejected. We cannot conclude that a higher proportion of unmarried workers miss more days in a year than the married workers. n The p-value is: P(z > 1. 10) =. 5000 -. 3643 =. 1457 Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data 41

Lesson 9: Two Sample Tests of Hypothesis - END - Ka-fu Wong © 2004 Lesson 9: Two Sample Tests of Hypothesis - END - Ka-fu Wong © 2004 ECON 1003: Analysis of Economic Data Lesson 9 -42