- Количество слайдов: 26
S 519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 10: t test
T test for dependent l l A repeated-measures study (a. k. a dependent study) is one in which a single sample of individuals is measured more than once on the same dependent variable. Main benefit: two sets of data are from the same subjects.
Example l Three professors at University of Alabama studied the effects of resource and regular classrooms on the reading achievement of learning-disabled children. A group of children was test before they take the 1 -year daily instruction and after they took the 1 -year daily instruction. l Which statistical test we should use?
T test for dependent : the sum of all the difference between groups : the sum of the differences squared between groups n : the number of pairs of observations
T test for dependent Pre-test Post-test D 3 5 4 6 5 5 4 5 3 6 7 8 7 6 7 8 8 9 9 8 7 7 6 7 8 9 6 6 7 8 8 7 9 10 9 9 8 8 4 4 5 6 9 8 12 D 2 4 3 2 1 3 4 2 1 -1 2 4 2 1 0 -1 -5 -4 -2 -1 3 1 4 16 9 4 1 9 16 4 1 1 4 16 4 1 0 1 25 16 4 1 9 1 16
T test for dependent l Step 1: A statement of the null and research hypotheses
T test for dependent l Step 2: setting the level of risk (or the level of significance or Type I error) associated with null hypothesis l 0. 05
T test for dependent l Step 3: selection of the appropriate test statistics l l Following Figure 10. 1 T test for dependent = t test for paired samples = t test for correlated samples
T test for dependent l Step 4: computation of the test statistic value l t=2. 45
T test for dependent l Step 5: determination of the value needed for rejection of the null hypothesis l l l Table B 2 df=n-1=25 -1=24 One tailed: because research hypothesis is directed
T test for dependent l Step 6: a comparison of the t value and the critical value l l 2. 45>1. 711 Reject the null hypothesis
T test for dependent l Step 7 and 8: time for a decision l There is the difference between pre-test and posttest: the post-test scores are higher than the pretest scores.
Excel: TTEST function l TTEST (array 1, array 2, tails, type) l l array 1 = the cell address for the first set of data array 2 = the cell address for the second set of data tails: 1 = one-tailed, 2 = two-tailed type: 1 = a paired t test; 2 = a two-sample test (independent with equal variances); 3 = a twosample test with unequal variances
Excel TTEST() l l It does not computer the t value It returns the likelihood that the resulting t value is due to chance l Less than 1% of the possibility that two tests are different due to chance the two tests are difference due to other reasons than chance.
Excel Tool. Pak l T test: paired two sample for means option t-Test: Paired Two Sample for Means Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail pretest 6. 32 2. 976666667 25 0. 050718341 0 24 -2. 449489743 0. 010991498 1. 710882067 0. 021982997 2. 063898547 posttest 7. 52 3. 343333333 25
Advantages of the Repeated-Samples Design l Repeated-measures design reduces or limits the variance, by eliminating the individual differences between samples.
Problems With the Repeated-Samples Design l Carryover effect (specifically associated with repeated-measures design): subject’s score in second measurement is altered by a lingering aftereffect from the first measurement.
Types of t test
Example I l l l A researcher is interested in a new technique to improve SAT verbal scores. It is known that SAT verbal scores have μ=500 σ=100. She randomly selects n=30 students from this population, and has them undergo her training technique. Students are given analogy questions, and are shocked each time they get an answer wrong. The sample then writes the SAT, and gets M = 560.
Example II l l l A social psychologist is interested in whether people feel more or less hopeful following a devastating flood in a small rural community. He randomly selects n=10 people and asks them to report how hopeful the feel using a 7 point scale from extremely hopeful (1) to neutral (4) to extremely unhopeful (7) The researcher is interested in whether the responses are consistently above or below the midpoint (4) on the scale, but has no hypothesis about what direction they are likely to go. His sample reports M=4. 7, s = 1. 89.
Example III l l To test the hypothesis that people give out more candy to kids in cute costumes than scary ones, I hire 20 kids to work for me. Ten are randomly assigned to wear cute bunny costumes, and the other ten wear Darth Vader costumes. I drop the kids off in random parts of the city, and count the total pieces of candy each has after 1 hour of trick-ortreat. Cute bunnies: M = 120, s = 10 Darth Vaders: M = 112, s = 12
Example IV l l l We are testing the effects of moderate amounts of alcohol on driving performance. We make the hypothesis that even a small amount of beer will degrade driving performance (an increase in obstacles hit). To test our hypothesis, we have n=5 subjects drive around a course on Big Wheels™ covered with cardboard cutouts of children and furry animals, and we record the number of cutouts they hit. Then, they drink one beer, and do the course again; again we record the number of cutouts hit. What is a potential confound with this experiment?
Example V l We want to determine if IU SLIS faculty publish more than the national average of 4 papers per year (per person). We take a random sample of n=12 IU SLIS profs and survey the number of papers each has published, obtaining M=6. 3, s=1. 13.
Example VI l I want to know which dog is responsible for the holes in my yard. I buy 10 German Shepherds, 10 Beagles, and randomly assign each dog to its own yard. At the end of the day, the Beagles have dug M=11. 3 holes, s=2. 1, and the Shepherds have dug M=5. 4 holes, s=1. 9. Test my hypothesis that Beagles dig more holes than German Shepherds.
Example VII l We want to know if noise affects surgery performance. We randomly select a sample of 9 surgeons, and have them perform a hand -eye coordination task (not while performing surgery, of course). The surgeons first perform the task in a quiet condition, and then we have them perform the same task under a noisy condition. Test the hypothesis that noise will cause poorer performance on the task.
Example VIII l ETS reports that GRE quantitative scores for people who have not taken a training course are μ=555, σ=139. We take a sample of 10 people from this population and give them a new preparation course. Test the hypothesis that their test scores differ from the population.