aa6e86c51eeeeb8ab3a273e8451bff99.ppt

- Количество слайдов: 32

Homework #1 is due now Today: T-test and Outliers xkcd. com

• Stats practice in next lab • Also need to start putting together your group for inquiry 2. . . 3 -5 people/group • Inquiry 1 written and oral reports are due in lab Th 9/24 or M 9/28 • Homework #2 is posted. . . Stream Open House • Homework #3 will ask you to complete some online safety training • Online evaluation • More TA office hours

How significant of a difference is this? Set 1= 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7 Mean = 3. 67 ± 1. 6 range = 2. 07 to 5. 27 And Set 2= 8, 6, 7, 8, 9, 5, 6, 7, 9, 8, 9, 5 Mean = 7. 25 ± 1. 48 range = 5. 77 to 8. 73 No overlap, might be different

The ‘Students’ T-test is a method to assign a numerical value of statistical difference.

The ‘Students’ T-test is a method to assign a numerical value of statistical difference.

The ‘Students’ T-test is a method to assign a numerical value of statistical difference. (Difference between means) (variance) (sample size)

The ‘Students’ T-test is a method to assign a numerical value of statistical difference. T is then used to look up the P-value from a table. Also need ‘degrees of freedom’ = (n 1+n 2)-1.

Partial table for determining P from T P-value Df 0. 05 0. 02 0. 01 1 12. 71 31. 82 63. 66 2 4. 303 6. 965 9. 925 3 3. 182 4. 541 5. 841 4 2. 776 3. 747 4. 604 5 2. 571 3. 365 4. 032 6 2. 447 3. 143 3. 707 T

How significant of a difference is this? Using a speadsheet to get a P value = 3. 44 x 10 -6. Set 1= 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7 Mean = 3. 67 ± 1. 6 And Set 2= 8, 6, 7, 8, 9, 5, 6, 7, 9, 8, 9, 5 Mean = 7. 25 ± 1. 48

How significant of a difference is this? P value = 3. 44 x 10 -6. So the chance that these 2 sets of data are not significantly different is 3. 44 x 10 -6 Set 1= 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7 Mean = 3. 67 ± 1. 6 And Set 2= 8, 6, 7, 8, 9, 5, 6, 7, 9, 8, 9, 5 Mean = 7. 25 ± 1. 48

How significant of a difference is this? P value = 3. 44 x 10 -6. So the chance that these 2 sets of data are significantly different is 1 - 3. 44 x 10 -6 or 0. 999996559 We can be 99. 9996559% certain that the difference is statistically significant. Set 1= 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7 Mean = 3. 67 ± 1. 6 Set 2= 8, 6, 7, 8, 9, 5, 6, 7, 9, 8, 9, 5 Mean = 7. 25 ± 1. 48

Overlap, different means, but might not be a statistically significant difference Set 1= 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7 Mean = 3. 67 ± 1. 6 range = 2. 07 to 5. 27 Set 2= 8, 6, 7, 8, 9, 5, 6, 7, 9, 8, 4, 5 Mean = 6. 83 ± 1. 64 range = 5. 19 to 8. 47 P-value = 4. 41 x 10 -5

What is, or is not, a statistically significant difference? 20% random difference : 80% confidence 10% random difference : 90% confidence 5% random difference : 95% confidence 1% random difference : 99% confidence 0. 1% random difference : 99. 9% confidence

Generally a P-value of 0. 05 or less is considered a statistically significant difference. 20% random difference : 80% confidence 10% random difference : 90% confidence 5% random difference : 95% confidence 1% random difference : 99% confidence 0. 1% random difference : 99. 9% confidence

Standard deviation is NOT a valid method for determining statistical signifigance. T-test is one valid and accurate method for determining if 2 means have a statistically significant difference, or if the difference is merely by chance.

Outliers… 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7, 121, 130 Median = 4 Mean = 18

Outliers: When is data invalid?

Outliers: When is data invalid? Not simply when you want it to be.

Outliers: When is data invalid? Not simply when you want it to be. Dixon’s Q test can determine if a value is statistically an outlier.

Dixon’s Q test can determine if a value is statistically an outlier.

Dixon’s Q test can determine if a value is statistically an outlier. Example: results from a blood test… 789, 700, 772, 766, 777

Dixon’s Q test can determine if a value is statistically an outlier. Example: results from a blood test… 789, 700, 772, 766, 777

Dixon’s Q test can determine if a value is statistically an outlier. Example: results from a blood test… 789, 700, 772, 766, 777 Q=|(700 – 766)| ÷ |(789 – 700)|

Dixon’s Q test can determine if a value is statistically an outlier. Example: results from a blood test… 789, 700, 772, 766, 777 Q =|(700 – 766)| ÷ |(789 – 700)| = 0. 742

Dixon’s Q test can determine if a value is statistically an outlier. Example: results from a blood test… 789, 700, 772, 766, 777 Q =|(700 – 766)| ÷ |(789 – 700)| = 0. 742 So?

You need the critical values for Q table: Sample # Q critical value 3 0. 970 4 0. 831 5 0. 717 6 0. 621 7 0. 568 10 0. 466 12 0. 426 15 0. 384 20 0. 342 25 0. 317 If Q calc > Q crit rejected From: E. P. King, J. Am. Statist. Assoc. 48: 531 (1958)

You need the critical values for Q table: Sample # Q critical value 3 0. 970 4 0. 831 5 0. 717 6 0. 621 7 0. 568 10 0. 466 12 0. 426 15 0. 384 20 0. 342 25 If Q calc > Q crit than the outlier can be rejected 0. 317 Q calc = 0. 742 Q crit = 0. 717 = rejection From: E. P. King, J. Am. Statist. Assoc. 48: 531 (1958)

What can outliers tell us?

If you made a mistake, you should have already accounted for that.

Outliers can lead to important and fascinating discoveries. Transposons “jumping genes” were discovered because they did not fit known modes of inheritance.

Homework #1 is due now and homework #2 is posted Next: R 2, Samples, and Populations xkcd. com