
Stastical Tests.pptx
- Количество слайдов: 64
Stastical Tests
Hypothesis test • Decide if the hypothesis is likely to be correct or not, using the data collected in an experiment. • NULL HYPOTHESIS H There is no difference between sets of data they are nearly the same is it just chance that there is a small difference? • SIGNIFICANCE In Biology it is 5%. This is the probability of rejecting H when H is actually correct. So we are 95% sure that H is not correct. O O
Data is collected and often organized into formats that are interpreted easily. Example: Plant height due to the application of fertilizers. Height is given in centimeters (cm. ) 10 14 11 12 15 15 12 13 14 13 12 8 12 9 10 13 11 12 8 10 9 16 7 11 9
Number of plants 10 15 12 13 9 8 7 9 9 10 10 12 12 11 12 13 14 15 8 9 10 11 12 13 14 15 Height (cm) 16 14 12 8 11 16 11 13 12 12 7 12 14 9 8 11 15 13 10 10 9
• Median- The middle number in a set of data. • Mode- The number within the set of data that appears the most frequently. ¨ Mean- The average a. Denoted by х b. Calculated by the following formula Х = Σx n
¨ Variance- Determined by averaging the squared difference of all the values from the mean. - symbolized by δ 2 = Σ (х – х)2 n-1
• Standard Deviation- Is a measure of dispersion that defines how an individual entry differs from the mean. • - calculated by finding the square root of the variance. • Defines the shape of the normal distribution curve δ = √ δ 2
• The red area represents the first standard deviant. • 68% of the data falls within this area. • Calculated by x ± δ • The green area represents the second standard deviant. • 95% of the data falls within the green PLUS the red area. • Calculated by x ± 2δ • The blue area represents the third standard deviant. • 99% of the data falls within blue PLUS the green PLUS the red area. • Calculated by x ± 3δ
9/14/2010 Photo courtesy of Judy Davidson, DNP, RN 9
Standard Deviation (σ) 99% 95% 9/14/2010 10
2 -tailed Test • The critical value is the number that separates the “blue zone” from the middle (± 1. 96 this example) • In a t-test, in order to be statistically significant the t score needs to be in the “blue zone” • If α =. 05, then 2. 5% of the area is in each tail 9/14/2010 11
1 -tailed Test • The critical value is either + or -, but not both. • In this case, you would have statistical significance (p <. 05) if t ≥ 1. 645. 9/14/2010 12
William Gosset (aka ‘Student’) (1876 -1937) Worked in quality control at the Guinness brewery and could not publish under his own name. Former student of Karl Pearson
The t-test What can this test tell you? If there is a statistically significant difference between two means, when: ØThe sample size is less than 25. ØThe data is normally distributed
t-test t= x 1 – x 2 (s 12/n 1) + (s 22/n 2) SD = x 1 = mean of first sample x 2 = mean of second sample s 1 = standard deviation of first sample s 2 = standard deviation of second sample n 1 = number of measurements in first sample n 2 = number of measurements in second sample (x – x)2 n– 1
Worked example Does the p. H of soil affects seed germination of a specific plant species? • Group 1: eight pots with soil at p. H 5. 5 • Group 2: eight pots with soil at p. H 7. 0 • 50 seeds planted in each pot and the number that germinated in each pot was recorded.
What is the null hypothesis (H 0)? H 0 = there is no statistically significant difference between the germination success of seeds in two soils of different p. H HA = there is a significant difference between the germination of seeds in two soils of different p. H If the value for t exceeds the critical value (P = 0. 05), then you can reject the null hypothesis.
Construct the following table… Pot Group 1 (p. H 5. 5) (x – x)2 Group 2 (p. H 7. 0) (x – x)2 1 38 1. 27 39 20. 25 2 41 3. 52 45 2. 25 3 43 15. 02 41 6. 25 4 39 0. 02 46 6. 25 5 37 4. 52 48 20. 25 6 38 1. 27 39 20. 25 7 41 3. 52 46 6. 25 8 36 9. 77 44 0. 25 Mean 39. 1 1. 27 43. 5 20. 25 38. 88 82. 0
Calculate standard deviation for both groups Group 1: SD = (x – x)2 n– 1 = 38. 88 = 2. 36 8– 1 Group 2: SD = (x – x)2 n– 1 = 82. 0 8– 1 = 3. 42
Using your means and SDs, calculate value for t x 1 – x 2 t= (s 12/n 1) + (s 22/n 2) 39. 1 – 43. 5 t= = (2. 362/8) t = 2. 99 + (3. 422/8) -4. 4 0. 696 + 1. 462
Compare our calculated value of r with the relevant critical value in the stats table of critical values Our value of t = 2. 99 Degrees of freedom = n 1 + n 2 – 2 = 14 D. F. Critical Value (P = 0. 05) 14 15 16 17 18 2. 15 2. 13 2. 12 2. 11 2. 10 Our value for t exceeds the critical value, so we can reject the null hypothesis. We can conclude that there is a significant difference between the two means, so p. H does affect the germination rate for this plant.
Chi-Squared Tests in Ecology
The Chi-Squared Test ( 2) • The chi-squared test is used to study differences between data sets. • It is only used for frequencies (counts), never for measurements. • It is used to compare an experimental result with an expected theoretical outcome. • It is not a valid test for small sample sizes (n<20) • It tests the validity of the null hypothesis: no difference between groups of data. • In ecology, chi-squared tests are used to study habitat preference.
Mangrove (Avicennia marina var. resinifera)
Mangroves are doing well in USA these days…
Mangrove forests vs shrimp farming
Pneumatophores • Some plants have aerial roots – pneumatophores • These can come in handy in waterlogged soil where oxygen levels are very low…
Soil porosity is related to the average size of soil particles… • Larger particles, like sand, have larger spaces for air • Clay soils have very tiny particles, much smaller spaces for air
Experimental Design • 1 m x 1 m quadrats were placed around many mangroves in numerous locations. • Pneumatophores were counted. • Chi-squared test was use to compare observed results for pneumatophore density. • Null hypothesis: no difference in density between substrates (soils)
The Flat Periwinkle (Littorina littoralis)
Periwinkles feed on a number of seaweed species
Food preference is a form of animal behavior • Using quadrats, the number of periwinkles associated with each seaweed species was recorded.
State your null hypothesis for this investigation (H 0) • H 0: There is no difference between the numbers of periwinkles associated with different species. • What is the alternative hypothesis (HA)? • HA : There is a real difference between numbers of periwinkles associated with different species.
Use the chi-squared test to determine if the observed differences are significant or if they can be attributed to chance alone. Enter the observed values and calculate the chisquared value
Here’s how you do it… • The expected value (E) would be the mean number of periwinkles associated with the four seaweed species.
Now Complete the Chart… • Calculate the degrees of freedom: • 4 -1 = 3
Check your Chi-square table for 3 degrees of freedom. • 57. 4 >> 7. 82, 11. 34 • Is H 0 accepted or rejected? • There is a significant difference in feeding preferences of periwinkles.
Woodlice (Pillbugs) Experiment
Category Dry Humid O 15 35 E 25 25 O-E -10 +10 (O-E)2 100 Chi-squared value is 8 for 1 DF (O-E)2/E 4 4
Chi-Square Homework Worksheet • A marketing analyst wishes to see whether consumers have any preference among five flavors of a new fruit candy. • A random sample of 100 people provided the following data. Flavor Preference Cherry 32 Strawberry 28 Orange 16 Lime 14 Grape 10
What is the null hypothesis? • How do you calculate your expected value? • How many degrees of freedom do you have? • Show your chi-square calculations below, and state whether the null hypothesis is accepted or rejected.
Flavor Preference Cherry 32 Strawberry 28 Orange 16 Lime 14 Grape 10
A science teacher living in the desert noticed that native mesquite trees differed in terms of number of parasitic mistletoe plants they contained. Mesquite trees in residential neighborhoods tended to have fewer parasitic plants than trees found in undisturbed areas. • RESEARCH QUESTION: Is there really more mistletoe in the undisturbed areas than in his neighborhood? • What is hypothesis?
Your value for Chi-squared will be checked against this table to see if it is significantly confident Degrees of Freedom 0. 99 0. 95 1 0. 000 0. 004 2 0. 020 3 Probability, p 0. 05 0. 01 0. 001 3. 84 6. 64 10. 83 0. 103 5. 99 9. 21 13. 82 0. 115 0. 352 7. 82 11. 35 16. 27 4 0. 297 0. 711 9. 49 13. 28 18. 47 5 0. 554 1. 145 11. 07 15. 09 20. 52 6 0. 872 1. 635 12. 59 16. 81 22. 46 7 1. 239 2. 167 14. 07 18. 48 24. 32 8 1. 646 2. 733 15. 51 20. 09 26. 13 9 2. 088 3. 325 16. 92 21. 67 27. 88 10 2. 558 3. 940 18. 31 23. 21 29. 59 11 3. 05 4. 58 19. 68 24. 73 31. 26 12 3. 57 5. 23 21. 03 26. 22 32. 91 13 4. 11 5. 89 22. 36 27. 69 34. 53 14 4. 66 6. 57 23. 69 29. 14 36. 12 15 5. 23 7. 26 25. 00 30. 58 37. 70
Example 1: GENETICS Comparing the observed frequency of different types of maize grains with the expected ratio calculated using a Punnett square.
The photo shows four different phenotypes for maize grain, as follows: Purple & Smooth (A), Purple & Shrunken (B), Yellow & Smooth (C) and Yellow & Shrunken (D)
The Punnett square below shows the expected ratio of phenotypes from crosses of four genotypes of maize. Gametes PS Ps p. S ps PS PPSs Pp. SS Pp. Ss Ps PPSs PPss Pp. Ss Ppss p. S Pp. Ss pp. SS pp. Ss ps Pp. Ss Ppss pp. Ss ppss A : B : C : D = 9 : 3 : 1
What is the null hypothesis (H 0)? H 0 = there is no statistically significant difference between the observed frequency of maize grains and the expected frequency (the 9: 3: 3: 1 ratio) HA = there is a significant difference between the observed frequency of maize grains and the expected frequency If the value for χ2 exceeds the critical value (P = 0. 05), then you can reject the null hypothesis.
Calculating χ2 χ2 = (O – E)2 E O = the observed results E = the expected (or predicted) results
O-E (O-E)2 E 244 27 729 2. 99 73 81 -8 64 0. 88 C 63 81 -18 324 4. 00 D 26 27 -1 1 0. 04 433 χ 2= 7. 91 Phenotype O A 271 B E (9: 3: 3: 1)
Compare your calculated value of χ2 with the critical value in your stats table Our value of χ2 = 7. 91 Degrees of freedom = no. of categories - 1 = 3 D. F. Critical Value (P = 0. 05) 1 2 3 4 5 3. 84 5. 99 7. 82 9. 49 11. 07 Our value for χ2 exceeds the critical value, so we can reject the null hypothesis. There is a significant difference between our expected and observed ratios. i. e. they are a poor fit.
Example 2: ECOLOGY • One section of a river was trawled and four species of fish counted and frequencies recorded. • The expected frequency is equal numbers of the four fish species to be present in the sample.
What is the null hypothesis (H 0)? H 0 = there is no statistically significant difference between the observed frequency of fish species and the expected frequency. HA = there is a significant difference between the observed frequency of fish and the expected frequency If the value for χ2 exceeds the critical value (P = 0. 05), then you can reject the null hypothesis.
Calculating χ2 χ2 = (O – E)2 E O = the observed results E = the expected (or predicted) results
Species O E O-E (O-E)2 E Rudd 15 10 5 25 2. 5 Roach 15 10 5 25 2. 5 Dace 4 10 -6 36 3. 6 Bream 6 10 -4 16 1. 6 40 40 χ 2= 10. 2
Compare your calculated value of χ2 with the critical value in your table of critical values. Our value of χ2 = 10. 2 Degrees of freedom = no. of categories - 1 = 3 D. F. Critical Value (P = 0. 05) 1 2 3 4 5 3. 84 5. 99 7. 82 9. 49 11. 07 Our value for χ2 exceeds the critical value, so we can reject the null hypothesis. There is a significant difference between our expected and observed frequencies of fish species.
Mesquite and Mistletoe
Tree # 1 2 3 4 5 6 7 8 9 10 “wild” trees 26 11 16 9 3 14 5 8 14 15 Residential trees 6 8 1 0 0 0 3 3 3 1 Here is his data. Calculate the chi-squared value for the null hypothesis.