Sociology 5811 Lecture 9 CI Hypothesis Tests

Скачать презентацию Sociology 5811 Lecture 9 CI Hypothesis Tests

7eedea086003de757773f63da4bb8cf9.ppt

Количество слайдов: 50

Announcements • Problem Set #3 Due next week • Problem set posted on course website • We are a bit ahead of reading assignments in Knoke book • Try to keep up; read ahead if necessary

Review: Confidence Intervals • General formula for Confidence Interval: • • Where: Y-bar is the sample mean Sigma sub-Y-bar is the standard error of the mean Z (alpha/2) is the critical Z-value for a given level of confidence – If you want 90%, look up Z for 45% (a/2) – See Knoke, Figure 3. 5 on page 87 for info

Small N Confidence Intervals • Issue: What if N is not large? • The sampling distribution may not be normal • Z-distribution probabilities don’t apply… • Standard CI formula doesn’t work • Solution: Use the “T-Distribution” • A different curve that accurately approximates the shape of the sampling distribution for small N • Result: We can look up values in a “t-table” to determine probabilities associated with a # of standard deviations from the mean.

Confidence Intervals for Small N • Small N C. I. Formula: • Yields accurate results, even if N is not large • Again, the standard error can be estimated by the sample standard deviation:

T-Distributions • The T-distribution is a “family” of distributions • In a T-Distribution table, you’ll find many T-distributions to choose from – Basically, the shape of sampling distribution varies with the size of your sample • You need a specific t-distribution depending on sample size • One t-distribution for each “degree of freedom” – Also called “df” or “Dof. F” • Which T-distribution should you use? • For confidence intervals: Use T-distribution for df = N - 1 • Ex: If N = 15, then look at T-distribution for df = 14.

Looking Up T-Tables Choose the desired probability for a/2 Find t-value in correct row and column Choose the correct df (N-1) Interpretation is just like a Z -score. 2. 145 = number of standard errors for C. I. !

Answering Questions… • Knowledge of the standard error allows us to begin answering questions about populations • Example: National educational standard requires all schools to maintain a test score average of 60 • You observe that a sample (N=16, s=6) has a mean of 62 • Question: Are you confident that the school population is above the national standard? • We know Y-bar for the sample, but what about for the whole school? • Are we confident that > 60?

Question: Is > 60? • Strategy 1: Construct a confidence interval around Y-bar • And, see if the bounds fall above 60 • Visually: Confident that > 60: Y 58 59 60 61 62 63 64 • Visually: might be 60 or less 65 66 Y 58 59 60 61 62 63 64

Question: Is > 60? • Strategy 1: Construct a confidence interval around Y-bar – Let’s choose a desired confidence level of. 95 – N of 16 is “small”… we must use the t-distribution, not the Z-distribution – Look up t=value for 15 degrees of freedom (N-1).

Looking Up T-Tables Choose the desired probability for a/2 Find t-value in correct row and column Choose the correct df (N-1)=15 Result: t = 2. 131

Question: Is > 60? • Strategy 1: Construct a confidence interval around Y-bar • CI is 58. 53 to 65. 47! We aren’t confident > 60 Y 58 59 60 61 62 63 64 65 66

Question: Is > 60? • Note #1: Results would change if we used a different confidence level • A 95% and 50% CIs yield different conclusions: Y 58 59 60 61 62 63 64 65 66 • Idea: Wouldn’t it be nice to know exactly which CI would describe the distance from Y-bar to ? • i. e. , to calculate the exact probability of Y-bar falling a certain distance from ?

Question: Is > 60? • Note #2: We typically draw CIs around Y-bar – But, we can also get the same result focusing on our comparison point (Y = 60) • Example: If 60 is outside of CI around Y-bar Y 58 59 60 61 62 63 64 65 66 • Then, Y-bar is outside of the CI around 60 Y 58 59 60 61 62 63 64 65 66

Question: Is > 60? • The critical issue is: How far is the distance between Y-bar and 60 – Is it “far” compared to the width of the sampling distribution? • Ex: Y-bar is more than 2 Standard Errors from 60? • In which case, the school probably exceeds the standard – Or, is it relatively close? • Ex: Y-bar is only. 5 Standard Errors from 60 • In which case we aren’t confident… – Note: If we know the sampling distribution is normal (or t-distributed), we can convert SE’s to a probability

Question: Is > 60? • Strategy 2: Determine the probability of Y-bar = 62, if is really 60 or less • Procedure: – 1. Use Y=60 as a reference point – 2. Determine how far Y-bar is from 60, measured in Standard Errors • Which we can convert to a probability – 3. Issue: Is it likely to observe a Y-bar as high as 62? • If this is common to observe, even when = 60 (or less), then we can’t be confident that > 60! • But, if that is a rare event, we can be confident that > 60!

Question: Is > 60? • Strategy 2: Look at sampling distribution • Confident that not 60 or less: m is unlikely to really be 60… because Y-bar usually falls near the center of the sampling distribution! Y 58 59 60 61 62 63 64 • Visually: might easily be <60 59 60 61 62 66 In this case, it is common to get Y-bars of 62 or even higher Y 58 65 63 64 65 66

Question: Is > 60? • Issue: How do we tell where Y-bar falls within the sampling distribution? • Strategy: Compute a Z-score • Recall: Z-scores help locate the position of case within a distribution • It can tell us how far a Y-bar falls from the center of the sampling distribution • In units of “standard errors”! • Probability can be determined from a Z-table • Note: for small N, we call it a t-score, look up in a t-table.

Question: Is > 60? • Note: We use a slightly modified Z formula • “Old” formula calculates # standard deviations a case falls from the sample mean • From Y-sub-i to Y-bar • New formula tells the number of standard errors a mean estimate falls from the population mean • Distance from Y-bar to in the sampling distribution • In this case we compare to hypothetical = 60.

Question: Is > 60? • Let’s calculate how far Y-bar falls from – Since N is small, we call it a “t-score” or “t-value” • Y-bar is 1. 33 standard errors above 60!

Question: Is > 60? • Question: What is the probability of t>1. 33 • i. e. , Y-bar falling 1. 333 or more standard errors from )? This area reflects the probability Y 58 59 60 61 • Result: p = about. 105 62 63 64 65 66 • Note: Knoke t-table doesn’t contain this range… have to look it up elsewhere or use SPSS to calculate probability.

Question: Is > 60? • Result: p =. 105 • In other words, if = 60, we will observe Y-bar of 62 or greater about 10% of the time • Conclusion: It is plausible that is 60 or lower • We are not 95% confident that > 60 • Conclusion matches result from confidence interval • We have just tested a claim using inferential statistics!

Hypothesis Testing • Hypothesis Testing: • A formal language and method for examining claims using inferential statistics – Designed for use with probabilistic empirical assessments • Because of the probabilistic nature of inferential statistics, we cannot draw conclusions with absolute certainty – We cannot “prove” our claims are “true” – However, improbable, we will occasionally draw an un-representative sample, even if it is random

Hypothesis Testing • The logic of hypothesis testing: • We cannot “prove” anything • Instead, we will cast doubt on other claims, thus indirectly supporting our own • Strategy: • 1. We first state an “opposing” claim • The opposite of what we want to claim • 2. If we can cast sufficient doubt on it, we are forced (grudgingly) to accept our own claim.

Hypothesis Testing • Example: Suppose we wish to argue that our school is above the national standard • First we state the opposite: • “Our school is not above the national standard” • Next we state our alternative: • “Our school is above the national standard” • If our statistical analysis shows that the first claim is highly improbable, we can “reject” it, in favor of the second claim • …“accepting” the claim that our school is doing well.

Hypothesis Testing: Jargon • Hypotheses: Claims we wish to test • Typically, these are stated in a manner specific enough to test directly with statistical tools – We typically do not test hypotheses such as “Marx was right” / “Marx was wrong” – Rather: The mean years of education for Americans is/is not above 18 years.

Hypothesis Testing: Jargon • The hypothesis we hope to find support for is referred to as the alternate hypothesis • The hypothesis counter to our argument is referred to as the null hypothesis • Null and alternative hypotheses are denoted as: • H 0: School does not exceed the national standard • H-zero indicates null hypothesis • H 1: School does exceed national standard • H-1 indicates alternate hypotheses • Sometimes called: “Ha”

Hypothesis Testing: More Jargon • If evidence suggests that the null hypothesis is highly improbable, we “reject” it • Instead, we “accept” the alternative hypothesis • So, typically we: • Reject H 0, accept H 1 – Or: • Fail to reject H 0, do not find support for H 1 • That was what happened in our example earlier today…

Hypothesis Testing • In order to conduct a test to evaluate hypotheses, we need two things: • 1. A statistical test which reflects on the probability of H 0 being true rather than H 1 • Here, we used a z-score/t-score to determine the probability of H 0 being true • 2. A pre-determined level of probability below which we feel safe in rejecting H 0 (a) • In the example, we wanted to be 95% confident… a =. 05 • But, the probability was. 10, so we couldn’t conclude that the school met the national standard!

Hypothesis Test for the Mean • Example: Laundry Detergent • Suppose we work at the Tide factory • We know the “cleaning power” of tide detergent, exactly: It is 73 on a continuous scale. • “Cleaning Power” of Tide = 73 • You conduct a study of a competitor. You buy 50 bottles of generic detergent and observe a mean cleaning power of 65 • H 0: Tide is no better than competitor ( >= 73) • H 1: Tide is better than competitor ( < 73)

Hypothesis Test: Example • It looks like Tide is better: • Cleaning power is 73, versus 65 for a sample of the competition • Question: Can we reject the null hypothesis and accept the alternate hypothesis? • Answer: No! It is possible that we just drew an atypical sample of generic detergent. The true population mean for generics may be higher.

Hypothesis Test: Example • We need to use our statistical knowledge to determine: • What is the probability of drawing a sample (N=50) with mean of 65 from a population of mean 73 (the mean for Tide) • If that is a probable event, we can’t draw very strong conclusions… • But, if the event is very improbable, it is hard to believe that the population of generics is as high as that of Tide… • We have grounds for rejecting the null hypothesis.

Hypothesis Test: Example • How would we determine the probability (given an observed mean of 65) that the population mean of generic detergent is really 73? • Answer: We apply the Central Limit Theorem to determine the shape of the sampling distribution • And then calculate a Z-value or T-value based on it • If we chose an alpha (a) of. 05 • If we observe a t-value with probability of only. 0023, then we can reject the null hypothesis. • If we observe a t-value with probability of. 361, we cannot reject the null hypothesis

Hypothesis Test: Steps • 1. State the research hypothesis (“alternate hypothesis), H 1 • 2. State the null hypothesis, H 0 • 3. Choose an a-level (alpha-level) – Typically. 05, sometimes. 10 or. 01 • 4. Look up value of test statistic corresponding to the a-level (called the “critical value”) • Example: find the “critical” t-value associated with a=. 05

Hypothesis Test: Steps • 5. Use statistics to calculate a relevant test statistic. – T-value or Z-value – Soon we will learn additional ones • 6. Compare test statistic to “critical value” – If test statistic is greater, we reject H 0 – If it is smaller, we cannot reject H 0

Hypothesis Test: Steps • Alternate steps: • 3. Choose an alpha-level • 4. Get software to conduct relevant statistical test. – Software will compute test statistic and provide a probability… the probability of observing a test statistic of a given size. – If this is lower than alpha, reject H 0

Hypothesis Test: Errors • Due to the probabilistic nature of such tests, there will be periodic errors. • Sometimes the null hypothesis will be true, but we will reject it – Our alpha-level determines the probability of this • Sometimes we do not reject the null hypothesis, even though it is false

Hypothesis Test: Errors • When we falsely reject H 0, it is called a Type I error • When we falsely fail to reject H 0, it is called a Type II error • In general, we are most concerned about Type I errors… we try to be conservative.

Hypothesis Tests About a Mean • What sorts of hypothesis tests can one do? • 1. Test the hypothesis that a population mean is NOT equal to a certain value – Null hypothesis is that the mean is equal to that value. • 2. Population mean is higher than a value – Null hypothesis: mean is equal or less than a value • 3. Population mean is lower than a value – Null hypothesis: mean is equal or greater than a value • Question: What are examples of each?

Hypothesis Tests About Means • Example: Bohrnstedt & Knoke, section 3. 93, pp. 108 -110. N = 1015, Y-bar = 2. 91, s=1. 45 • H 0: Population mean = 4 • H 1: Population mean not = 4 • Strategy: • 1. Choose Alpha (let’s use. 001) • 2. Determine the Standard Error • 3. Use S. E. to determine the range in which sample means (Y-bar) is likely to fall 99. 9% of time, IF the population mean is 4. • 4. If observed mean is outside range, reject H 0

Example: Is =4? • Let’s determine how far Y-bar is from hypothetical =4 • In units of standard errors • Y-bar is 24 standard errors below 4. 0!

Hypothesis Tests About a Mean • A Z-table (if N is large) or a T-table will tell us probabilities of Y-bar falling Z (or T) standard deviations from • In this example, the desired a =. 001 • Which corresponds to t=3. 3 (taken from t-table) – That is: . 001 (i. e, . 1%) of samples (of size 1015) fall beyond 3. 29 standard errors of the population mean – 99. 9% fall within 3. 29 S. E. ’s.

Hypothesis Tests About a Mean • There are two ways to finish the “test” • 1. Compare “critical t” to “observed t” – Critical t is 3. 3, observed t = -24 • We reject H 0: t of +/-24 is HUGE, very improbable • It is highly unlikely that = 4 • 2. Actually calculate the probability of observing a t-value of 24, compare to pre-determined a • If observed probability is below a, reject H 0 – In this case, probability of t=27 is. 0000000… • Very improbable. Reject H 0!

Two-Tail Tests • Visually: Most Y-bars should fall near • 99. 9% CI: – 3. 3 < t < 3. 3, or 3. 85 to 4. 15 Mean of 2. 91 (t=24) is far into the red area (beyond edge of graph) Sampling Distribution of the Mean 3. 85 Z=-3. 3 4 4. 15 Z=+3. 3

Hypothesis Tests About a Mean • Note: This test was set up as a “two-tailed test” • Meaning, that we reject H 0 if observed Y-bar falls in either tail of the sampling distribution • Ex: Very high Y-bar or very low Y-bar means reject H 0 – Not all tests are done that way… Sometimes you only reject H 0 if Y-bar falls in one particular tail.

Hypothesis Testing • Definition: Two-tailed test: A hypothesis test in which the a-area of interest falls in both tails of a Z or T distribution. • Example: H 0: m = 4; H 1: m ≠ 4 • Definition: One-tailed test: A hypothesis test in which the a-area of interest falls in just one tail of a Z or T distribution. • Example: H 0: > or = 4; H 1: < 4 • This is called a “directional” hypothesis test.

Hypothesis Tests About Means • A one-tailed test: H 1: < 4 • Entire a-area is on left, as opposed to half (a/2) on each side. Also, critical t-value changes. 4

Hypothesis Tests About Means • T-value changes because the alpha area (e. g. , 5%) is all concentrated in one size of distribution, rather than split half and half. • One tail vs. Two-tail: a=. 05 a/2=. 025

Hypothesis Tests About Means • Use one-tailed tests when you have a directional hypothesis – e. g. , > 5 • Otherwise, use 2 -tailed tests • Note: In many instances, you are more likely to reject the null hypothesis when utilizing a onetailed test – Concentrating the alpha area in one tail reduces the critical T-value needed to reject H 0

Tests for Differences in Means • A more useful and interesting application of these same ideas… • Hypothesis tests about the means of two different groups – Up until now, we’ve focused on a single mean for a homogeneous group – It is more interesting to begin to compare groups – Are they the same? Different? • We’ll do that next class!