
e4c3a33441096c56cf49b644f8ba224d.ppt
- Количество слайдов: 59
Sampling It is rare to have access to all the information we would like to know about a given situation. Usually we need to examine a portion of the total system then extend our knowledge of that portion to the total system. Such a portion is said to be a sample and the total system is referred to as the population.
Definitions A population is the complete set that we seek to obtain information about A sample is some part of the population (a subset) that is actually available as a source of information
. Mathmatically, we can describe both samples and populations using measures such as the mean, median, mode and standard deviation. When these terms are used to describe the properties of a sample, they are referred to as statistics. When they are used to describe the properties of a population, they are referred to as parameters.
In real life, calculating parameters is prohibitive as populations tend to be large. As a result, most population parameters are unknown. Rather than investigate a whole population, we can take a sample and calculate the desired statistic from the sample. http: //www. youtube. com/watch? v=Lfg. Pm. KTd. Us. E Perdisco Sampling
Why Sample? It is not always practical or even desirable to analyse the entire data (population) relating to a particular problem. This may be because it is: • Physically impossible to collect • Too expensive to collect • Dynamic and may change over time • Too time consuming
What is sampling? Sampling is used as a tool to collect information from some units of a population and compile the information into a useful form. A sample should be unbiased; it should be representative of the population. If there is bias, the results will be of no value.
Simple rules help to eliminate bias: • Do not only use people who volunteer to be in the sample • Do not choose a sample using a method that omits segments of the population • Do not use people in the sample because they are ‘handy’ • Ensure that the person selecting the sample does not have a vested interest in the results
Sampling In a sampling, only part of the total population is approached for information on the topic under study. These data are then 'expanded' or 'weighted' to make inferences about the whole population. We define the sample as the set of observations taken from the population for the purpose of obtaining information about the population
Advantages of Sampling vs Census • Reduces cost - both $ and staffing requirements • Less time needed to collect & process data & produce results. • Enables characteristics to be tested which could not otherwise be assessed. An example is life span of light bulbs. To test all light bulbs is not possible as the test needs to destroy the product so only a sample of bulbs can be tested. • Surveys lead to less respondent burden, as fewer people are needed to provide the required data. • Results can be made available quickly
Disadvantages of Sampling vs Census • Data on sub-populations (such as a particular ethnic group) may be too unreliable to be useful. • Data for small geographical areas also may be too unreliable to be useful. • (Because of the above reasons) detailed cross-tabulations may not be practical. • Estimates are subject to sampling error which arises as the estimates are calculated from a part (sample) of the population. • May have difficulty communicating the precision (accuracy) of the estimates to users.
Steps in sampling 1. Plan the survey – Identify the question – Decide who to include 2. Collect data – Interview – Written response – variables
Steps in sampling con’t 3. Organise the data – Arrange into accessible categories 4. Display the information – Select visual presentation 5. Analyse the data – Measures of central tendency and dispersion 6. Draw conclusions – Communicate the findings
For revision you might like to look at the following two videos… http: //www. statisticslectures. com/topics/samplingmethods/ http: //www. youtube. com/watch? v=r. ASK 8 Ppqak. M
Imagine you wanted to survey customers at Mc. Donalds so you decide to select every ninth person that walks in the door – the problem with this sampling technique is that every ninth person may not want to speak to you and may be in a hurry – so you end up only stopping people who look like they are not in a hurry
How it might work in practice – OK I have interviewed 5 teenagers, stop there, now I go looking for 5 adults etc.
College A - 200 trade students, 800 Accounting Students - sample of 100 students must contain 20 trade and 80 Accounting - so find the strata levels for each college but you don't have time to sample every college so you take a random sample of a few colleges - however it would be better if the sample was also stratified and not random - eg 10 colleges 2 are country and 8 are city - if we want to sample 5 colleges make sure one is a country college.
Multistage sampling is a complex form of Cluster sampling. Using all the sample elements in all the selected clusters may be prohibitively expensive or not necessary. The technique is used frequently when a complete list of all members of the population does not exist and is inappropriate. For example, household surveys conducted by the Australian Bureau of Statistics begin by dividing metropolitan regions into 'collection districts', and selecting some of these collection districts (first stage). The selected collection districts are then divided into blocks, and blocks are chosen from within each selected collection district (second stage). Next, dwellings are listed within each selected block, and some of these dwellings are selected (third stage). This method means that it is not necessary to create a list of every dwelling in the region, only for selected blocks. In remote areas, an additional stage of clustering is used, in order to reduce travel requirements
Clerical errors - missing data, not recording sample responses correctly
Sampling Errors A sample free of bias will still contain naturally occurring random variation. No 2 samples will be identical - so how much variation can be expected? This random discrepancy between a measurement from a sample (sample statistic) and the population quantity being estimated (population parameter) is called sampling error.
Non sampling errors Non-sampling errors arise from the research mechanisms used in collecting and analysing data: – Questionnaires – Interviews – Reponses – Analysis
The following Slides are taken from an article which is on Share. Point The file is called - Questionnaire Design http: //www. fao. org/docrep/W 3241 E/w 3241 e 05. htm
What is a better way to ask the question? ……
What is a better way to ask the question? …… Rate the dairy drink on the following qualities Overall Satisfaction - If price were not the issue would you buy this product? Yes / No Sweetness: Not Sweet enough / Just right / Too Sweet Texture: Too thick / About right / Too thin Colour: Attractive / Acceptable / Not good How much would you be prepared to pay for this product: Less than $1 / between $1 and $2 / $3
Again a series of questions would be better Overall Satisfaction - Did you like the product? Yes / No Sweetness: Not Sweet enough / Just right / Too Sweet Texture: Too thick / About right / Too thin Colour: Attractive / Acceptable / Not good
What is wrong with the following questions?
Yep – you guessed it…. – way too confusing
Selection of end of Chapter questions
Selection of end of Chapter questions
Can't survey all suburbs - too expensive - select 3 to 4 suburbs (representative of the whole - higher and lower Socio-economic suburbs. Still too expensive select certain regions within those suburbs – break it down into Street batches.
Secondary data - is data collected by someone else e. g. Australian Bureau of Statistics (ABS)
Could you really draw valid conclusions about the Australia’s views on an issue from a sample of 1000 people telephoned at random. That is the sample size big enough and does it cover rural, city - eastern states, western states, age brackets, male female
Past exam questions For a major project, a statistics student aims to investigate the opinions of TAFE students about the facilities provided at the college library. He intends to collect data using a questionnaire. Describe 4 features of a good quality questionnaire. Suggested responses http: //www. fao. org/docrep/W 3241 E/w 3241 e 05. htm
. • The student suspects that opinions will depend upon the course in which the student is enrolled. What method should ne used for sampling? Explain how this could be applied.
. • The student suspects that opinions will depend upon the course in which the student is enrolled. What method should ne used for sampling? Explain how this could be applied. • Stratafied sampling technique – for example if ½ of the students were enrolled in Accounting and ¼ in IT and a ¼ in Financial Planning then the sample should be made up of students in the same proportions.
. • Give one example of quantitative data that is discrete and one example of qualitative data that may be collected by the responses to this questionnaire on library facilities
• . Give one example of quantitative data that is discrete and one example of qualitative data that may be collected by the responses to this questionnaire on library facilities • Quantitative Discrete data – approximately how many books do you borrow per semester? • Qualitative data – how would you rank the service you received from library staff; (1 = very poor, 2 = poor, 3 = satisfactory, 4 = good, 5 = excellent)
. Examine the following questions from a questionnaire. Identify any issues or potential problems with each question and suggest an alternative: “How do you feel about building an ice-rink in downtown Macksville where the railroad property has been sitting unused for a number of years? ” “You wouldn’t say that you are in favour of gun control, would you? ” “In these uncertain economic times with the stock market down and corporate scandal on the rise, would you support more regulation of big business? ”
. “How do you feel about building an ice-rink in downtown Macksville where the railroad property has been sitting unused for a number of years? ” • Question is too long/complicated. Requires specific local knowledge • Suggested alternative: Should an ice-rink be built in Macksville? 1 = strongly agree, 2 = agree, 3= disagree 4 = strongly disagree
. “You wouldn’t say that you are in favour of gun control, would you? ”
. “You wouldn’t say that you are in favour of gun control, would you? ” • Leading question/ accusatory tone – suggests there is something wrong with the respondent if they are in favour of gun control • Suggested alternative: Are you in favour of gun control?
. “In these uncertain economic times with the stock market down and corporate scandal on the rise, would you support more regulation of big business? ” • Too long/ vague/ complicated plus the words corporate scandal on the rise it is also leading the respondent to give an answer that is biased • Suggested alternative: Do you think we need more regulation of big business?