7f2f927eb5829cb62127c30c2d7712da.ppt
- Количество слайдов: 31
Measuring Coverage: Post Enumeration Surveys Owen Abbott Office for National Statistics, UK
Agenda • Introduction • Why have a PES? • Essential features of a PES – Survey Design – Fieldwork • Analysing the data – Matching – Estimation • Results from 2001 UK Census • Discussion
Why do we need a PES? • Census won’t count every household or person • Undercount causes bias in estimates • In the UK in 2001, we estimated that 3 million persons (6%) did not fill in the form • Increasing problem from 1981 to 1991 to 2001 • The undercount is not evenly spread – Inner Cities – Deprived areas – Young persons
Why do we need a PES? • Census counts alone not good enough • UK Users demand robust census population estimates – Central Government resource allocation – Yearly demographic population estimates – Government Policy • So we need to measure how many households and persons the census misses, and work out: – where they are missed from – their characteristics
Basic Methodology • PES - Census Coverage Survey (CCS) in UK – In the UK approx 1% population • Match the PES to the Census • Use the people the PES sees that the census didn’t to estimate how many missed – where and characteristics • Add to the Census counts (either at aggregate level or impute (UK))
2001 UK ‘One Number Census’ framework
Post Enumeration Survey Key features: A - Design – Sample survey – Sample size dependent on accuracy (and geographic level) requirements B - Fieldwork – – – Conducted after the census has finished Independent re-enumeration Area based Door to door interview Focused on measuring coverage
Post Enumeration Survey - Design • Multi-stage Stratified sample • Select a sample of (small) geographical areas that can be re-enumerated – UK uses Postcodes (about 20 hhs) – US uses blocks (about ? ? 100 hhs) • Sample stratified by: – Geography – Area type – Demography
2001 UK PES Design Geographical Strata: • Local Authorities (mean pop 120 k) grouped into contiguous groups called Estimation areas (EAs), each having 500 k pop Area Type and Demographic strata: • Within every EA a sample of 1991 Enumeration Districts was selected, stratified using a hard-tocount index and the 1991 age-sex structure – (1991 EDs have about 200 households)
2001 UK PES Design • Hard to count index was a national stratification using a combination of variables associated with undercount e. g: – – Unemployed Multi-occupied Private rented Language difficulty • 3 level index, split into 40%, 20% nationally • Within each selected ED a sample of 3, 4 or 5 postcodes was selected
Post Enumeration Survey - Field • Aim: enumerate all the people and households in the sampled areas • Carry out the survey after the Census – Census fieldwork finished • Independence critical (see later) – Interview based – Independent re-enumeration – Separate fieldforce and management – No address list (UK have address list for Census) – Difficult if doing quality at same time, as not independent
Post Enumeration Survey - Field • In UK, focused on measuring coverage – Previously measured quality as well – Found that separate surveys more effective – Can focus on getting maximal response in sampled areas • UK 2001 PES used very short interview – key household and demographic questions only • • Accommodation type Tenure Name Gender Date of Birth (or Age) Student Ethnicity Activity last week
Post Enumeration Survey - Field • Other initiatives to maximise response: – Pairwork and teamwork – Refusal avoidance training – Calling strategy – Up to 10 attempts to interview – Last attempt deliver form to return in post
Post Enumeration Survey • Interviewer Duties: – Establish the postcode boundaries – Conduct independent listing of all residential and nonresidential addresses – Seek out obscure accommodation – Deliver advance notification cards – Identify/probe for all households at an address – – Make contact with householders Conduct doorstep interviews Persuade potential refusals Report Progress
Post Enumeration Survey • Map
Post Enumeration Survey • Property Listing
Analysing the data - Matching • Match Census returns to CCS returns • Require very high quality – Minimise false negative matches (missed matches, see later) • In 2001, we used hierarchical nature of data to help match – Match within sampled areas (geographical blocking) – First match household – Then match persons within households
Analysing the data - Matching • Used a five stage strategy, designed to minimise false negative matches: – – – Exact matching High probability matching Clerical assisted probability matching Clerical matching Final expert review of non-matches • Developed our own in-house system • Allowed access to scanned form images (this was crucial)
PO 155 RR 29 29 ERIC SMITH 13 13 MALE SINGLE
Analysing the data - Matching • Output: – Match between Census and CCS – Census only – CCS only
Analysing the data – Estimation • Dual System Estimation (DSE) – Capture-recapture as used for wildlife • Simple example: How many fish in a lake? – Catch as many as possible on day 1 • Count them (N 1) • Mark with a red dot • Return them to the lake – Catch as many as possible on day 2 • Count them (N 2) • Count how many have red dots (N 12) – Number of fish in lake= (N 1 * N 2)/N 12
Analysing the data - Estimation • Use matched Census+CCS data • DSE estimates adjustment for those missed in both Census and CCS Counted By Census Yes No Counted By CCS Yes No n 11 n 10 n 1+ n 01 n 00 n 0+ n+1 n+0 n++ DSE count (for a postcode): n++ = n 1+ x n+1 n 11
Analysing the data - Estimation • DSE assumptions – Independence – Homogeneity of capture probabilities – Perfect matching – Closure – No list inflation • Violation of these assumptions leads to bias (in both directions) • Lots of literature on DSE
Analysing the data – Estimation • DSE can only be used within the sample • Need additional step to get to population totals • In 2001, we used DSE at postcode level • Then used a ratio estimator to predict for non- sampled postcodes (again lots of literature)
Analysis – Getting to small areas • Ratio estimator produced estimates for 500 k population blocks • Needed estimates for Local Authorities (about 120 k population) • Sample size not sufficient to do directly • So used small area estimation techniques – these borrow strength across areas – We used a fixed effect to model LA differences • LA population estimates from the model then constrained to EA totals
Quick summary of 2001 UK method • In 2001, One Number Census methodology was developed – – – Large CCS (320, 000 households) Matching Capture Recapture Modified ratio estimator Small area estimation to get LA totals Imputation • Estimated 1. 5 million households missed • 3 million persons missed (most from the missing households but some from counted households)
Results • England Wales population about 50 m individuals in 20 m households • Estimated 1. 5 million households missed • 3 million persons missed (most from the missing households but some from counted households)
Underenumeration in 2001
Response Rates in 2001
Summary • Fundamental that the census is good – This does not make a bad census good, it makes a good census better! • US, Australia, NZ, Canada, UK all measure coverage (and most use a PES) – All aim at measuring coverage for assessing census quality, most do not fully adjust the outputs – Coverage for most is around 96 -98% – Increasing problems of overcoverage • The design and fieldwork of the PES are important to get right
More info • Brown, J. J. , Diamond, I. D. , Chambers, R. L. , Buckner, L. J. , and Teague, A. D. (1999), “A methodological strategy for a one-number census in the UK, ” Journal of the Royal Statistical Society A, 162, 247 -267. • www. statistics. gov. uk/census 2001/onc. asp • owen. abbott@ons. gov. uk
7f2f927eb5829cb62127c30c2d7712da.ppt