Скачать презентацию Producing household estimates from administrative data Methodology and Скачать презентацию Producing household estimates from administrative data Methodology and

5256ec0a5c6960926b489d74a7b0f712.ppt

  • Количество слайдов: 21

Producing household estimates from administrative data Methodology and analysis towards ONS Research Outputs 2016 Producing household estimates from administrative data Methodology and analysis towards ONS Research Outputs 2016

Definitions A household is defined as: one person living alone, or a group of Definitions A household is defined as: one person living alone, or a group of people (not necessarily related) living at the same address who share cooking facilities and share a living room or sitting room or dining area.

Aims Produce household statistics as part of Research Outputs 2016. Three types of statistics Aims Produce household statistics as part of Research Outputs 2016. Three types of statistics over the next few years: • Number of households • Household size • Household composition Priority for 2016 is household numbers Derived from the same base as population estimates (SPD) Replicate a similar output package as the population estimates time series Can be produced at various levels of geography Updated in line with newer SPDs.

What data can we use? Address Base Population Coverage Survey Tax and Benefits data What data can we use? Address Base Population Coverage Survey Tax and Benefits data

Comparing with other ONS outputs No mid year estimates as with population OA Output Comparing with other ONS outputs No mid year estimates as with population OA Output Area • Can evaluate quality in 2011 by comparing with Census estimates, down to OA level. • DAU produce national estimates for 1996 onwards: • Families and people in families • Households and people in households Produced from LFS – sample size - 41, 000 households containing around 100, 000 individuals. Internally estimates can be produced at Local Authority level. DAU Demographics Analysis Unit LFS Labour Force Survey SPD Statistical Population Database

Can Address. Base help? There are 1128 classifications of address on Address Base, an Can Address. Base help? There are 1128 classifications of address on Address Base, an Ordinance Survey product, including care home, house boat and caravan. Classifications have four levels of detail (many/most do not) and have dates attached, that allows further validation. C Commercial L Land M RB Ancillary Building RC Car Park Space Military RD Dwelling O Other (Ordnance Survey Only) RG Garage P Parent Shell RH House In Multiple Occupation R Residential RI Residential Institution U Unclassified X Dual Use Z Object of Interest

Address Matching Address matching methodology is developing at ONS - estimate a 5% increase Address Matching Address matching methodology is developing at ONS - estimate a 5% increase in match rate. OSAPR Ordnance Survey Address-Point Reference Number UPRN Unique Property Reference Number Need a reliable unique identifier for addresses - transition from OSAPR to UPRN

Changes in address identification OSAPR Yorkshire and T. . . West Midlands Ordnance Survey Changes in address identification OSAPR Yorkshire and T. . . West Midlands Ordnance Survey Address-Point Reference Number Wales South West South East North West UPRN North East Unique Property Reference Number London East of England East Midlands UPRNs 2014 1. 00 OSAPRs 2013 2. 00 3. 00 4. 00 OSAPRs/UPRNs in millions Currently ONS has attached OSAPRs onto records up to 2013, with a switch to UPRNs in 2014. We would expect an increase due to housing stock growth of around 1%. 77% of LAs show an increase of more than 1 %.

Challenges Our three biggest challenges for producing household numbers Definition – household/address is not Challenges Our three biggest challenges for producing household numbers Definition – household/address is not a one to one relationship. Half weights on SPD – when sources disagree e. g. Correct address allocation • data lags • high churn • people not deregistering • poor Address. Base matching/allocation

Dealing with half sizes Our objective is to count each person in a household Dealing with half sizes Our objective is to count each person in a household – need to resolve unmatched records Two methods 1. Redistribute according to household size distributions 2. Source preference HESA è PR è CIS (based on most likely address identifier depending on demographic group/ dates on data)

Dealing with half sizes 25, 000 Comparing with Census Outputs Over counting large household Dealing with half sizes 25, 000 Comparing with Census Outputs Over counting large household sizes, whilst undercounting 1 and 2 person households. 20, 000 15, 000 10, 000 5, 000 hh size 1 hh size 2 hh size 3 hh size 4 hh size 5 total hhs Census - QS 406 EW Redistributed half sizes plus % differences 25 20 15 10 5 0 -5 -10 -15 -20 -25 hh size 1 hh size 2 hh size 3 hh size 4 hh size 5 total hhs plus Redistributed half sizes It is anticipated that better address matching and the use of UPRNs rather than OSAPRs will resolve some of these differences.

Dual System Estimation ONS often uses DSE to weight up for non response. To Dual System Estimation ONS often uses DSE to weight up for non response. To trial the use of DSE, to weight up for undercount, I used a 4% sample by postcode taken from the Census as a proxy for a survey. To allow for differences in samples, 400 samples were taken. England Wales Yorkshire and The Humber In the future, an annual survey similar to a coverage survey could contribute. West Midlands Wales South West South East North West North East London East of England East Midlands -14 -12 -10 -8 SPD % diff -6 -4 -2 0

aggregate Fo r ea ch LA Dual System Estimation (Census addresses * SPD addresses) aggregate Fo r ea ch LA Dual System Estimation (Census addresses * SPD addresses) Matched addresses Then to scale up to England Wales Entire population match Sample population

Dual System Estimation Take 400 random samples of 4% of all postcodes to generate Dual System Estimation Take 400 random samples of 4% of all postcodes to generate 400 samples of admin data to produce 400 estimations. There 1, 305, 301 unique postcodes on Census and 1, 326, 885 on SPD. There are no missing values on either. We can examine each sample to see just how representative it is. 400

Dual System Estimation Impact of DSE on household counts England Wales Yorkshire and The Dual System Estimation Impact of DSE on household counts England Wales Yorkshire and The Humber West Midlands Wales South West South East North West North East London East of England East Midlands -14 -12 -10 -8 DSE % diff -6 SPD % diff -4 -2 0 85% of Local Authorities are within 0. 5% of Census estimate 90% of Local Authorities are within 1% of Census estimate 95% of Local Authorities are within 1. 5% of Census estimate

Dual System Estimation Influencing factors SPD match rate has a large influence on estimations Dual System Estimation Influencing factors SPD match rate has a large influence on estimations Postcode density – may be higher in London? Communal establishments – need to be sure that they are not included in SPD numbers It is hoped that a large number of unmatched addresses will be resolved in the near future, reducing the undercount of households seen in the SPD.

Allocating address at SPD record level Using many data sources to find our ‘best’ Allocating address at SPD record level Using many data sources to find our ‘best’ address. Benefits Enables aggregation at different levels and cross tabulation with other variables. Can weight certain data sources for different demographic groups. e. g. students Note: a non valid UPRN may occur when the address given cannot be matched to one on reference data, or is not in England Wales

Allocating address at record level PR Joe Bloggs 17/4/1974 UPRN: 12345 True match on Allocating address at record level PR Joe Bloggs 17/4/1974 UPRN: 12345 True match on SPD CIS Joe Bloggs 17/4/1974 UPRN: 12346 Patient register moves - PDS Joe Bloggs 17/4/1974 move 1 - 1/1/2011: UPRN: 12345 Joe Bloggs 17/4/1974 move 2 - 2/2/2011: UPRN: 22345 Joe Bloggs 17/4/1974 move 3 - 3/3/2013: UPRN: 12346 Can use activity data to locate the newest address.

Using all addresses from datasources If we consider any address found on any data Using all addresses from datasources If we consider any address found on any data source to be a live address, we could use a count as an estimate of all live addresses. Cons: Loss of coherence with the population base – usual residency. 24, 400, 000 24, 200, 000 24, 000 23, 800, 000 23, 600, 000 23, 400, 000 23, 200, 000 23, 000 22, 800, 000 Published * 2011 From Datasources 2013 2014 * 2011 - Census Outputs, 2013 and 2014 - Demographics Analysis Unit Pros: Includes all live addresses. Further refinement of addresses by address type can be applied. e. g. removal of communal establishments.

Household Composition ONE PERSON Households in Thousands -1000 3000 5000 ONE PERSON OVER 65 Household Composition ONE PERSON Households in Thousands -1000 3000 5000 ONE PERSON OVER 65 ONE PERSON - 65 AND UNDER FAMILY - ALL OVER 65 FAMILY - MARRIED - NO CHILDREN FAMILY - MARRIED - DEPENDENT CHILDREN FAMILY - MARRIED - NON-DEPENDENT FAMILY - COHABIT - NO CHILDREN OTHERLONE PARENT FAMILY - COHABIT - DEPENDENT CHILDREN FAMILY - COHABIT - NON-DEPENDENT LONE PARENT - DEPENDENT CHILDREN LONE PARENT - NON-DEPENDENT CHILDREN OTHER - DEPENDENT OTHER - ALL STUDENTS OTHER - ALL OVER 65 OTHER - OTHER Census SPD Initial investigative method using age, sex and hashed name similarity. Other data sources containing family relationships are being investigated e. g. tax and benefits data.

Plans for the future • This year – numbers of households by LA, England Plans for the future • This year – numbers of households by LA, England Wales, 2011 for Research Outputs, Autumn 2016 • Future releases • Household sizes • Household composition • Case studies of Local Authorities of interest We have initiated a ONS household working group to join different sectors of work e. g. Address register, commercial data - aerial photography, house price index, existing surveys. Investigating production of an enhanced address register.