Скачать презентацию Some Initiatives on Combining Data to Support Small Скачать презентацию Some Initiatives on Combining Data to Support Small

97e5d80c6762595e283c757ae82cadb0.ppt

  • Количество слайдов: 23

Some Initiatives on Combining Data to Support Small Area Statistics and Analytical Requirements at Some Initiatives on Combining Data to Support Small Area Statistics and Analytical Requirements at ONS-UK Denise Silva and Philip Clarke IAOS 2008 - Shanghai – Reshaping Official Statistics

Outline • Small Area Estimation (SAE) problem • SAE history at ONS • Successful Outline • Small Area Estimation (SAE) problem • SAE history at ONS • Successful ONS experiences on Small Area Estimation • Small Area Estimation Project (SAEP) and the production of income estimates • Small area estimation for the 2001 Census • Small Area estimates for unemployment • Lessons learned • Developments in progress • Preparing for future challenges 2

Small Area Estimation Problem • NSIs challenge: to produce reliable information under operational constraints Small Area Estimation Problem • NSIs challenge: to produce reliable information under operational constraints • Pressure for reducing sample sizes and respondent burden • Data requested for small areas or domains but limited resources for data collection • In sample surveys: sample sizes not large enough to provide reliable estimates for small areas • Solution: borrow information from other related datasets – from similar areas – from previous occasions 3

Small Area Estimation Methods • Methods for producing small area statistics from combined data Small Area Estimation Methods • Methods for producing small area statistics from combined data sources – Rely on statistical models that relate survey data with auxiliary information – Auxiliary information: administrative data or census data • Modelling procedure fitted using sample data – Dependent variable is survey variable of interest – Independent variables are auxiliary data (covariates) 4

In the beginning… OPCS commissioned Uo. S to examine potential for small area estimation In the beginning… OPCS commissioned Uo. S to examine potential for small area estimation 1993 1994 1995 1996 OPCS + Io. E projects: experimental studies on mental health estimation for Do. H 1997 1998 1999 ONS = OPCS+CSO 5

Small Area Estimation Projects Timeline Eurostat project: ONS, Statistics Finland + 2 Universities ONS: Small Area Estimation Projects Timeline Eurostat project: ONS, Statistics Finland + 2 Universities ONS: Small Area Estimation Project (SAEP) - research 1993 1994 First OPCS + 1995 1996 1997 1998 1999 ONS initial experiments Uo. S report 6

Small Area Estimation Projects Timeline 2001 Census tests: small area population estimates 1993 1994 Small Area Estimation Projects Timeline 2001 Census tests: small area population estimates 1993 1994 1995 1996 1997 1998 1999 SAEP First OPCS + ONS initial experiments Uo. S report 7

Small Area Estimation Projects Timeline Small area estimates of unemployment : research 2000 2001 Small Area Estimation Projects Timeline Small area estimates of unemployment : research 2000 2001 2002 2003 2001 -2004: EURAREA research project NSIs + Universities 7 European Countries 2004 2005 2006 2007 2008 2001 Census Implementation 8

Small Area Estimation Projects Timeline Implementation stage: estimates of income and unemployment + current Small Area Estimation Projects Timeline Implementation stage: estimates of income and unemployment + current developments 2000 2001 2002 2003 2004 2001 Census + research on small area estimates of income and unemployment 2005 2006 2007 2008 Model based estimates at small areas are accepted as a part of ONS established statistical outputs 9

Framework for small area estimation at ONS • Successful experiences in social statistics: – Framework for small area estimation at ONS • Successful experiences in social statistics: – estimates for income, unemployment and population (2001 Census) • Auxiliary information from Census and from administrative databases available at aggregated levels • Incompatible and changing boundary systems – variety of geographic unit types – boundaries do not always align 10

The SAEP methodology and income estimation • Estimates published for 1998/99, 2001/02 and 2004/05 The SAEP methodology and income estimation • Estimates published for 1998/99, 2001/02 and 2004/05 at local area level ( 7100 small areas) • Survey data: mean household income from Family Resources Survey • Auxiliary data: Census variables, social benefit claimants, council tax banding, house sale price index and Income tax data • Linear regression for log income with area random effect • Model with unit level response and area level covariates 11

Small area estimation of unemployment level and rates for Local Authorities in GB • Small area estimation of unemployment level and rates for Local Authorities in GB • ILO unemployment measured by the Labour Force Survey (LFS) - the direct estimates • LFS has unclustered survey design – giving a sample in each Local Authority ( 400 areas) • Small sample sizes at Local Authority (LA) level • Use job seekers allowance claimant count as auxiliary data 12

Estimation of unemployment at Local Authority level • Survey data: number of respondents by Estimation of unemployment at Local Authority level • Survey data: number of respondents by age/sex group in each LA who are unemployed from LFS – age groups: 16 to 24; 25 to 49; 50 and over • Auxiliary data: job seekers allowance claimant counts + geographical region + ONS area classification • Area level model by age/sex groups in each LA • Binomial mixed model with a logistic link function • Model relates the probability of an individual in age-sex group be unemployed within each LA to the auxiliary data 13

Small area estimation for the 2001 Census • Key elements/stages of the One Number Small area estimation for the 2001 Census • Key elements/stages of the One Number Census: – the Census itself – the Census Coverage Survey (CCS) – the matching of Census and CCS to estimate undercount – the process to obtain model based population estimates for Local Authorities – the production of a database with individual and household level records 14

Small area estimation for the 2001 Census • Survey data: – Census Coverage Survey Small area estimation for the 2001 Census • Survey data: – Census Coverage Survey (CCS) count • Auxiliary data: unadjusted census count • simple linear regression model through the origin with different coefficients (slopes) across Local Authorities • Area level model 15

Lessons learned • Good small area estimates depend on: – adequacy of the modelling Lessons learned • Good small area estimates depend on: – adequacy of the modelling procedures – covariates with good prediction power – model validation prior to publication • Big effort to transfer to the production area 16

Lessons learned • Challenges: – ability to master complexities of statistical theory – availability Lessons learned • Challenges: – ability to master complexities of statistical theory – availability of relevant auxiliary data – acceptance of model based estimates as official statistics outputs 17

When we find the answers…. they change the questions Labour Market area • Unemployment When we find the answers…. they change the questions Labour Market area • Unemployment estimation at Parliamentary Constituency Area (PCA) level – Local Authority and PCA are non-nested geography – Boundaries change over the years – Issue is to ensure consistency with LA estimates at comparable areas • Consistent estimation of all three labour market states: employed, unemployed and not economically active 18

When we find the answers…. • Income estimation – Income distribution each area – When we find the answers…. • Income estimation – Income distribution each area – threshold value defined as 60% of national median income (also considered as a poverty line) – SAEP methods can be used to estimate proportion of households below the poverty level 19

When we find the answers…. • Other projects (in earlier stages ) – Small When we find the answers…. • Other projects (in earlier stages ) – Small area estimation for business surveys – Statistical models to improve international migration small area statistics • Research projects (with academic consultants) – Small area estimation of income distribution – Estimation of change over time • ONS committed to quality whilst responding to users’ requirements 20

Looking forward • The more successful we are in obtaining small area estimates… the Looking forward • The more successful we are in obtaining small area estimates… the more complex the estimation system becomes • Need to ensure: – comparability over time – consistency over area/domains – coherence over variables • More general issues – Analysis of change over time – Broader areas estimates and precision measures 21

Preparing for future challenges • For producing statistics from combined data sources – Methodology Preparing for future challenges • For producing statistics from combined data sources – Methodology Directorate established strategy for developing data matching and data sharing • Investigate models to improve comparability over time and consistency over geographies • Need to experiment with resampling methods for calculating precision of the estimates 22

THE END For references: see paper THE END For references: see paper "Data! data!" he cried impatiently. "I can't make bricks without clay. " So said Sherlock Holmes to Dr. Watson in "The Adventure of the Copper Beeches" (by Sir Arthur Conan Doyle) 23