0fb295516557bf103aa53e03b468297c.ppt
- Количество слайдов: 15
Could that be true? Methodological issues when deriving educational attainment from different administrative datasources and surveys Presentation for the IAOS Conference on Reshaping Official Statistics Shanghai, October 14 -16, 2008 Bart F. M. Bakker Manager Section Socio-Economic State Statistics Netherlands. . Could that be true?
The problem • Increasing use of administrative data for official statistics, because: • lower costs • smaller response burden • covering all elements of the population for small domain statististics • Surveys only additional • The problem: • unknown or poor quality of part of the administrative data • unknown or poor quality of statistical outcomes if administrative sources are combined . . Could that be true? 2
General idea • Administrative data are collected with one or more traditional survey techniques, so: they have the same errors as traditional surveys • The size of the errors depends on the audits the register keeper execute • Variables that are important to the register keeper are assumed to be of better quality. . Could that be true? 3
. . Could that be true? 4
. . Could that be true? 5
An example: educational attainment The goal of the project • Determining the educational attainment of as many persons as possible • that can be used to derive a background variable for all kinds of research • and, if the validity is reasonable, can be used for the estimation of the educational attainment in small areas and small subgroups • not one register available. . Could that be true? 6
Sources • CRIHO: students in higher education from 1986 • ERR: students who did an exam in general secondary education from 1999 • Education Number Registers: students in secondary general education from 2004 • CWI: job-seekers who are registered as such in the employment exchange from 1990 • WSF: students with student grants from 1999 • LFS: 1% samples from the population aged >15 from 1996. . Could that be true? 7
Table 1. The registers and their quality Source CRIHO ERR Education Registers WSF CWI Measurement Object Validity register variable good reasonable Measurement error register variable nil nil many Processing error register variable nil few few statistical variable nil nil many Representation Coverage error register target population nil a few schools are missing from second year alright, improvements still possible nil statistical target population only public higher education in the Netherlands from 1986 only (large part of) public secondary general education from 1999 only (large part of) secondary education from 2003 only higher education in the Netherlands from 1995 only a large part of jobseekers from 1990 Linking error statistical target population nil nil few Correction error statistical target population nil nil nil . . Could that be true? 8
Micro-integration: harmonisation • Determine the classification of educational attainment • Harmonise the copied information on the training programmes • Derive the classification • Derive information whether certificates are attained • The date that the certificates are attained. . Could that be true? 9
Micro-integration: correction for measurement errors Is the educational attainment valid at the reference date? 1. Border that the probability is <5% that someone will attain a higher level 2. Probability <5% that someone has attained a higher level since the latest certificate is attained Both empirically determined with the use of life tables. . Could that be true? 10
Micro-integration: correction for measurement errors • For one person on one reference date more than one valid score on educational attainment is available • Choose the source with the best quality: 1. CRIHO, Education Number Register, ERR 2. LFS 3. WSF CWI only for weighting . . Could that be true? 11
Derive educational attainment Derive the highest educational level attained from: • all followed training programmes before reference date • the certificates that are attained before reference date • validity on reference date • choose source with best quality • downgrade the followed training programmes not ended with a certificate • impute with the use of age <15 years . . Could that be true? 12
Results: coverage 100% 90% 80% register 15+ 70% 60% coverage 50% LFS 15+ 40% 30% PR 0 -14 + register 15+ + LFS 15+ 20% 10% 0% 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100105 age . . Could that be true? 13
Weighting the data Coverage shows selectivity • underrepresentation of vocational education on secondary level • overrepresentation of youngsters Weight to the population, result in two vectors • the valid scores on educational attainment on reference date and • a weight . . Could that be true? 14
Conclusions • Administrative data have the same errors as traditional surveys • And some more… • Combining data from registers and surveys is promising • But complicated • Always do research on the quality of the administrative data . . Could that be true? 15
0fb295516557bf103aa53e03b468297c.ppt