Office For National Statistics Imputing the 2011 UK Census in an automated production environment Presented by: Leone Wardman Sept 2011
Why Automate? • Security • Record keeping / Audit trail • Version control • Volumetrics • 24/7 operation
The planned system Figure 1: 2011 UK Census Automated data processing system • Bespoke system • 4. 5 years to develop • Oracle database • Java platform • Windows and Unix servers
The final system. . Figure 2: Interaction between the Census system and manual processes
The Edit and Imputation process -Modularised -Automated -CANCEIS Figure 3: Automated Imputation Process for 2011 Census
Did it work? Yes! • 99. 9% of person imputation occurred in automated method • 0. 001% persons had non-statistical imputation • 99. 08% of household imputation occurred in automated method • 0% households had non-statistical imputation
Were there any problems? Of course! • Edit Rules – changes to observed values • Soft Edit Conditions – increasing rare characteristics • Missingness in addresses – affecting the pass rates, changes to observed values
Issue 1: Edit Rule Implementation Figure 4: Edit Rules in a modular imputation approach
Issue 2: Soft Edit Implementation Table 1: Records with at least one Soft Edit condition present
Ideas for the future • Should we impute addresses in a separate module? • Could we allow values in earlier modules to change instead of fixing the values? • Would using reordering prevent the edit rule problems from occurring?