Скачать презентацию Development of UK Virtual Microdata Laboratory Felix Ritchie Скачать презентацию Development of UK Virtual Microdata Laboratory Felix Ritchie

372ada62cfb2e4c3ecfb6d4eb0accc8c.ppt

  • Количество слайдов: 17

Development of UK Virtual Microdata Laboratory Felix Ritchie Shanghai, March 2010 Development of UK Virtual Microdata Laboratory Felix Ritchie Shanghai, March 2010

Plan of presentation • Starting principles • What we did, and the impact • Plan of presentation • Starting principles • What we did, and the impact • New things we had to develop • security model, researcher management, SDC • What we’ve learnt • what matters, what doesn’t, what we’d do differently • Future directions

Starting principles • Designed by researchers for research – maximum access, limited by law Starting principles • Designed by researchers for research – maximum access, limited by law • Expandable • Secure at reasonable cost • Manageable at reasonable cost • Distribute access, not data

Distributed access • Why is this good? – Data always under ONS control – Distributed access • Why is this good? – Data always under ONS control – Live monitoring – Simpler, but safer, disclosure control • How does this work in practice? – – VML accessible from all ONS computers Access points in govt. offices in Glasgow and Belfast Plan to roll-out to more govt offices in 2010 VML-duplicate set up on academic network • VML set to become exception rather than default data store

What we did • Central data repository and processors • Access via secured thin What we did • Central data repository and processors • Access via secured thin clients • Work space partitioned by dataset, not usage – researchers get access to dataset, not variables • No access to internet or rest of network • Same system for internal and external users

What we did - outcomes • 30%-50% growth every year • Massive increase in What we did - outcomes • 30%-50% growth every year • Massive increase in microeconomic analysis – Form almost no firm-level studies to European leaders • Keystone of ONS Administrative Data Project • Total cost ~£ 350, 000 per year • strategy 17%, fixed ops 65% variable ops 18% • income ~£ 50, 000

New things developed (1) The VML Security Model • valid statistical purpose safe projects New things developed (1) The VML Security Model • valid statistical purpose safe projects • trusted researchers + safe people • anonymisation of data + safe data • technical controls around data + safe setting • disclosure control of results + safe outputs safe use

New things developed (2) Output statistical disclosure control • ‘Standard’ SDC not appropriate – New things developed (2) Output statistical disclosure control • ‘Standard’ SDC not appropriate – traditional rules not appropriate for research environments – SDC on data or methods pointless • Principles-based output SDC – – – SDC at the point of release trained researchers trained staff agreement on principles and purpose safe vs unsafe outputs, based on functional form

New things developed (3) Active researcher management • Need to develop shared objectives with New things developed (3) Active researcher management • Need to develop shared objectives with researchers – Principles-based SDC needs buy-in from researchers – Reduced management costs • Compulsory training – SDC – VML objectives and constraints – legal and procedural background

What we’ve learnt (1) Things that matter • attitude to researchers • model of What we’ve learnt (1) Things that matter • attitude to researchers • model of SDC • broad scale of operations – including future plans • scale of coherent networks • (for remote access) – eg ONS internal network, Government Secure Intranet, University Intranet, VPN?

What we’ve learnt (2) Things that don’t matter • Location of servers and users What we’ve learnt (2) Things that don’t matter • Location of servers and users • Type of data • IT • Metadata • Specific legal/procedural framework?

What we’ve learnt (3) Things we would do differently • Prepare ONS for expansion What we’ve learnt (3) Things we would do differently • Prepare ONS for expansion – senior buy-in – IT planning • better data management • better user management • better metadata

Future directions • Expansion across the government network • Supporting academic equivalent – VML Future directions • Expansion across the government network • Supporting academic equivalent – VML facing massive internal increase in use • Developing international standards • Better communication – wikis, FAQs, common metadata system – metadata • Not being considered – remote job systems – synthetic data

Questions? Felix Ritchie felix. ritchie@ons. gsi. gov. uk Microdata Analysis and User Support maus@ons. Questions? Felix Ritchie felix. [email protected] gsi. gov. uk Microdata Analysis and User Support [email protected] gsi. gov. uk

Old stuff – if necessary Old stuff – if necessary

The data model (1) • ‘Spectrum’ of access points balancing – value of data The data model (1) • ‘Spectrum’ of access points balancing – value of data – ease of use – disclosure risk • for a given level of confidentiality, maximise data use and convenience • no ‘one-size-fits-all’ solution – no absolute prohibitions – trade-off is made explicit – users determine appropriate level of access

Use of confidential data: the access spectrum Type of access None VML ONS sites Use of confidential data: the access spectrum Type of access None VML ONS sites SDC of inputs None Restrictions on users Many SDC of outputs Complete Secure data service Special licences Licensed data archive Little Examples: Internet Complete RDCs Anonymisation VML Govt sites Census data Original data Data for ONS linking Enterprise data Original data Identified data for ONS linking Identifiable data for analysis Complete None ONS contractor Govt. users only Anon. CD-ROM Web tables