a0e0d1931d2af1403581908bf0b97295.ppt
- Количество слайдов: 17
Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing Systems Department Statistics Estonia 23 th. of May 2011
Strategy of Statistics Estonia 2008– 2011 “From data collector to information service provider” Objective: High-quality information service Standardise the process of data processing: Indicator: Introduction of the unified data processing software n Working out and introduction of the universal data processing information system 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Architecture of the information system Metadata i. META system Economic entities e. STAT VAIS VVIS ADAM Data collection Persons Administrative registers 3/19/2018 PX-Web Statistical analysis Processing e. Geostat Statistical registers SRS KUNDE Dissemination Census-HUB Users Data Warehouse Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Data processing system (VAIS) n VAIS is a collection of tools and technologies aimed at automating data processing (Phase 5 in GSBPM). n In essence, the task of check, clean, and transforming statistical activity data can be identified as taking the raw data from one or more sources and transforming it to analytical system source data input data base structures (observation registry). 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Framework for … n Integrate data n Classify & code n Review, validate and edit n Impute n Derive new variables & statistical units n Calculate weights n Calculate aggregates n Finalize data files 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Metadata driven template based tool Template driven approach provides an universal solution for three main goals of the VAIS project: n Create an easy to use statistical data processing tool requiring minimal programming skills for transformation package creation. n Create a metadata driven process-oriented and automated statistical data processing tool. n Create an extendable data transformation tool. 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Imputation Method for Statistical Activity N Common Metadata Aggregation Def for Repository Statistical Activity N Data Sources for Statistical Activity N Target Dataset for Statistical Activity N 3/19/2018 VALIADTE IMPUTE AGGREGATE INTEGRATE DATA LOAD DATA Common XDTL Packages Validation Rules for Statistical Activity N INTEGRATE DATA Common XDTL Packages Data Sources for Statistical Activity N Data processingng package (XDTL) for Statistical Activity N Design Phase LOAD DATA Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Data processing with VAIS n Automating and speeding up data transformation n Raw data, transformation metadata and source data audit trails n Metadata driven template based tool n Balancing automation and manual intervention 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS architecture
Balancing automation and manual intervention Manual data processing RAW data Automated data processing OK? Data Warehou se Metadata (validation and transformation rules) 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS applications and roles Roll VAIS Designer x Data Warehouse programmer VAIS Operator Administrator x Chief operator x Operator URMA x Administrator 3/19/2018 x x Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
URMA n n User rights management application Allows using existing user for authorization Allows create roles and link users with roles Allows set rights according to domain statistical work 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS Designer n Application for data processing design n User interfaces for designing each processing procedures n Procedures group to packages n Packages setup fallows policy of ETL n Packages are designed for each statistical work version 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
VAIS Operator n Allows user to manually intervene to data processing. n Allows to solve tasks created from data validation. n Report of data processing gives overview of data in process. n Gives users information for decision, that is necessary to solve tasks. 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Technical platform VAIS is built on open-sourced freely available technological components. n XDTL (e. Xtensible Data Transformation Language – an XML based descriptional language designed for specifying data transformations, see http: //xdtl. org) run-time engine (XDTL RT). n MMX Metadata Repository, part of Metadata Framework (a MOF compliant metadata management environment designed with a wide variety of metadata-driven applications in mind, see http: //mmframework. org). n Apache Foundation's Velocity template engine (http: //velocity. apache. org) is used as the template engine combining excellent template rendering functionality with very easy to use template language. n The user applications are programmed in Java, based on Wicket MVC framework (http: //wicket. apache. org) n Quartz scheduling framework (http: //www. quartz-scheduler. org) is used for execution scheduling. 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Implementation n VAIS development 05. 2010 - 10. 2011 n Data processing of Population and Housing Census 2011 (31. 12. 2011) n Reuse administrative data (2012) n Data collecting system for administrative data (ADAM) and e. STAT development for prefilling questionnaires in e. STAT with administrative data (annual bookkeeping report). (31. 08. 2011). VAIS is used for converting administrative data into the statistical data format. (for the year 2012 i. e for the reference year 2011 data collection) n Data processing of other statistical activities (first pilots 2013) n Data processing of next registry based Population and Housing Census (pilot 2014) 3/19/2018 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)
Questions? Thank you!
a0e0d1931d2af1403581908bf0b97295.ppt