8390ce7c9b732e6a1e7d193633772c95.ppt
- Количество слайдов: 41
The Digital-Agenda-Data tool (DAD) Its architecture, functionalities & developments CONNECT-F 4 Mai 2015 http: //digital-agenda-data. eu/
The first simple data-model
Attach level of metadata Org. publisher Collection Vehicule Question-module Variable Indicators (%, ratio) Breakdown-dimension Single Observation ESMS periodicity Def. , list of brkdwns, nb obs Concept definit. Flag, notes
Dataset & level of metadata Org. publisher Collection Vehicule Question-module Variable Indicators (%, ratio) Breakdown-dimension Single Observation ESMS periodicity Def. , list of brkdwns, nb obs Concept definit. Flag, notes
Breakdowns & dimensions • Ref-time, ref-area – cross-tabulated • Multiple dimensions … cross-tabulated • Superposed in one dimension if used one-byone as marginal distributions
Origin From 6. 000 observations on 30 tables To an interactive multi format Web access to 1. 000 observations
What is it? • An IT tool to store, link and visualise "statistical" data and metadata • Developed in line with the most recent standards for a "semantic" publication of Open Data • To promote data sharing, linking, reuse and analysis by statisticians as well as nonstatisticians, internally and externally EC services
A. Powerful visualisation tool
B. Powerful semantic data repository
… interactive and user friendly • • • It is not only for statisticians, it is for everyone Enables creation of charts within few clicks Provides data in a user friendly format It enables navigation through charts With rich metadata to describe datasets, indicators, sources and single observations
The functionalities for users • Let have a quick live tour http: //ec. europa. eu/digital-agenda/en/create-graphs – Navigation experience, keeping parameters – Variety of charts and tables to explore cubes and compare different indicators – Metadata always present: box and mouse-over – Reuse tools at charts level: URL, images, download, share, embed – Presentation of the dataset structure: list of indicators, dataset structure – Reuse tools at dataset level: sparql, API, CSV, TSV, RDF file and graphs' URL,
How it works: service architecture
Data Structure Definition
More details are available in the Annex of the technical reports (D 2), published on the documentation page:
The functionalities for admin. • Quick tour on CR – content registry http: //digital-agenda-data. eu/data – Browse multiple datasets & their observations – Search by keywords or observations – Download Code Lists (with log in) – Upload "observations": via staging (SQL on Access db) and via excel templates – Maintenance via the Edit of Properties – announcement on the EU Open Data Portal
Quick tour on Daviz the data visualisation wizard
Coming soon: • • data-linking via sparql to other repositories SDMX-ML interface (with EEA) ECAS (? ) authentication/rights other data models, and new charts • procurement clauses in DG CONNECT studies (original data to be provided in electronic formats, templates, promote semantic assets as ELI)
Data-linking via sparql (qb vocab)
CR - Content Registry
CR - Content Registry Semantic data repository • Provides a possibility of browsing and searching data and metadata • Enables export of metadata • Transforms data into machine readable format • Provides SPARQL Endpoint Only for administrators • Interface for upload and update of data and metadata
CR - Content Registry Automatic mapping SPARQL Endpoint SQL Mapping Templates for • observations • metadata MS Excel upload MS Access Database upload Daviz
CR – Spreadsheet upload To upload data using Excel Sheets you need to use predefined Excel templates for observations and for metadata
CR – Database Upload File Upload Conversion Mapping Staging Database RDF Upload
CR – Database upload
CR – Database upload
CR – Database upload
It's Open Source Software – new module for EEA Daviz • used by the European Environment Agency • free software, GPL v 2 (or later) license • source code – https: //github. com/eaudeweb/edw. datacube/ – https: //github. com/eaudeweb/scoreboard. visualization – https: //github. com/eaudeweb/scoreboard. theme – adaptation of Content Registry • used by EEA (EIONET) • free software, MPL 1. 1 • source code – https: //svn. eionet. europa. eu/
PSI + 2011/833/EU + ODP + LOD • scope includes administrative data + on its behalf (art 2, 1. (b)) • all such data shall be available for reuse (art 4); exceptions • services will "publish" through the EU open data portal, acting as a single point of access … • … to its structured data so as to facilitate linking and reuse for commercial (and non) purposes (art 5) • datasets shall be made available in machine-readable format where possible and appropriate (art 8) • EC research is supporting standards, resources and tools to facilitate the reuse of data and particularly their linking: i. e. LOD 2 project and the ISA/Joinup initiative
Pressure on services --> diversification ? • Proliferation of Scoreboards (ENTR …) • Reuse of ESTAT dissemination chain(COMP) • In-house web solutions exploiting existing administrative databases (EMPL, BUDG) • Commercial platforms as Tableau or Socrata (SANCO, REGIO ? ) … • semantic formats without sparql (SANCO) • Integrated semantic solutions (CNECT, OP ? ) • Others ?
But publishing is communication challenging interoperability, calling for flexible solutions: standards + customisation • "reuse" is not only the 5 stars, is also quality of multilevel metadata • the advantage of statistical data: SDMX & Data. Cube consistency • the advantage of RDF repository with sparql: statdata-linking + also linking stat with other data
Possible priorities for cooperation Support/urge an RDF expression of SDMX Estat offering its data in RDF Estat opening a sparql end point Use "IT rationalisation" to promote LOD tech Participate in DAD development Training for key interservice networks (ODP, stat, IT, eval) Dialogue with research and commercial key actors • … • •
DATA UPLOAD – DATA MODEL Current data model is based on 5 dimensions - what does it mean? - Each single observation is TIME described with up to 5 characteristics UNIT REGION INDICATOR observation BREAKDOWN 33
The xls template for data (observations) in a flat format Year Country Variable Breakdo Unit wn Value 2013 EU i_iuse total pc_ind 71 2013 PT i_iuse F 25_54 pc_ind 54 Flag Note
35
DATA PREPARATION – INDICATOR METADATA Year Country 2013 EU 2013 PT Variable i_iuse i_use Breakdown Unit Value Flag total pc_ind 71, 7 EB_F pc_ind 54 OBSERVATION TEMPLATE Note INDICATOR TEMPLATE notation pref. Label i_iuse Individuals who are regular internet users Regular internet (at least once a week) users notation internet-usage mobile research-and-development telecom pref. Label Internet usage Mobile market Research and Development Telecom sector alt. Label order 40 35 110 10 notation pref. Label definition estat-hh Eurostat - ICT Households survey Eurostat - Community survey on ICT usage in Households and by Individuals estat-ent Eurostat - ICT Enterprises survey Eurostat - Community survey on ICT usage and e. Commerce in Enterprises definition notes Individuals using the internet at least once a week in the last 3 months. member-of order source internetusage estat-hh 2 INDICATOR GROUP TEMPLATE SOURCE TEMPLATE page notes http: //epp. eurostat. ec. europa. eu/porta l/page/portal/information_society/intr Extraction from HH/Indiv comprehensive database oduction (ACCESS) version 29 April 2013 Extraction from ENT 2 comprehensive database http: //epp. eurostat. ec. europa. eu/porta (NACE Rev 2 in ACCESS 133 MB) version 16 Apr l/page/portal/information_society/intr 2013, and from ENT (NACE Rev 1. 1 in ACCESS 104 36 oduction MB) version 12 Dec 2011.
indicators notation pref. Label i_iuse alt. Label Individuals who are regular Regular internet users definition Individuals using the internet at least once a week in the last 3 months. notes memberorder source of internetusage 2 estat-hh 37
Source Notation pref. Label estat-hh definition page notes http: //epp. euros tat. ec. europa. eu Extraction from Eurostat - Community /portal/page/po HH/Indiv Eurostat - ICT survey on ICT usage in rtal/information comprehensive Households and by _society/introdu database (ACCESS) survey Individuals ction version 29 April 2013 38
How such information will be used ? § - To design the charts (values, pref_labels) § - For the mouse-over (values, notes) § - In the metadata space (definitions, source) § - In drop down filters menus (groups, alt_labels) § - To structure the URL as a query (notations) § - In the downloadable files …
1. FILTERS – DATA selection 1 2 POWERFUL VISUALIZATION TOOL? 2. CHART AREA PRINT/DOWMLOAD CHART 3. METADATA – learn about the data 4 3 5 4. CHART OPTIONS PRINTING DATA EXPORT EMBEDDING SOCIAL MEDIA 5. NAVIGATION AREA


