Скачать презентацию SDMX Basics Statistical Data Metadata The Basics Скачать презентацию SDMX Basics Statistical Data Metadata The Basics

00c027544bd27bec064a85816e2428cd.ppt

  • Количество слайдов: 132

SDMX Basics Statistical Data & Metadata The Basics of the SDMX Information Model SDMX SDMX Basics Statistical Data & Metadata The Basics of the SDMX Information Model SDMX tools SDMX-ML messages Major changes in SDMX 2. 1 1

THE SDMX COMPONENTS Technical Specifications The SDMX Information Model Guidelines to Tools Harmonise Content THE SDMX COMPONENTS Technical Specifications The SDMX Information Model Guidelines to Tools Harmonise Content IT Architectures for data exchange The Content Oriented Guidelines (COG) SDMX compliant tools 2

Describing the data exchange Who? When? How? What? Who? Where? What? 3 Describing the data exchange Who? When? How? What? Who? Where? What? 3

Model of the statistical table Number Tourism establishments Italy Annual data 2529 4 Model of the statistical table Number Tourism establishments Italy Annual data 2529 4

Design a DSD: What do we need to do first? n Identify the Concepts Design a DSD: What do we need to do first? n Identify the Concepts – A concept is a unit of knowledge created by a unique combination of characteristics (SDMX Information Model) 5

Identifying Concepts n Identifying Concepts - Sources – Existing data set tables • From Identifying Concepts n Identifying Concepts - Sources – Existing data set tables • From website • From applications – Data Collection Instruments • Questionnaires • Excel spreadsheets – Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L 039 of 14/02/1976; Compilation of statistics on foreign workers – Database Tables – Existing Data Structure Definitions • From other organisations 6

DERIVING THE CONCEPTS FROM A TABLE FREQUENCY COUNTRY TOURISM_INDI CATOR TOURISM_A CTIVITY UNIT OBS_VALUE DERIVING THE CONCEPTS FROM A TABLE FREQUENCY COUNTRY TOURISM_INDI CATOR TOURISM_A CTIVITY UNIT OBS_VALUE TIME E P OBS_STATUS 7

SDMX-IM - Concept Scheme 8 SDMX-IM - Concept Scheme 8

Identify/Define Code Lists n Purpose of a Code List – Constrains the value domain Identify/Define Code Lists n Purpose of a Code List – Constrains the value domain of concepts when used in a structure like a data structure definition – Defines a shortened language independent representation of the values – Gives semantic meaning to the values, possibly in multiple languages n Agreeing on harmonised code lists is probably the most difficult aspect of defining a data structure definition 9

SDMX-IM - Code List Each code is defined uniquely by an ID and a SDMX-IM - Code List Each code is defined uniquely by an ID and a description that can be provided in several languages. Code list is maintenable SDMX container. Code lists: ID: CL_AREA ID: CL_TOURISM_ACTIVITY Code value Description Code value BE Belgium A 100 Hotels and similar DE Germany B 010 Tourist Campsites FR France B 020 Holiday dwellings IT Italy AT Austria ES SPAIN PT Portugal Description Partial code lists can also be exchanged (v 2. 1). The content of the partial code list is specified on a Constraint. 10

SDMX-IM - Concept Scheme Exercise: Deriving a concept scheme from a table 11 SDMX-IM - Concept Scheme Exercise: Deriving a concept scheme from a table 11

SDMX-IM - Concept Scheme Correction : Deriving a concept scheme from a table 12 SDMX-IM - Concept Scheme Correction : Deriving a concept scheme from a table 12

Data Set Structure n Computers need to know the structure of data in terms Data Set Structure n Computers need to know the structure of data in terms of: – – – Dimensionality Additional metadata Measures (Observation) Concepts Valid content • Code Lists • Non coded format (integer, date, text) 13

Concepts play roles in a Data Structure • Comprises – Dimensions Concepts that identify Concepts play roles in a Data Structure • Comprises – Dimensions Concepts that identify the observation value Attributes – Concepts that additional metadata about the observation value (as a value or the context of the value) Measure – Concept that is the observation value – Any of these may be • • • coded text date/time number etc. Representation 14

DERIVING A DATA STRUCTURE FROM A TABLE FREQUENCY COUNTRY TOURISM_INDI CATOR TOURISM_A CTIVITY UNIT DERIVING A DATA STRUCTURE FROM A TABLE FREQUENCY COUNTRY TOURISM_INDI CATOR TOURISM_A CTIVITY UNIT OBS_VALUE TIME E P DIMENSIONS ATTRIBUTES OBS_STATUS MEASURES 15

DERIVING A DATA STRUCTURE FROM A TABLE 16 DERIVING A DATA STRUCTURE FROM A TABLE 16

DERIVING A DATA STRUCTURE FROM A TABLE Exercise: Create the WASTE DSD with the DERIVING A DATA STRUCTURE FROM A TABLE Exercise: Create the WASTE DSD with the DSW 17

STATISTICAL DATA & METADATA Statistical Metadata (Identifiers, Descriptors) Structural metadata Code lists, Concept Schemes, STATISTICAL DATA & METADATA Statistical Metadata (Identifiers, Descriptors) Structural metadata Code lists, Concept Schemes, DSD Statistical Data (Figures) Time series data representation Cross-sectional data representation 18

STATISTICAL DATA & METADATA Two different ways to represent data Statistical data - Cube STATISTICAL DATA & METADATA Two different ways to represent data Statistical data - Cube Time series s ie er s ce e im sli T Tourism activity Cross-sectional slice 1250 1216 1220 B 020 Cross-section for 2006 B 010 A 100 Country AT ES FR Time 2007 2006 2005 542 1216 8138 2510 IT 19

STATISTICAL DATA – Time Series Number of touristic establishments – Time series FREQ: A STATISTICAL DATA – Time Series Number of touristic establishments – Time series FREQ: A – Annual Number of touristic establishments – Time series FREQ: A Annual GEO: FR ––France FREQ: – Austria GEO: ATA – Annual TOUR_INDICATOR: A 001 – Establishments FREQ: GEO: ESA – Annual TOUR_INDICATOR: UNIT: NBR - – Spain A 001 – Establishments Number GEO: IT Number TOUR_INDICATOR: UNIT: NBR - – Italy A 001 – Establishments TOUR_INDICATOR: UNIT: NBR - Number A 001 – Establishments A 100 B 010 Activity. NBR - Number UNIT: A 100 B 010 Activity Hotels and Tourist Time similar Campsites Hotels and Time similar Campsites 2002 A 00 33518 2529 Tourist Time similar Campsites 2002 A 00 33518 2529 2002 A 00 2003 A 00 2004 A 00 2005 A 00 44568 33527 33518 33527 78564 33768 33527 33768 45259 34058 33768 34058 36475 34058 4752 24112529 4125 25102411 4455 25872510 3586 2587 B 020 Holiday dwellings Holiday dwellings 56586 75963 68385 56586 68385 88526 68376 68385 68376 77486 61810 68376 61810 45963 61810 20

STATISTICAL DATA– Cross-sectional Number of touristic establishments – Cross-sectional TIME: 2002 A 00 TIME: STATISTICAL DATA– Cross-sectional Number of touristic establishments – Cross-sectional TIME: 2002 A 00 TIME: 2003 A 00 TOUR_INDICATOR: A 001 - Establishments TIME: 2004 A 00 FREQ: A -Annual TOUR_INDICATOR: A 001 - Establishments UNIT: NBR 2005 A 001 - Establishments TOUR_INDICATOR: TIME: - Number UNIT: NBR - Number. A 001 - Establishments TOUR_INDICATOR: UNIT: NBR – Number A 100 B 010 Activity Hotels and Tourist A 100 B 010 Campsites Country Activity similar and Hotels Tourist similar and Campsites Country Hotels Tourist similar Campsites Country AT 14204 540 AT ES ES FR FR IT IT IT 14204 15263 17827 18595 18135 19786 34058 36844 540 1220 684 1220 8052 1485 8052 2587 9562 2587 5684 B 020 Holiday B 020 dwellings Holiday dwellings 3388 4843 4758 4843 2406 4777 2406 2855 61810 66852 21

DERIVING A DATA STRUCTURE FROM A CROSS SECTION TABLE FREQUENCY TIME TOURISM_A CTIVITY TOURISM_INDI DERIVING A DATA STRUCTURE FROM A CROSS SECTION TABLE FREQUENCY TIME TOURISM_A CTIVITY TOURISM_INDI CATOR UNIT AT OBS_VALUE ES COUNTRY FR E IT DIMENSIONS P ATTRIBUTES OBS_STATUS MEASURES 22

23 23

REMARK ABOUT THE CROSS-SECTIONAL REPRESENTATION Dimensions Measures Attributes 24 REMARK ABOUT THE CROSS-SECTIONAL REPRESENTATION Dimensions Measures Attributes 24

REMARK ABOUT THE CROSS-SECTIONAL REPRESENTATION Dimensions Primary Measure Attributes 25 REMARK ABOUT THE CROSS-SECTIONAL REPRESENTATION Dimensions Primary Measure Attributes 25

DERIVING A DATA STRUCTURE FROM A TABLE 26 DERIVING A DATA STRUCTURE FROM A TABLE 26

ELEMENTS OF A DATA STRUCTURE DEFINITION Data Structure Definition concepts that identify the observation ELEMENTS OF A DATA STRUCTURE DEFINITION Data Structure Definition concepts that identify the observation Attribute Relationship concepts that add metadata • Group Key • Dimension(s) • Data Set • Observation Attributes Key Group Key concepts that are observed phenomenon Measure(s) takes semantic from concepts that identify a partial key takes semantic from Concept Dimensions takes semantic from has format Representation Noncoded has format Concept Scheme Coded has code list Code List 27

SDMX Information Model - Data Set E P 28 SDMX Information Model - Data Set E P 28

SDMX INFORMATION MODEL - DATA SET FREQ COUNTRY TOURISM_INDICA TOR TOURISM_ACTIVI TY TIME UNIT SDMX INFORMATION MODEL - DATA SET FREQ COUNTRY TOURISM_INDICA TOR TOURISM_ACTIVI TY TIME UNIT OBS_VALUE OBS_STATUS DATA SET GROUP KEY KEY Time series COUNTRY TOURISM_ACTIVI TY IT. B 020 KEY VALUES M. IT. A 001. B 020 61810 68676 68685 55586 Cross-section TIME PERIOD 2005 A 00 2004 A 00 2003 A 00 2002 A 00 OBSERVATION VALUE ATTRIBUTE Attribute attachment NBR VALUE Attribute attachment E E 29

THE STS SAMPLE DATASET EXERCISE 3: IDENTIFY THE CONCEPT & ROLES 30 THE STS SAMPLE DATASET EXERCISE 3: IDENTIFY THE CONCEPT & ROLES 30

THE STS SAMPLE DATASET EXERCISE 3: IDENTIFY THE CONCEPT & ROLES TITLE REF_AREA FREQ THE STS SAMPLE DATASET EXERCISE 3: IDENTIFY THE CONCEPT & ROLES TITLE REF_AREA FREQ BASE_YEAR TIME STS_INDICATOR STS_ACTIVITY STS_ADJUSTMENT OBS_ VALUE DECIMALS OBS_STATUS 31

DSD OF DATAFLOW STSRTD_IND_M Attachment level Obs Series Group Dimension s Concept frequency Geographic DSD OF DATAFLOW STSRTD_IND_M Attachment level Obs Series Group Dimension s Concept frequency Geographic area adjustment Concept ID FREQ REF_AREA ADJUSTMENT type of index STS_INDICATOR activity STS_ACTIVITY base year STS_BASE_YEAR reference TIME period Measure OBS_VALUE Attributes turnover idex status OBS_STATUS time duration TIME_FORMAT set Title TITLE decimals Metadata Roles Code List CL_FREQ CL_AREA_EE CL_ADJUSTMENT CL_STS_INDICATOR CL_STS_ACTIVITY CL_STS_BASE_YEAR CL_OBS_STATUS CL_TIME_FORMAT CL_DECIMALS Example of Remark value M Monthly GR Greece TREND Turnover TOVV deflated (volume of sales) NS 5201 Retail trade 2005 201101 CCYYMM 108. 6 observation A actual data P 1 M ISO 8601 1 One DECIMALS List of variables Codes Values 32

STRUCTURE OF THE DATASET FOR TIME SERIES Attributes and attachment level: group Group of STRUCTURE OF THE DATASET FOR TIME SERIES Attributes and attachment level: group Group of series Series REF_AREA="GR" ADJUSTMENT="N" STS_INDICATOR="TOTV" STS_ACTIVITY="NS 5201“ STS_BASE_YEAR="2005" DECIMAL=“ 2" TITLE=" Wholesale and retail trade turnover and volumes of sales" M; GR; N; TOTV; NS 5201; 2005; 201101; 88. 81; A M; GR; N; TOTV; NS 5201; 2005; 201102; 84. 74; A M; GR; N; TOTV; NS 5201; 2005; 201103; 88. 87; A M; GR; N; TOTV; NS 5201; 2005; 201104; 93. 01; A Attributes can be attached to groups REF_AREA="GR" ADJUSTMENT="N" STS_INDICATOR="TOTV" STS_ACTIVITY="N 15220“ STS_BASE_YEAR="2005" DECIMAL="1" TITLE="Retail sale of food" M; GR; N; TOTV; N 15220; 2005; 201101; 60. 8; A M; GR; N; TOTV; N 15220; 2005; 201102; 78. 2; A M; GR; N; TOTV; N 15220; 2005; 201103; 89. 9; A 33

STRUCTURE OF THE DATASET FOR TIME SERIES Attributes and attachment level: series Definition of STRUCTURE OF THE DATASET FOR TIME SERIES Attributes and attachment level: series Definition of Series 1 FREQ="M" REF_AREA="GR" ADJUSTMENT="N" STS_INDICATOR="TOTV" STS_ACTIVITY="NS 0006“ STS_BASE_YEAR="2005" TIME_FORMAT="P 1 M" M; GR; N; TOTV; NS 0006; 2005; 201101; 88. 8; A M; GR; N; TOTV; NS 0006; 2005; 201102; 84. 7; A M; GR; N; TOTV; NS 0006; 2005; 201103; 88. 8; A Attributes can be attached to series FREQ="M" REF_AREA="GR" ADJUSTMENT="N" Definition of STS_INDICATOR="TOTV" STS_ACTIVITY="N 14500“ Series 2 STS_BASE_YEAR="2005" TIME_FORMAT="P 1 M" Series 2 M; GR; N; TOTV; N 14500; 2005; 201101; 60. 8; A M; GR; N; TOTV; N 14500; 2005; 201102; 78. 2; A M; GR; N; TOTV; N 14500; 2005; 201103; 89. 9; A 34

STRUCTURE OF THE DATASET FOR TIME SERIES Attributes and attachment level: observation Definition of STRUCTURE OF THE DATASET FOR TIME SERIES Attributes and attachment level: observation Definition of Series 1 Attributes can be FREQ="M" REF_AREA="GR" ADJUSTMENT="N" STS_INDICATOR="TOTV" STS_ACTIVITY="NS 0006“ attached to observations STS_BASE_YEAR="2005" TIME_FORMAT="P 1 M" Definition of Observation 1 TIME_PERIOD="201101" OBS_VALUE="88. 81" OBS_STATUS="A“ Definition of Observation 2 TIME_PERIOD="201102" OBS_VALUE="84. 75" OBS_STATUS=“E” Definition of Observation 2 TIME_PERIOD="201103" OBS_VALUE="89. 87" OBS_STATUS="A” Observation 1 CSV M; GR; N; TOTV; NS 0006; 2005; 201101; 88. 81; A Observation 2 M; GR; N; TOTV; NS 0006; 2005; 201102; 84. 75; E Observation 3 M; GR; N; TOTV; NS 0006; 2005; 201103; 89. 87; A 35

EXAMPLE 2: DEMOGRAPHY SAMPLE DATASET CROSS SECTIONAL CASE FREQ TITLE MARITAL_STATUS SEX AGE REF_AREA EXAMPLE 2: DEMOGRAPHY SAMPLE DATASET CROSS SECTIONAL CASE FREQ TITLE MARITAL_STATUS SEX AGE REF_AREA Dimension attached to the dataset level TOTAL MALE FEMALE TIME Dimension attached to the group level 3 Measures OBS_VALUE UNIT Dimension attached to the observation level OBS_STATUS 36

DSD FOR DATAFLOW: DEMOGRAPHY_RQ 37 DSD FOR DATAFLOW: DEMOGRAPHY_RQ 37

STRUCTURE OF THE DATASET FOR CROSS SECTIONAL Attributes and attachment level Dimensions attached to STRUCTURE OF THE DATASET FOR CROSS SECTIONAL Attributes and attachment level Dimensions attached to dataset Dataset Attributes attached to dataset REF_AREA=“HU“ FREQ="A” UNIT="PERS“ TITLE=" Population on 1 January by age, sex and legal marital status” Group Observation TIME="2010" Cross– sectional measure Dimension attached to group Primary Measure Dimensions attached to observation Attribute attached to observation FEMALE OBS_VALUE=“ 36" MARITAL_STATUS=“REP” OBS_STATUS="P" MALE OBS_VALUE=“ 94" MARITAL_STATUS =“REP” OBS_STATUS="P" TOTAL OBS_VALUE=“ 130" MARITAL_STATUS =“REP" OBS_STATUS="P" 38

COMPLIANCE & IMPLEMENTATION Generally the following four steps need to be done: 1. Preparation: COMPLIANCE & IMPLEMENTATION Generally the following four steps need to be done: 1. Preparation: The statisticians from the organisations involved in the data exchange describe the data and the different dataflows, dataset and provision agreements. 2. Compliance: you create all the necessary objects according to the SDMX Technical Specifications. 3. Implementation: Now we put into practice. Standard software is installed and configured to use the DSDs. The exchange process is set up and tested. 4. Production: use the objects in the production process. SDMX implementation is achieved when the data and metadata exchanges within the domain are carried out according to SDMXcompliant specifications. 39

COMPLIANCE: CREATE ALL THE NECESSARY OBJECTS n Define the DSD 1. Code lists 2. COMPLIANCE: CREATE ALL THE NECESSARY OBJECTS n Define the DSD 1. Code lists 2. List of concepts (Concept scheme) 3. Roles of concepts (Dimension, Attribute, Measure) n Provide the related Dataflows (e. g. STSRTD_TURN_M, DEMOGRAPHY_RQ) 40

THE STEPS TO BUILD A DATA STRUCTURE DEFINITION 1 2 3 Identification of the THE STEPS TO BUILD A DATA STRUCTURE DEFINITION 1 2 3 Identification of the descriptor concepts for the data Choose the type of data representation (Time Series and Cross-sectional ) Choice of Cross Domain code lists or definition of specific code lists for coded concepts 4 Define Dimensions for Time Series and Cross-sectional data representation Definition of the text format for non coded concepts Definition of the concept role (Dimension, Attribute or Measure) 5 Define Attributes with the attachment levels Time Series and Cross-sectional data representation Define Time Series primary measure and/or Crosssectional measures with their measure concepts Create the defined artefacts in a SDMX Data Structure Definition tool (e. g. DSW) 41

Describing the data exchange Who? When? How? What? Who? Where? What? 42 Describing the data exchange Who? When? How? What? Who? Where? What? 42

IT ARCHITECTURES FOR DATA EXCHANGE 43 IT ARCHITECTURES FOR DATA EXCHANGE 43

SDMX REGISTRY 44 SDMX REGISTRY 44

SDMX REGISTRY DEMONSTRATION 45 SDMX REGISTRY DEMONSTRATION 45

EXERCISE 1: SDMX Registry Query and Retrieval of Structural Metadata and Provisioning Information 46 EXERCISE 1: SDMX Registry Query and Retrieval of Structural Metadata and Provisioning Information 46

CREATION OF THE DSD THE SDMX OBJECTS RELATED TO THE DATA STRUCTURE Organisation Schemes CREATION OF THE DSD THE SDMX OBJECTS RELATED TO THE DATA STRUCTURE Organisation Schemes Code lists Concept Schemes DSDs Data. Flows Category Schemes 47

CREATION OF THE DSD: DATA STRUCTURE WIZARD Offline version of Eurostat’s SDMX Registry Interaction CREATION OF THE DSD: DATA STRUCTURE WIZARD Offline version of Eurostat’s SDMX Registry Interaction with any SDMX v 2. 0 compliant Registry üQuery SDMX v 2. 0 Registry SDMX Registry üSubmit data structures to SDMX v 2. 0 Registry Data Authoring (building SDMX-ML sample datasets) Export metadata for use with the GENEDI tool Import/Export GESMES/TS structure files DSW – “standalone” desktop application (replaced Key. Family Access. DB tool) Import/Export SDMX-ML structures (validate structure messages) Maintenance of SDMX v 2. 0 data and meta data structures (create, modify, delete, query) Reporting of structures Advanced search features Import/Export SDMX-ML messages 48

Example - DSD import / creation using the DSW 49 Example - DSD import / creation using the DSW 49

LIFE DEMONSTRATION - DSD IMPORT / CREATION USING THE DATA STRUCTURE WIZARD 50 LIFE DEMONSTRATION - DSD IMPORT / CREATION USING THE DATA STRUCTURE WIZARD 50

EXERCISE 2: CREATION OF THE DSD: FISH_CATCH_A DATA STRUCTURE DEFINITION ID FISH_CATCH_A Name Catches EXERCISE 2: CREATION OF THE DSD: FISH_CATCH_A DATA STRUCTURE DEFINITION ID FISH_CATCH_A Name Catches for all fishing areas Version 1. 0 Agency. ID ESTAT Valid From Valid To 51

EXERCISE: CREATION OF THE DSD: FISH_CATCH_A DIMENSIONS CONCEPT REPRESENTATION CONCEPT SCHEME Position in Key EXERCISE: CREATION OF THE DSD: FISH_CATCH_A DIMENSIONS CONCEPT REPRESENTATION CONCEPT SCHEME Position in Key ID Name VER CS_FISHERIES 1. 0 1 FREQ 2 Country ISO 3 REPORTING_AREA codes (extended) 3 Production Area PRODUCTION_AR (from major area to CS_FISHSTAT EA sub-unit) 4 SPECIES ASFIS Species Alpha 3 Code TIME_PERIOD Reference year TIME Frequency ID AGENCY CODELIST ID TEXT VER AGENCY FORMAT ESTAT CL_FREQ 1. 1 ESTAT 1. 0 ESTAT CL_REPO RTING_AR EA 1. 0 ESTAT 1. 0 FAO CL_PROD UCTION_A REA 1. 0 FAO CS_FISHSTAT 1. 0 FAO CL_SPECI ES 1. 0 FAO CS_FISHERIES 1. 0 Dimension Type ESTAT CS_FISHERIES Frequency 52

EXERCISE: CREATION OF THE DSD: FISH_CATCH_A MEASURES CONCEPT REPRESENTATION CONCEPT SCHEME TYPE ID CODELIST EXERCISE: CREATION OF THE DSD: FISH_CATCH_A MEASURES CONCEPT REPRESENTATION CONCEPT SCHEME TYPE ID CODELIST Name ID Primary OBS_VALUE Value of the measure VER AGENCY ID VER AGENCY CS_FISH ERIES TEXT FORMAT 1. 0 ESTAT MEASUR E CODE DIMENSI ON N/A ATTRIBUTES CONCEPT ATTACHMENT LEVEL CONCEPT SCHEME ID CODELIST Name ID Observation REPRESENTATION UNIT unit CS_FISHERIES VER AGENCY 1. 0 ESTAT ID CL_UNIT VER AGENCY 1. 1 ESTAT TEXT FORMAT ATTRIBUTE ASSIGNMENT TYPE STATUS C 53

Describing the data exchange Who? When? How? What? Who? Where? What? 54 Describing the data exchange Who? When? How? What? Who? Where? What? 54

SDMX and Reference Metadata 55 SDMX and Reference Metadata 55

REFERENCE METADATA - PURPOSE In the SDMX model, objects can have explanatory texts explaining REFERENCE METADATA - PURPOSE In the SDMX model, objects can have explanatory texts explaining the main features of data. linked to the object by a simple “reference” to the object can be stored and exchanged without being embedded in the data message 56

REFERENCE METADATA - REPORT Content provide a structured (sometimes hierarchical) presentation of specific metadata REFERENCE METADATA - REPORT Content provide a structured (sometimes hierarchical) presentation of specific metadata items 57

REFERENCE METADATA - STRUCTURE 58 REFERENCE METADATA - STRUCTURE 58

SDMX INFORMATION MODEL - DSD VS MSD DSD Target Identifier List of dimensions / SDMX INFORMATION MODEL - DSD VS MSD DSD Target Identifier List of dimensions / group contains Keys Dimensions Identifier components Series Data attributes Metadata attributes defines how a metadata set identifies describes what is being measured Target Object Data what object is being described by the reference metadata Metadata 59

REFERENCE METADATA - STRUCTURE 60 REFERENCE METADATA - STRUCTURE 60

EXAMPLE OF REFERENCE METADATA 61 EXAMPLE OF REFERENCE METADATA 61

EXAMPLE OF REFERENCE METADATA the information applies to all the datasets for the population EXAMPLE OF REFERENCE METADATA the information applies to all the datasets for the population category reference metadata contains descriptive information the metadata structure definition defines the structure of the report 62

Practical Example 63 Practical Example 63

Example for reference metadata 64 Example for reference metadata 64

Example for reference metadata 65 Example for reference metadata 65

Example for reference metadata 66 Example for reference metadata 66

Defining a Metadata Structure Definition n The Tasks 1. Analysis of the entire set Defining a Metadata Structure Definition n The Tasks 1. Analysis of the entire set of metadata in order to identify and document the “Concepts” for which metadata are to be reported or disseminated. 2. Determine the structure of the “Metadata Report” in terms of the concepts used, the hierarchy of the concepts when used in the report, and their “representation” (e. g. is a code list used, is the format free text? ). 3. Specify the “object type” to which the metadata are to be attached, and how this object type is identified: knowledge of the SDMX Information model is useful here (as the metadata can only be attached to object types that can be identified in terms of the object types that exist in the information model). 67 67

METADATA STRUCTURE DEFINITION n A reference metadata set has a set of structural metadata METADATA STRUCTURE DEFINITION n A reference metadata set has a set of structural metadata which describes how it is organized. This metadata identifies – – – what reference metadata concepts are being reported, how these concepts relate to each other (typically as hierarchies), how they may be represented (as free text, as coded values, etc. ), which is the role in its usage (mandatory or conditional) with which formal SDMX object types they are associated n An MSD comprises two fundamental parts: – The Object Type(s) to which metadata can be attached to – The Concepts for which metadata have to be reported • these concepts are grouped under one (or more) Report Structure(s) 68

METADATA REPORT STRUCTURE – CONTACT INFORMATION n In this case, there is no individual METADATA REPORT STRUCTURE – CONTACT INFORMATION n In this case, there is no individual name, just the organisation and the organisation unit. Also, there is no telephone number or fax number, but there is a web contact address. n From this information the following report structure, and underlying concepts, can be derived. 69 69

METADATA REPORT STRUCTURE – CONTACT INFORMATION The actual definition of the concept is in METADATA REPORT STRUCTURE – CONTACT INFORMATION The actual definition of the concept is in the Concept Scheme. Two levels of hierarchy in the report Attribute Concept Sub Attribute Concept Format Contact CONTACT Contact organisation CONTACT_ORG Text Contact organisation unit CONTACT_ORG_UNIT Text Contact mail address CONTACT_MAIL_ADDRESS Text The usage of the concept, its place in the hierarchy, representation, and attachment are defined in the “Metadata Attribute” part of the MSD (called Attribute in the table). 70 70

METADATA REPORT STRUCTURE – CONTENT METADATA Attribute Concept Sub Attribute Concept Format Statistical Presentation METADATA REPORT STRUCTURE – CONTENT METADATA Attribute Concept Sub Attribute Concept Format Statistical Presentation STAT_PRES Statistical unit STAT_UNIT Text Statistical Population STAT_POP Text Reference Area REF_AREA Text Time Coverage TIME_COV Text 71 71

METADATA REPORT STRUCTURE – CONCEPT SCHEME The following concepts are derived from the previous METADATA REPORT STRUCTURE – CONCEPT SCHEME The following concepts are derived from the previous tables: CONTACT_ORG_UNIT CONTACT_MAIL_ADDRESS STAT_PRES The concepts in the concept scheme can be defined in a hierarchy where there is a semantic link between the parent and child concepts; the child concept(s) having a more fine grained semantic meaning of (a part of) the parent. STAT_UNIT STAT_POP REF_AREA TIME_COV 72 72

METADATA REPORT STRUCTURE – THE ATTACHMENT OBJECT TYPE n The Metadata Set which is METADATA REPORT STRUCTURE – THE ATTACHMENT OBJECT TYPE n The Metadata Set which is reported (i. e. the actual metadata content) is intended to be metadata about “something”. n The “something” is the object type and in an MSD it is necessary to declare the object type and to define how it is identified in terms of its constituent components. n For instance, a Code would be identified by a combination of the Code List identifier and the Code identifier. 73 73

METADATA REPORT STRUCTURE – THE ATTACHMENT OBJECT TYPE The attachment object type must be METADATA REPORT STRUCTURE – THE ATTACHMENT OBJECT TYPE The attachment object type must be definable using the identifiable object types in the SDMX Information Model – the XML schema demands this and list the following object types. Agency Concept. Scheme Concept Codelist Code Key. Family Component Key. Descriptor Measure. Descriptor Attribute. Descriptor Group. Key. Descriptor Dimension Measure Attribute Category. Scheme Reporting. Taxonomy Category Organisation. Scheme Data. Provider Metadata. Structure Full. Target. Identifier Partial. Target. Identifier Metadata. Attribute Data. Flow Provision. Agreement Metadata. Flow Content. Constraint Attachment. Constraint Data. Set XSData. Set Metadata. Set Hierarchical. Codelist Hierarchy Structure. Set Structure. Map Component. Map Codelist. Map Code. Map Category. Scheme. Map Category. Map Organisation. Scheme. Map Organisation. Role. Map Concept. Scheme. Map Concept. Map Process. Step 74 74

METADATA REPORT STRUCTURE – THE ATTACHMENT OBJECT TYPE Data Category The object type is METADATA REPORT STRUCTURE – THE ATTACHMENT OBJECT TYPE Data Category The object type is the Data Category (called “Category” in the SDMX Information Model). If the intent of the MSD is to define where the metadata are to be attached in the Eurostat dissemination environment then this is all that is required. 75 75

METADATA REPORT STRUCTURE – THE ATTACHMENT OBJECT TYPE n If Eurostat wish to publish METADATA REPORT STRUCTURE – THE ATTACHMENT OBJECT TYPE n If Eurostat wish to publish this and make it available to other organisations (e. g. as a downloadable file) then it would be necessary to also identify the Data Provider (which in this case is Eurostat). n Object types Category and Data Provider could be associated with a coding scheme – there would certainly be a list for all of the data categories (this would be a “Category Scheme”), but for the Data Provider this could be declared as non enumerated (i. e. text). 76 76

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Concept Scheme n Is a “Container” for METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Concept Scheme n Is a “Container” for concepts. In SDMX this is the level at which the concepts are maintained. n Has a maintenance agency, identity and versioning information. n Concepts in a concept scheme can be hierarchic. n Often concepts are used in reporting hierarchies and these hierarchies are built in the Report Structure of the Metadata Structure Definition. 77 77

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Schematic of the structure of a concept METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Schematic of the structure of a concept scheme 78 78

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Report Structure - General Structure defined within METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Report Structure - General Structure defined within a Metadata Structure Definition 79 79

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Report Structure - Contact Report CONTACT ESTAT_MSD METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Report Structure - Contact Report CONTACT ESTAT_MSD CONTACT_ORG_UNIT CONTACT_MAIL_ADDRESS CATEGORY _CONTACT_ REPORT ESTAT_METADATA_CS Contact organisation name Contact organisation unit Contact mail address 80 80

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Report Structure - Quality Report STAT_PRES ESTAT_MSD METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Report Structure - Quality Report STAT_PRES ESTAT_MSD STAT_UNIT STAT_POP REF_AREA CATEGORY _CONTENT_ REPORT TIME_COV ESTAT_METADATA_CS Statistical Presentation Statistical unit Statistical Population Reference Area Time Coverage 81 81

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Defining the Attachment Object Type Schematic defines METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Defining the Attachment Object Type Schematic defines all of the possible object types that are within the scope of the MSD references a sub set of the Identifier Components of the Full Target identifier 82 82

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Defining the Attachment Object Types Data Provider METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Defining the Attachment Object Types Data Provider Data Category 83 83

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Defining the Attachment Object Types references just METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Defining the Attachment Object Types references just the Identifier Component linked to the Data Provider ESTAT_MSD comprises the Category and the Data. Provider object types CATEGORY Category Data Provider AGENCY CATEGORY AGENCY ESTAT_CATEGORY _SCHEME 84 84

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Defining the Attachment Object Type Note that METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Defining the Attachment Object Type Note that this metadata is attached at a fairly high level – the level of the subject domain category – for the data provider. If there are metadata at a lower level of granularity, for instance at the level of the “table”, then this can also be specified in an MSD. In order to attach metadata to each of the tables then each of these tables can be defined as a “Dataflow” and the metadata attached to the provision of the data by a data provider for this dataflow. 85 85

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Link of the Report Structures to the METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Link of the Report Structures to the relevant Target Identifiers ESTAT_MSD CATEGORY_CONTENT_REPORT ESTAT_CATEGORY_SCHEME AGENCY link the Report Structures to the relevant Target Identifiers 86 86

METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Link of the Report Structures to the METADATA REPORT STRUCTURE – BRINGING IT TOGETHER Link of the Report Structures to the relevant Target Identifiers The XML that makes this link is the target attribute in the Report Structure 87 87

METADATA SET: STRUCTURE n References to : – a Metadata Structure Definition (MSD) – METADATA SET: STRUCTURE n References to : – a Metadata Structure Definition (MSD) – a Report Structure – a Target Identifier n Defines: – The actual values of the target objects n Comprises: – The Reported Attributes and their corresponding Values – These Attributes may be: • • coded text date/time number etc. 88

METADATA SET – GENERAL SCHEMATIC 89 89 METADATA SET – GENERAL SCHEMATIC 89 89

METADATA SET – GENERAL SCHEMATIC The Contact Metadata Set there can be many such METADATA SET – GENERAL SCHEMATIC The Contact Metadata Set there can be many such sets in an SDMX Message. Metadata is reported in a Metadata Set Each metadata report is reported in a separate Metadata Set 90 90

METADATA SET – GENERAL SCHEMATIC ESTAT_MSD Category = Key_Indicators. Structural_Indicators Data Provider=EUROSTAT CATEGORY_CONTACT_REPORT CONTACT_ORG METADATA SET – GENERAL SCHEMATIC ESTAT_MSD Category = Key_Indicators. Structural_Indicators Data Provider=EUROSTAT CATEGORY_CONTACT_REPORT CONTACT_ORG Unit C 2 National accounts : production CONTACT_ORG_UNIT Eurostat, Statistical Office of the European Communities CONTACT_MAIL_ADDRESS http: //epp. eurostat. ec. europa. e u/portal/page/portal/help/user_ support 91 91

METADATA SET – METADATA FILE 92 92 METADATA SET – METADATA FILE 92 92

METADATA SET – ESMS EXAMPLE 93 METADATA SET – ESMS EXAMPLE 93

METADATA SET – ESMS EXAMPLE 94 METADATA SET – ESMS EXAMPLE 94

Metadata Reporting 95 Metadata Reporting 95

METADATA REPORTING IN SDMX-ML n Reference Metadata mechanism supports reporting and dissemination through specified METADATA REPORTING IN SDMX-ML n Reference Metadata mechanism supports reporting and dissemination through specified types of messages n Structure Message – Provides the Metadata Structure Definition n Generic Metadata Message – Provides a single format for any metadata structure definition – All reference metadata expressible in SDMX-ML format can be marked up according to this format, in agreement with the contents of the Structure – Performs only a minimum of validation – Supports the creation of generic software tools and services for processing reference metadata n Metadata Report message – For each MSD, an XML schema (specific to that MSD) can be created – Performs validation on sets of reported data – Less verbose than the Generic metadata message – Easier to use because the XML mark-up relates directly to the reported concepts 96

METADATAFLOW DEFINITION n Very similar to a Data flow definition; describes, categorises, and constrains METADATAFLOW DEFINITION n Very similar to a Data flow definition; describes, categorises, and constrains metadata sets n Metadata sets are reported or disseminated according to a metadata flow definition. n Identifies a Metadata Structure Definition n May be associated with one or more subject matter domains (this facilitates the search for data according to organised category scheme) n Constraints, in terms of reporting periodicity or sub set of possible keys that are allowed in a metadata set, may be attached to the metadata flow definition. 97

DESCRIBING THE DATA EXCHANGE When? How? Where? 98 DESCRIBING THE DATA EXCHANGE When? How? Where? 98

SDMX INFORMATION MODEL - OBJECTS Structure Definition Data or Metadata Set Categorisation Category Data SDMX INFORMATION MODEL - OBJECTS Structure Definition Data or Metadata Set Categorisation Category Data or Metadata Flow Data Provider Category Scheme Provision Agreement Constraint 99

SDMX INFORMATION MODEL - OBJECTS Structure Definition Data or Metadata Set Categorisation Category Data SDMX INFORMATION MODEL - OBJECTS Structure Definition Data or Metadata Set Categorisation Category Data or Metadata Flow Data Provider Category Scheme Provision Agreement Constraint 100

SDMX-IM - CATEGORISATION Category “Tourism” 101 SDMX-IM - CATEGORISATION Category “Tourism” 101

SDMX-IM - CATEGORISATION - EXEMPLE Statistical Tables Sub categories 102 SDMX-IM - CATEGORISATION - EXEMPLE Statistical Tables Sub categories 102

SDMX-IM - CATEGORISATION - EXEMPLE Category Sub categories Dataflows 103 SDMX-IM - CATEGORISATION - EXEMPLE Category Sub categories Dataflows 103

SDMX INFORMATION MODEL – DATA & METADATA FLOW Structure Definition Category Scheme Data & SDMX INFORMATION MODEL – DATA & METADATA FLOW Structure Definition Category Scheme Data & Metadata set DATA & METADATA FLOWS Category Data Provider Provision Agreement Constraint 104

SDMX IM – DATA PROVIDERS & PROVISION AGREEMENT Production and dissemination of Statistical data SDMX IM – DATA PROVIDERS & PROVISION AGREEMENT Production and dissemination of Statistical data Data provider Production and dissemination of Reference Metadata Data flows and Metadata flows Data consumer 105

SDMX IM – DATA PROVIDERS & PROVISION AGREEMENT What and When? Who? 106 SDMX IM – DATA PROVIDERS & PROVISION AGREEMENT What and When? Who? 106

SDMX IM - CONSTRAINTS DATA & METADATA FLOWS Provision Agreement Constraint 108 SDMX IM - CONSTRAINTS DATA & METADATA FLOWS Provision Agreement Constraint 108

SDMX IM - SUMMARY Data Structure Definition Category Scheme Category Categorisation Data Flow Data SDMX IM - SUMMARY Data Structure Definition Category Scheme Category Categorisation Data Flow Data Provider Scheme Provision Agreement references Registered Data Source Data Provider R e g i s t e r K e y s Constraint 109

SDMX IM - SUMMARY 110 SDMX IM - SUMMARY 110

SDMX-ML Messages 111 SDMX-ML Messages 111

SYNTAXES FOR SDMX MESSAGES n Based on a common Information Model – SDMX-EDI (GESMES/TS) SYNTAXES FOR SDMX MESSAGES n Based on a common Information Model – SDMX-EDI (GESMES/TS) • EDIFACT syntax • Time-series oriented – One format for Data Sets – SDMX-ML • XML syntax • Four different formats for Data Sets • Easier validation (XML based) 112

SDMX DATA COMMON HEADERS 113 SDMX DATA COMMON HEADERS 113

SDMX DATA MESSAGES Equivalent representations for reporting Datasets Version 2. 0 4 data messages, SDMX DATA MESSAGES Equivalent representations for reporting Datasets Version 2. 0 4 data messages, each with a distinct format. Version 2. 1 Cross. Sectional Data Compact Data Therefore, there are now 4 data messages which are based on two general formats: • Generic. Data Generic. Time. Series. Data • Structure. Specific. Data Structure. Specific. Time. Series. Data Utility. Data Phased out Generic. Data 114

EXAMPLE OF GENERIC SDMX-ML MESSAGE 115 EXAMPLE OF GENERIC SDMX-ML MESSAGE 115

EXAMPLE OF COMPACT SDMX-ML MESSAGE 116 EXAMPLE OF COMPACT SDMX-ML MESSAGE 116

EXAMPLE OF CROSS-SECTIONAL SDMX-ML MESSAGE 117 EXAMPLE OF CROSS-SECTIONAL SDMX-ML MESSAGE 117

EXERCISE 3: CREATE SAMPLE DATA MESSAGES CONFORMING TO THE DSD FISH_CATCH_A USING THE DATA EXERCISE 3: CREATE SAMPLE DATA MESSAGES CONFORMING TO THE DSD FISH_CATCH_A USING THE DATA STRUCTURE WIZARD (DSW) 118

CONVERSIONS SDMX V 2. 0 Equivalent formats Compact SDMX-ML Exceptions: Based on the same CONVERSIONS SDMX V 2. 0 Equivalent formats Compact SDMX-ML Exceptions: Based on the same IM If a Cross-Sectional DSD does NOT contain a time dimension Generic SDMX-ML Cross-sectional SDMX-ML Can be expanded to other formats (e. g. CSV, GESMES) 119

SDMX CONVERTER Read the input message Parsing Populate the data model of the tool SDMX CONVERTER Read the input message Parsing Populate the data model of the tool (based on the SDMX v 2. 0 information model) Write the converted message Uses the data model to write the output message in the required target format. Information retrieved from the Registry Data flow ID is used to retrieve the data flow definition from the Registry. The DSD ID, version and agency. ID are retrieved from the data flow definition and are used to acquire the DSD 120

SDMX CONVERTER MAIN FUNCTIONALITY Main use: Conversion CSV Compact SDMX-ML Possible conversions CSV Compact SDMX CONVERTER MAIN FUNCTIONALITY Main use: Conversion CSV Compact SDMX-ML Possible conversions CSV Compact SDMX-ML Generic SDMX-ML Utility SDMX-ML Cross-sectional SDMX-ML * Cross-sectional SDMX-ML SDMX-EDI (GESMES/TS) 121

Select the Input file Select the output file Select the input and output formats Select the Input file Select the output file Select the input and output formats SDMX Basics MMMM 2011 Select the DSD on the local drive Identify a DSD to download from the SDMX Registry Identify a dataflow linked to the DSD to download from the SDMX Registry Select / manage headers SDMX training session on basic principles, for CSV input formats Major Changes in version 2. 1 CSV parameters Fabien JACQUET Select mapping / transoding tables GESMES representation for GESMES output formats XML parameters for SDMX output formats Load / save the current settings

EXERCISE 4: CONVERT BETWEEN VARIOUS MESSAGE FORMATS USING THE SDMX CONVERTER 123 EXERCISE 4: CONVERT BETWEEN VARIOUS MESSAGE FORMATS USING THE SDMX CONVERTER 123

Major changes in SDMX v 2. 1 124 Major changes in SDMX v 2. 1 124

OVERVIEW OF THE CHANGES n Structural Metadata – Data Structure Definition (DSD) – Metadata OVERVIEW OF THE CHANGES n Structural Metadata – Data Structure Definition (DSD) – Metadata Structure Definition (MSD) – Constraint – Code List – Organisation Scheme – Categorising Structures – Process – Provision Agreement – Transformations and Expressions n Data Set – Message Changes – Structured Data Mechanism Revised n Metadata Set – Message Changes – Alignment of Formats – Structured Metadata Mechanism Revised 125

DATA STRUCTURE DEFINITION (DSD) n Support for non-time-series data structures Measure Dimension Version 2. DATA STRUCTURE DEFINITION (DSD) n Support for non-time-series data structures Measure Dimension Version 2. 0 Version 2. 1 DSD Concepts Measures Code lists Attributes Code lists Dimensions And Measure dimension Code lists Concept role Primary Measure Attributes Code lists Dimensions Code lists Measure Dimension Concept Scheme explicit element 126

CONSTRAINT Version 2. 0 Constraint is only available for use in a Registry context CONSTRAINT Version 2. 0 Constraint is only available for use in a Registry context Version 2. 1 Dataflow Constraint Code list Constraint Registry Provision agreement Constraint is embedded in the object it constrains Provision agreement The same Constraint can be “used” to constrain multiple objects DSD Constraint is independently maintained 127

CODE LIST Common Code list Constraint 1 Partial Version 2. 1 Constraint 2 DSD CODE LIST Common Code list Constraint 1 Partial Version 2. 1 Constraint 2 DSD 128

In 2. 1, a Category can categorise any type of Identifiable Artefact CATEGORISATION Version In 2. 1, a Category can categorise any type of Identifiable Artefact CATEGORISATION Version 2. 0 Data/Metadata flow Category Scheme Category Code list Provision agreement Reference DSD two-way referencing Reference Data/Metadata flow In 2. 0, a Category can only reference a Dataflow or Metadataflow. Category Version 2. 1 Categorisation independently maintainable and references the Category and the object that is categorised 129

DATA MESSAGE CHANGES 2. 0 2. 1 Generic Structure-specific Cross sectional Not supported Generic. DATA MESSAGE CHANGES 2. 0 2. 1 Generic Structure-specific Cross sectional Not supported Generic. Data Cross. Sectional Structure. Specific. Data Time series Generic Compact & Utility Generic. Time. Series. Data Structure. Specific. Time. Series Data 130

METADATA STRUCTURE CHANGES Full Target Identifier and Partial Target Identifier are replaced by the METADATA STRUCTURE CHANGES Full Target Identifier and Partial Target Identifier are replaced by the single Metadata Target Identifier Component is renamed Target Reference 131

METADATA MESSAGE CHANGES 2. 0 2. 1 Generic Structure-specific Metadata set Generic. Metadata. Report METADATA MESSAGE CHANGES 2. 0 2. 1 Generic Structure-specific Metadata set Generic. Metadata. Report Structure. Specific. Metadata In 2. 1, the two formats are quite similar 132

TO KNOW MORE ABOUT SDMX Eurostat SDMX info Space https: //webgate. ec. europa. eu/fpfis/mwikis/sdmx TO KNOW MORE ABOUT SDMX Eurostat SDMX info Space https: //webgate. ec. europa. eu/fpfis/mwikis/sdmx World Bank 133