1ef3d6f5bae18022f45abcef93da0570.ppt
- Количество слайдов: 11
SHARE IDs Stephanie Stuck MEA Frankfurt December 6 th Mannheim Research Institute for the Economics of Aging www. mea. uni-mannheim. de
Data versions and ID-variables internal version Household ID public version sampid 2 (scrambled version of sampid) who gets the id all households cvid should be used to merge modules within waves all household members (in CV), that means: non-eligible persons get a cvid, too, e. g. children, other people living in the household respid should be used to merge waves all eligible household members (in CV), that means: all household members that should be interviewed, e. g. respondents and partners even if partners areyounger than 50 years Person ids Download site internal website http: //cdata 8. uvt. nl/share/ version 2. 7/ country specific versions public website www. shareproject. org & internal website All Countries Available for respective country team, Cent. ERdata, MEA working groups, external users 2
sampid rules (old) Ø Digits 1 -2: country code for example 11 for Austria, 23 for Belgium French speaking Ø Digits 3 -5: wave indicator indicates the wave in which the household participated for the first time (042 for wave 1 main survey and 062 for wave 2 main survey). Ø Digits 6 -11: household ID Ø Digits 12 -13: longitudinal household split indicator 00 by default, if household splits, respondent moves out of the household based on respid, e. g. if ‘moving out respondent’ has respid 01 it is changed to 01 Examples 1104200010000: Austria, starting in wave 1 (longitudinal sample) 110420001: Austria, starting in wave 1, split off household in wave 2 2306214010300: Belgium (French speaking), starting wave 2 (refresher) è One needs to combine sampid with the respondent ID (respid) to identify and merge cases on the respondent level è Merging problems esp. for split households / ‘moving’ respondents across waves 3
Therefore. . . Ø We will change the system and Ø have unique person ids, that can be used to merge modules and waves Ø person id will not change across waves, even if a household splits Ø have string country codes instead of numeric ones Ø We will divide sampid into different parts: Ø household id (fixed part and split indicator if needed) Ø new wave indictor variable ‘wi’ indicates when a household first entered the sample 4
Old and new country codes (first two digits of household ids and person ids) Country (old) numeric code internal country code public country code Austria 11 AT AT Germany 12 DE DE Sweden 13 SE SE Netherlands 14 NL NL Spain 15 ES ES Italy 16 IT IT France 17 FR FR Denmark 18 DK DK Greece 19 GR GR 5
Old and new country codes (first two digits of household ids and person ids) Country (old) numeric code internal country code public country code Switzerland German 20 Cg Switzerland French 21 Cf Switzerland Italian 22 Ci Belgium French 23 Bf Belgium Flemish 24 Bn Israel Hebrew 25 Ih Israel Arabic 26 Ia Israel Russian 27 Ir Czechia 28 CZ CZ Poland 29 PL PL Ireland 30 IE IE CH BE IL 6
New household identifier hhidcom (internal) & hhid (public) Ø Digits 1 -2: country code in letters. e. g. AT for Austria, Bf for Belgium French speaking (internal) Ø Digits 3 -8: fixed household ID This part will not change across waves if household splits off Ø Digit 9: one digit added to the fixed household id to identify whether it is an ‘additional’ household that resulted from a split, Ø A for all ‘original’ household (all in wave 1, refresher in wave 2) Ø B used only if a household has split. A is than still used for the ‘first’ part of the household and B for the ‘splitting part’ (the one that is interviewed second, normally the one that moved out) Ø C is used for very rare case of split off household when original household in wave 1 consisted of 3 eligible sisters for example and split in 3 parts. Examples for new household id AT 100100 A: Austria, ‘original’ household AT 100100 B: Austria, split off household Bf 140103 A: Belgium French speaking household (internal) 7
New person identifier: pidcom Ø Digits 1 -2: country code (CC) in letters e. g. AT for Austria, Bf for Belgium French speaking Ø Digits 3 -8: fixed household ID this part will not change across waves. Ø Digit 9 -10: respondent id, e. g if respid is 1 it will be 01 Respondent identifier old new Sampid & respid pidcom 1110010000 1 AT 10010001 1110010000 2 AT 10010002 2314010300 1 Bf 14010301 8
Old and new ids internal version public version (scrambled) old new Household ID & wave indicator sampid hhidcom & wi sampid 2 hhid Person id sampid & respid pidcom sampid & respid & wi merge. Id 9
In addition: Ø A dataset will be generated that shows to which households a respondent belonged during her or his ‘SHARE history’, e. g. : pidcom hhidcom_w 1 hhidcom_w 2 hhidcom_w 3 AT 10010001 AT 100100 A AT 10010002 AT 100100 A AT 100100 B Bf 14010301 Bf 140103 A Ø A compatibility file will be made for internal use to merge the old sampid respid files with the new ids Ø We will have an additional person id (uuid) to insure uniqueness, but it will be used in the background only for technical reasons 10
Right now Ø we still have to use the old system for data cleaning Ø but we will have soon have the pidcom to merge across waves Ø mergeid will already be included in release 0 Ø as soon as the new system is available and checked we will inform you how to go on probably in the next SHARE data cleaning meeting (February 6, Antwerp) 11
1ef3d6f5bae18022f45abcef93da0570.ppt