
a0c496b29513dd0634d8c5662a0717a5.ppt
- Количество слайдов: 48
Introduction to Stata 2016 1
And how will we do this? n I introduce and demonstrate a topic and a set of commands n You try the same commands on your computer n I (sometimes) give you small assignments to complete n You report your experience and we discuss any problems that occured 2
What do we Find? n A result window (Stata speaks to us) n A review window (shows executed commands) n A variable window (shows the variables in the data set) and n A command interface (where we tell Stata what to do) 3
4
Using the Do-files n The do-files in Stata is a text file with commands that can be run directly from Stata n The do-files stores your commands n By using do-files you always have a good documentation of your work such as codings etc. n It also makes it easy for you to repeat or modify your analyses n By using do-files you never have to make any changes in your data. 5
Data for exercise Download : exercise 1 and 2 to your computer https: //gul. gu. se/course. Id/73396/content. do? id=32404802 6
Using the Do-files n. A do-file should initially look something like this: n n n . clear. set more off. use c: /exercise 2. dta 7
Using the Do-files q q . clear – clears the data in memory. Otherwise no new data can be opened. . set more off – Tell Stata to execute all commands inspite of screen size n Dots before any commands are standard in most books on Stata (I use it as well so you get used to it) n More over, Stata is sensitive for capital- and lower case letters. 8
Using the help command n By typing. help command in the ci many problems can be solved. n Try any of the following: q q q . help desc. help lab. help list 9
renaming the data n Changing variable names n The rename command in the ci: n . rename old_varname new_varname Labeling variables. label variable varname [”label”] e. g. label variable sex ”Gender” 10
Examining your data II n n n Some helpful commands to examine your data more carefully. tabulate/tabulate 1. tabstat. summarize. list. browse. order. sort/gsort. inspect. describe. codebook 11
Examining your data II n Let us now go through each command see what it can do for us using the ” exercise 1. dta”. n . tabulate (tab) – tabulates our variables. The command requires a variable list. tab vars q . tab v 39 If you want to tab several variables: q . tab 1 v 39 v 35 x v 40 12
Examining your data II. summarize (sum) – summarize our variables. (If no variables in varlist=all variables). q q q . sum vars, d (d=detailed – gives more information such as median values etc. ) For example: q. sum v 39, d (d=detailed) q. sum v 39 v 35 x v 40 13
Examining your data II. list – list variables. The command shows the values for a specific observation on a certain variable or all variables q . list shows the values for a specific observation. list vars n shows the values for a specific observation on a specific variable n q For example: q. list v 39 v 35 x v 40 Type set more off for long outputs 14
Examining your data II. order – order the variables. The command hence requires a variable list. order vars For example: q . order v 39 v 35 x v 40 15
Examining your data II. sort –arranges the observations of the current data into ascending order based on the values of the variables in varlist. (. sort vars) For example: q q . sort v 39 v 35 x v 40 16
Examining your data II Even better is. gsort [-|+] that arranges the observations into ascending or decending order such as: . gsort –var 1. gsort +var 1 var 2 var 3 etc 17
Examining your data II. inspect –Display simple summary of data's attributes. It is a bit more detailed compared to sum or tab and is useful for numerical vars. For example: q q q . inspect vars. inspect v 39 v 35 x v 40 18
Examining your data II. describe – Describe data in memory or in file For example: q q q . describe vars. describe v 39 v 35 x v 40 19
Examining your data II. codebook – describe data contents and the output is often useful for printing. It also gives information of variable characteristics such as numeric or string For example: q q q . codebook vars. codebook v 39 v 35 x v 40 20
Stata options n Stata’s general grammar is very straight forward and most commands can be executed with different options. . command varlist, [options] 21
Stata options n Stata’s general grammar is very straight forward and most commands can be executed with different options. n To see which options that are available – type: . help command 22
Stata options n Let’s try the some of the commands we learnt with their options (we have already tried one) . sum varlist, [d] is here our option 23
Stata options n Let’s try the some of the commands we learnt with their options (we have already tried one) Useful options for tabulate : , sort , nolabel , missing . tab varlist, [] 24
Stata options. tab varlist, [] n For example, type: q v 39 v 35 x v 40 . tab v 39, sort. tab v 40, nolabel. tab 1 v 39 v 40 , missing (underscore means abbreviations for the ci) 25
Stata options Let’s continue with the options Stata allows for a wide range of different options or pre-post commands that can be used with the main commands. . [by varname] command varlist [in] [if], [options] 26
Stata options Let’s continue with the options and introduce the ifstatements An if-statment means that the command only is executed for those observations who fulfill the condition you specify. command varlist [if], [options] 27
Stata options If-statments should be specified before the comma and can be combined with other options such as: . command varlist if var 1==x, [options] Try the following statements: . tab u 39 if v 35 x==1, m (shows var values with missing included) Here we are simply tabulating the values of satisfaction with life for all men 28
Stata options n What’s the level of life satisfaction among young and old people? n . tab u 39 if v 42 x <1973 or. tab u 39 if v 42 x >1970 n n 29
Stata options – common Stata operators 30
Stata options – common Stata operators Some noteworthy operators: | == != ^ or equal to (as comprison) not equal to (as comprison) exponent (eg. 2^2=4) 31
Stata options – common Stata operators n Lets try the OR operator n . tab varlist if var. X==Z | var. X==Y n For example: tab w 39 if v 42 x>1970 & v 35 x==1 & v 40==1 | v 40==4 32
Stata options – common Stata operators n Now we have introduced the IF-statement. . [by varname] command varlist [in] [if], [options] n Let’s look at the IN-statement. . command varlist [in], [options] 33
Stata options – common Stata operators Specifics for the in qualifier: f l the first observation in the data set the last observation in the data set Such as: . command varlist in f/l 34
Stata options – common Stata operators Examples of in-statements: command meaning list in 1/10 list in f/10 list in 5/15 list in 5/l list in -5/l list in 10 list the first ten observations list observation nr 5 to 15 list from observation 5 to the end list the last five observations list only observation nr ten 35
Stata options – common Stata operators If you want to keep or drop variables Example: . drop/keep var 3 – var 5 Or labels. label drop labelname Or observations. drop in 45/65 36
Stata options – common Stata operators n Lets say you (for some reason) want to find the level of education among the first 20 respondents that are very satisfied with their lifes n n . gsort +v 39. list v 40 v 39 in 1/20 n Or the last 20 respondents n . list v 40 v 39 in -20/l 37
Stata options And finally, let’s check out the by option. . [by varname] command varlist [in] [if], [options] With the by command you can receive the values of variable x for every value of variable x, such as: . by x: tab z However, the by command only works for sorted data… 38
Stata options Solutions (There always several ways to do things in Stata): 1, sort the variabel and then use the by command (the long way). sort x. by x: tab z 2, sort directly after the by command such as: . by x, sort: tab z 3, or even better, use the bysort command. bysort x: tab z 39
Stata options n Let’s try the bysort function n For example q q n . bysort v 40: tab v 39. bysort v 40: sum v 39 Etc. 40
Conclusively – what have we learnt n n n n Working with do-files More on data examination Creating simple univariate tables Sorting your observations Re-ordering your variables Stata logical operators Using command qualifiers, if, in and by statements 41
Exploring data n Before we move over to data management (which is the next subject), let’s practice what we’ve learnt so far… with a new (and more interesting) data set. n so…. . q q Clear Stata Create a new do-file. Type in neccessary set-commands Load the data file ” exercise 2. dta” (Make all this in the do-file) 42
n n . clear. set more off. use exercise 2. dta This data is based on Qo. G with country/years as units of analysis. 43
Examining your data II n n n Take a look at the data, explore the variables use the commands below combined with the if/in/by options. tabulate/tabulate 1. summarize. list. browse. order. sort/gsort. inspect. describe. codebook 44
Question 1 n Which are the top ten countries_year observations in terms of having most GDP per capita? 45
Solution(s) n . clear. set more off n use "C: exercise 2. dta", clear n n gsort -mad_gdppc list cname_year in 1/10 n Or n . tab cname_year in 1/10 n 46
Question 2 n What is the mean value of GDP per capita among countries that have a religious Fractionalization below or above the 25: th and 75: th percentile values? 47
Solution: q q q sum al_religion, d al_religion <= 25: th perc. =. 232 al_religion >= 75: th perc. =. 641. sum mad_gdppc if al_religion<=3. 02. sum mad_gdppc if al_religion>=4. 18 48