Poli 5D Social Science Data Analytics More on Stata Shane Xinyang Xuan ShaneXuan.com February 1, 2017 ShaneXuan.com 1 / 12
Contact Information Shane Xinyang Xuan xxuan@ucsd.edu The teaching sta ff is a team! Professor Roberts M 1600-1800 (SSB 299) Jason Bigenho Th 1000-1200 (Econ 116) Shane Xuan M 1100-1150 (SSB 332) TH 1200-1250 (SSB 332) Supplemental Materials UCLA STATA starter kit http://www.ats.ucla.edu/stat/stata/sk/ Princeton data analysis http://dss.princeton.edu/training/ ShaneXuan.com 2 / 12
Road map Some quick notes before we start today’s section: – Make sure that you pass around the attendance sheet – Open a .do file – Import your data (“h1 fams data.xlsx”) – I will be using my slides, and you will need to type the code in your .do file ShaneXuan.com 3 / 12
Announcement I have changed my o ffi ce hours to I Monday 11-11:50 am I Thursday 12-12:50 pm in order to accommodate as many students as possible. ShaneXuan.com 4 / 12
Data management I You should have the data imported before the section starts: – cd “/Users/Shane/Dropbox/Poli5D/psets/” – import excel “h1 fams data.xlsx”, sheet(“Families”) firstrow clear ShaneXuan.com 5 / 12
Data management I You should have the data imported before the section starts: – cd “/Users/Shane/Dropbox/Poli5D/psets/” – import excel “h1 fams data.xlsx”, sheet(“Families”) firstrow clear I We want to generate a new variable (age dad2) ShaneXuan.com 5 / 12
Data management I You should have the data imported before the section starts: – cd “/Users/Shane/Dropbox/Poli5D/psets/” – import excel “h1 fams data.xlsx”, sheet(“Families”) firstrow clear I We want to generate a new variable (age dad2) generate age dad2 = age dad + 1 ShaneXuan.com 5 / 12
Data management I You should have the data imported before the section starts: – cd “/Users/Shane/Dropbox/Poli5D/psets/” – import excel “h1 fams data.xlsx”, sheet(“Families”) firstrow clear I We want to generate a new variable (age dad2) generate age dad2 = age dad + 1 I We want to replace a value in variable race mom ShaneXuan.com 5 / 12
Data management I You should have the data imported before the section starts: – cd “/Users/Shane/Dropbox/Poli5D/psets/” – import excel “h1 fams data.xlsx”, sheet(“Families”) firstrow clear I We want to generate a new variable (age dad2) generate age dad2 = age dad + 1 I We want to replace a value in variable race mom replace race mom = “Black” if race mom == “Blck” ShaneXuan.com 5 / 12
Label your variables I Create a mapping (mom older names) ShaneXuan.com 6 / 12
Label your variables I Create a mapping (mom older names) label define mom older names 1 ”Yes” 0 ”No” ShaneXuan.com 6 / 12
Label your variables I Create a mapping (mom older names) label define mom older names 1 ”Yes” 0 ”No” I Associate the mapping with a variable ShaneXuan.com 6 / 12
Label your variables I Create a mapping (mom older names) label define mom older names 1 ”Yes” 0 ”No” I Associate the mapping with a variable label values mom older mom older names ShaneXuan.com 6 / 12
Label your variables I Create a mapping (mom older names) label define mom older names 1 ”Yes” 0 ”No” I Associate the mapping with a variable label values mom older mom older names I Assign label ShaneXuan.com 6 / 12
Label your variables I Create a mapping (mom older names) label define mom older names 1 ”Yes” 0 ”No” I Associate the mapping with a variable label values mom older mom older names I Assign label label variable mom older ”Whether mom is older” ShaneXuan.com 6 / 12
Label your variables I Create a mapping (mom older names) label define mom older names 1 ”Yes” 0 ”No” I Associate the mapping with a variable label values mom older mom older names I Assign label label variable mom older ”Whether mom is older” I Tabulate your results ShaneXuan.com 6 / 12
Label your variables I Create a mapping (mom older names) label define mom older names 1 ”Yes” 0 ”No” I Associate the mapping with a variable label values mom older mom older names I Assign label label variable mom older ”Whether mom is older” I Tabulate your results tab mom older ShaneXuan.com 6 / 12
Deal with missingness I Generate missing ShaneXuan.com 7 / 12
Deal with missingness I Generate missing generate dadmiss = missing(age dad) ShaneXuan.com 7 / 12
Deal with missingness I Generate missing generate dadmiss = missing(age dad) I Tabulate your results ShaneXuan.com 7 / 12
Deal with missingness I Generate missing generate dadmiss = missing(age dad) I Tabulate your results tab dadmiss ShaneXuan.com 7 / 12
Deal with missingness I Generate missing generate dadmiss = missing(age dad) I Tabulate your results tab dadmiss I lookup functions ShaneXuan.com 7 / 12
Deal with missingness I Generate missing generate dadmiss = missing(age dad) I Tabulate your results tab dadmiss I lookup functions list if dadmiss == 1 ShaneXuan.com 7 / 12
Create some “bins” Scenario: We want to recode interval variables into ordinal variables. I recode functions ShaneXuan.com 8 / 12
Create some “bins” Scenario: We want to recode interval variables into ordinal variables. I recode functions recode age dad (15/25=1) (26/35=2) (36/55=3), gen(age dad3) ShaneXuan.com 8 / 12
Create some “bins” Scenario: We want to recode interval variables into ordinal variables. I recode functions recode age dad (15/25=1) (26/35=2) (36/55=3), gen(age dad3) I Create a mapping ShaneXuan.com 8 / 12
Create some “bins” Scenario: We want to recode interval variables into ordinal variables. I recode functions recode age dad (15/25=1) (26/35=2) (36/55=3), gen(age dad3) I Create a mapping label define agenames 1 “young” 2 “middle” 3 “older” ShaneXuan.com 8 / 12
Create some “bins” Scenario: We want to recode interval variables into ordinal variables. I recode functions recode age dad (15/25=1) (26/35=2) (36/55=3), gen(age dad3) I Create a mapping label define agenames 1 “young” 2 “middle” 3 “older” I Apply the mapping ShaneXuan.com 8 / 12
Create some “bins” Scenario: We want to recode interval variables into ordinal variables. I recode functions recode age dad (15/25=1) (26/35=2) (36/55=3), gen(age dad3) I Create a mapping label define agenames 1 “young” 2 “middle” 3 “older” I Apply the mapping label values age dad3 agenames ShaneXuan.com 8 / 12
Create some “bins” Scenario: We want to recode interval variables into ordinal variables. I recode functions recode age dad (15/25=1) (26/35=2) (36/55=3), gen(age dad3) I Create a mapping label define agenames 1 “young” 2 “middle” 3 “older” I Apply the mapping label values age dad3 agenames I Tabulate results, calculate by row ShaneXuan.com 8 / 12
Create some “bins” Scenario: We want to recode interval variables into ordinal variables. I recode functions recode age dad (15/25=1) (26/35=2) (36/55=3), gen(age dad3) I Create a mapping label define agenames 1 “young” 2 “middle” 3 “older” I Apply the mapping label values age dad3 agenames I Tabulate results, calculate by row tab age dad3 welfare, row ShaneXuan.com 8 / 12
Visualization in Stata I Histogram – histogram age mom – histogram age mom, frequency – histogram age mom, percent ShaneXuan.com 9 / 12
Visualization in Stata I Histogram – histogram age mom – histogram age mom, frequency – histogram age mom, percent I Scatterplot – twoway (scatter age mom age dad, mlabel(idnum) mlabsize(tiny) msize(tiny)) ShaneXuan.com 9 / 12
Visualization in Stata (2) I Boxplot ShaneXuan.com 10 / 12
Visualization in Stata (2) I Boxplot – graph box age mom ShaneXuan.com 10 / 12
Visualization in Stata (2) I Boxplot – graph box age mom – graph box age mom, scheme(s1manual) ShaneXuan.com 10 / 12
Visualization in Stata (2) I Boxplot – graph box age mom – graph box age mom, scheme(s1manual) I Barplot ShaneXuan.com 10 / 12
Visualization in Stata (2) I Boxplot – graph box age mom – graph box age mom, scheme(s1manual) I Barplot I Code race mom into numeric variable ShaneXuan.com 10 / 12
Visualization in Stata (2) I Boxplot – graph box age mom – graph box age mom, scheme(s1manual) I Barplot I Code race mom into numeric variable encode race mom, generate(race mom2) ShaneXuan.com 10 / 12
Visualization in Stata (2) I Boxplot – graph box age mom – graph box age mom, scheme(s1manual) I Barplot I Code race mom into numeric variable encode race mom, generate(race mom2) I install -catplot- ShaneXuan.com 10 / 12
Visualization in Stata (2) I Boxplot – graph box age mom – graph box age mom, scheme(s1manual) I Barplot I Code race mom into numeric variable encode race mom, generate(race mom2) I install -catplot- ssc inst catplot ShaneXuan.com 10 / 12
Visualization in Stata (2) I Boxplot – graph box age mom – graph box age mom, scheme(s1manual) I Barplot I Code race mom into numeric variable encode race mom, generate(race mom2) I install -catplot- ssc inst catplot I Plot ShaneXuan.com 10 / 12
Recommend
More recommend