Yale Center for Analytical Sciences (YCAS) A Macro Automates your Table 1 Geliang Gan Biostatistician YCAS geliang.gan@yale.edu
Contents Ø Why did I develop the macro Ø How did I develop the macro Ø What can the macro do for you Ø Limitation of the macro
Developing Table 1 macro Ø 2 macros were developed, one for categorical variable summary and one for continuous variable summary Ø 2 macros were combined into 1 overall table 1 macro % Table1Macro (dsn = car1, calist = type origin gender carmake, coplist = msrp invoice, cononplist = mpg_city mpg_highway, group = drivetrain);
Sample Result Table1 - Variable Summary Drivetrain Front Rear N = 336 P Value (N = 226) (N = 110) MSRP Mean (SD) 24782.56 (10697.04) 46093.63 (27306.18) <0.001 *** Invoice Mean (SD) 22834.07 (9718.06) 42101.36 (24925.46) <0.001 *** MPG (City) Median (Range) 21.0 (12.0 – 60.0) 18.0 (12.0 – 29.0) <0.001 *** MPG (Highway) Median (Range) 29.0 (18.0 – 66.0) 25.0 (18.0 – 36.0) <0.001 *** Type Hybrid 003 (01.33%) 000 (00.00%) <0.001 *** SUV 022 (09.73%) 000 (00.00%) Sedan 179 (79.20%) 055 (50.00%) Sports 008 (03.54%) 036 (32.73%) Truck 000 (00.00%) 012 (10.91%) Wagon 014 (06.19%) 007 (06.36%) Origin Asia 099 (43.81%) 025 (22.73%) <0.001 *** Europe 037 (16.37%) 050 (45.45%) USA 090 (39.82%) 035 (31.82%) gender 0.023 * * 0 098 (49.25%) 069 (62.73%) 1 101 (50.75%) 041 (37.27%) carmake 0 034 (15.04%) 023 (20.91%) 0.17 1 096 (42.48%) 051 (46.36%) 2 096 (42.48%) 036 (32.73%) Note: * P value less than 0.05 ** P value less than 0.01 *** P value less than 0.001
Macro parameters dsn = calist = coplist = cononplist = outputorder = continuousorder = 1 outdsn = eratio = 0.2 percenttype = copmain = mean table1summary column copsupplement = cononpmain = cononpsupplement caexcludelist = copexcludelist = std median = range cononpexcludelist = caincludemissing = copincludemissing = cononpincludemissi missingtop = yes no no ng = no notest = no cadec = 2 copdec = 2 cononpdec = 1 overall = no paddingchar = 1 reportinsas = no variableshading = labelvariable = no suppresswarning = yes yes creatertf = yes compacttable = yes tabletitle = savefilename = orderbyformat = variable summary pvaluetop = no orientation = missingpercent = missinglabel = group = yes Missing showgroup = yes Nalabel = NA
Foolproof features • dsn must be provided and data isn’t empty • Variable name in any list must be in the dsn • Name of statistic for continuous variable is the list supported by proc means • Stats are only calculated once for duplicate variable names in the list • If group variable are all missing, data will be summarized without stratification • Parameter eratio ranges between 0 and 1 • Variable in continuous variable list must be numeric
Example1: overall = yes % Table1Macro (dsn = car1, calist = type origin gender carmake, coplist = msrp invoice, cononplist = mpg_city mpg_highway, group = drivetrain, Overall = yes );
Example1 Result Table1 - Variable Summary Drivetrain Front Rear Total P Value (N = 226) (N = 110) (N = 336) MSRP Mean (SD) 24782.56 (10697.04) 46093.63 (27306.18) 31759.40 (20488.22) <0.001 *** Invoice Mean (SD) 22834.07 (9718.06) 42101.36 (24925.46) 29141.82 (18643.15) <0.001 *** MPG (City) Median (Range) 21.0 (12.0 – 60.0) 18.0 (12.0 – 29.0) 20.0 (12.0 – 60.0) <0.001 *** MPG (Highway) Median (Range) 29.0 (18.0 – 66.0) 25.0 (18.0 – 36.0) 28.0 (18.0 – 66.0) <0.001 *** Type <0.001 *** Hybrid 003 (01.33%) 000 (00.00%) 003 (00.89%) SUV 022 (09.73%) 000 (00.00%) 022 (06.55%) Sedan 179 (79.20%) 055 (50.00%) 234 (69.64%) Sports 008 (03.54%) 036 (32.73%) 044 (13.10%) Truck 000 (00.00%) 012 (10.91%) 012 (03.57%) Wagon 014 (06.19%) 007 (06.36%) 021 (06.25%) Origin Asia 099 (43.81%) 025 (22.73%) 124 (36.90%) <0.001 *** Europe 037 (16.37%) 050 (45.45%) 087 (25.89%) USA 090 (39.82%) 035 (31.82%) 125 (37.20%) gender 0 098 (49.25%) 069 (62.73%) 167 (54.05%) 0.023 * * 1 101 (50.75%) 041 (37.27%) 142 (45.95%) carmake 0 034 (15.04%) 023 (20.91%) 057 (16.96%) 0.17 1 096 (42.48%) 051 (46.36%) 147 (43.75%) 2 096 (42.48%) 036 (32.73%) 132 (39.29%) Note: * P value less than 0.05 ** P value less than 0.01 *** P value less than 0.001
Example2: show missing information % Table1Macro (dsn = car1, calist = type origin gender carmake, coplist = msrp invoice, cononplist = mpg_city mpg_highway, group = drivetrain, Overall = yes, caincludemissing = yes, copincludemissing = yes, cononpincludemissing = yes );
Example2 Result Table1 - Variable Summary Drivetrain Front Rear Total P Value (N = 226) (N = 110) (N = 336) MSRP N (N Missing) 226 (0) 110 (0) 336 (0) Mean (SD) 24782.56 (10697.04) 46093.63 (27306.18) 31759.40 (20488.22) <0.001 *** Invoice N (N Missing) 226 (0) 110 (0) 336 (0) Mean (SD) 22834.07 (9718.06) 42101.36 (24925.46) 29141.82 (18643.15) <0.001 *** MPG (City) N (N Missing) 213 (13) 100 (10) 313 (23) Median (Range) 21.0 (12.0 – 60.0) 18.0 (12.0 – 29.0) 20.0 (12.0 – 60.0) <0.001 *** MPG (Highway) N (N Missing) 221 (5) 107 (3) 328 (8) Median (Range) 29.0 (18.0 – 66.0) 25.0 (18.0 – 36.0) 28.0 (18.0 – 66.0) <0.001 *** Type Missing 000 (00.00%) 000 (00.00%) 000 (00.00%) <0.001 *** Hybrid 003 (01.33%) 000 (00.00%) 003 (00.89%) SUV 022 (09.73%) 000 (00.00%) 022 (06.55%) Sedan 179 (79.20%) 055 (50.00%) 234 (69.64%) Sports 008 (03.54%) 036 (32.73%) 044 (13.10%) Truck 000 (00.00%) 012 (10.91%) 012 (03.57%) Wagon 014 (06.19%) 007 (06.36%) 021 (06.25%) Origin Missing 000 (00.00%) 000 (00.00%) 000 (00.00%) <0.001 *** Asia 099 (43.81%) 025 (22.73%) 124 (36.90%) Europe 037 (16.37%) 050 (45.45%) 087 (25.89%) USA 090 (39.82%) 035 (31.82%) 125 (37.20%) gender Missing 027 (11.95%) 000 (00.00%) 027 (08.04%) 0.023 * * 0 098 (43.36%) 069 (62.73%) 167 (49.70%) 1 101 (44.69%) 041 (37.27%) 142 (42.26%) carmake Missing 000 (00.00%) 000 (00.00%) 000 (00.00%) 0.17
Example3: exclude missing row in percentage calculation % Table1Macro (dsn = car1, calist = type origin gender carmake, coplist = msrp invoice, cononplist = mpg_city mpg_highway, group = drivetrain, Overall = yes, caincludemissing = yes, copincludemissing = yes, cononpincludemissing = yes, missingpercent = no );
Example3 Result Table1 - Variable Summary Drivetrain Front Rear Total P Value (N = 226) (N = 110) (N = 336) MSRP N (N Missing) 226 (0) 110 (0) 336 (0) <0.001 *** Mean (SD) 24782.56 (10697.04) 46093.63 (27306.18) 31759.40 (20488.22) Invoice N (N Missing) 226 (0) 110 (0) 336 (0) Mean (SD) 22834.07 (9718.06) 42101.36 (24925.46) 29141.82 (18643.15) <0.001 *** MPG (City) N (N Missing) 213 (13) 100 (10) 313 (23) Median (Range) 21.0 (12.0 – 60.0) 18.0 (12.0 – 29.0) 20.0 (12.0 – 60.0) <0.001 *** MPG (Highway) N (N Missing) 221 (5) 107 (3) 328 (8) Median (Range) 29.0 (18.0 – 66.0) 25.0 (18.0 – 36.0) 28.0 (18.0 – 66.0) <0.001 *** Type <0.001 *** Missing 000 (00.00%) 000 (00.00%) 000 (00.00%) Hybrid 003 (01.33%) 000 (00.00%) 003 (00.89%) SUV 022 (09.73%) 000 (00.00%) 022 (06.55%) Sedan 179 (79.20%) 055 (50.00%) 234 (69.64%) Sports 008 (03.54%) 036 (32.73%) 044 (13.10%) Truck 000 (00.00%) 012 (10.91%) 012 (03.57%) Wagon 014 (06.19%) 007 (06.36%) 021 (06.25%) Origin Missing 000 (00.00%) 000 (00.00%) 000 (00.00%) <0.001 *** Asia 099 (43.81%) 025 (22.73%) 124 (36.90%) Europe 037 (16.37%) 050 (45.45%) 087 (25.89%) USA 090 (39.82%) 035 (31.82%) 125 (37.20%) gender Missing 027 (11.95%) 000 (00.00%) 027 (08.04%) 0.023 * * 0 098 (49.25%) 069 (62.73%) 167 (54.05%) 1 101 (50.75%) 041 (37.27%) 142 (45.95%) carmake Missing 000 (00.00%) 000 (00.00%) 000 (00.00%) 0.17
Limitations • Program is developed using SAS 9.4.3 and only fully tested in Windows 7 and Word 2013 • Don’t change the background color of your document • If missing data are labeled with special values, such as “999”, and there are true missing, the report may have unexpected errors • Program may halt in calculating some p value for Fisher’s Exact test
A million thanks to my colleague at YCAS for their support on developing this macro geliang.gan@yale.edu
Recommend
More recommend