SAS STAT package has many procedures that can be used to conduct specific statistical analysis with structured data sets. I found the following use of precedures quite common and there are usually a number of statistics needed to be output for further compilation. The caveat is to sort the data set in the desired order, so the null hypothesis is in the upper left cell (1,1) in the two-way frequency table, which normally showes the results of study group with a respone equal to 1. This way the output statistics can be interpreted more straight forward.
A. How to get output from statistics procedures?
There are different ways for outputing statistics, eg. ODS OUTPUT, OUTPUT statement or OUT=;
a. use PROC MEANS for general statistics like n, mean, min, max std;
- use maxdec= option to adjust the decimal places needed.
- output out=filename: N, MIN, MAX, MEAN, STD, SUMWGT
b. use PROC TTEST for two-sample independent T-test; use CLASS Variable to identify and differentiate the groups for study cases and control cases;
- ods output
- ttests=filename: tValue, DF, (pick Satterthwaite method which assume unequal variance);
- statistics=filename: n, mean, LowerCLMean, UpperCLMean, StdDev, StdErr for ‘0’,’1′, Diff (1-2);
- equality=filename: fValue, probf (equality of variances)
c. use PROC FREQ for two-sample independent T-test
- tables statement, out option
- out= filename: default only provide the TABLES variable, frequency count, percent of total frequency. To include percent of row frequency and percent of column frequency, you need to add OUTPCT in TABLES statement.
- output statement out option
- Output relrisk out = filename;
- ods output
- ChiSq=filename: Chi-Square, Prob
- PdiffTest=filename: Proportion Difference Test, Wald is the default method, Proportion Difference, Z Value, One-sided Prob and two-sided prob.
- RelativeRisks=filename: relative risk estimate, case-control (odds ratio), column 1 risk, column 2 risk, 95% L/U Confidence Limit
Overall, the ODS OUTPUT is the most versatile and powerful method to obtain statisics results from these procedure. Check “ODS Table Names” under each procedure in the SAS STAT User Guide. The SAS procedure assigns a name to each table that it creates. Use ODS Table Name= filename to assign your own table name to be saved in the work library. In order to have the output ods table, you also need to check if you have included the corresponding option(s) in the specific statement for the program generate the statistics.
Sample code for Two-sample Proportion Test (include all 3 methods of output):
%macro prop ( in =, var=, out=, weight=); proc sort data=∈ by descending study descending &var.; run; ods graphics on; ods output ChiSq = &out.chi_&var. PdiffTest=&out.pdiff_&var. RelativeRisks=&out.rr_&var.; proc freq data=&in. order =data ; format study grpfmt. &var. rspFmt.; weight &weight.; tables study*&var. / chisq measures riskdiff(equal) outpct out=&out._&var. plots= (freqplot(twoway=groupvertical scale =percent)); output relrisk out=&out._or_&var.; title "Proportion Test:Case - Control study of variable &var. for &out."; run; ods output close; ods graphics off; %mend prop; %prop ( in = g1pair, var= isRetainYr1, out =g1prop, weight = weight1);
*”The Satterthwaite approximation of the standard errors differs from the Pooled method in that it does not assume that the variances of the two samples are equal. This means that if the variances are equal, the Satterthwaite approximation should give us exactly the same answer as the Pooled method.” (Reference: https://wolfweb.unr.edu/~ldyer/classes/396/PSE.pdf)