SAS: Proc Tabulate Nested Row Variable and Missing Option

In previous 3 Proc Tabulate related posts, I have focused on table coloring, percentages, and setting order for the class variable .

In the table coloring post, I have provided example of stacking statistics for multiple row variables one after the other. Sometimes, we the variables has a nested relationship, we can use * to identify the hierarch of the variable, eg Faculty*program so the statistics by program are ordered by Faculty and then program.

proc tabulate data =table (where=(exclude = "NO" )) missing;
class visa fullpart gender program faculty year;
var heads fftes;
table faculty="Faculty" all visa="Visa Status" all fullpart="F/P" all  
gender="Gender" all faculty="Faculty"*program="Program" all,
(year="year" all)*(heads="Heads"*SUM="" heads="% Total"*pctsum fftes="FFTEs"*SUM="" fftes="% Total"*pctsum 
< faculty all visa all fullpart all  gender all faculty*program > =""  );
run; 

The above code basically stacking the sum and percentage of heads and FFTEs by year (column) for the row variables in the order of Faculty, Visa, fullpart (full-time or part-time), gender, and Program (nested under Faculty).

Please pay attention to the “missing” option in the Proc tabulate. If there is any missing value in the class variables, the whole observation is removed from the Proc tabulate result, even for the class variable that doesn’t have any missing values, because Proc tabulate force the total statistics for all the class variables to be the same. It is crucial to include the “missing” option, so “” has a value to be included in the Proc Tabulate results.

Here is a good reference document for Proc Tabulate.

Reference: https://www.iuj.ac.jp/faculty/kucc625/sas/tabulate.html

proc tabulate data= sashelp.cars;
class origin type make drivetrain;
var msrp;
table (origin="" all) * (make="" all), (type="" all )*(DriveTrain="" )*(MSRP=""*MEAN N);
run; 

Output:
The table produced will show a hierarchical summary:

  • Columns: Each combination of type (e.g., Sedan, SUV) and drivetrain (e.g., 4WD), summarizing msrp with mean and count.
  • Rows: Each combination of origin (e.g., Asia, Europe, USA) and make (e.g., Acura, Audi, BMW), along with totals (all).

SAS: Proc Tabulate Table Colouring

/*Original table*/
proc tabulate data =table1 ;
class Agegroup gender acadyear;
var heads;
table agegroup="Age Group" all gender="Gender" all ,
(acadyear="")*(heads="Heads"*SUM="" heads="% Total"*pctsum < agegroup all gender all > =""  );
run;
/* Change body to light green */
proc tabulate data =table1 S=[background=lightgreen] ;
class Agegroup gender acadyear;
var heads;
table agegroup="Age Group" all gender="Gender" all ,
(acadyear="")*(heads="Heads"*SUM="" heads="% Total"*pctsum <agegroup all gender all > =""  );
run;
Add S=[background=lightgreen] in the Proc tabulate statement.
/* Change Class Variable Colour */
proc tabulate data =table1 S=[background=lightgreen] ;
class Agegroup gender acadyear / S=[background=lightgreen];
var heads;
table agegroup="Age Group" all gender="Gender" all ,
(acadyear="")*(heads="Heads"*SUM="" heads="% Total"*pctsum < agegroup all gender all > ="" );
run;
Add / S=[background=lightgreen] in the Class statement.
/* Change var variable column heading colour */
proc tabulate data =table1 S=[background=lightgreen] ;
class Agegroup gender acadyear / S=[background=lightgreen];
var heads / S=[background=lightgreen];
table agegroup="Age Group" all gender="Gender" all ,
(acadyear="")*(heads="Heads"*SUM="" heads="% Total"*pctsum < agegroup all gender all > =""  );
run;
Add / S=[background=lightgreen] in the Var statement.
/* Change Box colour */
proc tabulate data =table1 S=[background=lightgreen] ;
class Agegroup gender acadyear / S=[background=lightgreen];
var heads / S=[background=lightgreen] ;
table agegroup="Age Group" all  gender="Gender" all ,
(acadyear="")*(heads="Heads"*SUM="" heads="% Total"*pctsum < agegroup all gender all > ="" )
/ box =[S=[background=lightgreen]];
run;
Add / / box =[S=[background=lightgreen]] in the Table statement.
/* Change Class categories colour */
proc tabulate data =table1 S=[background=lightgreen] ;
class Agegroup gender acadyear / S=[background=lightgreen];
classlev Agegroup gender acadyear /  S=[background=lightgreen];
var heads / S=[background=lightgreen] ;
table agegroup="Age Group" all  gender="Gender" all ,
(acadyear="")*(heads="Heads"*SUM="" heads="% Total"*pctsum < agegroup all gender all > =""  )
/ box =[S=[background=lightgreen]];
run;quit;
Add class variables and / S=[background=lightgreen] to the Classlev statement.
/* Change Colour of 'All' field */ 
proc tabulate data =table1 S=[background=lightgreen] ;
class Agegroup gender acadyear / S=[background=lightgreen];
classlev Agegroup gender acadyear /  S=[background=lightgreen];
var heads / S=[background=lightgreen] ;
table agegroup="Age Group" all  gender="Gender" all ,
(acadyear="")*(heads="Heads"*SUM="" heads="% Total"*pctsum < agegroup all gender all > =""  )
/ box =[S=[background=lightgreen]];
keyword all / S=[background=green];
run;
Add all / S=[background=green] to the Keyword statement.
/* Rename 'All' to 'Total'. Change 'Total' row colour Across the table. */
/* Method 1 */
proc tabulate data =table1 S=[background=lightgreen] ;
class Agegroup gender acadyear / S=[background=lightgreen];
classlev Agegroup gender acadyear /  S=[background=lightgreen];
var heads / S=[background=lightgreen] ;
table agegroup="Age Group" all="Total"*[s=]  gender="Gender" all="Total"*[s= < parent >]  ,
(acadyear="")*(heads="Heads"*SUM="" heads="% Total"*pctsum < agegroup all gender all > =""  )
/ box =[S=[background=lightgreen]];
keyword all / S=[background=green];
run;
/* Method 2 */
proc tabulate data =table1 S=[background=lightgreen] ;
class Agegroup gender acadyear / S=[background=lightgreen];
classlev Agegroup gender acadyear /  S=[background=lightgreen];
var heads / S=[background=lightgreen] ;
table agegroup="Age Group" all="Total"*[S=[background=green]]  gender="Gender" all="Total"*[S=[background=green]]  ,
(acadyear="")*(heads="Heads"*SUM="" heads="% Total"*pctsum < agegroup all gender all > =""  )
/ box =[S=[background=lightgreen]];
keyword all / S=[background=green];
run;
Add *[S=[background=green]] to the all labels in the Table statement.
/* Add All column Total. Override colour of All column */
proc tabulate data =table1 S=[background=lightgreen] ;
class Agegroup gender acadyear / S=[background=lightgreen];
classlev Agegroup gender acadyear /  S=[background=lightgreen];
var heads / S=[background=lightgreen] ;
table agegroup="Age Group" all="Total"*[S=[background=green]]  gender="Gender" all="Total"*[S=[background=green]]  ,
(acadyear="" all={label="All Years" S=[background=lightgreen]} )*(heads="Heads"*SUM="" heads="% Total"*pctsum < agegroup all gender all > ="" )
/ box =[S=[background=lightgreen]];
keyword all / S=[background=green];
run;
In Table statement, add all={label=”All Years” S=[background=lightgreen]} to override the colour setup in the Keyword statement.

SAS Proc Tabulate: Order Class Variable

The following block of code provides a cross-tabulation by student’s home faculty and responsible faculty who students took course from.  The special treatment in the code is the two class statements used for setting order for the home faculty class variable based on format of the home faculty.  The second class statement where the responsible class variable is defined doesn’t have the “/order” option, therefore the order for responsible variable will be based on internal value.

proc tabulate data=count;
class home /order= formatted ;
class responsible;
var ffte;
table (home="Home Faculty" all), (responsible="Responsible Faculty" all)*(count="")*(sum="");
run;

SAS: Proc Tabulate and Percentage Statistics

The following proc tabulate procedure will generate at a report on sum of undergraduate teaching and graduate teaching and FTE for each Faculty by departments (row header) and by year (column header).

proc tabulate data=data;
class year faculty dept;
var undergrad_load graduate_load fte;
table faculty, (dept all), year*(undergrad_load="UG" graduate_load="GR" fte)*(sum="");
run;

Data table contains grade, program, and incoming year of students. “Total” variable equals to 1 if the student has a grade; “ge75” is 1 if the grade is great than and equal to 75; and “ge80” is 1 if the grade is great than and equal to 80.

The following proc tabulate procedure gives a summary table of total number of counts of students with grade, students with grade great than and equal to 75 and 80.

proc tabulate missing f=6. noseps data=data;
class program progname category year level2;
var total ge75 ge80;
tables category,program*progname*level2,year*(total ge75 ge80) / rts=40 indent=3;
run;

PERCENTAGE STATISTICS

  • reppctn: report percentage (all rows and all columns add up to 100)
  • colpctn: column percentage (every column adds up to 100)
  • rowpctn: row percentage (every row adds up to 100)
  • A*pctsum: construct a custom denomination, both A and B are analysis variables, and B as in the is the denominator for calculating the percentage.

The following tabulate procedure calculates the percentage of greater and equal to 75 and 80. The outomatic SAS naming with ‘1111111’ of the percentage output reflects the number of class variables in the procedure.

proc tabulate missing f=6. noseps data=data out=dataout (drop=_type_ _table_ _page_ rename=(ge75_pctsum_1111111=ge75 ge80_pctsum_1111111=ge80));
class level1 faculty program progname category level2 year;
var total ge75 ge80;
tables category*level1*faculty*program*progname*level2*year, ge75*pctsum='ge75%'*f=8.2 ge80*pctsum='ge80%'*f=8.2 / rts=40 indent=3;
run;

Examples: with multiple class variables in row expression

  • Reppctn/Reppctsum
proc tabulate data = table1 ;
class Agegroup gender acadyear;
var heads ;
table agegroup="Age Group" all gender="Gender" all ,
(acadyear="" all)*(heads="Heads"*SUM="" heads="% Total"*reppctsum="" );
run;
  • Colpctn/Colpctsum — best for % sum for multiple class variables
proc tabulate data = table1 ;
class Agegroup gender acadyear;
var heads ;
table agegroup="Age Group" all gender="Gender" all ,
(acadyear="" all)*(heads="Heads"*SUM="" heads="% Total"*colpctsum="" );
run;
  • Rowpctn/Rowpctsum
proc tabulate data = table1 ;
class Agegroup gender acadyear;
var heads ;
table agegroup="Age Group" all gender="Gender" all ,
(acadyear="" all)*(heads="Heads"*SUM="" heads="% Total"*rowpctsum="" );
run;
  • pctsum – not working for the All column
proc tabulate data = table1 ;
class Agegroup gender acadyear;
var heads ;
table agegroup="Age Group" all gender="Gender" all ,
(acadyear="" all="All Year" )*(heads="Heads"*SUM="" heads="% Total"*pctsum < agegroup all gender all > ="" );
run;

Show % sign in the table

Reference: http://support.sas.com/kb/38/001.htmlhttp://support.sas.com/kb/38/001.html

proc format;                           
   picture fmtpct (round) low-high='009.99%';   
run;
/* use: f=fmtpct. */