Step 1: Organize dataset – from wide to long. Categorical variables, possible predictor/explanatory/independent variables and the outcome/response/dependent variable all in columns.
Step 2: Proc Means – Find descriptive statistics for all the numeric variables by categorical variables.
%macro mean (data=, byvar=, var=); proc sort data = &data.; by &byvar.; run; proc means data = &data.; by &byvar.; var &var.; run; %mend mean;
Step 3: Proc Sgpanel or Proc sgplot – Check the scatter plot of outcome variable and predictor variable by categorical variables.
%macro plotby (data=, title=, by= , x=, y= , lbl= ); proc sgpanel data =&data.; title &title.; panelby &by. / columns =2 rows =4; scatter x = &x. y=&y. /datalabel=&lbl.; run; %mend plotby;
- title macro variable needs to be in “”;
- by mcro variable is categorical;
- Use COLUMNS and ROWS options in the PANELBY statement to define the grid layout of the plot. It can be determined by the number of category and the preferred size of the plot;
- Use uniscale in the PANELBY statement if the shared column or row axis needs to be identical. The default is ALL. UNISCALE= COLUMN | ROW | ALL
- To give label for each datapoint in the plot, use DATALABEL option in the SCATTER statement;
- Use ROWAXIS and COLAXIS to change the default axis value;