SAS: Standardize Variables

Standardized variables can be referred as Z score with a mean of 0 and standard deviation of 1. When multi-scale variables are enter into Regression analysis, the variables with larger variances will have more importance and influence on the results than the variables with small variances. It is important to standardized variables in the preprocessing step for regression analysis, cluster analysis, and neural network.

To calculate the standardized variable, use the non-standardized variable minus the mean and then divided by standard deviation.

Get means and standard deviation before standardizing the variables.

proc univariate data = data1 out=stat;
var var1 var2 var3;
run;

In the data set, the variables are not measured in the same units and cannot be assumed to have equal variance. We use PROC STDIZE to standardize the variables.

proc stdize data= data1 out=Stand method =std;
var var1 var2 var3;
run;

Get means and standard deviation after standardizing the variables. Means of the variables should equal 0 and standard deviations equal to 1.

proc univariate data = Stand out=stat;
var var1 var2 var3;
run;