It is easier to use proc sgplot than proc boxplot to compare distibution by classification variable. “Drive Train” and “Type” are both categorical variables.
proc sgplot; title "Price distribution by Drive Train and Type"; vbox invoice / category =type group = drivetrain; run;
- side by side comparison
- group became legend
- applied legend color by the group
- inset statement for sgplot doesn’t have statistics output (n/min/max/mean/stddev)
proc sort out=cars; by drivetrain type; run; proc boxplot data=cars; title "Price distribution by Drive Train and Type"; plot invoice*type; by DriveTrain; inset min mean max stddev/ header = "Overall Statistics"; insetgroup min max / header = "Cheap and Expensive by Type"; run;
- need to sort the data first according to by statement and plot categorical variable;
- plot in light blue; want other color, need extra code
- not able to show 2 categorical variable plot side by side;
- use by statement use produce plot separately.
- inset and insetgroup are nice to have to produce stats as part of the plot.
- inset: data, min, max, mean, nmax, nmin, dobs, stddev;
- insetgroup: max, mean, min, n, nhigh, nlow, nout, q1, q2, q3, range, stddev;