SAS SQL: Merge dataset with multiple conditions

Assume all the macro variables are defined. There are two lists A and B. A is list of year n students and B is a list student id for year n+1 students in the same program.  Step 1 is to create table that includes all the students with study level 1 or 2 in A but not in B (1 year later).  This step identifies the student population of study level 1 and 2 who are lost to the program one year later. Notice A and B are used as table aliases to simplify the statement. Also the syntax for left join needs to be followed by  the on statement to specify the column field to join the two table.

proc sql;
create table nov&yr.not as
select A.* from nov&yr. as A
left join nov&nextyr.id as B
on A.sisid = B.id
where B.id is null and A.std_lvl in (’01’,’02’)
;
quit;

Second step is to identify the students who changed their major to Psychology from results of step 1.  Notice I omit As when I assign the aliases.
Proc sql;
create table nov&yr.chg as
select A.*, B.sisall, B.newm1 from nov&yr.not A, nov&nextyr.all B
where A.sisid = B.sisall
and A.major1 ne B.newm1
and B.newm1 eq ‘PSYC’
;
quit;

SAS: Set start and stop for macro variables

Sometimes I need to repeat the same exercise for multiple times, for example faculty want retention rate not just for one year but multiple year, it is easy to use %macro macroname (parameter1, parameter2) in conjunction with %do varname = &value1 &to &value2 to customize the year range in the do loop statement per clients request.

eg;

%macro step (strtyear, stopyear);
%do yr= &strtyear %to &stopyear;

….continue coding

%end;

%mend step;

%step (11, 14);

 

SAS: Convert Variable between Numeric and Character Format

A. Use Put or Input
Convert between numeric and character variable.

  • character to numeric (input)
old_char = "2018";
new_num = input(old_char, 8.);
new_num = 2018;
  • character to numeric (input) to character (put)
old_char = "2018";
new_char = put (input (substr(old_char , 1, 4 ), 8.) -1 , 4.);
new_char = "2017";

/* or */
new_char = put (old_char*1-1 , 4.);
new_char = "2017";
  • numeric to character (put), and with leading Zero.
old_num = 2018;
new_char = put(old_num, 4.); *new_char = "2018";
new_char1 = put(old_num, z8.); *new_char1 = "00002018";
  • numeric to character (put) to numeric (input)
old_num = 2018;
new_num = input(substr(put(old_num, 4.),3, 2) , 8.);
new_num = 18;

Use the following functions to check whether any digit or alphabetic character is in the character variable. The results will be the position of the first digit/alphabetic character in the field.

check1 = anydigit(var); *return position of first digit, 0 if not found in the string;
check2 = anyalpha(var); *return position of first alphabetic character, 0 if not found in the string;
check3 = notalpha(var); *return position of first non digit, 0 if not found in the string;
check4 = notdigit(var); *return position of first non alphabetic character, 0 if not found in the string;
  • Please note there should be no space in between % and statement eg. macro, let, mend, put, eval etc.
  • To apply the macro variable, use &varname. format.

CODE:

%macro setyr;
%let yr = 14;
%let nextyr =%eval(&yr. +1);  /* 15 */
%let next2yr=%eval(&yr. +2);  /* 16 */
%let fisyr = “20%substr(&yr., 1,2)/%substr(&nextyr., 1,2)”;   /* “2014/2015” */
%let fisnextyr = “20%substr(&nextyr., 1,2)/%substr(&next2yr., 1, 2)”;   /* “2015/2016” */
%let year = “20&yr.”; /* “2014” */
%put &yr;
%put &nextyr;
%put &next2yr;
%put &fisyr;
%put &fisnextyr;
%put &year;
%mend setyr;
%setyr;

OUTPUT:
14
15
16
“2014/15”
“2015/16”
“2014”

B. Use Vvalue fuction
Vvalue() returns the formatted value that is associate with the variable.

data want;
data have;
new_charvar1 = vvalue(formatted_numvar1);  *formatted means the variable has been applied with format;
new_numvar2= vvalue(formated_charvar2);
run;

SAS: Variables/Fields with same Prefix

I have a SAS file that stores students course selection (up to 20 fields) and grades for each course (up to 20 fields) for a certain semester. To set the sas working file, I use the follow data step. Both course variables and grades variables have the common prefix “crs” and “grade”.  It is more efficient to list the 20 course variables using “crs1-crs20” than list each and every one of them.

%let varlist = crs1-crs20 grade1-grade20 studentid major;

Data tem1;

Set  coursefile_fw16 (keep = &varlist.);

run;