SAS: Update values with the last record and keep other attributes of the first record by group

The list contains course grades for students and the year students took the course.  It is possible that students can retake the course to improve their gpa.  We want a list that the last records of the students are showing all the grades that they took in the past and if a course was taken multiple times then the most recenct grade of that course.  We also want the last records of the student to show the initial year that the students were taking these course.

The following marco used a temp variable to hold the grade of the course and retain the value for the same student and if the course is retaken, the temp variable will be assigned with new grade, and if the course grade is empty, then the value retained in the  temp variable will be assigned to the course grade.

%macro lst (in =, out=, crs=);
data &out.;
drop temp ;
set &in.;
by id;
length temp $2 firstcrsyr $4 ;
retain temp firstcrsyr ;
if first.id then do; temp = '' ; firstcrsyr = yr; end;
/* Assign TEMP when courseX is non-missing /
if &crs. ne '' then temp=&crs.;
/ When X is missing, assign the retained value of TEMP into courseX */
else if &crs. eq '' then &crs.=temp;
run;
%mend lst;
%lst (in = list1, out = list2, crs= course1);
%lst (in = list2, out = list2, crs= course2);
%lst (in = list2, out = list2, crs= course3);

SAS: Cumulative Count by Study and Control Group

This example is used to set up pairing id (PAIRID) for study group and control group. Variable INDEX is the matching character. Variable STUDY identify the group membership (0 = control group; 1= study group).  Variable ID is the unique identifier. In the SAMPLE dataset, number of membership are equal for both the control group and study group at each strata.

data pair;
set sample;
by study index;
count =1;
if first.study then pairid = 0;
pairid +count;
drop count;
run;

*Not using retain statement.

SAS: Proc SQL Where … In to subset

SQL where … in can be used to subsetting dataset which meet contain condition and the condition is in a separate dataset.

Table ONE is a large dataset and contains variable ID and other related information. Table TWO contains unique ID that of interest.  Table THREE is the subset of table ONE which match the ID value with table TWO.

proc sql;
create table THREE as
select *
from ONE
where ID in (select ID from TWO);

 

Data: Curl Installation and batch download PDF files from a website

cURL Installation

  1. download carcet.pem from  https://curl.haxx.se/docs/caextract.html
  2. download curl from https://curl.haxx.se/download.html (I use a windows system, so I download the binary file for windows 64 bit)
  3. extract curl to c:\curl and put carcet.pem under c:\curl\bin folder
  4. add environment variable curl with the path of curl.exe

Curl is useful for downloading files from the website.  The basic command is the following:

curl -O url

This command needs be run under the c:\curl\bin folder. The files are downloaded to C:\Users\username\

Purpose: download SAS 9.3 user guide pdf files from https://support.sas.com/documentation/onlinedoc/stat/930/

  1. In window cmd, under c:\curl\bin, run curl -o index https://support.sas.com/documentation/onlinedoc/stat/930/
    • generate the index file which contains the wrapped source html code of the webpage
  2.  open git bash, run following
      1. cd c
      2. cd curl
      3. cd bin
      4. grep -i pdf  index > list
  3. list contains the href=”*.pdf”. Use Excel text to column to get only the name of the pdf files.
  4. open list in notepad++ and at the bottom of the window, it shows “Windows (CR LF)”,  right click and select “Unix (LF)”.  This will solve the error “curl: (3) Illegal characters found in URL”
  5. Start a new bash file in notepad++ with the following code
    1. echo “Start!”
      url=https://support.sas.com/documentation/onlinedoc/stat/930/
      while read query
      do
      curl -O “$url${query}”
      echo $url${query}
      done < list
    2. save as echo
  6. in git bash, navigate to where the echo file is and run following
    1. bash echo

Check SAS license and components installed

proc setinit; run;

Proc setinit provides the components list with the expiration dates.

—Base SAS Software
—SAS/STAT
—SAS/GRAPH
—SAS/ETS
—SAS/FSP
—SAS/OR
—SAS/AF
—SAS/IML
—SAS/QC
—SAS/SHARE
—SAS/LAB
—SAS/ASSIST
—SAS/CONNECT
—SAS/INSIGHT
—SAS/EIS
—SAS/SHARE*NET
—MDDB Server common products
—SAS Integration Technologies
—SAS/Secure Windows
—SAS Enterprise Guide
—SAS Bridge for ESRI
—OR OPT
—OR PRS
—OR IVS
—OR LSO
—SAS/ACCESS Interface to DB2
—SAS/ACCESS Interface to Oracle
—SAS/ACCESS Interface to Sybase
—SAS/ACCESS Interface to PC Files
—SAS/ACCESS Interface to ODBC
—SAS/ACCESS Interface to OLE DB
—SAS/ACCESS Interface to Teradata
—SAS/ACCESS Interface to MySQL
—SAS/IML Studio
—SAS Workspace Server for Local Access
—SAS/ACCESS Interface to Netezza
—SAS/ACCESS Interface to Aster nCluster
—SAS/ACCESS Interface to Greenplum
—SAS/ACCESS Interface to Sybase IQ
—DataFlux Trans DB Driver
—SAS Framework Data Server
—Reserved for Dataflux

proc product_status; run; 

Get the version of the product.

SAS Proc Upload and Download

SAS server processes dataset at a much faster rate than the local desktop.  To send the datasets from the WORK library of the desktop to the WORK library of the server,  and get the processed datasets from WORK library of the server to the WORK library of the desktop, we need to use PROC UPLOAD and PROC DOWNLOAD.

proc update data =localdata out =serverdata; run;
…code…
proc download data=serverdata out =localdata; run;

SAS Array example

In the data set, the student courses have 20 fields and the student grades have 20 fields and the courses and grades are one to one match.  The following codes can be used to get the course name from course field and grade from the grade field.

array crs {20} crs1-crs20;
array course {20} course1-course20 ;
array grade {20} grade1-grade20;
array gr {20} gr1-gr20 ;
do i = 1 to 20;
if crs{i} ne ” then course{i} = substr(crs{i},9,10);
if grade{i} ne ” then gr{i} =scan(grade{i},5,’\’);
end;
drop crs1-crs20 grade1-grade20;

Second example produces the retention results for each student based on whether students come back next year. cumm_crYr variables the accumulative credits at the end of year X of the program; ProgramYr variables are the name of the program the students registed at the end of year X.  If the student didn’t register for the year, there will be no value in cummcrYr or programYr. Therefore, if the student has accumulative credits calculated for the next year then the student is retained one year later and the isRetainYr1 will equal to 1.  The array evaluates whether the value for cumm_Yr2/programYr2 is null, if not, then is RetainYr2 is 1, etc.

data two;
set one;
length isRetainYr1-isRetainYr4 8;
array cumm_crYr{5} cumm_crYr1-cumm_crYr5;
array isRetainYr{4} isRetainYr1-isRetainYr4;
array programYr{5} programYr1-programYr5;
do i = 1 to 4;
j=i+1;
if pgmentyr*1 +i -1 <= 2016 then do;
if cumm_crYr{j} ne . or programYr{j} ne ” then isRetainYr{i} = 1;
else isRetainYr{i} = 0; end;
end;
run;

SAS Regression Basic 1

Step 1: Organize dataset – from wide to long. Categorical variables, possible predictor/explanatory/independent variables and the outcome/response/dependent variable all in columns.

Step 2: Proc Means – Find descriptive statistics for all the numeric variables by categorical variables.

%macro mean (data=, byvar=, var=);
proc sort data = &data.;
by &byvar.;
run;
proc means data = &data.;
by &byvar.;
var &var.;
run;
%mend mean;

Step 3: Proc Sgpanel or Proc sgplot – Check the scatter plot of outcome variable and predictor variable by categorical variables.

%macro plotby (data=, title=, by= , x=, y= , lbl= );
proc sgpanel data =&data.;
title &title.;
panelby &by. / columns =2 rows =4;
scatter x = &x. y=&y. /datalabel=&lbl.;
run;
%mend plotby;
  • title macro variable needs to be in “”;
  • by mcro variable is categorical;
  • Use COLUMNS and ROWS options in the PANELBY statement to define the grid layout of the plot.  It can be determined by the number of category and the preferred size of the plot;
  • Use uniscale in the PANELBY statement if the shared column or row axis needs to be identical. The default is ALL. UNISCALE= COLUMN | ROW | ALL
  • To give label for each datapoint in the plot, use DATALABEL option in the SCATTER statement;
  • Use ROWAXIS and COLAXIS to change the default axis value;

 

SAS Error: Java proxy is not responding

reference: http://support.sas.com/kb/44/853.html

Two check-points for Java environment set up for SAS 9.3.

  • C:\Program Files\SASHome\sassw.config
    • REHOME=C:\Program Files (x86)\Java\jre1.6.0_24\bin\java.exe
  • C:\Program Files\SASHome\x86\SASFoundation\9.3\nls\en\sasv9.cfg
    • Dsas.jre.libjvm=C:\Program Files (x86)\Java\jre7\bin\client\jvm.dll

In the second check point, the Dsas.jre setup is crucial because when updating java to a higher version, the jvm.dll will be deleted and the newer version of java doesn’t have the jvm.dll.  In the sasv9.cfg, the directory for Dsas.jre.libjvm should be point to the older version where the jvm.dll file still exists.

Download previous version from a third party link here:

https://support.sas.com/en/documentation/third-party-software-reference/9-3/support-for-jre.html

Diagnostics:

  1. run notepad as administrator
  2. open C:\Program Files\SASHome\x86\sassw.config
  3. verify the JRE that was used during installation process.
  4. current JRE configuration is JREHOME=C:\Program Files (x86)\Java\jre1.6.0_24\bin\java.exe
  5. Go to windows explorer and find java.exe doesn’t exist in the folder.
  6. There is another version of java under the jdk1.7.0_55 folder. This is a development package.  It doesn’t contain the jvm.dll file for SAS to access a JVM for JNI processing
  7. Download jre6 for windows at the third party link above. Pick the version that the filename contains “jre” and “i586”. I download the “Windows Offlline Installation” version under “Java SE Runtime Environment 6u22” which works.  The filename of the installation file is jre-6u22-windows-i586.exe.
  8. Install java run envirnoment.
  9. Use notepad to exit sasv9.cfg and update the directory for dsas.jre.libjvm to Dsas.jre.libjvm=C:\Program Files (x86)\Java\jre6\bin\client\jvm.dll
  10. The Java error message in sas disappeared.