Delete dataset

Library has to be specified using library= option.  Use “delete libraryname.tablea” doesn’t work. Option “noprint” causes error.

rsubmit;
proc datasets library=libraryname nodetails nolist;
delete tablea;
run;
endrsubmit;

 

Import CSV, DBF, and xlsx datasets with data compression

  1. Import table1.xlsx
PROC IMPORT OUT= WORK.table1
DATAFILE= "X:\GROUPS\Surveys\
results\4. Data\table1.xlsx"
DBMS=EXCEL REPLACE;
RANGE="table1$";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
  1. Import table1.csv. Data file is systematically saved in the year folder

Method 1:

%macro survey (strtyear, stopyear);
%do yr= &strtyear %to &stopyear;
%put &yr.;
data WORK.table_&yr. ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile "X:\GROUPS\Surveys\&yr.\results\4. Data\table1.csv" delimiter = ',' TRUNCOVER DSD lrecl=32767
firstobs=2 ;
informat field1 $1. ;
informat field2 $4. ;

format field1 $1. ;
format field2 $4. ;

input
field1 $
field2 $;

label
field1 = "Label 2"
field2= "Label 2"

if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;
%end;
%mend survey;
%survey (2007, 2011);

Method 2:

PROC IMPORT OUT= WORK.table_2013
DATAFILE= “L:\GROUPS\Surveys\2013\Results\4. Data\table1.csv”
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;

  1. Import dbf dataset
PROC IMPORT OUT= WORK.table_2013 (compress=yes)
DATAFILE= "L:\GROUPS\Surveys\2013\Results\4. Data\table1.dbf"
DBMS=DBF REPLACE;
GETDELETED=NO;
RUN;

Compression datasets

  1. Compressing Text data

(compress = yes)

  1. Compressing Numeric data

(compress=binary)

 

R Online Reference and Resources

reference site:

http:/www.bi-notes.com

Tutorial: From 0 to R with Google Analytics

https://developers.google.com/analytics/solutions/r-google-analytics

Blog

http://www.tmllr.com/?utm_source=analyticsdemystified.com&utm_medium=social&utm_content=tims_r_post&utm_campaign=gilliganisawesome

From level[1] to level[2] with R and Google Analytics: ggplot2

http://www.tmllr.com/from-level1-to-level2-with-r-and-google-analytics-ggplot2/

Crawling Websites

Found a very interesting blog on “Simple Web Crawler with SAS Macro and SAS Data Step” at http://www.sascommunity.org/wiki/Simple_Web_Crawler_with_SAS_Macro_and_SAS_Data_Step with the following code to strip url from the listed websites.

data work.links_to_crawl;
length url $256;
input url $;
datalines;
http://www.yahoo.com
http://www.sas.com
http://www.google.com
;
run;

%macro crawler();
%let html_num = 1;

data work.links_crawled;
length url $256;
run;

%next_crawl:
/* pop the next url off */
%let next_url = ;

data work.links_to_crawl;
set work.links_to_crawl;
if _n_ eq 1 then call symput(“next_url”, url);
else output;
run;

%let next_url = %trim(%left(&next_url));

%if “&next_url” ne “” %then %do;

%put crawling &next_url … ;

/* crawl the url */
filename _nexturl url “&next_url”;

/* put the file we crawled here */
filename htmlfile “file%trim(&html_num).html”;

/* find more urls */
data work._urls(keep=url);
length url $256 ;
file htmlfile;
infile _nexturl length=len;
input text $varying2000. len;

put text;

start = 1;
stop = length(text);

if _n_ = 1 then do;
retain patternID;
pattern = ‘/href=”([^”]+)”/i’;
patternID = prxparse(pattern);
end;

/* Use PRXNEXT to find the first instance of the pattern, */
/* then use DO WHILE to find all further instances. */
/* PRXNEXT changes the start parameter so that searching */
/* begins again after the last match. */
call prxnext(patternID, start, stop, text, position, length);
do while (position ^= 0);
url = substr(text, position+6, length-7);
* put url=;
output;
call prxnext(patternID, start, stop, text, position, length);
end;
run;

/* add the current link to the list of urls we have already crawled */
data work._old_link;
url = “&next_url”;
run;
proc append base=work.links_crawled data=work._old_link;
run;

/* only add urls that we haven’t already crawled or that aren’t queued up to be crawled */
proc sql noprint;
create table work._append as
select url
from work._urls
where url not in (select url from work.links_crawled)
and url not in (select url from work.links_to_crawl);
quit;

/* only add urls that are absolute (http://…) */
data work._append;
set work._append;
absolute_url = substrn(url, 1, 7);
put absolute_url=;
if absolute_url eq “http://” ;
drop absolute_url;
run;

/* add new links */
proc append base=work.links_to_crawl data=work._append force;
run;

/* increment our file number */
%let html_num = %eval(&html_num + 1);

/* loop */
%goto next_crawl;
%end;

%mend crawler;

%crawler();

 

Misc: Open EML files without launch mail system

One of my co-worker was sent about 30 EML files and each has attachment that needs to be saved to the hard drive.  If you click the eml file, the mail system will launch and then you need to type in the user name and password to open each eml file.  I was looking into ways can save time  to access the attachment and make the process more efficient. I came across this software EML File Viewer on the developer’s website http://www.freeviewer.org/eml/ that does exactly the work I was looking for.  The view can search the eml files on your drive and identify and preview the attachment for each in separate window.  There is a free version you can download to try out.

 

Access VBA – conditional change decimal place for report fields

The “Terms to Graduation” data is given 2 decimal places to help identify the small differences from year to year, while the data of the rest of the report are in integers.

VBA coding:

Private Sub Detail_Format(Cancel As Integer, FormatCount As Integer)
If Me![Table Name].Value = “F. Terms to Graduation (Calendar Year)” Then
Me.[2008].DecimalPlaces = “2”
Me.[2009].DecimalPlaces = “2”
Me.[2010].DecimalPlaces = “2”
Me.[2011].DecimalPlaces = “2”
Me.[2012].DecimalPlaces = “2”
Me.[2013].DecimalPlaces = “2”
Me.[2014].DecimalPlaces = “2”
Else
Me.[2008].DecimalPlaces = “0”
Me.[2009].DecimalPlaces = “0”
Me.[2010].DecimalPlaces = “0”
Me.[2011].DecimalPlaces = “0”
Me.[2012].DecimalPlaces = “0”
Me.[2013].DecimalPlaces = “0”
Me.[2014].DecimalPlaces = “0”
End If
End Sub

Report results:

access2-1

Y. I. An