SAS: Identfy and Remove Duplicate Records

Step 1: Check duplicate records by one field or a combination of fields.

proc freq data=temp noprint;
table var1 /out =dupl (keep =var1 count where = (count >1));
run;

proc freq data=temp noprint;
table var1*var2 /out =dupl (keep =var1 var2 count where = (count >1));
run;

Examine the output to see what circumstances can cause duplicate records.

Step 2: Remove duplicate records.

proc sort data=temp NODUPKEY;
by var1 ;
run;

proc sort data=temp NODUPKEY;
by var1 var2 ;
run;

Alternative method to output unique records and duplicated records in two separate datasets.

proc sort data =temp out=temp1;
by var1 var2;
run;
data unique dup;
set temp1;
by var1 var2;
if first.var2 and last.var2 then output unique;
else output dup;
run;