You can easily compare two tables in SAS using different methods. Comparison can be done on sampled data or the entire datasets.
The following methods can be used to compare two tables by identifying matching, non-matching, and common observations from both the datasets.
- PROC SQL
- DATA MERGE
- PROC COMPARE
The following two sample datasets will be used to demonstrate how to compare two tables in SAS using multiple sas code examples.
/* create two sample datasets */
data data_one(label='First Data Set');
input student $ marks major $;
datalines;
2000 87 Math
2042 92 AR/VR
2095 92 History
2187 95 Art
;
run;
data data_two(label='Second Data Set');
input student $ marks major $;
datalines;
2000 87 Math
2042 92 AR/VR
2095 92 History
2187 94 Music
2204 96 Robotics
2500 98 AI/ML
;
run;
/* view datasets */
proc print data=data_one; title 'First Data Set - work.data_one';
proc print data=data_two; title 'Second Data Set - work.data_two';
run;
1. PROC SQL – Compare Two Tables In SAS
PROC SQL procedure can be used to compare and summarise the differences between two datasets using EXCEPT & INTERSECT statements.
CASE 1: Finding rows present in the first dataset but don’t have the exact matching rows in the second dataset.
/* comparing two tables in sas */
proc sql;
title "Rows present in data_one but NOT in data_two";
select * from data_one except select * from data_two;
quit;
CASE 2: Finding rows present in the second dataset but don’t have the exact matching rows in the first dataset.
/* comparing two tables in sas */
proc sql;
title "Rows present in data_two but NOT in data_one";
select * from data_two except select * from data_one;
quit;
CASE 3: Finding common rows from both the datasets
/* finding common rows from both the tables in sas */
proc sql;
title "Common rows in both the tables";
select * from data_one intersect select * from data_two;
quit;
2. DATA MERGE – Compare Two Tables In SAS
DATA MERGE statement can be used to compare and summarise the differences between two datasets using the IN statement. If N= options applied on both the datasets, it flags value 1 for respective datasets.
MERGE technique only works with the sorted datasets. It means before using data merge you must sort both the datasets based on columns to be used for comparison. If you want to compare two tables across all the columns then you can use _ALL_ in proc sort procedure.
/* sort datasets by all the variables */
proc sort data=data_one;
by _all_;
run;
proc sort data=data_two;
by _all_;
run;
In this example we will create a separate dataset for each case. Those comparison summary datasets can be used for further analysis.
Following datasets will be created:
- work.in_dataOne: data present in data_one but missing in data_two
- work.in_dataTwo: data present in data_two but missing in data_one
- work.in_both: common rows present in both the tables
/* compare datasets using data merge technique */
data in_dataOne in_dataTwo in_both;
merge data_one(in=in_a) data_two(in=in_b);
by _all_; /* compare data across the columns */
/* data present in data_one but missing in data_two */
if in_a and not in_b then output in_dataOne;
/* data present in data_two but missing in data_one */
if in_b and not in_a then output in_dataTwo;
/* common rows present in both the tables*/
if in_a and in_b then output in_both;
run;
/* view output datasets */
proc print data=in_dataOne;
title "Rows present in data_one but NOT in data_two";
proc print data=in_dataTwo;
title "Rows present in data_two but NOT in data_one";
proc print data=in_both;
title "Common rows in both the tables";
run;
3. PROC COMPARE – Compare Two Tables In SAS
SAS has a dedicated proc compare procedure to compare and summarise differences between two datasets and prints comparison report in the output. This procedure doesn’t generate separate output datasets unless and until it’s not specified with the optional features of the procedure.
It’s possible to create an output dataset by using OUT=<data-set> option with compare procedure. Anyway, we don’t need to go through this now. Let’s focus on a simple proc compare procedure that gives us comparison report.
/* compare two tables using proc compare */
proc compare base=work.data_one compare=work.data_two;
run;
FAQ – Compare Two Tables In SAS
You can compare two datasets, identify common rows using PROC SQL procedure with INTERSECT statement.
You also achieve the same using DATA MERGE technique with IN=parameter in the data step.
Yes. You can create output datasets that stores the data using OUT=, OUTDIF=, OUTDIFF= options in PROC COMPARE procedure.