5 Ways To Combining and Appending SAS Datasets

There are multiple methods that can be used to combine or append two or more datasets in SAS. In this article you’ll learn 5 simple ways to combine and append SAS datasets as follows.

  • Appending Or Concatenating Two Datasets Vertically – Stacking The Datasets
  • Appending Or Concatenating Stacked Sorted Data Values
  • PROC APPEND To Concatenate Datasets
  • PROC DATASETS To Concatenate Datasets
  • PROC SQL To Concatenate Datasets

We will explore each of these methods to combine and append datasets in SAS using the following sample data sets work.University1 and work.University2

/* create sample datasets */
data University1;
	input student $ grade $;
	datalines;
Jackobe A 
Kjell   B 
Morten  A 
Steven  C 
Peter   A
;
proc print;
title 'Sample Dataset: work.University1';
run;


data University2;
	input student $ grade $;
	datalines;
Anna    B
Jan     A
Nils    A
Kristin C
Michael A
;
proc print;
title 'Sample Dataset: work.University2';
run;
How to combine and append sas datasets

5 Ways To Combine and Append SAS Datasets

 

1. Appending Or Concatenating Two Datasets Vertically – Stacking The Datasets

You can append data vertically using the SET statement in the data step. It reads an observation from one or more SAS data sets. You can also say, it stacks the multiple datasets one after another. 

The best part is you can combine or append many datasets by listening after the SET statement

In this example two sample datasets University1 and University2 will be stacked vertically and created a new dataset named work.CombinedUniversity

/* Appending Or Concatenating Two Datasets Vertically - Stacking the datasets  */
data CombinedUniversity;
	set University1 University2;
run;

/* view dataset */
proc print data=CombinedUniversity;
	title 'CombinedUniversity Dataset';
run;
Combine and append sas datasets

2. Appending Or Concatenating Stacked Sorted Data Values

This method is quite similar to the previous one. The only minor difference here is it stack the rows after sorting the dataset. It means rows from multiple datasets will append together but after sorting it out on a specified BY variable in the data step.

The BY statement controls the operation of a SET, MERGE, MODIFY, or UPDATE statement in the DATA step and sets up special grouping variables. 

Prerequisite: Before you use BY statement to combine or append multiple datasets with SET statement, all of your input datasets must be sorted on BY variable. 

In this example two input datasets University1 and University2 must be sorted on BY variable “student” using the PROC SORT procedure.

/* sorting datasets on BY variable */
proc sort data=University1;
	by student;
run;

proc sort data=University2;
	by student;
run;

/* Appending Or Concatenating Stacked Sorted Data Values */
data CombinedUniversity;
	set University1 University2;
	by student;
run;

/* view dataset */
proc print data=CombinedUniversity;
run;
Appending sorted SAS datasets

If you compare the output of this query with the previous method, you can clearly see the sequence of the rows has changed due to the sorted stacked method used to combine and append datasets.

3. PROC APPEND To Concatenate Datasets

This is a very popular SAS procedure when it comes to appending the two datasets. The only drawback with this method is, it can only appends two datasets at a time. Whereas first methods can combine and append two or more dataset at the same time.

You could append more than two datasets using the proc append method but you have to write this procedure multiple times in order to cover all the datasets which you want to append. 

In the proc append procedure base=dataset is the master dataset and data=dataset is the dataset which is to be appended to the master dataset. 

If the master-dataset doesn’t exist then it creates a new dataset and appends the data. In this example work.CombinedUniversity dataset does not exist in the beginning so when the first proc append procedure runs it creates work.CombinedUniversity dataset and appends work.University1 dataset. 

The second proc append procedure appends the work.University2 dataset to the CombinedUniversity dataset.

/* PROC APPEND To Concatenate Datasets */
proc append base=CombinedUniversity data=University1;
run;

proc append base=CombinedUniversity data=University2;
run;

/* view dataset */
proc print data=CombinedUniversity;
run;
Combine and append sas datasets

Did you know? How to use the PROC APPEND in 10 different ways.

4. PROC DATASETS To Concatenate Datasets

This is a slightly different version of the proc append procedure. With the PROC DATASETS procedure you can use the APPEND statement to append multiple datasets.

Here also for the first time if base=dataset doesn’t exist then it creates a new dataset and appends data from data=dataset. You must write the multiple append statement to append more than one datasets.

/* PROC DATASETS To Concatenate Datasets */
proc datasets lib=work memtype=data;
	append base=CombinedUniversity data=University1;
	append base=CombinedUniversity data=University2;
	run;
quit;

/* view dataset */
proc print data=CombinedUniversity;
	title 'CombinedUniversity Dataset';
run;
Combine and append sas datasets

5. PROC SQL To Concatenate Datasets

You can use the PROC SQL method to combine and append multiple datasets in SAS. There are two simple methods which can be used to combine and append SAS datasets using proc sql procedure.

  1. PROC SQL With UNION ALL option
  2. PROC SQL With OUTER UNION Corr option

1. PROC SQL With UNION ALL option

You can select all the observations from two or more datasets and combine them together using the UNION ALL option.

The UNION option processes unique rows from both the queries. But you can preserve duplicate rows as well using ALL keyword

The optional ALL keyword preserves the duplicate rows, reduces the execution by one step, and thereby improves the query-expression’s performance.

/* PROC SQL To Concatenate Datasets: using union all */
proc sql;
	create table combineduniversity as 
	select * from University1 
	union all 
	select * from University2;
quit;
proc print;
   title 'A UNION ALL B';
run;
PROC SQL To Concatenate Datasets using union all

2. PROC SQL With OUTER UNION Corr Option

The OUTER UNION option concatenates the query results. The CORR which is also known as CORRESPONDING option causes PROC SQL to match the columns in table-expressions by name and not by ordinal position.

/* PROC SQL To Concatenate Datasets: using outer union corr */
proc sql;
	create table CombinedUniversity as 
	select * from University1 
	outer union corR
	select * from University2 ;
quit;
proc print;
title ' A OUTER UNION CORR B';
run;
PROC SQL To Concatenate Datasets using outer union corr

FAQ

Can I combine or append datasets with different structures in SAS?

Yes, you can combine or append datasets with different structures in SAS, but you need to ensure that the variables you are combining or appending are compatible.

What happens if there are conflicts when combining or appending datasets in SAS? 

If there are conflicts when combining or appending datasets in SAS, such as variables with the same name but different types, SAS will give an error. You need to resolve these conflicts before combining or appending.

What does it mean to combine and append datasets in SAS? 

Combining and appending datasets in SAS involves bringing together data from multiple datasets into one. This can be done either by adding new observations (appending) or new variables (combining).