With the PROC TRANSPOSE procedure you can create an output data set by restructuring the values in a SAS data set, transposing selected variables into observations.
Usually datasets are structured but sometimes you may get the data in different formats, depending on how data is being collected at source.
There are other external factors such as convenience, costing, and third party vendors, etc that also affects the structure of input data.
But you may want to arrange the data in a specific format and proc transpose procedure could be the best solution to standardize your data.
A Simple Definition
In a simple word, proc transpose switches the variable into observations or observations into variables in SAS dataset.
Here is the simplest illustration of how TRANSPOSE procedure works in real life:
The TRANSPOSE procedure can often eliminate the need to write a lengthy DATA step to achieve the same result.
Further, the output data set can be used in subsequent DATA or PROC steps for analysis, reporting, or further data manipulation.
PROC TRANSPOSE does not produce printed output. To print the output data set from the PROC TRANSPOSE step, use PROC PRINT, PROC REPORT, or another SAS reporting tool.
Syntax:
/* proc transpose syntax */
PROC TRANSPOSE <DATA=input-data-set> <DELIMITER=delimiter>
<OUT=output-data-set> <PREFIX=prefix> <SUFFIX=suffix>;
BY <variable-1 variable-2 ...>;
COPY variable(s);
ID variable;
IDLABEL variable;
VAR variable(s);
RUN;
Explanation:
- BY – Transpose each BY group
- COPY – Copy variables directly without transposing them
- ID – Specify a variable whose values name the transposed variables
- IDLABEL – Create labels for the transposed variables
- VAR – List the variables to transpose
The following sample dataset will be used to demonstrate multiple use cases with the examples of how to use proc transpose in SAS.
/* Create sample dataset */
data SalesReport;
input Region $9. store $ Manager $ Department $ Sales 8. Workers 8.;
datalines;
Southeast Store-A Hayes Produce 100 4
Southeast Store-B Hayes Meat 80 5
Northwest Store-C Jeffreys Paper 60 2
Northwest Store-D Jeffreys Canned 420 6
Northwest Store-E Duncan Canned 230 4
Northwest Store-F Duncan Meat 73 2
Southwest Store-G Royster Canned 120 3
Southwest Store-H Patel Produce 350 8
Southwest Store-I Patel Meat 80 3
Northeast Store-J Fuller Paper 200 5
;
run;
/* view dataset */
proc print data=SalesReport; run;
A Simple PROC TRANSPOSE Procedure
By default, the PROC TRANSPOSE transposes only the numeric variables. That’s why in the below example only two variables “Sales” and “Workers” will be transposed, because no VAR statement is used and none of the numeric variables appear in another statement.
The OUT= puts the result of the transposition in the data set transpose_SalesReport.
/* Simple PROC TRANSPOSE in SAS */
proc transpose data=SalesReport out=transpose_SalesReport;
run;
/*view dataset */
proc print data=transpose_SalesReport; run;
PROC TRANSPOSE With PREFIX=option
The PREFIX=option specifies a prefix to use in constructing names for transposed variables in the output data set.
Also, you can use NAME=name to give the name for the variable in the output data set that contains the name of the variable that is being transposed to create the current observation.
/* proc transpose with prefix= and name= */
proc transpose data=SalesReport out=transpose_SalesReport name=StoreSales_Report
prefix =Store_;
run;
/*view dataset */
proc print data=transpose_SalesReport; run;
PROC TRANSPOSE With PREFIX and SUFFIX= options
In the previous example we have seen how to use prefix= option. It’s similar to using the suffix=option.
It specifies a suffix to use in constructing names for transposed variables in the output data set.
In the below example we have appended **_Amnt as a suffix in the transposed output variable names.
/* proc transpose with prefix= and suffix= options*/
proc transpose data=SalesReport out=transpose_SalesReport name=StoreSales_Report
prefix=Store_ suffix=_Amnt;
run;
/*view dataset */
proc print data=transpose_SalesReport; run;
PROC TRANSPOSE By ID Variable
The transposed output dataset variables can be set using the ID statement in proc transpose. Values present in the ID variable would become the variable name.
In the following example we are transposing the “Store” and “Workers” (ID-variable) from rows to columns in the transposed output.
/* proc transpose by ID variable */
proc transpose data=SalesReport out=transpose_SalesReport;
id Store;
run;
/*view dataset */
proc print data=transpose_SalesReport; run;
Choosing The Transposed Variable
By default the proc transpose procedure transposes all the numeric variables present in the input dataset. You can limit the variable list by using the VAR statement and listing only variables that you want to transpose.
In the previous example the output generated for both the numeric variables: Sales and Workers. Let’s limit the transposed variables to have only a “sales” variable.
/* proc transpose by choosing transposed variable */
proc transpose data=SalesReport out=transpose_SalesReport;
id Store;
var Sales; /* variable selection */
run;
/*view dataset */
proc print data=transpose_SalesReport; run;
Use PROC TRANSPOSE With BY, ID, and VAR statements
In this example we will mention all three statements together, BY for groups, ID for variables to place along the columns, and VAR for variable selection.
Here are short description:
- by: The variable to place along the rows. e.g. store
- id: The variable to place along the columns. e.g. Department
- var: The variable whose values are placed within the dataset. e.g. Sales
The dataset must be sorted on the BY variables before using it in the proc transpose procedure.
/* Use PROC TRANSPONSE with By, ID, and VAR statements */
proc sort data=salesreport;
by Store;
run;
proc transpose data=SalesReport out=transpose_SalesReport;
by Store;
id Department;
var Sales;
run;
/*view dataset */
proc print data=transpose_SalesReport; run;
Drop _NAME_ Column From Transposed Output
If you’re not sorting the data beforehand then you must use NOTSORTED keyword with BY statement in the transpose procedure.
The _NAME_ column can be dropped from the output using drop=option with the OUT= statement.
/* drop _name_ and add notsorted keyword */
proc transpose data=SalesReport out=transpose_SalesReport(drop=_NAME_);
by Store NOTSORTED;
id Department;
var Sales;
run;
/*view dataset */
proc print data=transpose_SalesReport; run;
PROC TRANSPOSE With DELIMITER=option
When you add two variables in the ID statement, by default it appends the var 1 and var 2 names to create transposed dataset variable names.
For example: If you use ID variables – Store and Department.
The transposed output will have the variable names as values from Store column appended with values from Department column.
For example transposed dataset will have following column names: Store-AProduce, Store-BMeat, Store-CPaper, etc
With the DELIMITER option you can segregate these values and add some delimiters. In the below example “__” delimiter has been added between Store name and Department names.
Since we are not limiting the variables using the VAR statement the output will be generated for all the numeric variables, that is, Sales and Workers.
/* proc transpose with delimiter option */
proc transpose data=SalesReport out=transpose_SalesReport delimiter=__ name=StoresByDept;
id Store Department;
run;
/*view dataset */
proc print data=transpose_SalesReport; run;
FAQ – PROC TRANSPOSE In SAS
PROC TRANSPOSE is a SAS procedure that transposes selected variables into observations and creates an output dataset. It can be used to convert a dataset from a long format to a wide format.
You can use PROC TRANSPOSE to restructure your data by specifying the input dataset, the variable to place along the rows, the variable to place along the columns, and the variable whose values are placed within the dataset
You can use the DROP statement to drop the NAME variable when using PROC TRANSPOSE
Use PROC PRINT, PROC REPORT, or another SAS reporting tool to print the output dataset from PROC TRANSPOSE.