SAS Program Structure (DATA step, PROC step, & Output step)

SAS program structure is easy to understand but before that you must have the basic understanding of SAS user interface or at least familiar with the SAS studio web application.

SAS programming structure mainly consists of three steps: the first is the data step, the second is a proc step, and third one is output step.

Apart from this few others like data set, label, variables, etc are also part of SAS program structure.

raw data to final analysis - SAS Programming Structure

We usually begin with the raw data which has not yet been processed by SAS. You use a set of statements such as DATA step along with other SAS statements to read raw data and store it into a new SAS data set. 

Further you can use proc step to modify or apply business rules on that data set to prepare for the analysis or to create reports.

It means to perform the basic operations such as reading data from source, analysing it, modifying data according to the business rules, and creating reports, you have to use SAS programming steps.

SAS Programming Structure

  1. DATA step
  2. PROC step
  3. OUTPUT step
SAS Program Structure

The Data Step

The DATA step consists of a group of SAS statements that begins with a DATA statement. It’s a process of building a SAS data set and names a data set.

When you submit a data step for execution, first it gets compiled, and then syntax checked. If the syntax is correct, then the statements are executed. 

The process of data step execution happens in the two phases.

  1. Compilation phase
  2. Execution (Loop) phase

Compilation Phase: When you submit a data step for execution, SAS first checks the syntax. If no error is found then SAS starts converting SAS statements into machine code.

SAS further processes the machine code and creates the three items such as input buffer, program data vector (PDV), and descriptor information. 

Execution Phase: In this phase SAS executes statements in a loop. The input buffer, program data vector, and descriptor information are getting used to create a SAS data set but with one observation or record at a time.

If there is another record to read, then the program executes again. SAS builds the second observation, and continues until there are no more records to read. The data set is then closed, and SAS goes on to the next DATA or PROC step.

Syntax:

DATA  data-set-name;   /* data set */
INPUT var1, var2;   
DATALINES
value-of-var1 value-of-var2;   /* data (1st record) */
value-of-var1 value-of-var2;   /* data (2nd record) */
RUN;

Let’s understand through a simple example:

data Country;
input IdNumber CountryName $ 2-24 CountryCode $ 25-28   ;
datalines;
1 United Arab Emirates  ARE
2 European Union        EUU
3 United Kingdom        GBR
4 Tunisia             TUN
5 United States         USA
;
run;
SAS Program structure - DATA step output

The PROC Step

The PROC step consists of a group of SAS procedures that begins with a PROC statement. PROC step analyses the data of a data set by using SAS procedures such as PRINT, FREQ, PROC MEANS, PROC SORT, etc.

In the proc step SAS invokes SAS inbuilt or also known as pre-written procedures to analyse data sets and display results as a report.

Syntax:

PROC procedure-name options;   /* specify sas procedure name*/ 
RUN;

Lets understand through an example of the DATA step followed by the PROC step.

data Country;             /* DATA step */
input IdNumber CountryName $ 2-24 CountryCode $ 25-28   ;
datalines;
1 United Arab Emirates  ARE
2 European Union        EUU
3 United Kingdom        GBR
4 Tunisia               TUN
5 United States         USA
;
run;

proc print data=Country;  /* PROC step */
run;
SAS Program structure - PROC step output

The Output Step

The output step displays the result of the data analysis done by the PROC step. The RUN statement is not a part of the DATA step nor part of the PROC step but it is responsible to execute previously submitted steps and generate output.

The RUN statement is not required between steps in a SAS program. However, it is a best practice to use a RUN statement because it can make the SAS program easier to read and the SAS log easier to understand when debugging.

The Output step stores or displays the result of analysis done by proc step. You can see the output in the RESULTS tab in SAS Studio.

Example:

data weight;
  input IdNumber Name $ 6-20 Team $ 22-27 StartWeight EndWeight;
  datalines;
1023 John Shaw       red    289 165
1049 Lilo Serrano    yellow 145 124
1219 Kane Nance      red    210 192
1246 Jay Sinha       yellow 294 177
1078 Ashley McKnight red    127 118
1221 Jim Bane        yellow 220 .
;
run;
proc print data=weight;  /* Output step */
  title 'Players with start weight 200 and more';
  where  StartWeight >= 200 ; 
run;
SAS Program structure - OUTPUT step result

That’s it about the SAS program structure. You may check out our other SAS programming tutorials to master data analytics, business intelligence, and cloud technology.

Free SAS Course

Learn SAS Code — The only programming course available on the internet which you need to master Data Analytics, Business Intelligence (BI) and Cloud technology.

100% free.

Unlock Free SAS Tutorials

FAQ

What is SAS programming structure?

To perform the basic operations such as reading data from source, analysing it, modifying data according to the business rules, and creating reports, you need to use SAS programming steps and follow particular structure.

SAS programming structure mainly consists of three steps: the first is the data step, the second is a proc step, and third one is output step.

  1. DATA step
  2. PROC step
  3. OUTPUT step