Getting Started with SAS: Beginner Guide

This is the quick starting point for new SAS users who want to learn SAS. I have included the SAS essentials to get you up and running. This article will point you from basic concepts to more advanced SAS concepts with real life examples.

What is SAS?

SAS “Statistical Analysis System” is a statistical software suite developed by SAS Institute for data management, advanced analytics, business intelligence, graphs, prediction, forecast analytics, and much more.

Let’s get started.

BASE SAS is the core foundation for a variety of data management and analytical software components offered by SAS.

Base SAS

How you use SAS depends on what you want to accomplish. Some people use it extensively for many different purposes, others use it for a few very specific tasks. 

What are the things you can do with the SAS?

Here is the compressive list but not limited to what you can do using SAS in the analytics industry.

  • Store data in the SAS tables.
  • Access data in almost any format
  • Manage and manipulate data
  • Build reports and dashboards.
  • Save reports in a wide variety of formats, including HTML, PDF, and RTF
  • Creating Safe Drugs & Clinical Research
  • Business Intelligence
  • Predictive Analytics
  • Forecasting
  • Data mining
  • Machine Learning
  • Deep Learning
  • Build AI/Ml models

Getting Started with SAS Essentials

Base SAS software contains the following:

  • a data management facility
  • a programming language
  • data analysis and reporting utilities

Once you learn to use BASE SAS more regusly, it will enable you to understand how other SAS products work as at the end most of the product generates base sas code in the backend. 

For example, in SAS DI studio you create jobs to cleanse, transform, join or alter data,etc. When you look at the final outcome, DI studio automatically generates SAS code in the backend.

Before we deep dive into those areas let’s look at two most common and important sas programming elements to understand.

DATA Step

Any portion of a SAS program that begins with a DATA statement and ends with a RUN statement, another DATA statement. 

DATA step functionality:

  • reads, transforms, and outputs data.
  • runs in a single thread in SAS or in multiple threads in SAS Cloud Analytic Services (CAS). CAS is a cloud-based run-time environment that enables massively parallel execution.
  • has a built-in looping functionality that automatically reads records from external files until the end of file is reached.

There are four statements that are commonly used in the DATA step.

  • DATA statement names the dataset
  • INPUT statement lists names of the variables
  • CARDS statement indicates that data lines immediately follow.
  • INFILE statement indicates that data is in a file and the name of the file.
DATA WORK.GAME; 
INPUT IdNumber 1-4 Name $ 6-24 Team $ StartWeight EndWeight;
Loss=StartWeight-EndWeight;
DATALINES;
1023 David Shaw red 189 165
1049 Amelia Serrano yellow 145 124
1219 Alan Nance red 210 192
1246 Ravi Sinha yellow 194 177
1078 Ashley McKnight red 127 118
;

PROC step

The PROC step consists of a group of SAS statements that call and execute a built-in program known as procedure. Use PROCs to analyse the data in a SAS data set, produce formatted reports or other results, or provide ways to manage SAS files.

With the PROC steps you get the results by writing minimal code and generating the output that you need.

The output from a PROC step can provide descriptive statistics, frequency tables, cross-tabulation tables, tabular reports consisting of descriptive statistics, charts, plots, and so on. Output can also be in the form of an updated data set.

proc print data=SASHELP.CARS(obs=10); 
run;
proc print output
Output

Data Management

SAS organize data into a table called as SAS data set. This is how it looks when you open any SAS data set. You can imagine data entered into an excel sheet. 

sas table cars

In a SAS data set, information available in each row is referred to as an “observation”. Each column has the same type of data and is referred to as a variable.

In a SAS data set, an observation contains all the data values for an entity; a variable contains the same type of data value for all entities.

To create a SAS data set in Base SAS, you write a program that uses SAS statements, typically you use the DATA step to create a data set.

The following SAS program creates a SAS data set named GAME:

data WORK.GAME; 
input IdNumber 1-4 Name $ 6-24 Team $ StartWeight EndWeight;
Loss=StartWeight-EndWeight;
datalines;
1023 David Shaw red 189 165
1049 Amelia Serrano yellow 145 124
1219 Alan Nance red 210 192
1246 Ravi Sinha yellow 194 177
1078 Ashley McKnight red 127 118
;

LOG:

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
68
69 DATA WORK.GAME;
70 INPUT IdNumber 1-4 Name $ 6-24 Team $ StartWeight EndWeight;
71 Loss=StartWeight-EndWeight;
72 DATALINES;
 
NOTE: The data set WORK.GAME has 5 observations and 6 variables.
sas DATASTEP

When you access external data or create SAS data sets, then you can start working on those data. The most common task you do is, cleansing, delete, update – transform the data. That comes under data management.

Programming Language

In the above example you’ve already created a new data set by writing SAS statements, that is also a small SAS program.

The SAS language contains statements, expressions, functions and CALL routines, options, formats, and informats, etc. You can relate this with any other programming languages. 

Like any other programming language, SAS also has certain rules to use all these elements while writing programs.

Data analysis and reporting

SAS is a powerful programming language where you can perform data analysis and reporting quickly on a huge data set.

SAS has built-in programs known as SAS procedures. It is easy to use and with minimal code you can achieve meaningful results. 

SAS also offers a variety of products in which you can build reports and dashboards by using drag and drop elements.

For example, the following SAS program produces a report that displays the values of the variables in the SAS data set “GAME”.

proc print data=WORK.GAME;
title 'GAME - SAS Data set details';
run;

This procedure, known as the PRINT procedure, displays the variables in a simple, organised form.

GAME - SAS Data set details
Output

Getting started with SAS…

Learn SAS Code (Free Course)

Learn SAS Code — The only SAS programming course available on the internet which you need to master Data Analytics, Business Intelligence (BI) and Cloud technology with tons of real life examples.

Unlock Free Access