The formats in SAS are a very underrated feature among the SAS programmers. Especially when it comes to user defined formats. It has been seen that people happily use join or complex look-up logic but don’t want to create user defined formats.
This is probably because not everyone is aware of the real power of the proc format procedure. The user defined data is stored in the memory and can be easily accessible through formats.
Formats can also be used for more than the displaying data. Formats can be combined with DATA step functions to provide a means to do data conversions and even table look-ups.
The SAS formats are powerful, flexible, and easy to use. You must have already been using default formats available in sas such as date9., $8., datetime., dollar9.2, percent8.2, and so on.
A few selected formats include:
→ dollarw.d includes dollar sign and commas
→ percentw.d writes the number as a percent
→ $UPCASEw.
→ DATEw.
→ w.d where w is the width and d the number of decimal places
→ zw.d writes leading zeros
→ $w. writes standard character data
The application of these formats to the data values on the left could produce the following results:
- 2345.678 ⇾ dollar9.2 ⇾ $2,345.68
- 0.6723 ⇾ percent8.2 ⇾ 67.23%
- -0.6723 ⇾ percent8.2 ⇾ (67.23%)
- 123456789 ⇾ ssn11. ⇾ 123-45-6789
- 2345.678 ⇾ 10.2 ⇾ 2345.68
- 2345.678 ⇾ z10.3 ⇾ 002345.678
- ‘ abcde’ ⇾ $char8. ⇾ ‘ abcde’
- ‘ abcde’ ⇾ $8. ⇾ ‘abcde ‘
SAS has a wide range of built-in formats available which you can use anytime anywhere in the SAS code. But what if you want to create your own formats?
This is also possible. You can create user defined formats in SAS the way you want and use wherever you want.
User defined formats can be created using the PROC FORMAT procedure. It can be used to:
- convert numeric values into character values
- convert character strings into numbers
- convert character strings into other character strings
The proc format procedure syntax is quite easy and simple. Though with the options available in the procedure provides you a great deal of power and flexibility.
Syntax:
PROC FORMAT options;
VALUE format_name specifications;
INVALUE informat_name specifications;
PICTURE format_name specifications;
RUN;
If I categorise SAS formats into different parts then I would break it down into three parts. I’d rather call them methods.
Every method explained here can be used independently, depending on your requirements you can choose either of these.
- Create User Defined Formats using VALUE statement
- Create User Defined Formats using PICTURE format
- Create User Defined Formats from DATA
There are two types of formats you can create in SAS. One is numeric format and the other is character format.
The numeric formats can be created on numeric values or numeric variables whereas character format can be created on char values or char variables. You’ll learn about these two types of formats further in the article.
Create User Defined Formats using VALUE statement
The simple formats can be created using the VALUE statement. After the value statement you need to specify data mapping details, the actual data values against their user defined values which you want to map.
In the following example you’ll create one character format ($YEAR.) and one numeric format (GRADE.). Further we will apply both the formats on a demo dataset called work.GRADE.
In the PROC FORMAT default WORK library is being used. It means in the following example library=WORK is applied by default.
So, for the demonstration let’s create a new dataset with some dummy grade data.
/* SAS User Defined Format (PROC Format) */
data work.grade;
input Name $ 1-8 Gender $ 11 Status $13 Year $ 15-18 Section $ 20 Score 22-23
FinalGrade 25-26;
datalines;
Abbott F 2 1987 A 90 97
Branford M 1 1998 A 92 97
Crandell M 2 1993 B 81 71
Dennison M 1 1997 A 80 72
Edgar F 1 1998 B 89 80
Faust M 1 1981 B 78 73
Greeley F 2 1988 A 82 91
Hart F 1 2001 B 84 80
Isley M 2 2015 A 88 86
Jasper M 1 2002 B 91 96
Chris F 2 2009 A 82 91
Harty F 1 1999 B 84 84
Dan M 2 1997 A 88 97
Jasprit M 1 2019 B 91 93
Kim F 1 2021 B 82 98
Hacker M 1 2023 A 81 93
Jan M 2 2025 B 92 98
Bard M 1 2011 A 92 97
Sandy F 2 1998 B 89 91
Karlsen F 1 1997 A 85 82
Bunny M 2 2009 A 89 89
Josh M 1 2022 A 91 93
Frode F 2 2030 B 92 98
Nils M 1 2010 A 82 90
Geir M 2 1980 A 86 98
;
run;
proc print;
1. Character Formats in SAS
The character formats can be created on char values or char variables and apply them on equivalent character variables only.
In this example you’ll create char format ($YEAR.) and apply them on the char variable YEAR in the GRADE dataset. It’s also allowed to mention character values in a range.
So here actual years values present in the YEAR variable will be mapped and replaced with values like “80’s decade”, “90’s decade”, and so on.
/* char format: User defined format on character variable */
proc format library=work;
value $YEAR
'1980'-'1989'="80's decade"
'1990'-'1999'="90's decade"
'2000'-'2009'="2000's decade"
'2010'-'2019'="2010's decade"
'2020'-'2099'="2020's or more";
run;
proc print data=grade;
format year $YEAR.;
run;
2. Numeric Formats in SAS
The numeric formats can be created on numeric values or numeric variables and apply them on equivalent numeric variables only.
In this example you’ll create numeric format (GRADE.) and apply them on the numeric variable FinalGrade on the GRADE dataset.
It’s also allowed to mention numeric values in a range. So here actual numeric grades available on FinalGrade will be mapped and replaced with values like “A”, “A+”, “A++”, “Genius”, etc.
/* Numeric format: User defined format on numeric variable */
proc format library=work;
value GRADE
70-79="A"
80-89="A+"
90-99="A++"
100="Genius";
run;
proc print data=grade;
format FinalGrade GRADE.;
run;
3. Extra: Use User Defined Formats in SAS Data step
This is another yet very powerful use case where you use user defined formats in the Data step.
In the following example you’ll create a new dataset called fmt_Grade with two new char variables (fmtGrade $8, fmtYear $50) and apply formats on them using the PUT function.
This way you can visually see the actual data and their respective formatted values but on the separate columns.
data fmt_Grade;
set Grade;
/* define two new variables */
length fmtGrade $8 fmtYear $50;
/* assign values with formats */
fmtGrade=put(FinalGrade, Grade.);
fmtYear=put(Year, $YEAR.);
run;
/* view data */
proc print data=fmt_grade;
run;
Character to Numeric Mapping
Here is one more example of character to numeric mapping using formats. You have data “month_data” with the name of months in the “month_nm” column which is a character variable.
Since the months are static and definite so you can define a format that maps character months to numeric months.
This is how month names will be mapped to month number whenever you use month_fmt once you create it.
month_nm | month_nr |
---|---|
Jan | 1 |
Feb | 2 |
Mar | 3 |
Apr | 4 |
May | 5 |
Jun | 6 |
Jul | 7 |
Aug | 8 |
Sep | 9 |
Oct | 10 |
Nov | 11 |
Dec | 12 |
Format “month_fmt” can be defined and created as follows. You may change the name of the format. It’s up to you.
/* char to num mapping format for months */
proc format;
invalue month_fmt
"Jan" = 1
"Feb" = 2
"Mar" = 3
"Apr" = 4
"May" = 5
"Jun" = 6
"Jul" = 7
"Aug" = 8
"Sep" = 9
"Oct" = 10
"Nov" = 11
"Dec" = 12
;
run;
Once this format is available you can use this month_fmt format in any dataset where you have month names in character and you want them to appear as numeric formatted values.
With the given months_data you can create a new dataset “months_data_fmt” with a new column month_nr which will have a formatted month number which is derived from month names.
/* given sas data set */
data work.months_data;
input month_nm $;
datalines;
Jan
Apr
May
Jun
Dec
Apr
Jul
;
run;
/* Create a new sas data set with formatted value on column "month_nr" */
data work.months_data_fmt;
set work.months_data;
month_nr = input(month_nm, month_fmt.);
run;
NOTE: So far we have seen the formats created under the WORK library. Check out this article to learn more about how to create formats in the permanent library and store it in the catalog.
Create User Defined Formats using PICTURE format
You must be wondering what the heck is PICTURE format!
Think how the telephone numbers are written: combination of a country code, dash, spaces between the digits, and so on.
They are in some sort of standard format, Isn’t it?
The Picture format is like TEMPLATE that defines how data should be printed, or displayed, or even stored in the table.
Syntax is very similar to proc format invalue. The only change here is PICTURE keyword in place of INVALUE.
In the below example demonstrated how a 10 digits mobile number can be formatted with a group of two digits and space in between.
For example: mobile number 9876543210 can be formatted as: 98 76 54 32 10
/******** proc format PICTURE ************/
proc format;
picture fmtMobile
00000-9999999999='99 99 99 99 99'
other='99 99 99 99 99';
run;
data _null_;
mobile_nr=9876543210;
put mobile_nr=;
/* print value using user defined format fmtMobile. */
put mobile_nr=fmtMobile.;
run;
It was a simple example. Let’s have a look at some different date formats.
You have dates in the format like this ’10MAY2025:05:20:15’dt
And you want to format this as: 2025-05-10:05:20:15
This is how you can do it:
proc format;
picture fmtdbdate
other='%Y-%0m-%0d:%0H:%0M:%0S' (datatype=datetime);
run;
data _null_;
now='10MAY2025:05:20:15'dt;
put 'now_as_is ' now=;
/* print date value using sas existing format datetime. */
put 'now_datetime ' now=datetime.;
/* print date value using user defined format fmtdbdate. */
put 'now_fmtdbdate ' now=fmtdbdate.;
run;
Create User Defined Formats from DATA
Instead of writing all the specifications to map the data using proc format you can build a user defined format from a sas dataset which has data in specific structure.
This dataset can be created manually by inserting values or you create it using another dataset.
You can choose whichever suits you best. Building user defined format using other dataset is the popular method as it gives you the possibility to quickly build user defined format with thousands of mapping records.
Imagine yourself writing thousands of mapping data values in proc format procedure!
It’s definitely not convenient hence most of us prefer to build formats from another dataset.
Format dataset structure:
It should have following variables: → fmtname, start, end, label, and type.
Method 1: Create Format dataset with raw data
In this method we create a dataset using datalines. It means it is a manual process though it’s a good starting point to learn more advanced SAS formats.
/*********** proc format using DATA ***************/
data my_fmt;
length fmtname $15;
input fmtname $ start end label $ type $;
datalines;
fmtEnginSizeGrp 1 2 Small N
fmtEnginSizeGrp 2.1 4.5 Medium N
fmtEnginSizeGrp 4.6 8 Large N
fmtEnginSizeGrp 8.1 10 XLarge++ N
;
run;
Now store the created format in the catalog Work.Formats and specify the source for the format. The CNTLIN= option specifies that the data set my_fmt is the source for the format fmtEnginSizeGrp.
/* store created format in the catalog work.formats and specify source of a format*/
proc format library=work cntlin=my_fmt;
run;
/* apply format on enginesize on sashelp.cars data set*/
proc freq data=sashelp.cars;
tables enginesize;
format enginesize fmtEnginSizeGrp.;
run;
Method 2: Create Format data set using another data set
Let’s say you have an existing data set called “Scale” with columns “begin”, “end”, “amount” in percent.
You want to build a format mapped to percentage in amount with its begin and end values.
/* proc format using DATA Example 2*/
data scale;
input begin: $char2. end: $char2. amount: $char3.;
datalines;
0 3 0%
4 6 3%
7 8 6%
9 10 8%
11 16 10%
;
run;
proc print;
title 'SAS Data Set: work.Scale';
run;
First thing you need to create a format data set from the existing data set “scale”. You already know the structure of the format data set.
Let’s create a format dataset “CTRL” in the work library.
data ctrl;
length label $ 11;
set scale(rename=(begin=start amount=label)) end=last;
retain fmtname 'PercentageFormat' type 'n';
output;
if last then do;
hlo='O';
label='***ERROR***';
output;
end;
run;
proc print data=ctrl noobs;
title 'The format CTRL Data Set';
run;
Now store the created format “PercentageFormat” in the catalog Work.Formats by specifying the source for the format “cntlin=ctrl”.
proc format library=work cntlin=ctrl;
run;
Now it is available to use wherever you want and however you want. In the following example you’ll see how this PercentageFormat is being used with the dataset “points”.
The format is applied here on the TotalPoints column in the proc freq procedure.
/* Create a new data set "points" */
data points;
input EmployeeId $ Q1 Q2 Q3 Q4 ;
TotalPoints=q1+q2+q3+q4;
datalines;
2355 1 0 0 6
5889 2 . 2 2
3878 3 4 9 1
4409 2 1 1 1
3985 3 6 3 2
0740 4 2 9 7
2398 5 1 . 6
5162 2 4 4 1
4421 3 2 2 2
7385 1 3 2 1
;
run;
proc print;
/* Use PercentageFormat on totalpoints column */
proc report data=work.points nowd headskip split='#';
column employeeid totalpoints totalpoints=Pctage;
define employeeid / right;
define totalpoints / 'Total#Points' right;
define pctage / format=PercentageFormat. 'Percentage' left;
title 'The Percentage of Salary for Calculating Bonus';
run;
That’s all about creating user defined format in SAS (The complete proc format guide)
FAQ
The proc format procedure is used to create user-defined formats in SAS. You can create SAS formats and store it in either work or permanent library catalog.
Syntax:
PROC FORMAT options;
VALUE format_name specifications;
INVALUE informat_name specifications;
PICTURE format_name specifications;
RUN;
User-defined formats in SAS allow users to customize the way data is displayed in reports and analysis.
These formats enable users to assign descriptive labels to data values, transforming raw data into more readable and meaningful output.
Yes, you can apply the same user-defined format to multiple variables in SAS.
This can be achieved by associating the format with those variables using the FORMAT statement in various SAS procedures or in data steps.
To create a user-defined format, you can use the PROC FORMAT procedure.
This procedure allows you to define a format name, specify the input values and their corresponding labels, and save the format for future use in data analyses and reporting.