The recommended ways to create new variables in SAS are to use the following methods. Depending on your requirement and use cases you can choose the method best suited to you but all of them are quick and easy to create variables in SAS data set.
How to Create New Variables in SAS Data Set
Here are 5 simple ways you can create new variables quickly and easily in SAS data set as follows:
- Create a new variable in SAS: Using the Length statement in a DATA Step
- Create a new variable in SAS: Using the INPUT statement in a DATA Step
- Create a new variable in SAS: Using an assignment statement
- Create a new variable in SAS: Using proc sql ALTER TABLE statement
- Create a new variable in SAS: Using IN= Data Set option
1. Create a New Variable in SAS: Using the Length Statement in DATA Step
This is the simplest way to define length and datatype of the new variables you want to add and insert some data into it.
The following example demonstrates that a new character variable “company” which is 25 char long, has been added into a data set named “sas_dataset” with the data string “SAS Institute”.
Similarly another numeric variable “year” has been added into the same data set with data value 1996.
data sas_dataset;
length company $25 year 8;
company='SAS Institute';
year=1996;
proc print;
run;
2. Create a new variable in SAS: Using the INPUT statement in a DATA Step
When you’re using a datalines statement with data values you need to mention the INPUT statement to define and create variables in the SAS data set.
You have to list down all the variable names followed by $ sign if it is a character variable. You don’t need to mention anything for numeric variables.
The following example shows the character variable “platform” and numeric variable “id” created using the INPUT statement.
data sas_dataset;
input platform $ id;
datalines;
Facebook 100
Twitter 500
LinkedIn 300
Snapchat 200
;
proc print;
run;
3. Create a new variable in SAS: Using an Assignment Statement
This method is being used especially when you want to use existing variables to calculate something and save the results in the new variables.
For example, here we want to create a new variable named “c” with value on variable “b” by multiplying it by 1000.
Hence we’ve applied this formula: b*1000 and created a new variable to assign the results for each observation.
data sas_dataset;
input a $ b;
datalines;
Facebook 100
Twitter 500
LinkedIn 300
Snapchat 200
;
run;
data new_dataset;
set sas_dataset;
c=b*1000;
proc print;
run;
4. Create a new variable in SAS: Using proc sql ALTER TABLE Statement
The ALTER TABLE statement is used to add new variables, delete existing variables, or modify format of variables.
This method gives you more flexibility while adding new variables. It is also possible to add values on those newly added columns with UPDATE TABLE statement.
The following example demonstrates that two new columns new_char (a character variable) and new_num (a numeric variable) have been added in sas_dataset with blank values.
By default the character variable has a blank value (‘ ‘) and the numeric variable has dot (.) as blank value. Then after we have added data in those newly added variables using the UPDATE TABLE statement.
data sas_dataset;
input a $ b;
datalines;
Facebook 100
Twitter 500
LinkedIn 300
Snapchat 200
;
run;
/* define new variables */
proc sql;
alter table sas_dataset add new_num NUM (8), new_char CHAR (12);
quit;
/* set values to new variables */
proc sql;
update sas_dataset
set new_num=100000,
new_char='Social Media';
quit;
/* print data */
proc print data=sas_dataset; run;
You can also add conditional data here but for simplicity we have just added the same values for all the rows.
Check out more examples about update columns in SAS
PROC SQL – INSERT Data
You can also use proc sql-insert statement to insert data into newly defined variables in SAS data set.
/* create table by defining variables */
proc sql;
create table sas_dataset (new_num NUM (8), new_char CHAR (12));
quit;
/* insert values to new variables */
proc sql;
INSERT INTO sas_dataset(new_num, new_char)
VALUES(100000, 'Social Media')
VALUES(200000, 'Social Media')
VALUES(300000, 'Social Media')
VALUES(400000, 'Social Media');
quit;
/* print data */
proc print data=sas_dataset;
run;
5. Create a New Variable in SAS: Using IN= Data Set Option
The IN= data set option creates a special boolean variable which has value either 1 (true) or 0 (false). When the data set contributes data to the current observation, it sets the value 1 otherwise it sets it to 0.
You can use IN= on SET, MERGE, UPDATE statements in a data step.
The following example demonstrates how you can use the IN= option to create variables in SAS. The work.old and work.new are two input datasets and let’s try to merge these two datasets.
/* create two new datasets */
data old;
id=1;
platform='Facebook';
output;
id=2;
platform='Twitter';
output;
proc print data=old;
title ' data set: old';
run;
data New;
id=1;
platform='Facebook';
output;
id=3;
platform='Meta';
output;
id=4;
platform='Twitter';
output;
proc print data=New;
title ' data set: New';
run;
In the below example you can clearly see how you can use the IN= option followed by a placeholder that holds the boolean value when each observation gets processed.
You can print exact boolean values by creating new variables (here, x_in and y_in) and assigning the boolean values (x and y).
data combined;
merge old(in=x) new(in=y);
by id;
x_in=x;
y_in=y;
proc print;
run;
The above query merges two datasets on by variable (that is id variable) and assigns the boolean values. Here are some indicators:
- x_in=1 : it means id value found in dataset work.old
- y_in=1 : it means id value found in dataset work.New
That’s set about creating new variables in SAS using different methods. The most common method is using Data Step but depending on your scenarios and requirement you can use any of the methods mentioned above.