The Datalines statement along with the INPUT statement is used to create a data set from scratch by entering data directly in the program, rather than data stored in an external file.
You can use only one DATALINES statement in a DATA step. Use separate DATA steps to enter multiple sets of data.
Basic Syntax (example) :
data sas_dataset;
input var1 $ var2;
datalines;
Facebook 100
Instagram 200
Twitter 500
LinkedIn 300
Snapchat 100
;
run;
Explanation:
- data : mention data set name you want to create
- input: mention variable name and its type
- datalines/cards/lines: where you mention actual values in data set
If you observe the above example there is a $ dollar sign in the input statement after var1. It means you’re defining var1 as a character variable. If you don’t mention anything then it is considered a numeric variable. Hence var2 here is a numeric variable.
You have to be very careful while defining variable types. Get familiar with the input values first and then decide the appropriate data type for that variable.
The default length of the character variable is 8 characters. If you have any data value that exceeds 8 characters will be truncated. You can avoid this by specifying the exact range of the data value you’re expecting.
Datalines: Cards: Lines: — all are the same!
There are two alternative statements for datalines, one is CARDS and other one is LINES. Don’t get confused when you see CARDS or LINES, instead Datalines.
Let’s go through different variations of examples to understand how to use the Datalines statement to create a SAS data set.
Example 1: Create a Simple SAS Data set Using Datalines Statement
The following code shows how to create a simple SAS data set using datalines which have three variables. The first one is ID which is a numeric variable, second one is FirstName which is a character variable and third one is dept which is again character variable.
Since you have not specified length of the char variables, all the character variables are by default 8 char long.
You can replace datalines with cards or lines. It produces exactly the same output.
data employee_details;
input id FirstName $ dept $;
datalines;
1 Erik Accounts
2 Jan Commerce
3 Frode IT
4 Nils Support
5 Kim Software
;
proc print;
run;
Example 2: Create a SAS Dataset Using Datalines statement with char values more than 8 char long.
You need to make small changes in the above query to read character values which are more than 8 char length.
For demonstration purposes let’s try to focus on the employee FullName variable. When you specify range for char variable it means for the shorter data values you need to add blank spaces.
In the below example, the first name starts from 3rd position and the longest name we have in the data value is until 14th position. Hence the length of the FullName variable is 12 char long.
Again we have not specified the range for dept variable, it means it’ll have default length 8 characters.
data employee_details;
input id FullName $ 3-14 dept $;
datalines;
1 Erik Hansen Accounts
2 Jan Arne Commerce
3 Frode JK IT
4 Nils Oyvind Support
5 Kim Jan Software
;
proc print;
run;
Example 3: Create a SAS Dataset Using Datalines Statement with date values
When it comes to date value in SAS you need to specify informat to read date value in proper format. SAS date is a numeric value and gets stored in numbers.
Observe the below example where we have mentioned range for FullName, dept, and informat for doj variable. Here Id and doj are numeric variables and rest variables are characters.
data employee_details;
input id FullName $ 3-14 dept $ 15-23 doj ddmmyy10.;
datalines;
1 Erik Hansen Accounts 11/06/2020
2 Jan Arne Commerce 12/06/2009
3 Frode JK IT 05/11/1990
4 Nils Oyvind Support 17/01/1989
5 Kim Jan Software 26/08/1993
;
proc print;
run;
Look at the values on the doj variable. It looks like random numbers but they are SAS date values stored in numbers.
You can format the date the way you want to display on Data set. Let’s format doj in the same way we see it in the data values that is dd/mm/yyyy format.
data employee_details;
input id FullName $ 3-14 dept $ 15-23 doj ddmmyy10.;
format doj ddmmyy10.;
datalines;
1 Erik Hansen Accounts 11/06/2020
2 Jan Arne Commerce 12/06/2009
3 Frode JK IT 05/11/1990
4 Nils Oyvind Support 17/01/1989
5 Kim Jan Software 26/08/1993
;
proc print;
run;