_N_ Automatic Variable In SAS:
The value for _N_ is initially set to 1. Each time the DATA step loops past the DATA statement, the variable _N_ increments by 1. The value of _N_ represents the number of times the DATA step has iterated.
The following sample dataset will be used to demonstrate how to use _N_ automatic variable in SAS.
/* create a dataset */
data result;
input student $ marks major $;
datalines;
2000 87 Math
2010 92 English
2020 92 History
2030 94 Music
2040 96 Robotics
2050 98 AI/ML
2060 91 Physics
2070 90 Chemistry
2080 86 Aeronautics
2090 95 Geology
;
run;
/* view dataset */
proc print data=result;
run;
Example 1: How To Assign _N_ Values to a New Variable In SAS
The _N_ automatic variable is physically not available on the dataset but it can be made available by assigning _N_ values to a new variable. Following code shows how to add a new variable that holds values of _N_ variable.
/* assign _N_ values to new variable */
data result_New;
set result;
new_var = _N_;
run;
/* view dataset */
proc print data=result_New;
run;
There are many use cases for how to use _N_ variable in SAS but most of the time it’s being used to identify and select first N-observations or any specific row from sas dataset.
Example 2: Select First N Rows with _N_ Automatic Variable
As you already know how this _N_ variable works and what values are stored in SAS. You can use _N_ variable to select the first N rows from the SAS dataset.
The following code shows how to select the first 5 observations from the dataset.
/* select first N rows with _N_ automatic variable */
data first_5_rows_result ;
set result;
if _N_ <= 5;
run;
/* view dataset */
proc print data=first_5_rows_result; run;
Example 3: Select N-th Row with _N_ Automatic Variable
In the previous example you learn about how to select first N observations. In this example you’ll learn how to select very specific observations using _N_ (automatic variable) in SAS.
Let’s say, you want to select a very specific 4th observation from the dataset then it can be done using _N_ variable.
/* select N-th Observation Using _N_ variable*/
data select_4th_row_result ;
set result;
if _N_ = 4 then output;
run;
/* view dataset */
proc print data=select_4th_row_result; run;
Example 4: Select First and Last Observations Using _N_ Variable
You can select first and last observations using _N_ automatic variable and end=last_obs option in SAS. The end=option flags the last observation in the temporary numeric variable.
- First observation can be selected using _N_
- Last observation can be selected using end= option
The following code shows how to select first and last observations in SAS.
/* select first and last observations in SAS */
data select_first_last_obs;
set my_dataset end=last_obs;
if _N_=1 OR last_obs=1 then output;
run;
/* view dataset */
proc print data=select_first_last_obs; run;
FAQ – How To Use _N_ In SAS
_N_ is an automatic variable that is created by the DATA step in SAS. It counts the number of iterations of the DATA step and starts from 1. _N_ is temporary and not output to the data set
You can use N in a conditional statement to select the first row or the first N rows in a dataset. For example, if N = 1, then the condition is true for the first row only. If N <= 10, then the condition is true for the first 10 rows.
You can use _N_ automatic variable to create a new column in a dataset by assigning it to a new variable or by using it in a function.