How to Change Encoding in SAS ?

When you’re setting up a new SAS environment or making integration with other source systems, encoding issues are very common. That is because of the mismatch between the encoding set on the systems. 

In this article we will see how you can check the encoding set in SAS and how you can change the encoding when you want.

There are three places where you have encoding specified and you can change it from one encoding to another.

  1. Encoding of the SAS Session
  2. Encoding on SAS dataset
  3. Encoding on SAS library

1. Change Encoding of the SAS Session

First you should know what is the encoding of your current SAS session. Run the following proc options query and look for the encoding details in the SAS log.

proc options option=encoding; 
run;
SAS session encoding

Or alternate method it to run this code:

%PUT %SYSFUNC(getOption(ENCODING));

Now let’s assume the current encoding you have is LATIN1 and now you want to change encoding to UTF-8 in SAS. The session encoding is like top level encoding.

When you create a dataset by default session encoding is applied to that dataset unless and until you don’t specify another encoding method.

The session encoding changes can be done in the configuration files. Follow these steps to change the encoding in SAS:

Step 1. Open this folder: C:\Program Files\SASHome_foundation\SASFoundation\9.4\nls

You will find sas configuration file sasv9.cfg under two different subfolders as:

    • C:\Program Files\SASHome_foundation\SASFoundation\9.4\nls\u8\sasv9.cfg
    • C:\Program Files\SASHome_foundation\SASFoundation\9.4\nls\en\sasv9.cfg

Just verify that you have these two sub folders with sasv9.cfg files.

Step 2. Now open the sasv9.cfg file from location C:\Program Files\SASHome_foundation\SASFoundation\9.4 in any notepad.

Look for the below line in the config file:

 -config “C:\Program Files\SASHome_foundation\SASFoundation\9.4\nls\en\sasv9.cfg”

Step 3. Change the configuration file where it is pointing to \en\sasv9.cfg to \u8\sasv9.cfg 

Change from this:

-config “C:\Program Files\SASHome_foundation\SASFoundation\9.4\nls\en\sasv9.cfg”

To this:

-config “C:\Program Files\SASHome_foundation\SASFoundation\9.4\nls\u8\sasv9.cfg”

Step 4. Check the encoding by running proc options.

proc options option=encoding; run;

2. Change Encoding of the SAS Dataset

The encoding set on the SAS dataset can be easily identified by using proc contents procedure. By default session encoding is applied when you create a sas dataset. 

If you want to have a different encoding on any sas dataset then use encoding= option to set the encoding method you want on that particular dataset.

To check default encoding on base SAS dataset, run the following code to check encoding on SAS dataset using proc contents procedure.

data work.cars;
	set sashelp.cars;
run;

proc contents data=work.cars;
run;

Other option:

Alternatively, you can use the following code to print encoding of a SAS dataset.

    %LET DSID=%SYSFUNC(open(work.cars,i));
     %PUT %SYSFUNC(ATTRC(&DSID,ENCODING))

Let’s change the encoding from UTF-8 to LATIN1 on this work.cars dataset.

data work.cars(encoding='latin1');
	set work.cars;
run;

proc contents data=work.cars; 
run;
change encoding in SAS

3. Change the encoding of the SAS Library

When you attempt to change the encoding of the SAS library it means you’re talking about changing the encoding of all the data sets present under that library.

Theoretically you can change the encoding to let’s say, UTF-8 or LATIN1 for each and every data set under that library and that is kind of like you’re changing the encoding of the whole SAS library.

But we have some other shorter options to do the same thing. You can create a separate library and specify desired encoding in the library statement and copy data sets into that new library. All the copied data sets will have encoding you specified while defining the SAS library.

In the below example you have inlib where you have data sets created with different encoding. Now you want to change the encoding to UTF-8 for all the data available under lnlib

It’s fairly easy, just create a new library outlib, specify outencoding=’UTF-8′ and copy all the data from inlib to outlib.

libname inlib '/home/u61950255/Source';
libname outlib '/home/u61950255/Target' outencoding='UTF-8';

proc copy noclone in=inlib out=outlib;
run;

Alternative:

In case you don’t want to create a separate library and copy data from one place to another, you can still manage to change the encoding by re-writing the data in the same library.

Let’s assume you have data sets in the outlib with different encoding. Now you want to change the encoding to UTF-8 for all the data available under outlib.

Follow these Steps:

1. Define outline library with option outencoding=’UTF-8′

libname outlib '/home/u61950255/Target' outencoding='UTF-8';

2. Re-create all the data sets in the same library using data steps. 

data outlib.class;
set outlib.class;
run;

Example:

Let me demonstrate you with examples. I’m creating two data sets with different encoding in sas library outlib.

libname outlib '/home/u61950255/Target';

data outlib.class(encoding='utf-8') outlib.class_2(encoding='latin1');
	set sashelp.class;
run;

Now I want to change the encoding to UTF-8 for all the data available under the outlib library. Here is how to do it:

Library statement with option outencoding=’utf-8′

libname outlib '/home/u61950255/Target' outencoding='utf-8';

Recreate all the data sets within the same library.

data outlib.class;
	set outlib.class;
run;

data outlib.class_2;
	set outlib.class_2;
run;

proc contents data=outlib._all_;
run;
change encoding in SAS dataset library
How to change encoding in SAS dataset library

FAQ

How do you set encoding in SAS?

You can change encoding in SAS using encoding= option with SAS data step. It is also possible to change encoding on all the sas data sets in the library.

 

data work.cars(encoding='latin1'); 
set sashelp.cars;
run;

proc contents data=work.cars; run;
Can we set the encoding on SAS library level?

Yes. You can easily set the encoding on SAS library using outencoding= option in library definition.

 

libname libref 'lib_path' outencoding='UTF-8';


..read more here.

How to change encoding from LATIN1 to UTF-8 from SAS Dataset?

It’s fairly easy to change encoding of a sas dataset. Assume you have work.CARS dataset with encoding LATIN1

 

It can be easily changed to UTF-8 using following sas code:

 

data work.cars(encoding='UTF-8');

set work.cars;

run;