Return to Homepage

Goto the Tip of the Month Archive

Other interesting pages ...
LinkedIn Profile
SAS Cheat Sheet
Useful SAS Code
Full SAS Example
Basic Statistics
Contact Information

SAS Tip of the Month
April 2016

There are many parts to a variable, key among them are the name of the variable (NAME), type (TYPE), and length (LENGTH). Under SAS version 8 and above the name of a variable can be 32 characters long, its type can be either Character or Numeric, and the length for a character variable can be set between 1 and 32767 characters while a numeric variable can be set between 3 and 8 (yes, some operating systems allow for 2 but not all).

There are other parts to a variable, but the three most common are the informat, format and label - we shall look at this last part this month and next.

What exactly is a label and why is it used? In simple terms the label is a short descriptive text giving a more user friendly description of the variable. Take for example a very simple example, the variable WEIGHT in a dataset - sure it tells me what the variable is (it is weight) but it does not tell me other things I may need to know like what unit it is in(inches) or when the weight data point was collected, all of this being very important information. Given that I am allowed 32 characters for a variable name I could change the variable name to reflect this information, but in most cases it is not possible to do this and it indeed impractical. So here comes the use of the LABEL statement.

The label itself can be 256 characters, including blanks, in length (40 characters if you are still using SAS version 6.xx) - this gives us plenty of room to write a good description of the variable. Setting the label is commonly achieved by using the LABEL statement inside a datastep or the LABEL option under the MODIFY statement in the DATASETS procedure, the syntax of which is given below:

   data class; *** Inside a datastep;
     set class;
     label weight='Weight (kg) at Start of Study';
   run;
   proc datasets library=work; *** DATASETS procedure;
     modify class;
       label weight='Weight (kg) at Start of Study';
     quit;
   run;

I personally use the datastep method if I am creating or modifying a variable inside that datastep, otherwise I use the DATASETS procedure, but is only my convention - there is no set rule with this.

Before going on to a real world example, lets first see how we would look at what labels are set, if any, inside a dataset. The two easiest ways to look at the data is to either run a CONTENTS procedure call on the dataset, or if you have the SAS Viewer installed, look at the attributes window for that dataset.

Now lets look at a real world example where it will all come clearer. For this example I will use the dataset SASHELP.CLASS. Lets look first at the structure of the dataset using the CONTENTS procedure (will actually make a copy first so I don't overwrite the original data):

   data class; *** Make a copy of the dataset;
     set sashelp.class;
   run;
   proc contents data=class;
    *** Get structure of the dataset;
   run;

Running this code we get the following output (abridged):

   Alphabetic List of Variables and Attributes

   #   Variable   Type   Len 

   3   Age        Num     8
   4   Height     Num     8
   1   Name       Char    8 
   2   Sex        Char    1
   5   Weight.    Num     8

If a label existed for a variable we would see a column headed "Label", but in this case there is no labels applied to the dataset. So now lets set one for WEIGHT as we indicated above (this time I shall do the content structure not from the CONTENTS procedure, but the CONTENTS statement inside the DATASETS procedure):

   proc datasets library=work;
     modify class;
       label weight='Weight (kg) at Start of Study';
     contents data=class;
     quit;
   run;

Running this code we get the following output (abridged):

   Alphabetic List of Variables and Attributes
 
   #   Variable   Type   Len   Label
   3   Age        Num    8
   4   Height     Num    8
   1   Name       Char   8
   2   Sex        Char   1
   5   Weight     Num    8     Weight (kg) at Start of Study

As you can see the variable WEIGHT now has a label - the reason why it is useful will become clear shortly. For the purposes of this example I will also now add a label to the variable AGE and HEIGHT using the DATASETS procedure as above:

   proc datasets library=work;
     modify class;
       label height='Height (in) at Start of Study'
             age='Age (years) at Start of Study';
     contents data=class;
     quit;
   run;

Note that I did not redo the label for the variable WEIGHT in the above code as it was already done previously, although I could have put in the step and even replaced it with new text.

Next month we use these labels and see what we can do with them.

________________________________
Updated April 2, 2016