Return to Homepage

Goto the Tip of the Month Archive

Other interesting pages ...
LinkedIn Profile
SAS Cheat Sheet
Useful SAS Code
Full SAS Example
Basic Statistics
Contact Information

SAS Tip of the Month
August 2013
(for SAS and WPS)

Getting data into SAS from an Excel speadsheet is a task that is often carried out by the SAS Programmer. One reason is that Excel is a good package for entering and editing data and does have some audit capabilites. But how do we get the data on the spreadsheet to a SAS dataset?

Let first look at some data on a sheet called LongRivers in a file called Rivers.xls located in the directory C:\TEMP:

   NAME                               LENGTH_KMS
   Nile River                              6,695
   Amazon River                            6,516
   Yangtze River                           6,380
   Mississippi-Missouri River System       5,969
   Ob-Irtysh Rivers                        5,568
   Yenisey-Angara-Selenga Rivers           5,550
   Huang He (Yellow River)                 5,464
   Congo River                             4,667
   Rio de la Plata-Parana                  4,500
   Mekong River                            4,425

As can be seen the structure for this sheet is two columns, no merged cells, and the column names are in row 1. This is as basic as they come but is the general structure for most sheets that hold data.

We can certainly save the sheet as a CSV file and import that file, but that is an extra step -- where possible it is best to try and import the data directly using PROC IMPORT and the EXCEL option that is available inside the procedure. The basic syntax for the use of the procedure is:

   PROC IMPORT DATAFILE="filename" OUT=dataset_name DBMS=EXCEL REPLACE;
      SHEET="sheet_name$";
      GETNAMES=YES|NO;
   RUN;

where

  • filename -- directory and name for the Excel file being imported
  • dataset_name -- dataset being created from the Excel sheet data
  • sheet_name -- name of sheet being imported and must have the '$' symbol at the end (defaults to Sheet1 if not present)

The DBMS=EXCEL option tells SAS that the data coming in is from an Excel spreadsheet, while the REPLACE option is necessary if you want to overwrite an existing dataset (too many imports of data have failed because the programmer assumed any dataset of the same name would be automatically overwrittern when it is not!). The GETNAMES option is a YES or NO value and gives the choice for the user to get the names of the variables on the first row -- if YES, any blanks are converted to '_', and if NO then the variable names become F1, F2, etc.

There are other statements that can be used including MIXED, RANGE, SCANTEXT, USEDATA and SCANTIME but these are beyond the example used here -- consult the SAS User Documentation for the use of these statements.

With this knowledge, lets import the data from the Excel file:

   PROC IMPORT DATAFILE="C:\TEMP\Rivers.xls" OUT=longrivers DBMS=EXCEL REPLACE;
      SHEET="LongRivers$";
      GETNAMES=YES;
   RUN;

The data we imported is ready for use.

It is possible to do a quick check on the data by running the following:

   title1 "Listing of Ten Longest Rivers in the World";
   PROC PRINT DATA=longrivers;
   RUN;

and we get the following output:

           Listing of Ten Longest Rivers in the World

                                                  LENGTH_
      Obs    NAME                                   KMS

        1    Nile River                             6695
        2    Amazon River                           6516
        3    Yangtze River                          6380
        4    Mississippi-Missouri River System      5969
        5    Ob-Irtysh Rivers                       5568
        6    Yenisey-Angara-Selenga Rivers          5550
        7    Huang He (Yellow River)                5464
        8    Congo River                            4667
        9    Rio de la Plata-Parana                 4500
       10    Mekong River                           4425

I hope this has been helpful. See you in September!

________________________________
Updated August 2, 2013