of the Month
It is eight years since I started this so I am going to answer a question that I am asked often -- "What code do you use most often?" This month is the answer to that question.
As a programmer who deals with a lot of data, I have a macro that produces a contents output for every dataset in a directory, along with a listing of the first 25 observations. This is a program that I have had in my "utility belt" for many years.
The program (in the form a macro) is given below:
%macro contprt(dir= /*Directory*/ ,maxobs=25 /*Maximum number of OBS to print*/ ); ** Define the LIBNAME of the data from DIR macro variable; library mydata "&dir"; ** Get a list of the datasets in the SAS file in the directory in question; proc contents data=mydata._all_ out=_mycont (keep=memname memtype) noprint; run; ** Create a unique list of the datasets, selecting only DATA types; proc sort data=_mycont nodupkey; by memname; where memtype='DATA'; run; ** Create a list of the datasets, one macro variable for each. At the last iteration put the number of datasets being produced into a macro variable.; data _null_; set _mycont end=eof; call symput(compress('dsname'||put(_n_,8.)),compress(memname)); if eof then call symput('dsname0',put(_n_,8.)); run; ** Use a macro to produce CONTENTS and PRINT of each dataset, restricting the number of observations to that defined in the MAXOBS macro variable.; %do i=1 %to &dsname0; proc contents data=mydata.&&dsname&i; title "Contents of Dataset &&dsname&i"; run; proc print data=mydata.&&dsname&i (obs=&maxobs); title "First &maxobs of Dataset &&dsname&i"; run; %end; %mend contprt;
Now it must be remembered that I have had this macro since the late 1980's (SAS version 6.04), hence the odd way of placing each dataset into a seperate macro variable. SAS SQL would be some time off and character variables were restricted to 200 characters in length, but it works and has served me well.
Using what is available today, can it be reworked? You bet. The section where the list of the datasets in a directory can be selected and made into macro variables can be rewritten to:
proc sql noprint; select distinct memname into :dslist seperated by ' ' from sashelp.vmember where memtype='DATA' order by memname; run;
Using this code we get a list of the datasets into a single macro variable, each dataset name seperated by a space.
Because we have a list of the datasets in a single macro variable, the generation of the contents and print would change to the following:
** Use a macro to produce CONTENTS and PRINT of each dataset, restricting the number of observations to that defined in the MAXOBS macro variable.; %let i=1; %do %while(%scan(&dslist,&i) ne ); proc contents data=mydata.%scan(&dslist,&i); title "Contents of Dataset %scan(&dslist,&i)"; run; proc print data=mydata.%scan(&dslist,&i) (obs=&maxobs); title "First &maxobs of Dataset %scan(&dslist,&i)"; run; %let i=%eval(i+1); %end;
In this code the %SCAN function is used to select the name of the dataset, checked if it is non-missing (if missing we are at the end of the list) and then generate the CONTENTS and PRINT output. These two pieces of code is all that is needed in this version. There is yet another more code efficient method using a CALL EXECUTE within a single datastep, but that is for another day.
Yes, SAS has advanced over the years, and yes there are many ways to do the same task, but I am sticking to my old macro -- it is tried and trusted.
I hope this is useful. See you in February.
Updated January 11, 2011