of the Month
How do we create a list of datasets in macro variable, and why would be need to do that?
The second part of the question is easier to answer than the first part so I will answer this first -- we sometimes want to do an action on them, for example print out the first 25 observations of all datasets.
Now the first part of the question -- how do we create a list of datasets in a macro variable?
There are a number of ways to get a list of datasets in a library, the most common way being the use of the PROC CONTENTS, but another way is using what SAS already knows about our data in the SASHELP library, specifically the SASHELP.VMEMBER view -- this view contains a lot of information but the fields we are interested in are LIBNAME (library reference name), MEMNAME (name of dataset) and MEMTYPE (we only interested in datasets so am looking at the value 'DATA').
Lets first create the list of datasets and put that into a macro variable DSLIST using a datastep:
data _null_; length _txt $32000; retain _txt; set sashelp.vmember end=eof; where libname='A2012'; _txt=catx(' ',_txt,memname); if eof then call symput('DSLIST',strip(_txt)); run;
In the example above, the LENGTH statement sets the text field containing list of datasets; the RETAIN statement retains the list of dataset that are built across iterations of the datastep; in the SET statement, the variable EOF will equal 1 if end of the datastep is reached; the WHERE statement selects only datasets in the library A2012; the CATX function contactenates the dataset name found to the list of datasets; and finally when EOF=1, i.e. the last observation is found, the CALL SYMPUT function outputs the list of datasets to a macro variable DSLIST.
Using PROC SQL the code is slightly different, as shown below:
proc sql noprint; select distinct memname into :dslist separated by ' ' from sashelp.vcolumn where libname='A2012'; quit; run;
In the code above, the NOPRINT statement makes sure the SQL does not print out any output; the SELECT statement selects our data; the INTO statement puts the list into a macro variable DSLIST separated by a single space; from the dataset VCOLUMN in the libary SASHELP where the LIBNAME has the value A2012. The SQL code is certainly more readable from a readability point of view.
Now that we have created our list of datasets, the next thing is to do something with them -- enter the macro DO loop processing. I prefer using a DO WHILE loop as I usually do not know how many values are in the macro variable, as shown below:
%macro dotask; %let i=1; *initialize internal counter; %do %while(%scan(&dslist,&i) ne ); proc print data=A2012.%scan(&dslist,&i) (obs=25); title1 "Listing of first 25 observation in dataset %scan(&dslist,&i)"; run; %let %eval(&i+1); *Increment internal counter; %end; %mend dotask; %dotask;
We can acutally know the number of values in a macro variable two ways, one being changing the datastep code slightly, as shown below:
data _null_; length _txt $32000; retain _txt; set sashelp.vmember end=eof; where libname='A2012'; _txt=catx(' ',_txt,memname); if eof then do; call symput('DSLIST',strip(_txt)); call symput('DSNOBS',compress(put(_n_,8.))); *Number of obs passed as a macro variable; end; run;
or a method I introduced in August of 2011, the %SYSFUNC and COUNTW method:
the DOTASK macro can be amended to
%macro dotask; %do i=1 %to &dsnobs; proc print data=A2012.%scan(&dslist,&i) (obs=25); title1 "Listing of first 25 observation in dataset %scan(&dslist,&i)"; run; %end; %mend dotask; %dotask;
Look at the differences and try this out on your own computer.
See you next month.
Updated June 7, 2012