of the Month
"I have a dataset that I want to break apart by region, and I want the dataset created to have the name of that region -- how do I do it?" That was a question someone asked me at PharmaSUG last year. A very interesting question.
Lets look at some data first:
data _AllPets; infile cards; input pet $ 1-3 name $4-12; cards; Cat Pacha Cat Muschi Dog Mia Dog Dixie Dog Princess Cat Archie Dog Bella Cat Princess Cat Boots Dog Gus Cat Milly Dog Bailey Cat Max Dog Roxie Dog Cody Dog Boomer Cat Sooty Cat Shadow Dog Tucker Cat Fraidy ; run;
In this example we have a dataset with a name (variable NAME) and whether they are are Cat or a Dog -- what we want to do is take this dataset and create two datasets, one called CAT and the other DOG. It is easy to see from the data that this could be done in one datastep as shown below:
data cat dog; set _AllPets; if pet='Cat' then output cat; else if pet='Dog' then output dog; run;
But what if this was a very large dataset containing a hundred PET values with a thousand NAME values -- not so easy. So lets look at something a little more general. Lets first create a macro variable with the distinct PET values:
proc sql noprint; select distinct pet into :petval separated by " " from _AllPets; quit; run;
Running this code creates a macro variable with the distinct PET values, separated by a space. Now we are going to create a macro that will use each value of PET in this macro variable PETVAL and create a dataset with the same name:
%macro breakup; *Initialize macro counter; %let i=1; *A DO loop to actually do the breakup -- read each value of PET contained in the macro variable PETVAL until no more; %do %while(%scan(&petval,&i) ne ); *For each value of PET create a dataset with that VALUE, dropping PET from the outgoing dataset; data %scan(&petval,&i); set _AllPets; where pet="%scan(&petval,&i)"; drop pet; run; *Increment the DO LOOP counter; %let i=%eval(&i+1); %end; %mend breakup;
Running this macro will read each value from the macro variable PETVAL, create a dataset with that value, and use a WHERE statement to do the selection of the records from _ALLPETS. Using a PRINT procedure call on each the two datasets created, the following data is present:
List of Cat Names Obs name 1 Pacha 2 Muschi 3 Archie 4 Princess 5 Boots 6 Milly 7 Max 8 Sooty 9 Shadow 10 Fraidy List of Dog Names Obs name 1 Mia 2 Dixie 3 Princess 4 Bella 5 Gus 6 Bailey 7 Roxie 8 Cody 9 Boomer 10 Tucker
Take this code and try it out and see what it does. Remember to have the MPRINT, MLOGIC and SYMBOLGEN options on.
See you in Feburary.
Updated January 8, 2012