of the Month
If you have a large number of variables in a dataset, it is possible to reference ranges of variables without having to write out the long list. Lets look at some data and use this as an example:
data _games0; attrib year length=8 label='Year' city length=$50 label='City' nations length=8 label='Number of Nations' athletes length=8 label='Number of Athletes'; infile cards; input year city $ 9-30 nations athletes; cards; 1896 Athens, Greece 14 241 1900 Paris, France 24 997 1904 St. Louis, United States 12 651 1908 London, Great Britain 22 2008 ; run; title1 "Listing of Olympic Games, 1896-1908"; footnote1 "Source: Wikipedia, Accessed 2012-05-01"; run;
Now lets do a PROC CONTENTS run and see what we get (this is abridged to save space):
Listing of Olympic Games, 1896-1908 The CONTENTS Procedure Data Set Name WORK._GAMES0 Observations 4 Member Type DATA Variables 4 Engine V9 Indexes 0 Created Tuesday, May 01, 2012 09:04:12 AM Observation Length 80 Last Modified Tuesday, May 01, 2012 09:04:12 AM Deleted Observations 0 Protection Compressed NO Data Set Type Sorted NO Label Data Representation WINDOWS_32 Encoding wlatin1 Western (Windows) Variables in Creation Order # Variable Type Len Label 1 year Num 8 Year 2 city Char 50 City 3 nations Num 8 Number of Nations 4 athletes Num 8 Number of Athletes Source: Wikipedia, Accessed 2012-05-01
Note that I used the VARNUM option in the PROC CONTENTS call as this lists the variables out in order that they are in the dataset -- had we not used the VARNUM option variables would have been output in variable name order.
The thing to look at here is the order of the variables -- first YEAR, then CITY, followed by NATIONS and ATHLETES.
Now lets do a simple PROC PRINT with the variables to print being from YEAR to ATHLETES (this is a the same as doing a PROC PRINT without the VAR statement, but is used for illistrative reasons):
proc print data=_games0 label noobs; var year--athletes; run;
and has the output:
Listing of Olympic Games, 1896-1908 Number Number of of Year City Nations Athletes 1896 Athens, Greece 14 241 1900 Paris, France 24 997 1904 St. Louis, United States 12 651 1908 London, Great Britain 22 2008 Source: Wikipedia, Accessed 2012-05-01
Now lets say we want to print CITY then YEAR followed by NATIONS and ATHLETES:
proc print data=_games0 label noobs; var city--year nations athletes; run;
When you run this an error similar to the following is displayed:
ERROR: Starting variable after ending variable in data set.
Why is SAS/WPS doing this? The simple answer is that when you use a range of variables, the procedure/datastep will use the order of the variables within the dataset, therefore in our example, the variable CITY came after YEAR which is reverse of the order in the dataset.
This just touches the surface in regards to using variable ranges in your programming -- maybe sometime in the future I will expand on it. See you next month.
Updated May 1, 2012