Return to Archive

SAS Tip of the Month
June 2010

While at PharmaSUG this year in Orlando, a person asked me for a way that SAS could generate a selection of integers from 1 to 10 inclusive. When we looked around quickly on the web nothing could be found so I thought I would pass on this solution this month.

Just as a note, I am not having a discussion on the randomness of numbers that SAS produces as there is a large number of writings on this subject already.

For the solution I am going to show the code first in the form a macro that can be used as a general solution:

	%macro randint(dsout=rnum   /*Output dataset name*/
	              ,min=1        /*Starting Integer*/
	              ,max=10       /*Ending Integer*/
	              ,number=100   /*Number of Numbers to Generate*/
	              ,seed=0       /*Random Seed*/
	              ,rndvr=y      /*Variable containing random number*/
	              );
	   data &dsout;
	      do __i=1 to &number;
	         &rndvr=floor(&min+((&max+1)-&min)*ranuni(&seed));
	         output;
	      end;
	      drop __i;
	   run;
	%mend randint;

This does look daunting, but lets look at the code from the DATA statement and see what it does.

The DATA statement opens a dataset with the name supplied by the DSOUT parameter -- default is rnum.

A DO statement on the next line is just a counter, that is the data step should generate number random numbers -- default is 100.

The next line is where most of the work is done and is the hardest to understand. In this macro the RANUNI function is used to generate a number between 0 and 1, but not 0 or 1 -- the seed is a number that SAS needs to start the generation of a number (refer to documentation on the RANUNI funtion for details). The number from this function call is then multiplied by a traditional formula that produces a number between min and max values and then the FLOOR function is used to produce and integer.

Why the &max+1 and FLOOR? A discussed earlier, the number generated by the RANUNI function is between 0 and 1, but not including 0 and 1. Lets say the max value is 10, then without the '+1' the maximum number would be 9.9999..., not 10. If we use ROUND, then the '0' result would be between 0 and 0.5, and '10' would be the values between 9.5 and 9.9999..., while the rest would be selected on ranges between x-0.5 and x+0.5 -- in order to make the proportions as equal as possible the &max+1 is used so the maximum value is 10.9999... and the FLOOR function is used.

In the line, the starting integer (min), ending integer (max), number of numbers to generate (number), random number seed (seed) are all macro variable values.

The OUTPUT line just outputs the record, END ends the DO loop, and the DROP drops the temporary variable __i is used for the counter.

An example of a call where we want 1000 numbers between 1 and 100, in variable rnd, to a dataset called RANDOM would be:

   %randint(dsout=random,min=1,max=100,number=1000,rndvr=y);

I hope this is useful. See you in July.

________________________________
Updated June11, 2010