Return to Archive

SAS Tip of the Month
June 2007

To many, the PROC MEANS and PROC SUMMARY SAS procedures are the same. There are however two differences.

The first difference is that the SUMMARY procedure does have as default to print no output to an output file while the MEANS procedure does by default. The option that controls this is PRINT/NOPRINT so it is possible to print the output from the SUMMARY procedure and have no output from the MEANS procedure.

The second difference is not widely known but it is a useful. When the VAR statement is missing in the MEANS procedure analysis is carried out on all numeric variables, as shown in the following example (output below):=

    data vitals;
        infile cards;
        input patid $3. heart_rate
              temperature trtcd $1.;
    001 72 35.8 A
    002 80 36.4 B
    003 99 36.6 A

    proc means data=vitals nway;
        class trtcd;

                                 The SAS System
                               The MEANS Procedure

    trtcd  Obs  Variable     N        Mean     Std Dev     Minimum     Maximum
    3        3  heart_rate   3   2.0000000   1.0000000   1.0000000   3.0000000
                temperature  3  83.6666667  13.8684294  72.0000000  99.0000000

However, what happens when the SUMMARY procedure is used instead on the same data?

    proc summary data=vitals nway print;
        class trtcd;

          The SAS System
       The SUMMARY Procedure

          trtcd    Obs
          A          2
          B          1

Notice that the only result that came out from the SUMMARY procedure was the number of observations from each treatment group, similar to what the FREQ procedure will produce. The same result will prevail if the only variables in the VITALS dataset were PATID and TRTCD, that is only the character variables.

So what does this finding mean? In most cases SAS programmers use the FREQ procedure for frequency counts and the MEANS procedure for summary statistics however these two sets of statistics most commonly carried out can be done by one procedure. A possible macro for calculating both from one procedure is given below:

    %macro summstat(dsin=,
                      /*Input file -REQUIRED*/
                      /*Results filst -REQUIRED*/
                      /*Class Variable(s) -REQUIRED*/
                      /*Analysis Variables, only if
                       descriptive statistics
                       requested. Can use one or
                       numeric variable names or
                       _NUMERIC_ for all numeric
                       variables.  AUTONAME option will
                       set variable names in output
                      /*Output to listings file.
                       Values: PRINT|NOPRINT*/
        proc summary data=&dsin nway &outprt
            class &classvar
            %if (&vvars ne ) %then %do;
              var &vvars
              output out=&dsout (drop=_type_ _freq_)
                     n= mean= std= median= min= max=
            %else %do;
                output out=&dsout
    %mend summstat;

If anyone is going to PharmaSUG in Denver, I hope to see you there.

Updated June 1, 2007