Return to Homepage

Goto the Tip of the Month Archive

Other interesting pages ...
LinkedIn Profile
SAS Cheat Sheet
Useful SAS Code
Full SAS Example
Basic Statistics
Contact Information

SAS Tip of the Month
November 2011
(for SAS and WPS)

Last month I showed how to capture results of a SAS procedure to a dataset using ODS -- this month I am capturing results from a SAS procedure again but this time doing it a way that was pre-ODS, that is before ODS came into existance.

It was mentioned that many SAS procedures have ways of getting the dataset using an OUT option or similar that many procedures have. Lets see this in action when I capture the Quantiles from PROC UNIVARIATE using the same data:

   data River_US (label='Rivers in US over 1000 miles in Length');
      attrib name length=$40 label='River Name'
             distance length=8 label='Distance (miles)';
      input Name $ & distance @@;
   cards;
   Mississippi  2340
   Columbia  1240
   Colorado  1450
   St. Lawrence  1900
   Missouri  2540
   Ohio  1310
   Rio Grande  1900
   Brazos  1280
   Snake 1040
   Atchafalaya  1420
   Yukon  1980
   Red  1290
   Arkansas  1460
   ;
   run;

   proc univariate data=River_US;
      var distance;
      output out=_Q0
        MEDIAN=MEDIAN P1=P1 P5=P5 P10=P10 P90=P90 P95=P95
        P99=P99 Q1=Q1 Q3=Q3 MIN=MIN MAX=MAX;
   proc print data=_Q0;
      title1 "Listing of Quantiles Output from Univariate Procedure";
   run;

   === OUTPUT ===
                         Listing of Quantiles Output from Univariate Procedure

     Obs     MAX     P99     P95     P90     Q3     MEDIAN     Q1      P10     P5      P1      MIN

      1     2540    2540    2540    2540    2120     1675     1295    1240    1240    1240    1240

This method is very good. In most cases SAS Procedures that calculate statistics will capture all the output in a SAS dataset, however there are some output where this is not available in a SAS dataset, and may not be captured though ODS. An example of this is in the PROC UNIVARIATE output, Extreme Observations section. So how do we capture the information that a SAS procedure puts out on a listing but is not avaiable in a SAS dataset form?

An old trick that used to be used was to output the listing to a text file using PROC PRINTTO and then reading that file back in capturing the information we wanted and saving that as a SAS dataset. The following example shows how to capture the Quantiles and Extreme Observations output from PROC UNIVARIATE using this technique:

   *Capture the output using PROC PRINTTO;
   proc printto file='c:\temp\UnivariateOutput.txt' new;
   proc univariate data=River_US;
      var distance;
   proc printto;
   run;

   *Load in the file produced by PROC PRINTTO and capture required results;
   data _Q1; *Quantiles from PROC UNIVARIATE using PROC PRINTTO;
      retain QuantileFlg 0; *Indication if in Quartiles section;
      attrib VarName length=$8
             Quantile length=$10
             Estimate length=8 format=5.;
      infile 'c:\temp\UnivariateOutput.txt' length=len;
      input _txt $varying256. len;

      keep VarName Quantile Estimate;

      *Capture results;
      if QuantileFlg=2 then do; *In Quantiles Section with results;
         VarName='distance';
         if lengthn(strip(_txt))>0 then do;
            Quantile=substr(strip(_txt),1,10);
            Estimate=input(substr(strip(_txt),11),best.);
            output;
         end;
      end;

      *Flag which section we are in;
      if strip(_txt)=:'Quantiles'
        then QuantileFlg=1; *In Section;
      else if strip(_txt)=:'Quantile' and QuantileFlg=1
        then QuantileFlg=2; *In results;
      else if strip(_txt)=:'0% Min'
        then QuantileFlg=0; *End of Section, no more results;
   run;

   data _E0; *Extremes from PROC UNIVARIATE using PROC PRINTTO;
      retain ExtremeFlg 0; *Indication if in Extremes section;
      attrib LowValue length=8
             LowFreq length=8
             HighValue length=8
             HighFreq length=8;
      infile 'c:\temp\UnivariateOutput.txt' length=len;
      input _txt $varying256. len;

      keep LowValue LowFreq HighValue HighFreq;

      if strip(_txt)=:'Missing Values' and ExtremeFlg=2
        then ExtremeFlg=0; *End of Section, no more results;

      if ExtremeFlg=2 then do; *In Extremes Section with results;
         if lengthn(strip(_txt))>0 then do;
            LowValue=input(scan(strip(_txt),1),best.);
            LowFreq=input(scan(strip(_txt),2),best.);
            HighValue=input(scan(strip(_txt),3),best.);
            HighFreq=input(scan(strip(_txt),4),best.);
            output;
         end;
      end;

      if strip(_txt)=:'Extreme Observations'
        then ExtremeFlg=1; *In Section;
      else if strip(_txt)=:'Value' and ExtremeFlg=1
        then ExtremeFlg=2; *In results;
   run;

   proc print data=_Q1;
      title1 "Listing of Quantiles Output using PROC PRINTTO from Univariate Procedure";
   run;

   proc print data=_E0;
      title1 "Listing of Extremes Output using PROC PRINTTO from Univariate Procedure";
   run;


   === OUTPUT ===

   Listing of Quantiles Output using PROC PRINTTO from Univariate Procedure

                  Obs    VarName     Quantile      Estimate

                    1    distance    100% Max        2540
                    2    distance    99%             2540
                    3    distance    95%             2540
                    4    distance    90%             2540
                    5    distance    75% Q3          2120
                    6    distance    50% Median      1675
                    7    distance    25% Q1          1295
                    8    distance    10%             1240
                    9    distance    5%              1240
                   10    distance    1%              1240
                   11    distance    0% Min          1240


   Listing of Extremes Output using PROC PRINTTO from Univariate Procedure

                            Low      Low     High    High
                    Obs    Value    Freq    Value    Freq

                     1      1240      2      1450      3
                     2      1280      8      1900      4
                     3      1310      6      1900      7
                     4      1450      3      2340      1
                     5      1900      7      2540      5

Note that it is necessary to have a knowledge of the output text file structure when bringing in the results of a SAS procedure -- the output created here uses the default output format.

Capturing the results using the PROC PRINTTO technique is more work than using ODS or OUT options from the SAS procedures that create statistcs, but is still useful to know. Copy the code and try it on your SAS or WPS installation -- change a few things and see what happens.

I hope this is useful. See you in December.

________________________________
Updated November 8, 2011