Return to Archive

SAS Tip of the Month
September 2008

Continuing on from last month, conversion of numeric to character, or character to numeric conversion is an issue. Lets take a look at the following example:

   277 data TestDates;
   278    infile cards;
   279    input SubjectID $ 1-6 TestDate $ 8-15
   280          @17 FirstTreatmentDate date9.;
   281    put (_all_) (=);
   282 cards;

   SubjectID=000101 TestDate=06152008 FirstTreatmentDate=17198
   SubjectID=000102 TestDate=08082008 FirstTreatmentDate=17332
   NOTE: The data set WORK.TESTDATES has 2 observations and 3 variables.

   285 ;
   286 run;
   287 data TestDatesModified;
   288    attrib TestDateN length=8 format=date9.;
   289    set TestDates;
   290    TestDateN = TestDate; ** Convert TestDate to Numeric;
   291    DaysSinceFirstTreatment = TestDateN - FirstTreatmentDate;
   292    put (_all_) (=);
   293 run;

   NOTE: Character values have been converted to numeric
         values at the places given by: (Line):(Column).
         290:16
   TestDateN=21AUG**** SubjectID=000101 TestDate=06152008 FirstTreatmentDate=17198
   DaysSinceFirstTreatment=6134810
   TestDateN=********* SubjectID=000102 TestDate=08082008 FirstTreatmentDate=17332
   DaysSinceFirstTreatment=8064676
   NOTE: There were 2 observations read from the data set WORK.TESTDATES.
   NOTE: The data set WORK.TESTDATESMODIFIED has 2 observations and 5 variables.

As can be seen in the example, because we just blindly converted the TestDate variable from character to numeric, the values for TestDateN were strange. Note the log message above that gives an indication of the issue. To fix this we need to use and input statement with an informat, as the following example will show:

   294 data TestDates;
   295    infile cards;
   296    input SubjectID $ 1-6 TestDate $ 8-15
   297          @17 FirstTreatmentDate date9.;
   298    put (_all_) (=);
   299 cards;

   SubjectID=000101 TestDate=06152008 FirstTreatmentDate=17198
   SubjectID=000102 TestDate=08082008 FirstTreatmentDate=17332
   NOTE: The data set WORK.TESTDATES has 2 observations and 3 variables.

   302 ;
   303 run;
   304 data TestDatesModified;
   305    attrib TestDateN length=8 format=date9.;
   306    set TestDates;
   307    TestDateN = input(TestDate,mmddyy8.); ** Convert TestDate to Numeric;
   308    DaysSinceFirstTreatment = TestDateN - FirstTreatmentDate;
   309    put (_all_) (=);
   310 run;

   TestDateN=15JUN2008 SubjectID=000101 TestDate=06152008 FirstTreatmentDate=17198
   DaysSinceFirstTreatment=500
   TestDateN=08AUG2008 SubjectID=000102 TestDate=08082008 FirstTreatmentDate=17332
   DaysSinceFirstTreatment=420
   NOTE: There were 2 observations read from the data set WORK.TESTDATES.
   NOTE: The data set WORK.TESTDATESMODIFIED has 2 observations and 5 variables.

To avoid the issues of character to numeric conversion, please always use a INPUT statement with a informat to put the correct numeric value into a character variable.

________________________________
Updated September 1, 2008