Return to Archive

SAS Tip of the Month
December 2003

The way a statistic is calculated may be more important than the result it produces. Recently an example showed up when some statistics were being checked using Excel on results produced with SAS, specifically with the calculation of a first and third quartile, also known as the 25th and 75th percentile.

For the data 1, 2, 3, 4, 5, 6, 7 and 8 the following results are calculated for the 25th percentile:

 SAS Method 5 (default) = 2.5 SAS Method 4 = 2.25 Excel = 2.75

Why the difference? There actually is no standard for the calculation of percentile and it does depend on what a statistician is looking for. SAS has six methods of calculating the percentile, the two common ones being:

 Method 5: y = (xj - xj+1)/2 if g=0 or y = xj+1 if g>0, where n*p=j+g Method 4: y = (1-g)*xj + g*xj+1, where (n+1)*p=j+g and xn+1 is taken to be xn

Excel, S-Plus and StarOffice Calc by comparison uses a different method, specifically:

 y = (1-g)*xj+1+g*xj+2 where (n-1)*p=j+g, and both xn+1 and xn+2 is taken to be xn

Among the major statistical software packages only Excel, S-Plus and StarOffice Calc use this method. For those using Minitab or SPSS they use the SAS Method 4 for their calculation.

The moral of this example is that you should know how your software calculates a statistic before blindly reporting the result.

________________________________
Updated December 15, 2003