Return to Archive

SAS Tip of the Month
September 2010
A Quick Look at WPS version 2.5

I have been wrestling with this tip for the past week or two -- I am a SAS programmer by profession, have been using SAS for nearly 25 years and I am about to talk about a possible competitor to SAS Software. My mind was made up when I was asked by a close colleague if my page was dedicated to 'SAS' or 'SAS Institute Inc.', a bit like if I had a page dedicated to 'C' or 'Borland C'. If I did not make a comment on this then I would be going against my original aim of this page and that was to write about items that may be of interest to the SAS Programmer.

Last month World Programming Ltd., a small UK based company, released version 2.5 of their WPS software. Now you may be asking why should this news be commented upon on this page -- well, WPS is touted as a 'SAS Clone' and is gaining acceptance as a SAS alternative, a bit like Open Office or Lotus Smartsuite is to Microsoft Office.

So what is WPS? It is an interpreter of the SAS language, that is it reads the text in a SAS program file (a plain text file with a 'SAS' extension) and interprets it. This is exactly what the SAS software does.

It must be noted from the outset that WPS is not a copy of SAS and does not use the SAS source code, but reads a large proportion of the syntax and interprets them in a similar way producing output that looks similar as if it was run in SAS. Think of this as like the C++ interpreters of Borland, Microsoft or Sun -- if you use the basic C++ code you will be able to create an application using either of these three, but when you come to the more complicated code there are slight differences.

WPS has its own IDE for running interactively, known as WPS Workbench, that is available for the Windows environment. WPS has its own native dataset format called WPD, but it can read and write SAS version 7, 8 and 9 datasets in uncompressed format as well as read and write XPORT transport format files. There is also the capability of using the CPORT and CIMPORT procedures to transfer data files. WPS will also read SAS version 6 datasets and compressed datasets. WPS can interface with other file types including:

  • CSV
  • Clipboard (windows)
  • DDE (windows)
  • Email
  • FTP
  • HTTP
  • Pipe (UNIX and Windows)
  • Socket
  • URL
  • VSAM

WPS has database support for accessing many popular databases and formats including:

  • DB2
  • Informix
  • MySQL
  • ODBC
  • OLEDB
  • Oracle
  • Teradata
  • dBASE
  • SPSS
  • Excel
  • Access

Output supports the ODS Listing, ODS HTML, ODS Output, JPEG and GIF graphs, as well as the more traditional text formats produced by PROC PRINTTO. ODS SHOW and ODS TRACE are also supported.

Now that we know what WPS can read, write and output, what elements of the SAS language does WPS support.

Firstly, the procedures, and these are chiefly the following:

  • Statistics Procedures: ANOVA, CORR, FASTCLUS, FREQ, LOGISTIC, MEANS, REG, SCORE, SURVEYSELECT, SUMMARY, UNIVARIATE
  • Reporting Procedures: CHART, FORMS, GCHART, GPLOT, GREPLAY , PLOT, PRINT, TABULATE
  • Scoring Procedures: RANK, STANDARD
  • Utility Procedures: APPEND, CATALOG, CIMPORT, COMPARE, CONTENTS, COPY, CPORT, DATASETS, DELETE, EXPORT, FORMAT, IMPORT, OPTIONS, OPTLOAD, OPTSAVE, PDS, PDSCOPY, PRINTTO, PWENCODE, SORT, SOURCE, SQL, TRANSPOSE, TRANTAB

Secondly, the language elements that WPS supports (the lists here would be too numerous to list, but here is a brief summary -- refer to the WPS documentation, the link is here) are:

  • Data Set Options
  • DATA Step Statements and Functions
  • Formats and Informats
  • Global Statements
  • Library Engines
  • Macro language
  • System Options

WPS supports the majority, and arguably the most popular, language elements.

Interestingly, WPS includes a SDK as standard which provides the ability for users to extend the language by creating your own functions, formats, informats and call routines.

Multi-threading support was added in version 2.5, and 32-bit and 64-bit versions of WPS are available.

Now that we have an idea what WPS supports, what are some of the things that WPS does not support.

A couple of the notable items in the are the REPORT procedure, and the ODS RTF and PDF outputs, as well as the STAT procedures like GLM and LIFETEST, and GRAPH procedures like GSLIDE and GMAP. PROC TABULATE does not support the STD keyword (Standard Deviation) but does support MEDIAN, N, MIN, MAX, among others. Some formats/informats, functions and other language elements are not supported but these are rarely used items, for example REPEMPTY.

Within each procedure there are options that WPS does not support, for example in PROC FREQ the options BINOMIAL, CMH and FISHER|EXACT, and in PROC TABULATE the options ALPHA=, ORDER=FREQ, PCTLDEF=, QMETHOD= and STYLE=, among others.

It must be noted that WPS does not support SAS catalogs so formats/informats stored in catalogs need to be converted to SAS datasets then read in using the CNTLIN= option within the FORMAT procedure.

Now for some examples of output. The first example is output for a Chi-Square Test using the FREQ Procedure:

           Chi-Square Test Using the FREQ Procedure
                                    Cumulative    Cumulative
   Dice    Frequency     Percent     Frequency      Percent
   ---------------------------------------------------------
     1           14       11.67            14        11.67
     2           21       17.50            35        29.17
     3           20       16.67            55        45.83
     4           29       24.17            84        70.00
     5           26       21.67           110        91.67
     6           10        8.33           120       100.00

             Chi-Square Test for Equal Proportions
              Chi Square                     12.7
              Prob > Chisq             0.02635829
              Missings                          0
              Effective Sample Size           120

An example of a TABULATE procedure is given below:

                              The TABULATE Procedure

      --------------------------------------------------------------------------
      |                            |                   Name                    |
      |                            |-------------------------------------------|
      |                            | Fitzroy  |  Gipps   |   Grey   |  Hobson  |
      |                            |----------+----------+----------+----------|
      |                            |   roll   |   roll   |   roll   |   roll   |
      |----------------------------+----------+----------+----------+----------|
      |Dice Number  |              |          |          |          |          |
      |1            |N             |     10.00|     10.00|     10.00|     10.00|
      |             |--------------+----------+----------+----------+----------|
      |             |Mean          |      3.80|      2.70|      3.50|      3.50|
      |             |--------------+----------+----------+----------+----------|
      |             |Median        |      3.50|      2.00|      3.50|      4.00|
      |             |--------------+----------+----------+----------+----------|
      |             |Min           |      1.00|      1.00|      1.00|      1.00|
      |             |--------------+----------+----------+----------+----------|
      |             |Max           |      6.00|      6.00|      6.00|      5.00|
      |-------------+--------------+----------+----------+----------+----------|
      |2            |N             |     10.00|     10.00|     10.00|     10.00|
      |             |--------------+----------+----------+----------+----------|
      |             |Mean          |      3.30|      3.20|      3.90|      3.40|
      |             |--------------+----------+----------+----------+----------|
      |             |Median        |      3.50|      3.00|      4.00|      3.00|
      |             |--------------+----------+----------+----------+----------|
      |             |Min           |      1.00|      1.00|      2.00|      1.00|
      |             |--------------+----------+----------+----------+----------|
      |             |Max           |      5.00|      6.00|      6.00|      6.00|
      |-------------+--------------+----------+----------+----------+----------|
      |3            |N             |     10.00|     10.00|     10.00|     10.00|
      |             |--------------+----------+----------+----------+----------|
      |             |Mean          |      3.30|      4.10|      3.40|      3.50|
      |             |--------------+----------+----------+----------+----------|
      |             |Median        |      4.00|      4.50|      3.50|      3.00|
      |             |--------------+----------+----------+----------+----------|
      |             |Min           |      1.00|      1.00|      1.00|      1.00|
      |             |--------------+----------+----------+----------+----------|
      |             |Max           |      6.00|      6.00|      6.00|      6.00|
      --------------------------------------------------------------------------

It must be noted that the STD (Standard Deviation) is not supported in the TABULATE procedure -- this is very surprising since this is one of the basic statistics that is calculated from data. STD is supported in then MEANS, SUMMARY and UNIVARIATE procedures.

Now lets play with this using a DATA _NULL_ datastep:

          Using the Classic 'DATA _NULL_' for Output

                                       Name
      Dice              ----------------------------------
     Number  Statistic  Fitzroy   Gipps    Grey    Hobson
     -----------------------------------------------------

       1      N          10       10       10       10
              Mean        3.8      2.7      3.5      3.5
              STD         1.48     1.70     1.58     1.65
              Median      3.5      2.0      3.5      4.0
              Min         1        1        1        1
              Max         6        6        6        5

       2      N          10       10       10       10
              Mean        3.3      3.2      3.9      3.4
              STD         1.77     2.04     1.45     1.84
              Median      3.5      3.0      4.0      3.0
              Min         1        1        2        1
              Max         5        6        6        6

       3      N          10       10       10       10
              Mean        3.3      4.1      3.4      3.5
              STD         1.89     1.66     1.51     1.72
              Median      4.0      4.5      3.5      3.0
              Min         1        1        1        1
              Max         6        6        6        6

Just working with this a little more (using some old tricks that I used prior to the existence of PROC REPORT we get the following (output is in RTF format):

WPS Output, RTF Format

HTML output can be easily created using ODS HTML.

Now looking at a graph created using PROC GCHART using the JPEG format:

WPS Output, Graph

Now an important question, and one that I am always asked when people ask me about WPS -- "where does WPS stack up against SAS?".

The SAS product is large, very large (just a count of the number of Products and Solutions as of this morning was around 268, everywhere from SASŪ for Defense and Aerospace to Banking Solutions, Base SAS to SAS/Toolkit. SAS has been around since the 1970's so their history and product range is extensive.

By contrast WPS has been around since 2000, and has been developing what is now called WPS since 2002.

However, it is unfair to compare the two products this way. If you take the Base SAS, SAS/STAT, SAS/GRAPH and SAS/ACCESS (including engine support) products, against WPS, things get very interesting.

If you consider that WPS has taken the syntax of the SAS language and created its own interpretation with its own interpreter then it is not surprising that there are some features that are not supported that a SAS programmer using the SAS Institute product would expect. Not all the STAT, GRAPH or ACCESS (including Database Support) modules are in WPS, and certainly not all the language is there either, but the more common and regularly used items are there. Yes, there are still features that need to be addressed, but overall WPS has created a product that can take your existing data (maybe it is SAS dataset, Oracle, DB2 or other formats that WPS supports), and use the SAS language to manipulate it, do some statistical analysis and then create output, as the examples above demonstrate.

Any comparison between SAS and WPS must also look at the cost when purchasing the two products. The actual price of SAS or WPS software is dependant on so many factors, including the number of CPUs, but it must be noted that SAS runs around 8 times the price of WPS for the first year, and around 4 times the price thereafter.

World Programming Ltd. has a website that can be found at http://www.teamwpc.co.uk.

________________________________
Updated September 17, 2010