of the Month
Merging variables from one dataset into another is one of the basic data manipulation tasks that a SAS programmer has to do. The most common way to merge on data is using the MERGE statement in the DATA step but there are three other ways that can help. First though, some data:
Dataset: PATDATA SUBJECT TRT_CODE 124263 A 124264 A 124266 B Dataset: ADVERSE SUBJECT EVENT 124263 HEADACHE 124266 FEVER 124266 NAUSEA
Each of the examples below will merge the TRT_CODE from the PATDATA dataset onto the ADVERSE dataset.
Using the DATA step the code for merging would typically be:
DATA alldata0; MERGE adverse (in=a) patdata (in=b); BY subject; IF a; RUN;
Another way would be to use SQL, so the code would look something like:
PROC SQL; CREATE TABLE alldata0 AS SELECT a.*, b.trt_code FROM adverse a left join patdata b ON a.subject=b.subject; QUIT; RUN;
A third way is using the KEY= option in the SET statement as shown in the following example:
DATA alldata0; SET adverse; SET patdata KEY=subject /UNIQUE; IF _IORC_ THEN DO; _ERROR_=0; trt_code=''; END; RUN;
Before the third example is run the dataset PATDATA must have an index created inside it, using either the INDEX statement inside a DATASETS or SQL procedure, or INDEX option inside a DATA step.
The forth example creates a format from the data and sets the treatment from the created format:
DATA fmt; RETAIN fmtname 'TRT_FMT' type 'C'; SET patdata; RENAME subject=start trt_code=label; RUN; PROC FORMAT CNTLIN=fmt; RUN; DATA alldata0; SET adverse; trt_code=put(subject,$trt_fmt.); RUN;
There are other ways of merging data but the above examples show the most common. No one method is better than the other - it depends on the size of the data being merged. The following table gives a guide on the ratio of time spent when compared with a number of observations in the ADVERSE dataset (PATDATA has 1000 patients/records):
Number of | MERGE | | | PROC Observations | Statement | PROC SQL | KEY= Option | FORMAT -------------+-----------+----------+-------------+-------- 1K | 1.00 | 2.19 | 0.25 | 1.25 1M | 86.25 | 90.56 | 51.25 | 23.06 5M | 787.94 | 402.19 | 302.31 | 264.13
Again, the above table should only be given as a guide. Happy New Year.
Updated January 5, 2006