SAS Tip of the Month
We have all done it/seen it. An example of a specification:
If raw.repeat_data_cm_rg.cend_precision=1 and init_date_ser>=inception_date and raw.repeat_data_cm_rg.cend not missing and raw.repeat_data_cm_rg.cend_year not equal to year(ads.a_pat_demo.index_date) then set cend=last day of that year (which will be 31DEC).
And another example:
GHR= (GHR_1 + GHR_2 + GHR_3 + SFR_33 + SFR_35 - 5) *5
When writing ADS specifications a good rule of thumb is to write it as if you were across the table from an FDA reviewer – these people on the whole do not know SAS or R or any of the other myriad of statistical software, do not know the structure of the datasets, and don’t have much time. The specifications also have to be written in a way that is not programming language specific, i.e. either a SAS or R or STATA or SPSS or, dare I say it, VB – a programmer, and not necessarily a SAS programmer, must be able to pick up the specification and get the same result. There are some exceptions to this, namely when an obscure statistic is being requested and it needs a particular SAS procedure call with specific parameters.
The other issue when writing language specific specifications is that it places huge onus on the one specifying to get it right, and yes, it CAN take days to rewrite code if the specification is found to need amendment.
In the first example, the specification is meaningless to a reviewer who has little time and/or knows little of a programming language. To a programmer, trying to work out what the intent is difficult at best and many will just take that specification and write code to it. In this snippet, Its intent was:
“If the conmed end date is a year value only, and the subject started treatment after the Inception Date, and the year value is not the same year as the end of treatment, then set a computed end date of the end of the conmed year.”
Within many groups it is expected that an intermediate level programmer and higher to get this intent and program it, and if they do not understand it they should ask another programmer for help – experienced programmers enjoy passing on their knowledge and experience when asked (maybe we are busy at that moment you ask, but give a little bit of time and we always come back and at least follow-up). Sometimes even a help from an internet search will assist.
As a side note, this specification was part of a larger set of specifications on working incomplete concomitant medication end dates -- what would have been ideal would be a general instruction in the programming specifications saying something like:
Concomitant Mediation (CM) End Dates If a CM date is incomplete, the following rule will be set for a computed end date: o If the month and year is present then set computed date to the last day for the month o If only the year is present then set a computed date to the last date for the year.
Looking at the second example, a question was raised when the specs were being reviewed over what happens with missing data points in a record – a look at an external reference found that the equation in the document catered for missing values, i.e. one or more of the five variables in the formula had a missing value, and additionally catered for the situation where there were less than half the scores present. Within most groups it is not expected that a programmer to go to an external reference document (SAP, eCRF, protocol exempted) unless the specification directed so. As it happened this issue was detected and corrected early, but the worst case scenario could have been that this would have been picked up by the client or outside reviewer.
Yet another cautionary tale. This was another study which included some ADS specs with SAS code included. The programmers had happily programmed away and output was out when a question arose on how so few AE records met a certain criteria. After a quick investigation it was found that one of the key variables in the AE ADS, a flag indicating whether an AE was treatment emergent or not, did not match up with the SAP – this required a fix to the ADS program concerned, a rerun of the ADS and output programs.
Reading and understanding the SAP cannot be overstated – it is a must, after all this is the programming and statistical blueprint for the study. That is not to say there are issues with the SAP (saw an example where AGE in years of a subject was computed as the number of days from randomization to date of birth / 12) but this is the agreed analysis for the study between us and the client.
What is some good guidance?
Specification writers, read the SAP and become familiar with it, and read the protocol if available. Write your specification as if you were talking to an outside reviewers making sure you specify the intent of the derivation – you can always point to an external document for that derivation if needed. Trust your programmers to take that specification and turn it into what you intended.
Programmers, before writing code, read the SAP and become familiar with it, and also read the protocol if available. If a specification is not clear ask another programmer for help – again, experienced programmers enjoy passing on their knowledge and experience. If it is still unclear, then take it back to the specification writer because if you don’t understand it, expect an outside reviewer not to understand it either.
Whether we are a specifications writer, programmer or statistical reviewer, the ultimate goal is to produce output that is clear, accurate and timely, for us and our client.
Hope this is useful. See you next month.
Updated November 2, 2014