cdisc

Verify SAS datasets against CDISC standards

%cdisc (datlib = data library, 
        datname = dataset name);
Where Is Type... And represents...
datalib C (200) Library name reference the location where the dataset resides.
datname C (200) Name of the dataset to be verified.  Wild cards can be specified such as ae*.

Details
This tool verifies SAS datasets against CDISC submission data domain models version 3.0 as specified at: http://www.cdisc.org/pdf/V3CRTStandardV1_2.pdf.  It is intended to catch deviations of standards including the following:

  1. Required Fields: (2.4.5) Required identifier variables including: DOMAIN, USUBJID, STUDYID and --SEQ.
  2. Subject Variable: (3.5.1.2.8) For variable names, labels and comments, use the word "Subject" when referring to "patients" or "healthy volunteer".
  3. Variable Length: (3.5.1.2.6) Variable names are limited to 8 characters with labels up to 40 characters.
  4. Yes/No: (3.5.1.3.18) Variables where the response is Yes or No (Y/N) should normally be populated for both Yes and No responses.
  5. Date Time Format: (3.5.1.4.19) Use yymmdd10. but yymmdd8. is acceptable.
  6. Study Day Variable: (3.5.1.4.22) Study day variable has the name ---DY.
  7. Variable Names: (3.5.2) If any variable names used matches CDISC variables, the associated label has to match.
  8. Variable Label: (3.5.2) If any variable labels match that of CDISC labels, the associated variable has to match.
  9. Variable Type: (3.5.2) If any variables match that of CDISC variables, the associated type has to match.
  10. Dataset Names: (3.5.2) If any of the dataset names match CDISC, the associated data label has to match.
  11. Dataset Labels: (3.5.2) If any of the dataset label match CDISC, the associated dataset name  has to match.
  12. Abbreviations: (3.5.2) The following abbreviations are suggested for variable names and data sets.
    1. DM Demographics
    2. CM Concomitant Medications
    3. EX Exposure 
    4. AE Adverse Events
    5. DS Disposition
    6. MH Medical History
    7. EG ECG 
    8. IE Inclusion/Exclusion Exceptions
    9. LB Labs 
    10. PE Physical Exam 
    11. SC Subject Characteristics
    12. SU Substance Use 
    13. VS Vital Signs
  13. SEQ Values: (4.3.2.1) When the --SEQ variable is used, it must have unique values for each USUBJID within each domain.

The findings from the above evaluation will be stored in a dataset named WORK.CDISC.   Each test case will be identified by a column named "case" which corresponds to each item listed above.  

Example

%cdisc (datlib=mylib, 
        datname=ae);
%cdisc (datlib=mylib, 
        datname=ae*);