Paper presentations are the heart of a SAS users group meeting. PharmaSUG 2012 will feature nearly 200 paper presentations, posters, and hands-on workshops. Papers are organized into 12 academic sections and cover a variety of topics and experience levels.
Note: This list is subject to change. Last updated 20-May-2012.
Click on a section title to view abstracts for that section, or scroll down to view them all.
- Applications Development
- Coders Corner
- Data Standards and Quality
- Data Visualization and Graphics
- Hands-On Workshops
- Health Outcomes and Epidemiology
- Industry Basics
- Management and Support
- Statistics and Pharmacokinetics
- Techniques and Tutorials - Advanced
- Techniques and Tutorials - Foundations
Data Standards and Quality
Data Visualization and Graphics
|Paper No.||Presenter(s)||Paper Title (click for abstract)|
|HW01||Mike Molter||An Introduction to the Clinical Standards Toolkit and Clinical Data Compliance Checking|
|HW02-SAS||Lex Jansen||Using the SAS® Clinical Standards Toolkit 1.4 for define.xml creation|
& Paul Slagle
|Creating the ADaM Time to Event Dataset: The Nuts and Bolts|
|HW04||Sunil Gupta||Ready To Become Really Productive Using PROC SQL?|
|HW05||Kirk Paul Lafler||SAS® Macro Programming Tips and Techniques|
|HW06||Arthur Carpenter||Doing More with the Display Manager: From Editor to ViewTable - Options and Tools You Should Know|
|HW07||Matthew Becker||SDTM, ADaM and define.xml with OpenCDISC(R)|
|HW08-SAS||Vincent DelGobbo||An Introduction to Creating Multi-Sheet Microsoft Excel Workbooks the Easy Way with SAS®|
Health Outcomes and Epidemiology
|Paper No.||Presenter(s)||Paper Title (click for abstract)|
& Seungshin Rhee
|Multiple Techniques for Scoring Quality of Life Questionnaires|
|HO02||Karen Walker||7 Steps to Insights Using SAS: Progression Free Survival|
|HO04||Suhas R. Sanjee
& Rajavel Ganesan
& Mary N. Varughese
|Tumor Assessment using RECIST: Behind the Scenes|
& Eric Elkin
|Are You Discrete? Patients' Treatment Preferences and the Discrete Choice Experiment|
|HO07||Kevin Viel||Using the SAS System as a bioinformatics tool: a macro that calls the standalone Basic Local Alignment Search Tool (BLAST) setup|
|HO08||Brandon Fleming||ESTIMATING RISK IN THE PRESENCE OF UNREPORTED ZEROS IN THE DATA|
|Paper No.||Presenter(s)||Paper Title (click for abstract)|
& Patricia Gerend
|Adapt and Survive: Implementing the new ICH Development Safety Update Report (DSUR)|
|IB02||Arthur Collins||Risk Evaluation and Mitigation Strategies (REMS) and How They Impact Biostatistics and Statistical Programming Groups|
|IB03||Frank DiIorio||Metadata: Some Fundamental Truths|
|IB04||Vincent Amoruccio||Could Have, Would Have, Should Have! Adopting Good Programming Standards, Not Practices, To Survive An Audit.|
|IB05||Max Cherny||Problem with your SAS Program? Solutions abound!|
|IB07||Richann Watson||Using a Picture Format to Create Visit Windows|
|IB08||Sunil Gupta||SAS Enterprise Guide Best of Both Worlds – Is it Right for You?|
|IB09||Sneha Sarmukadam||Programmer's Safety Kit: Important Points to Remember While Programming or Validating Safety Tables|
|IB10||Kirk Paul Lafler||You Could Be a SAS® Nerd If . . .|
Management and Support
Statistics and Pharmacokinetics
Techniques and Tutorials - Advanced
|Paper No.||Presenter(s)||Paper Title (click for abstract)|
& Mario Widel
& Sy Truong
|Checksum Please: A Way to Ensure Data Integrity!|
|TA02||Kirk Paul Lafler||Exploring DATA Step Merges and PROC SQL Joins|
|TA03||Peter Eberhardt||A Cup of Coffee and PROC FCMP: I Cannot Function Without Them|
|TA04||Scott Burroughs||PIPE Dreams: Yet Another Tool for Dynamic Programming|
|TA05||Sudhir Singh||Using SAS Colon Effectively|
|TA07||Chris Speck||Diverse Report Generation with PROC REPORT|
|TA08||Joel Campbell||Perl Regular Expressions in SAS® 9.1+ - Practical Applications|
& Melvin Munsaka
|Supplementing Programmed Assisted Patient Narratives (PANs) with Graphs Using SAS|
|TA10||Arthur Carpenter||Programming With CLASS: Keeping Your Options Open|
& Kevin Russell
|Why Does SAS Say That? What common DATA Step and Macro Messages Are Trying to Tell You|
Techniques and Tutorials - Foundations
Applications DevelopmentAD01 : A Standard SAS Program for Corroborating OpenCDISC Error Messages
John R Gerlach, TAKE Solutions, Inc.
Ganesh Thankgavel, TAKE Solutions, Inc.
Monday, 9:15 AM - 9:45 AM, Location: Continental 8
The freeware application OpenCDISC does a thorough job of assessing the compliance of SDTM domains to the CDISC standard. However, the application generates error and warning messages that are often vague, even confusing. Thus, it is beneficial to corroborate the CDISC compliance issues using SAS. Moreover, realizing that a validation check generates similar messages across CDISC data libraries, the SAS code should be similar, as well. This paper explains a comprehensive standard SAS program that contains concise and reusable code that facilitates the process of corroborating OpenCDISC (OC) reports.
AD02 : SAS® Programming Tips, Tricks and Techniques
Kirk Paul Lafler, Software Intelligence Corporation
Monday, 10:15 AM - 11:15 AM, Location: Continental 8
The base-SAS® System offers users with a comprehensive DATA step programming language, an assortment of powerful PROCs, a macro language that extends the capabilities of the SAS System, and user-friendly interfaces including SAS Display Manager and Enterprise Guide. This presentation explores a collection of proven tips, tricks and techniques related to effectively using the SAS System and its many features. Attendees will examine keyboard shortcuts to aid in improved productivity; the use of subroutines and copy libraries to standardize and manage code inventories; data summarization techniques; the application of simple reusable coding techniques using the macro language; troubleshooting and code debugging techniques; along with other topics.
AD03 : SAS® Data Query and Edit Checks with HTML5
Sy Truong, Meta-Xceed, Inc.
Ale Gicqueau, Clinovo
Monday, 9:45 AM - 10:15 AM, Location: Continental 8
Querying data used to require sophisticated algorithms and programming with SQL expertise or SAS DATA STEP techniques. Now, there are many user-friendly graphical tools to build expressions without having to program code to access relational databases. However, there are few tools for creating queries from a web interface upon SAS data. This paper describes a real world project where data managers extract data from a relational database and transfer it to SAS data. Data managers need a user-friendly query tool to verify the cleanliness of the resulting SAS data. Since data managers are not SAS programmers, the HTML5 web interface eases the learning curve allowing them to perform tasks such as: creating compound query expressions, deriving temporary imputed variables, saving and combining saved queries. These are some of the methods described in this paper as it explains how to use SAS DATA STEP, HTML5 and XML to accomplish an effective user experience. The initial target audience for this project was to empower data managers to perform tasks only SAS programmers were able to do. However, the user-friendly interface extends usage to a larger audience of non-techie power users.
AD04 : How Readable and Comprehensible is a SAS® Program? A Programmatic Approach to Getting an Insight into a Program
Rajesh Lal, Experis
Raghavender Ranga, Experis
Monday, 11:15 AM - 11:45 AM, Location: Continental 8
In any programming language, there are some general guidelines for writing "better" programs. Well-written programs are easy to re-use, modify and comprehend. SAS programmers commonly have guidelines for indentation, program headers, comments, dead code avoidance, efficient coding etc. It would be great if we could gauge a SAS program’s quality by quantifying various factors and generalize for individual programming styles. This paper aims to quantify various qualitative characteristics of a SAS program using Perl regular expressions, and provide insight into a SAS program. This utility, when used across a large project, can serve as a tool to gauge the quality of programs and help the project lead take any corrective measures to ensure that SAS programs are well written, easily comprehensible, well documented and efficient.
AD05 : The Application of SAS Perl Regular Expression in Clinical Trial Studies - Batch Processing SAS Programs
Zhong Yan, Pharmanet/i3
Kimberly Jones, Pharmanet/i3
Monday, 3:30 PM - 4:00 PM, Location: Continental 8
In clinical trial research, it is sometimes necessary to batch process SAS programs within a study in order to ensure consistent updates. The process of manually opening each program and searching for the places that need to be updated is time consuming and prone to errors. SAS Perl Regular Expression is a powerful tool that can be used to scan files for matches with an identifiable pattern and replace them with customized choices. This paper introduces the application of SAS Perl Regular Expression to batch process SAS programs using SDD SAS programs as an example.
AD06 : A SAS® Tool to Allocate and Randomize Samples to Illumina® Microarray Chips
Huanying Qin, Baylor Health Care System
Greg Stanek, Baylor Health Care System
Derek Blankenship, Baylor Health Care System
Monday, 4:00 PM - 4:30 PM, Location: Continental 8
DNA/RNA Microarray has become a common tool for identifying differentially expressed genes under different experimental conditions. Several microarray platforms exist and differ by probe implementation and target-labeling strategies. When it comes to sample allocation, Illumina BeadChip is most unique because each chip can hold 12 samples, while for other platforms, each chip holds one sample or a combination of 2 paired samples. This adds complexity to sample randomization because each chip serves as a block and often the number of samples for each study group is not balanced, ie. each group does not have the same number of samples. The SAS Proc Plan procedure is able to aid in the design and randomization of sample allocation for balanced studies but is limited for unbalanced studies. This paper describes a SAS macro tool we developed to implement an optimizing algorithm to allocate and randomize samples to Illumina chips for unbalanced study designs.
AD07 : Migrating to a 64-bit Operating System: Quick Pointers
Jacques Thibault, Sunovion Pharmaceuticals Inc.
Monday, 1:15 PM - 1:45 PM, Location: Continental 8
The 64-bit operating environment becomes the new commodity platform and beginning with SAS 9.2, you can now choose to run 64-bit SAS on Windows x64. Since the 64-bit environment removes the 2GB of memory limit that exists on 32-bit machines, it may sound interesting to take advantage of it. It is certainly worth discussing and exploring what you are getting yourself into prior to migrating to such environment and how to best be prepared for a smooth migration in order to reduce the amount of unexpected surprises. This paper discusses what you should consider when changing your current 32-bit Windows operating system to a new 64-bit Windows platform and potential impacts you should anticipate for your SAS programs and related data files.
AD08 : Maximize the power of %SCAN using the WORDSCAN utility
Priya Saradha, Tech Data Services
Monday, 1:45 PM - 2:15 PM, Location: Continental 8
This paper demonstrates the power of %SCAN macro function using the user-defined utility, WORDSCAN, and defines the most effective way to pass input parameters for any macro. WORDSCAN is a powerful utility macro which splits a given string into new macro variables and counts the total number of individual words. The utility’s key features are, any character can be used as a delimiter in the input string and any character (bound by the rules of SAS variable naming conventions) can be used as a prefix for the new macro variable names. Using the %SCAN macro function, the utility searches through the input string for the specified delimiter and identifies the words to be created as new macro variables. Each word, thus identified, is stored in separate macro variables. The macro variables are named by the utility using the prefix, provided as a input, followed by a sequence number. This naming convention helps the user to identify the order in which each word appears in the given string. Along with the new macro variables, the utility also outputs the number of words read from the input string into a macro variable. Thus, WORDSCAN can increase the robustness and efficiency of user-defined macros using the power of %SCAN macro function.
AD10 : Experience of Generating an XML file and Uploading Serious and Frequent Adverse Events to ClinicalTrials.gov
Juan Wu, Medtronic Inc
Min Lai, Medtronic Inc
Tuesday, 11:15 AM - 11:45 AM, Location: Continental 8
The Food and Drug Administration Amendments Act (FDAAA) requires that basic results for applicable clinical trials for drugs, devices, and biologics be posted to ClinicalTrials.gov after September 27, 2008, and serious adverse events (SAE) and frequent adverse events (FAE) be posted to ClinicalTrials.gov after Sep 27, 2009. To comply with the requirements, a cross functional team was established at Medtronic to implement the necessary infrastructure. While regulatory or clinical staff performs the data entry for basic results posting using a Web-based Results Registration System at Medtronic, our statistical programming team produced validated SAS® macros and generated an XML file for posting serious and frequent adverse events to ClinicalTrials.gov. The posting process also ensures the SAE/FAE results are published to ClinicalTrials.gov accurately and efficiently. This paper will discuss the experience learned in generating and uploading an XML file for SAE and FAE posting to ClinicalTrials.gov, will describe the flow chart to illustrate the process and will include explanation of the macros.
AD12 : A SAS Macro Utility to Append SAS-Generated RTF Outputs and Create the Table of Contents
Sudhakar Anbazhagan, Ockham Group, Chennai, India
Shridhar Patel, Ockham Group, Cary, NC
Tuesday, 1:15 PM - 1:45 PM, Location: Continental 8
In Clinical Trial reporting, the analysis and the results are conveyed through tables and data listings. Depending upon the complexity, the number of tables and listings may vary from study to study. However, it is extremely useful to organize all the individual tables and listings that SAS generates, into a single file with a table of contents. Traditionally, these outputs would be compiled either manually or by making use of Stored Word VBA process. The disadvantage of these methods is that they are either time consuming or prone to errors. Although output methods vary, reports generated in RTF have gained popularity. SAS uses very basic RTF tags to produce elegant outputs and since RTF files are in one way plain text files, we can exploit the powerful text handling features of SAS to gather the required information needed for creating the TOC page by identifying certain RTF tags which hold that information. Additionally, tags can be inserted to create hyper-links and page numbers before compiling the individual outputs into one single and comprehensive file by programmatically manipulating a few RTF tags. A number of papers have been published that address this task, but the utility macro explained in this paper produces a comprehensive TOC with hyperlinks and page numbers with RTF outputs appended in the same document. The process can be customized to fit custom ODS RTF outputs and reporting requirements of an organization, using the fundamentals explained here. Requires: SAS9.1 or above Skill level required: Beginners.
AD14 : Creating Define.pdf with SAS® Version 9.3 ODS RTF
Elizabeth Li, PharmaStat, LLC
Carl Chesbrough, PharmaStat, LLC
Tuesday, 1:45 PM - 2:15 PM, Location: Continental 8
It is becoming more common for FDA submissions to include define.xml data definition documents for Study Data Tabulations Model (SDTM) data, Analysis Data Model (ADaM) data, and even legacy data. Although define.xml documents can help FDA reviewers to navigate submission datasets, documents and variable derivations, they usually do not print out properly on paper. One solution is to generate a define.pdf document with the same content as the define.xml. The PDF file includes bookmarks and hyperlinks to facilitate online review, and it can also be printed out for hardcopy review. Providing define.pdf documents can help sponsors remove obstacles to FDA’s review of their regulatory submissions. This paper presents tips for using SAS® Version 9.3 Output Delivery System (ODS) RTF to generate an RTF file, and then use Acrobat’s PDF Maker in to convert it to a define.pdf document. It discusses why to use ODS RTF instead of ODS PDF. It demonstrates how to create a user defined style using SAS© PROC TEMPLATE. It also shows how to use RTF command and escape codes to set up bookmarks and hyperlinks to internal as well as external locations. These features are important to an online review of any document. Key words: define.pdf, SAS® ODS, RTF, define.xml, bookmark, hyperlink
AD15 : Database Tectonics: Assessing Drift in Analysis Data
Brian Fairfield-Carter, ICON Clinical Research
Jasmin Fredette, ICON Clinical Research
Tuesday, 4:30 PM - 5:30 PM, Location: Continental 8
In long-term projects, particularly those with separate analyses conducted at major study milestones (i.e. at the end of acute-treatment and long-term extension phases), project documentation needs to account for changes to the analysis data resulting from evolving data standards, as well as those resulting from changes to the underlying raw data. The database at the time of the final/extension lock includes records from acute-phase visits, and because of ‘data drift’ (changes to raw data and to derivation logic) these acute-phase records may differ from those at the time of the acute lock. Since they may ultimately effect statistical inference, these differences need to be documented (often retrospectively), but the process is complicated because (1) differences resulting from changes to the raw data are confounded with those resulting from changes to data-handling rules and derivation logic, and (2) differences in analysis data accumulate across dependency layers (i.e. where lower-level analysis datasets are used in the creation of higher-level analysis datasets). This paper describes ‘cross-comparison’, where acute-lock programs are run on extension-lock raw data and extension-lock programs on acute-lock raw data, and the resulting datasets compared back against the original acute-lock analysis datasets, as a mechanism for distinguishing between differences arising from changes to program logic from those arising from changes to raw data. The use of a simple ‘pre-processor’ in implementing cross-comparison is illustrated, since the technique is ‘retro-fitted’ to an existing project and by ‘augmenting’ SAS code at run-time, the task of copying and modifying legacy code can be avoided.
AD16 : Clinical Laboratory Results, Old and New Pains
Mario Widel, Roche Molecular Systems
Scott Bahlavooni, Genentech
Tuesday, 3:30 PM - 4:00 PM, Location: Continental 8
Analyzing clinical laboratory results should be a straightforward task; however, a quick perusal of industry papers and presentations contradicts this statement. Issues surfaced over ten years ago continue to plague the industry. Further, adoption of the CDISC SDTM Laboratory Test Results (LB) domain adds another level of complexity. In this paper, the authors will readdress old challenges associated with laboratory data including variability of reported units and references ranges and their impact on analyzability as well as the discuss the new task of tabulating laboratory data in a standard model.
AD17 : Program Development with SAS Enterprise Guide and SAS/Connect in a Combined PC and UNIX Environment
Roger Muller, First Phase Consulting
Tuesday, 9:30 AM - 10:00 AM, Location: Continental 8
SAS Enterprise Guide (EG) has matured into a developmental tool useful to the advanced SAS programmer. This paper examines the use of SAS EG on the PC in combination with a Unix server which: (1) also has SAS, (2) is the storage depot for production runs of the final code including logs and outputs, and (3) contains the raw data which is being summarized. Workflow will be shown both with and without the use of SAS/Connect which serves to minimize the flow of large amounts of data between machines and takes advantage of the most powerful processor. Strong differentiation will be made on the advanced capabilities of exploring SAS data sets in EG with the Query Builder as opposed to the Viewtable facility in PC SAS. Approaches to structuring SAS code that is under development to smoothly transition to Unix will be covered.
AD18 : OpenCDISC Plus
Annie Guo, ICON Clinical Research
Tuesday, 4:00 PM - 4:30 PM, Location: Continental 8
OpenCDISC Validator is a collection of tools widely used in the industry to validate clinical trial data in compliance with the CDISC standards, including SDTM, ADaM, SEND, Define.xml, and others. The output from OpenCDISC is an Excel or CSV file, containing warnings, errors and other messages, one row per message per observation and variable(s) violating each rule. That often produces a lengthy report, and it can be laborious to cross-check SAS® data sets to resolve the messages. This paper is focused on SDTM and proposes an enhancement, OpenCDISC Plus. It merges the OpenCDISC report with SDTM data in SAS, and creates Excel files – OpenCDISC Plus SDTM – that are essentially the same as the original report, except for the addition of SDTM data alongside each message. It eliminates the need to access the physical SAS data sets or ad-hoc programming. Instead, simply check the SDTM data listed in the Plus reports for an immediate view of the entire observations that have violated the rules, and pinpoint the source of the messages. Further, we can compare two versions of Plus SDTM reports, and flag the differences between them. That is OpenCDISC Plus Comparison.
AD19 : Making a List, Checking it Twice: Efficient Production and Verification of Tables and Figures Using SAS
Linda Collins, PharmaStat, LLC
Elizabeth Li, PharmaStat, LLC
Wednesday, 8:00 AM - 8:30 AM, Location: Continental 8
Managing all the ‘moving parts’ of clinical data analysis is a daunting task. ‘Moving parts’ include inputs (raw data files, specifications), outputs (analysis files and reports) and processes (programs and macros). Each process and output must be verified as correct and each program executed in the proper sequence. Project leaders must be able to track the verification status of each output. In this paper, we present techniques for managing analysis through an ‘inventory’ spreadsheet of analysis metadata. The metadata is used to generate ‘driver’ programs that perform data selection and produce the tables, listings and figures (TLFs). Analysis programs not only generate TLFs, but also create SAS® datasets with a consistent format for storing summary results. The results datasets make it possible to automate independent programming verification. In addition, it makes possible to track the status of the entire project in a summary report by using the information from both the metadata and the results datasets. Furthermore, the inventory approach is expected to be highly useful when standards for analysis results metadata in define.xml are published.
AD20 : CDISC SDTM Conversion in ISS/ISE Studies: Tools
Balaji Ayyappan, PharmaNet/i3
Wednesday, 8:30 AM - 9:00 AM, Location: Continental 8
CDISC SDTM (Study Data Tabulation Model) conversion for Integration Summary of Safety (ISS) and Integrated Summary of Efficacy (ISE) is always a challenging job. The most taxing is the standardization of different non-CDISC compliant raw datasets in to CDISC compliant domains and variables for all the legacy studies involving large amount data and diverse systems of data collection - a time consuming process. Having faced this challenge year after year, we have developed a set of tools and reports to perform and validate the SDTM conversion for ISS/ISE studies of varying proportion to complete the task in manner that is efficient and accurate. This paper will walk you through the functionalities of this utility. The technologies used are SAS, Microsoft Excel, VBA and SAS IOM. The target audience is intermediate to advance level SAS programmers. Knowledge of VBA and SAS IOM fundamentals is a plus, but not a limiting factor.This paper will explain the tools and reports generated at various stages of ISS/ISE implementation:(i.Stage I : Pre-processing,ii. Stage II : SDTM Implementation,iii.Stage III: Post-Processing)
AD21 : Things Dr Johnson Did Not Tell Me: An Introduction to SAS® Dictionary Tables
Peter Eberhardt, Fernwood Consulting Group Inc.
Wednesday, 9:00 AM - 10:00 AM, Location: Continental 8
SAS maintains a wealth of information about the active SAS session, including information on libraries, tables, files and system options; this information is contained in the Dictionary Tables. Understanding and using these tables will help you build interactive and dynamic applications. Unfortunately, Dictionary Tables are often considered an ‘Advanced’ topic to SAS programmers. This paper This paper will help novice and intermediate SAS programmers get started with their mastery of the Dictionary tables.
AD22 : Using JMP® Partition to Grow Decision Trees in Base SAS®
Mira Shapiro, Analytic Designers LLC
Tuesday, 9:00 AM - 9:30 AM, Location: Continental 8
Decision Tree is a popular technique used in data mining and is often used as to pare down to a subset of variables for more complex modeling efforts. If your organization has only licensed Base SAS and SAS/STAT you may be surprised to find that there is no procedure for decision trees. However, if you are licensed JMP 9 user, you can build and test a decision tree with JMP. The Modeling->Partition analysis provides an option for creating SAS Data Step scoring code. Once created, the scoring code can be run in Base SAS. This discussion with provide a brief overview of decision trees and illustrate how to create a decision tree with Partition in JMP and then create the SAS Data Step Scoring code
AD23 : Accessing Microsoft Excel Workbook Cell Attributes from within SAS
Timothy Harrington, Dataceutics Inc.
Wednesday, 10:15 AM - 10:45 AM, Location: Continental 8
Data in a Microsoft Excel worksheet is now readily accessible to SAS® v9 using PROC IMPORT, and to SAS v9 and earlier using the DDE triplet. Although worksheet cell contents can be read into a SAS dataset, text attributes such as bold, italics, underscore, strikethrough, font, size, and color are not, at least in present releases of SAS, readily accessible. This paper discusses how cell attributes in Microsoft Excel® can be determined by a SAS program reading an Excel worksheet. SAS is PC SAS version 9.1 and later. MS Office is 2006 and later. Platform is Windows Vista and Windows 7.
AD24 : Let’s Compare Two SAS® Libraries!
Kavitha Madduri, Boehringer Ingelheim Pharmaceuticals Inc.
(Presented by Mithun Ranga)
Wednesday, 10:45 AM - 11:15 AM, Location: Continental 8
Here is a typical scenario that we come across very often – you have twelve datasets in one library and twenty datasets in another library. You want answer these questions: a) What datasets are in one library and not the other? b) For the common datasets: i. Is the number of observations the same? ii. Are the variable attributes the same? iii. Are there any differences between the datasets, if the number of observations and the variable attributes are the same? One can write SAS® code to answer each of the questions above. But these situations are so common that you do not want to spend time every time to know the differences between the two SAS® libraries. It would be easy if we have a SAS® macro that would do it automatically for us. This paper introduces COMPARE_LIB macro to compare datasets in two SAS® libraries. This macro compares the datasets in two SAS® libraries and produces reports to answer each question mentioned above. This macro would work on SAS® version 8 and above.
AD26-SAS : Enabling Analytics: Integrating and Automating the Clinical Data Analysis and Reporting Process
Matt Gross, SAS
Monday, 2:15 PM - 3:15 PM, Location: Continental 8
While SAS is fundamental to delivering insight and analysis throughout the Clinical Data Analysis and Reporting process in the Life Sciences and Health Care industries, these activities are usually done in isolation from one another with little oversight and coordination of activities and personnel. Manual steps and multiple systems are the norm which result in inefficiency and opportunities for mistakes or duplication of effort. SAS is dedicated to improving both our analytics as well as the environment to improve the processes which leverage SAS. SAS now provides a new environment to integrate people, data and processes into a central system. This presentation will focus on how SAS’ newest offerings, SAS Drug Development 4.1 and Clinical Data Integration 2.3 along with the underlying clinical analytics platform enable multiple participants and stakeholders both within and between organizations to work together, coordinate activities, share information and automate processes to reduce the time to develop, publish and share the results of their data analysis and reporting activities. See how new interfaces, capabilities and an integrated SAS authoring and execution environment will enable companies to more efficiently manage their work and the distribution of analytic-focused project throughout their business.
AD27-SAS : From InForm EDC Data to SAS Datasets using SAS Clinical Data Integration – Real Time
Jitesh Nagaria, SAS
Tuesday, 10:15 AM - 11:15 AM, Location: Continental 8
In Pharmaceutical Industry, getting the real time data from EDC systems into SAS Datasets is still the biggest challenge. There is significant amount of time and resources are involved in synchronizing data from EDC systems to SAS Datasets. As the industry becomes more and more competitive, having the right data in hand for analysis as soon as it has been entered is crucial and challenging. This paper talks about how to integrate and synchronize data from InForm EDC system through SAS Clinical Data Integration and convert it into SAS Datasets. It will describe the process flow and technical details required to set up the connection between SAS Clinical Data Integration and InForm EDC trial. It will also discuss how to fetch the metadata, clinical data in the form of ODM XML, parse the ODM XML and produce the SAS Datasets for each clinical domain. The integration involves technologies like SOAP, XML, XSL, SAS programs and SAS Clinical Standard Toolkit.
AD28 : Is Your Failed Macro Due To Misjudged "Timing"?
Arthur Li, City of Hope
Tuesday, 2:15 PM - 3:15 PM, Location: Continental 8
The SAS® macro facility, which includes macro variables and macro programs, is the most useful tool to develop your own applications. Beginning SAS programmers often don’t realize that the most important function in learning a macro program is understanding the macro language processing rather than just learning the syntax. The lack of understanding includes how SAS statements are transferred from the input stack to the macro processor and the DATA step compiler, what role the macro processor plays during this process, and when best to utilize the interface to interact with the macro facility during the DATA step execution. In this talk, these issues are addressed via creating simple macro applications step-by-step.
AD29 : Innovative Techniques: Doing More with Loops and Arrays
Arthur Carpenter, CA Occidental Consultants
Tuesday, 8:00 AM - 9:00 AM, Location: Continental 8
DO loops and ARRAY statements are common tools in the DATA step. Together they allow us to iteratively process large amounts of data with a minimum amount of code. You have used loops and arrays dozens of times, but do you use them effectively? Do you take advantage of their full potential? Do you understand what is really taking place when you include these statements in your program? Through a series of examples let’s look at some techniques that utilize DO loops and ARRAYs. As we discuss the techniques shown in the examples we will also examine the use of the loops and the arrays by highlighting some of the advanced capabilities of these statements. Included are examples of DO and ARRAY statement shortcuts and ‘extras’, the DOW loop, transposing data, processing across observations, key indexing, creating a stack, and others.
AD30 : Why the Bell Tolls 108 times? Stepping Through Time with SAS®
Peter Eberhardt, Fernwood Consulting Group Inc.
Monday, 4:30 PM - 5:30 PM, Location: Continental 8
For many SAS programmers, new or even advanced, the use of SAS date and datetime variables is often very confusing. This paper addresses the problems that the most of programmers have. It starts by looking at the basic underlying difference between the data representation and the visual representation of date, datetime and time variables. From there it discusses how to change data representations into visual representations through the use of SAS formats. Since date manipulation is core to many business process, paper also discusses date arithmetic first by demonstrating the use of simple arithmetic to increment dates; then by moving on to SAS functions which create, extract and manipulate SAS date and datetime variables. Finally, paper demonstrates the use of the %sysfunc macro function and the %let statement to present date, datetime and time variables. This paper is introductory and focuses on new SAS programmers, however, some advanced topics are also covered.
Coders CornerCC01 : Use Your Cores! An Introduction to Multi-core Processing with SAS
Erik Dilts, INC Research
Monday, 9:15 AM - 9:30 AM, Location: Continental 7
Computing power has changed direction the past few years. In the past, chip manufacturers designed faster and faster processors (CPUs), but they reached the theoretical limit of that method. This is how the multiple-core CPU in today’s computers came to be. A multi-core CPU is essentially the same as having multiple CPUs. SAS® 9 can take advantage of these cores in several different ways. This paper will explain how to use multi-core processing to your advantage. In addition it will demonstrate the pros and cons of the various techniques available under Windows, and when to use them.
CC02 : Automatic Version Control and Track Changes of CDISC ADaM Specifications for FDA Submission
Xiangchen (Bob) Cui, Vertex Pharmaceuticals
Min Chen, Vertex Pharmaceuticals
Monday, 9:30 AM - 9:45 AM, Location: Continental 7
CDISC ADaM specifications documentation is the core of the programming activities in FDA submission. It serves as the primary source for ADaM programming and validation, define.xml and reviewer guide, and also helps in establishing metadata traceability. It is preferable to create ADaM specifications in Word® Format to facilitate both reviewing and approving of specifications documentation by statisticians and/or validators. Since derivation rules may be complex and subject to constant change during the whole ADaM programming activities, it is desirable to automatically keep track of different versions of ADaM programming specifications. It is more beneficiary for sponsors to keep track of different versions when ADaM programming is outsourced to external vendors. This paper introduces a SAS macro to automatically detect and report any revisions between the new version and the old version of programming specifications. It helps to generate necessary documents for version control, establishes traceability and achieves high efficiency in FDA submission.
CC03 : Automatic Detection and Identification of Variables with All Missing Values in SDTM/ADaM Datasets for FDA Submission
Min Chen, Vertex Pharmaceuticals Inc.
Xiangchen (Bob) Cui, Vertex Pharmaceuticals Inc.
Tuesday, 8:45 AM - 9:00 AM, Location: Continental 7
When submitting clinical study data in electronic format to the FDA, it is preferable to submit as few as possible unnecessary variables which have all missing values. This kind of variable is called empty variable. It is highly desirable to automate the process of detecting and identifying empty variables in the submitted datasets. Handling these identified empty variables is very critical for FDA submission. CDISC introduced a concept of core variable in an SDTM domain and an ADaM dataset. A variable is categorized as Required, Expected, and Permissible in an SDTM domain and as Required, Conditionally Required, and Permissible in an ADaM dataset. Applying the information of core variable categories to these empty variables provides a better decision to handle these empty variables in an FDA submission. Hence it ensures the technical accuracy and submission quality. This paper introduces a SAS macro to automatically detect and identify empty variables in SDTM domains and ADaM datasets. Five scenarios of empty variables are illustrated and their corresponding resolutions are provided to the readers as a reference based on the information of core variable categories in SDTM/ADaM datasets.
CC04 : Identifying New and Updated Records in a Subject Listing
Mark McLaughlin, Biogen Idec
Monday, 9:45 AM - 10:00 AM, Location: Continental 7
Subject listings are a vital means for investigators to review study data, although their usefulness decreases as the output gets longer and more unwieldy. Similarly, as the length of the output increases, investigators are often concerned with reviewing only the most recent data points, information that gets increasingly difficult to find as the listings span multiple pages. This paper examines a method to both identify new records added since the prior listing was run, and flag those records that are no longer new but have since been updated or changed. It uses both DATA STEP programming and the output from PROC COMPARE, with a little PROC SQL thrown in. The result is a data listing that includes the variables of interest, and an additional column which indicates what has changed since the last iteration of the data listing. If the observation is a new one (one that didn’t exist last time around) then the column is flagged with the letter ‘N’ (for new record). If the observation is not a new record but rather an already existing record where some variable changed, then the column is flagged with the variable name (or names) that contain the altered information. This method will allow investigators to more efficiently read through potentially lengthy data listings and focus on the records they are most often interested in, those that are new or those that were changed/corrected.
CC05 : Using Skype to run SAS programs
Romain Miralles, genomic health
Monday, 10:00 AM - 10:15 AM, Location: Continental 7
Skype is a well-known method used to talk to friends, family members and co-workers. It is one of the best applications available to make voice calls over the Internet. In this paper we will present a new, innovative way to use SAS with Skype. Here, we have developed a solution that allows users to run SAS remotely through Skype. After installing the DLL from the API on the application website, programmers can create scripts to control Skype. By sending a specific message to a pre-defined user, programmers can execute SAS on demand. This paper will explain how to use Skype to run SAS programs. It will describe the installation and configuration of the DLL, provide the VB script needed to communicate with Skype, and illustrate a real-case scenario in which this technique is used.
CC06 : Watermarking and Combining Multiple RTF Outputs in SAS ®
Ajay Gupta, PPD INC
Monday, 10:15 AM - 10:30 AM, Location: Continental 7
In order to expedite the review process, we often get request from the client or business user to combine all tables and listings into one rtf document. Unfortunately, SAS does not provide a function to combine multiple rtf documents. Also, before finalization of the study draft listings and tables are generated more frequently for reviews. Unknowingly, using this draft listings and tables produced in SAS® for business needs increases the risk associated with misusing or misinterpreting the results. A viable solution to reduce that risk is to use a visible watermark embedded in the SAS outputs such as a pale image or text displayed and/or printed behind the text. This paper will introduce a method for combining the multiple rtf documents into one rtf document and adding watermark to the final RTF document in SAS using word basic commands via DDE (Dynamic Data Exchange). If needed, these functions can be use independently.
CC07 : Importing Excel ® File using Microsoft Access ® in SAS ®
Ajay Gupta, PPD INC
Tuesday, 8:30 AM - 8:45 AM, Location: Continental 7
In Pharmaceuticals/CRO industries, Excel files are widely use for data storage. There are different methods such as PROC IMPORT\LIBNAME to convert Excel file in SAS dataset. But, due to technical limitations in SAS these traditional methods might create missing values while reading the columns in Excel file containing mixed data types (both numeric and character). Especially, when the Excel file is large then it is hard to debug the loss of data. Also, if the validation programmer is using the same traditional method to read the Excel file then there is a possibility that datasets will pass the validation and further use into production even there is loss of data. This will affect the overall quality of deliverable and also increase the cost. To overcome this issue and provide an alternative method for validation, there is desperate need for new method to read Excel. This paper will introduce a unique method to read Excel using Microsoft Access in SAS. This convenient and reliable solution will help SAS Programmers/Statisticians to have better control over the quality of data and save significant time with minimal coding.
CC08 : Automation of Comparing ODS RTF outputs in Batch using VBA and SAS
Dongsun Cao, UCB Pharma
Monday, 10:30 AM - 10:45 AM, Location: Continental 7
In the pharmaceutical industry, Table, Listing and Graphs (TLGs), often produced in Rich Text Format (RTF), are the integral part of the new drug application (NDA) for approval to the regulatory agency. Therefore, it is essential to have an efficient quality check mechanism in place to ensure the quality of those outputs. One of the quality checks processes widely adopted is to have two programmers to perform independent programming and then visually check one RTF output against another. Because visual check is time consuming and error prone, it is important to have this process automated. So far, there are several papers published to address the automation but none of them implement a process without either changing original table structure or losing some information such as titles, footnotes or headings, thus limiting its use. In this paper, I present an automated process in which RTF outputs from two directories are compared against each other and comparison results are reported. The techniques involve using Visual Basic Application (VBA) macro to convert RTF outputs into text files which are then read into SAS and compared against each other using Proc Compare procedure. The described automated process will ensure that double programming scheme is religiously and efficiently implemented and should be able to add another layer of quality control to the TLGs production.
CC09 : The Application of Variable-Dependent Macros on SDTM Conversion
Mindy Wang, CDM
Monday, 10:45 AM - 11:00 AM, Location: Continental 7
In clinical trials and many other research fields, repetitive statements happen very often. This paper illustrates how to deal with the same task with three different approaches. The first approach uses simple copy and paste any SAS® beginner can do. The second approach uses a simple macro program to make the task more efficient, and reduce the length of the program. To further improve the program and to make the program more adaptable to other projects, the third approach uses variable-dependent macros which minimize tedious typing, eliminate possible human errors, and ensure the accuracy of data.
CC10 : Calculating Time-to-Event Parameters Using a Single DATA Step and a RETAIN Statement
Andrew Hulme, PRA
Monday, 11:00 AM - 11:15 AM, Location: Continental 7
During oncology studies, a typical measure of efficacy is the amount of time elapsed until a particular response is achieved for the first time (time to response) and how long that response is maintained (duration of response). For subjects that do not meet the response criteria before the end of the study, a censoring flag and time must also be calculated. Thus, typical methods of calculating these parameters use several programming steps for each particular variable of interest. This paper presents a method to calculate all necessary parameters in a single DATA step using a RETAIN statement. A subject’s records are ordered by date and the response criteria are evaluated during each iteration of the DATA step. If the criteria are met the time-to-event values are passed to the retained variables and kept until a subject’s last record; if not, the censored time is calculated on the last record and outputted.
CC11 : Short-circuiting PROC COMPARE: Techniques for Focusing Dataset Comparisons
Tracy Sherman, ICON Clinical Research
Brian Fairfield-Carter, ICON Clinical Research
Monday, 11:15 AM - 11:30 AM, Location: Continental 7
Electronic file comparisons are a staple of analysis dataset programming and validation, with PROC COMPARE output as the end product of most independent double-programming. This output can be frustrating to work with, particularly in early stages of program validation when discrepancies can be extensive, since PROC COMPARE is not particularly adept at matching non-discrepant records when the number of records on the ‘base’ and ‘compare’ datasets don’t match (this in contrast to many ‘diff’ utilities used in text-file comparisons, which can often accommodate record-count differences). When confronted with record-count differences, most programmers resort to temporary sub-setting and ad hoc/throw-away blocks of code to help trim down PROC COMPARE output and focus review efforts. However, these methods follow a reasonably consistent pattern, and as such it seems plausible that more generalized techniques could be developed. This paper proposes two such methods, the first designed to highlight among key variables where record-count differences exist (with the option of trimming off record-count differences to expose underlying value-level differences), and the second intended to allow for a more manageable ‘step-wise’ evaluation of discrepancies, starting with the first combination of key variables where a discrepancy exists.
CC12 : An Efficient Method to Create Titles for Multiple Clinical Reports Using PROC FORMAT within a DO Loop
Youying Yu, PharmaNet/i3
Tuesday, 9:00 AM - 9:15 AM, Location: Continental 7
Do you know how to create titles for multiple clinical reports directly from data using SAS PROC FORMAT within a DO loop? When we work on clinical efficacy data, we often have the total score and sub efficacy item score. Sometimes, the efficacy data contains up to 50+ sub items. Each item and total score will go through the same analysis and get outputs from the statistical models. Then, outputs are put into one “RTF” file with one item per page. Traditionally, we create macros using hard coding via a macro parameter for the title in each table. Here, we have a more efficient way to do it. We can use PROC FORMAT from the data to create the title for each efficacy sub item table automatically within a DO loop. Using this method, we can avoid hard coding or possible typos in the titles. This paper introduces this tip and trick. SAS versions 8.2 and up will be used in this paper.
CC13 : Techniques for Improvising the Standard Error Bar Graph and Axis Values Completely Through SAS Annotation.
Sunil Kumar Ganeshna, PharmaNet/i3
Venkateswara Rao Chirumamilla, PharmaNet-i3
Tuesday, 9:15 AM - 9:30 AM, Location: Continental 7
This paper introduces the multiple challenges and several useful techniques for enhancing SAS/GRAPH® output when creating plots with standard error bars and symbols of the least square means/mean and enhancing multiple x-axis values with uneven intervals. The macro presented in this paper utilizes the techniques of data manipulation, SAS annotation, markers and fonts, plot statement, axis statement, symbol and legend statements; this single macro call generates an annotated dataset that is utilized to create an Error bar graph with multiple axis values.This macro has its importance when a programmer is generating multiple graphs from a single program.It also covers generating multiple linetypes for a single treatment differeintating between the visits.
CC14 : Quick Data Definitions Using SQL, REPORT and PRINT Procedures
Bradford Danner, PharmaNet/i3
Tuesday, 9:30 AM - 9:45 AM, Location: Continental 7
Prior to undertaking analysis of clinical trial data, in addition to review of the Protocol, Case Report Forms, and Statistical Analysis Plan, a basic understanding of the SAS data available is crucial for statistical programmers and biostatisticians. Such an understanding can be accessed readily using the DATASETS and CONTENTS procedures. To centralize in one place and increase efficiency by which the information may be reviewed, a quick method using the SAS DICTIONARY library with the SQL, REPORT and PRINT procedures is presented. Through dynamic, data-driven, generation of macro control variables, a hyper linked data definitions document, in multiple formats, may be produced quickly, and with minimal user input.
CC15 : Reporting Outliers in Data: A Macro to Identify and Report Distant Data Points
Howard Liang, Pharmanet-i3
Jason Bishop, Pharmanet-i3
Tuesday, 9:45 AM - 10:00 AM, Location: Continental 7
When performing statistical analysis on data, certain statistical assumptions must hold. The identification of outliers is an integral part of grooming the data for analysis. One must check to see if these outliers are valid data from the data source. Reports can be created with and without detected outliers so statisticians and researchers can best decide on appropriate statistical methods and properly interpret the analysis results. We will be using an analysis of residuals generated by PROC MIXED to identify values outside of a given range and then produce a well-formatted output from PROC REPORT to aid in data interpretation. This will be incorporated into a stand-along macro that can be ported across multiple analyses.
CC17 : Efficiently Trim Character Variable Lengths to Fit Data, Reduce Dataset Size
Wayne (Weizhen) Zhong, Octagon Research Solutions
Monday, 11:30 AM - 11:45 AM, Location: Continental 7
In environments where saving datasets with the (COMPRESS=yes) option is not permitted, unnecessarily large datasets clog up hard drives and increase processing time, costing patience and efficiency alike. Somewhere between the extremes of simply defaulting character variables to 200 characters or spending time to manually set lengths - risking truncation when data updates - there is a happy middle ground to automate the process: always let the length be as short as it can without truncation.
CC18 : One Macro Call to a Table with both Frequency and Summary Elements from a Subject-Level Dataset
Wayne (Weizhen) Zhong, Octagon Research Solutions
Tuesday, 10:30 AM - 10:45 AM, Location: Continental 7
A Subject-Level dataset such as ADSL has one observation per subject with data appropriate to the format, such as demographics, baseline characteristics, certain disposition events, and other one record per subject numeric or character data. Tables summarizing this data corresponding have both frequency and summary statistics sections, and normally the tables are produced in individual sections and assembled together. The code to do each section is not complex, which is why this table is a good candidate for a macro to produce. This paper presents the full code of the macro, built with simplicity of the macro call in mind. The call requires only 3 parameters: the input dataset, a variable used to define table columns, and a list of mixed numeric/character variables, one for each section of the table. Adding optional features allow full control over the table presentation, with the end goal of making the program header the lengthiest section of the program.
CC19 : Don't have a "Week" Report
Barbara Ross, DataX Ltd
Tuesday, 10:45 AM - 11:00 AM, Location: Continental 7
Often we need to generate reports on an hourly, daily, and weekly basis. Hourly and daily are pretty straightforward, but when it comes to week, there are many ways to define it, code for it, and display it. This paper attempts to cover a lot the various methods of classifying weeks within your longitudinal data. It will show how to use the SAS WEEK function and formats to enumerate your weeks, the INTNX function to apply date range labels and specify non-traditional weeks, and other helpful tips such as filling in sparse summary tables and placing date ranges in your title. The topics covered are appropriate for novice SAS users.
CC20 : Beyond %LET: Building a GUI to Pass Run-Time Parameters to SAS Programs Using VBScript and HTML Applications
Shawn Hopkins, Seattle Genetics
Tuesday, 11:00 AM - 11:15 AM, Location: Continental 7
In a validated environment, any change to a program requires some degree of review and documentation. For programs that require changes to macro variable assignments or LIBNAMEs with each submission, maintaining validation documentation can become tedious. Also, having users editing code not only introduces the possibility of stray characters or missing semi-colons, but for also invalid values to be defined for macro parameters. Accepting specifications through a GUI at run-time allows the program both the flexibility to dynamically modify parameters within the code, and ensure only valid values are provided for parameters.. Using VBScript and VBScript within an HTML application (HTA) provides functionality to collect user input and pass the values to SAS programming environment. This approach leverages simple VBScript functions like MsgBox , the familiar file dialog boxes used in MS Office applications, and allows custom form creation with a variety of controls to meet nearly any business need.. This paper focuses on developing GUIs of increasing flexibility using VBScript from within a DATA _NULL_ step and calling external HTA files to return results to the SAS environment.
CC21 : Identifying and Verifying Treatment-Emergent Adverse Events and Concomitant Medications by Date Specificity
Tom Santopoli, Octagon Research Solutions, Inc.
Tuesday, 11:15 AM - 11:30 AM, Location: Continental 7
In clinical trials, treatment-emergent adverse events and concomitant medications are often identified in a similar manner: Event start and end dates are compared to reference start and end dates. However, partial dates can add many complications. The Statistical Analysis Plan (SAP) provides instruction for the handling of partial dates. This presentation will offer a method for using date specificity as a tool to address potential shortcomings of the methodology provided in the SAP.
CC22 : A Macro to Indent and Hyphenate Lengthy Text in PROC REPORT
Stanley Yuen, Simulstat Incorporated
Tuesday, 11:30 AM - 11:45 AM, Location: Continental 7
A SAS programmer can have long text values in a character variable to wrap in multiple lines using the FLOW option and SPLIT character assignment in PROC REPORT. This option is limited in allowing additional leading blank characters as indentations for each line of output. It can also sometimes break up a word with the remaining characters in the next line of output. When there is flowing text, it is preferable to insert a hyphen to indicate a word fragment and avoid unsightly spacing in left-justified text. This paper discusses a macro that modifies a character variable to allow the FLOW option and SPLIT character to produce indented hyphenated text in PROC REPORT using Base SAS 9 functions in the data step. The macro is also flexible in designating different indentations for the first line of text compared to the other lines for each observation in a report.
CC23 : Figure Out Your Programming Logic Errors Via DATA Step Debugger
Wuchen Zhao, University of Southern California
Wednesday, 8:00 AM - 8:15 AM, Location: Continental 7
Programming errors can be categorized into syntax errors and logic errors. Syntax errors can be easily fixed because these types of errors can be detected by SAS® during the DATA step compilation phase and SAS will stop the program to generate an error report in the log window. Unlike syntax errors, logic errors are difficult to be found and/or corrected since programs with logic errors often operate correctly but result in unexpected results. One of the useful tools to repair logic errors is to utilize DATA step debugger, which allows you to see the contents of the program data vector (PDV) during the data step execution. With the debugger running, you will be able to monitor and update data values by using data debugging commands which will help you to understand where the problem is so that you will be able to fix the error. This paper will illustrate some debugging techniques by using DATA step debugger.
CC24 : Need a Hand to Locate Your Variables in an Entire Library?
Ying Guo, PAREXEL International
Wednesday, 8:15 AM - 8:30 AM, Location: Continental 7
The ability to quickly identify the location of a variable in an entire library is critical during table, figure and listing (TFL) development, especially, when there is only limited information available for some of the variables. For example, sometimes, there may only be a partial variable name available, or perhaps you may only know one of the values for a variable. It is really time consuming to manually search for the expected variable in an entire library. Instead of opening each dataset and manually searching for the expected variables, the macro presented in this paper is a SAS® macro program designed to identify the location of a variable in a time-efficient and user-friendly way. It can automatically search all variables in a library regardless of the dataset structure and variable names. The macro is designed to obtain the location of a variable from one of following conditions: 1) A known partial or full variable name ; 2) A known partial or full variable label ; 3) A known partial or full variable value . The output of this macro will provide the dataset name and the full variable name of the variables searched. Use of this macro will significantly shorten search time, and help to increase TFL production efficiency.
CC25 : Formats in another format
Bettina Ernholt Nielsen, Ferring Pharmaceuticals
Wednesday, 8:30 AM - 8:45 AM, Location: Continental 7
Have you ever thought about alternative ways of using SAS formats in your programs? Did you ever use a format to do a calculation or to round your numbers based on the initial value? Often we use formats to read data, display values, group our data or do conversions between numeric and character, but formats can also be used to simplify programs that would have been somewhat complex or lengthy. By carefully constructing our formats to fit our needs we can utilize the powerful SAS format function to accomplish this.
CC26 : The Power of Using Options COMPLETETYPES and PRELOADFMT
Naren Surampalli, Novartis Pharmaceuticals
Wednesday, 8:45 AM - 9:00 AM, Location: Continental 7
As a statistical programmer in pharmaceutical industry, our job revolves around generating reports, which involve calculating counts and percentages across different treatments. To generate counts, most of us use procedures like frequency, sql etc., but these procedure doesn't display a missing group, if all the values of a particular group are missing in a dataset. To display zeros for a missing group or treatment in our reports, we end up writing additional code, which is not very efficient. This paper will discuss a better way to count, handle missing values efficiently and show how to create totals and sub-totals easily using one or two steps.
CC27 : “V” Can Transpose: Transposing Data Sets Using Functions
Neha Mohan, PharmaNet/i3
Wednesday, 9:00 AM - 9:15 AM, Location: Continental 7
When a formatted character variable is transposed simultaneously with other variables, the PROC TRANPOSE procedure returns unformatted values of the character variable. In cases when the formatted values associated with the character variable are desired, it is required to either transpose the corresponding variable separately or alternately create a new variable holding the corresponding formatted values, and then transpose this new variable along with other variables. This paper demonstrates the use of three “V” functions namely VVALUE, VNAME and VLABEL in transposing such variables; function VVALUE, to directly return the formatted values associated with the variables (that a PROC TRANSPOSE would not) and functions VNAME and VLABEL, to maintain the list of original variable names and their labels respectively (that a PROC TRANSPOSE would return). In addition, these functions also enable the original variables to be retained in the resultant transposed data set without having to perform any additional processing. The use of these functions can be further extended to the DOW-loop as well.
CC29 : Sending Emails in SAS® to Facilitate Clinical Trial
Frank Fan, Clinovo
Wednesday, 9:15 AM - 9:30 AM, Location: Continental 7
Email has drastically changed our ways of working and communicating. In clinical trial data management, delivering reports to users is part of our daily activities. Most of our colleagues prefer receiving reports by email rather than on the web or in a shared drive. Email is indeed one of the most convenient ways of communication. In this paper, we will present how to use SAS V9 to send reports via email. Before an email can be sent via SAS, the SAS configuration file needs to be modified. After addressing the configuration file, SAS syntax to send an email will be discussed. We will use several examples to illustrate how we can implement such practices.
CC30 : A Mean Way to Count, Enumerating the Values of Multiple Variables Using Formats with the MEANS Procedure
Rod Norman, InVentiv Health
Wednesday, 9:30 AM - 9:45 AM, Location: Continental 7
Perhaps the most common task of the clinical SAS programmer is to count things. Things like adverse events, laboratory value grade shifts, demographics, etc. and to do this over a wide variety of scenarios. To accomplish these tasks, programmers should consider using features of the MEANS procedure in conjunction with the MULTILABEL option of the VALUE statement within the FORMAT procedure. By invoking PROC MEANS on datasets that included only variables listed in CLASS statements and not using a VAR statement, efficiently produced output will contain only counts for all combinations of the CLASS variables. By using the PRELOADFMT and MLF keyword options on CLASS listed variables , a variety of counting scenarios can be applied to the input data. A TYPES statement identifies just the combinations of variables that are desired in the Output dataset created using OUTPUT OUT=. This routine is unburdened with macro variables and offers transparent code that can be easily modified should specifications change, counting scenarios increase or other variables added.
CC32 : Using File Modification Date Comparisons to Alert StudyTeams of Potential SAS Program Revisions Between Shared Folders
Stephen Hunt, ICON Clinical Research
Brian Fairfield-Carter, ICON Clinical Research
Wednesday, 9:45 AM - 10:00 AM, Location: Continental 7
In the absence of version control, communicating changes made to programs between development and production areas (e.g., modifications to a select few blinded-area programs that must be copied over to an unblinded area for re-delivery) can easily result in ‘stale’ code and outdated output without absolutely perfect coordination across the entire study team. To compound things, there indubitably exists an inverse relationship between size of programming teams and the timely and accurate communication within them. In such circumstances, an unblinded programmer or statistician must often rely on their own verification of the state of files to be run in production in situations of concurrent development (i.e., since members of a blinded team would be unable to verify or directly compare against contents of an unblinded folder themselves). Therefore, comparing file modification dates programmatically may be the best, first step towards evaluating consistency across folders intended to share the same production code (such as between blinded and unblinded study teams).
CC33 : Macro to Combine Listing and Descriptive Statistics Table in One Report
Chenille Lloyd, PharmaNet/i3
Wednesday, 10:15 AM - 10:30 AM, Location: Continental 7
A comprehensive presentation of clinical trials data is often desired in order to achieve a greater understanding of the safety and efficacy of a new drug. The most valuable reports are able to provide more than one perspective of the data while maintaining readability. The primary objective of this paper is to generate a combined report of an individual subject listing followed by a descriptive statistics table using SAS ® macro language and SAS ® base. A single presentation that concatenates a listing and table is particularly useful for pharmacokinetic analysis studies which require easy identification of outliers and trends in drug concentration data. The macro features additional functionality by allowing the programmer to specify subject exclusions and the number of decimal places used to report results.
CC34 : Determining and Reporting Significant Decimal Places for Continuous Data in Mixed Form
Gary Moore, Moore Computing Services, Inc.
Wednesday, 10:30 AM - 10:45 AM, Location: Continental 7
Data of mixed precision is common in clinical trials. Examples of this mixed form are Laboratory, ECG, and Vitals data, where a parameter or type variable defines different results. These results are often recorded to significant decimals appropriate to the data type as determined by the site collecting the data. But, the results are often reported in a common decimal place that may be inappropriate across all of the data types. This paper presents a simple approach for determining significant decimal places for continuous data collected in mixed form and reporting the appropriate significant decimal places across all data types. This paper used SAS BASE 9.2 and Windows 7.
CC35 : Export SAS data to Excel in SAS® Drug Development
Chun Feng, Celerion
Xiaopeng Li, Celerion
Wednesday, 10:45 AM - 11:00 AM, Location: Continental 7
Exporting SAS data to Excel is a common task for programmers. SAS® 9 provides multiple powerful techniques of outputting SAS data to Excel, such as proc export, libname engine, output delivery system (ODS), dynamic data exchange (DDE), and ExcelXP tagset. This paper will discuss capabilities and advantages of each technique in SAS® Drug Development (SDD), a relatively new regulatory-compliant SAS system for clinical research. With new interface, Java powered editor, and web-based system, SDD performs differently than PC SAS® in some aspects to create an Excel file. New convenient ways are available in SDD, including outputting from data explorer and defining output data table as an Excel file in data steps. This paper will greatly help SDD users to efficiently export SAS data to Excel.
Data Standards and QualityDS01 : Transform Incoming Lab Data into SDTM LB Domain with Confidence
Anthony Feliu, Genzyme
Monday, 2:15 PM - 2:45 PM, Location: Continental 1
Converting incoming lab data into a submission–ready format is often challenging and stressful to the programmer. Vendors, sponsors, and collection instruments all seem to have their own structures and nomenclature for lab data. This paper will detail a process to mold disparate data into a unified whole. In brief, verbatim values of each incoming data file are mapped one-to-one into SDTM variables. Mapped data are then compared to a dictionary of tests which is held outside the program as the “gold standard”. When a match is found, verbatim values are updated to CDISC-compliant dictionary terminology. Otherwise the record is flagged for review. Next, result values for quantitative tests are compared to the “preferred units” for that test. The incoming result will either be accepted, rescaled with help from a second dictionary of conversion factors, or flagged. Similarly, qualitative test results are standardized. Moving terminology out of program code and into external dictionaries provides excellent transparency and traceability. The approach is readily implemented. Both program code and dictionaries are maintainable and extensible across protocols or product lines.
DS02 : Ir-RECIST-ible
Sanket Kale, PAREXEL
Jeffrey Moran, PAREXEL
Monday, 2:45 PM - 3:15 PM, Location: Continental 1
The majority of oncology trials with an efficacy endpoint involving solid tumor assessment follow RECIST (Response Evaluation Criteria in Solid Tumors) guidelines for tumor evaluation. RECIST provides a standard method for determining whether patients improve, remain stable, or worsen in response to therapy. RECIST requires collecting and tracking a fairly large number of attributes about each of several tumors in every patient. Unfortunately, even small errors or inconsistencies in clinical procedure or data collection, both of which are common in a clinical trial setting, can make data unusable for RECIST response determination. This paper discusses key problems that can negatively impact the use of data for RECIST response determination and describes integrity checks that can help identify and correct these problems. These integrity checks can be used in data cleaning to ensure that the maximum pool of data is available for determining response to treatment.
DS03 : SAS File Design: Guidelines for Statisticians and Data Managers
Douglas Zirbel, Independent Consultant
Monday, 3:30 PM - 4:00 PM, Location: Continental 1
Although it might at first seem harmless, careless design of clinical trials data files can result in big problems later. Misunderstandings, delays, and even erroneous data are potential consequences to the clinical research organization, the pharmaceutical company, and ultimately to patients. This paper reviews a number of mistakes made in analysis/submission file specifications, then provides guidelines and a checklist for intelligent file design, whether CDISC or other formats.
DS05 : Seven New SDTM Domains for Medical Devices
Carey Smoak, Roche Molecular Systems, Inc.
Fred Wood, Octagon Researh Solutions
Rhonda Facile, CDISC
Kit Howard, Kestrel Consultants
Monday, 4:30 PM - 5:30 PM, Location: Continental 1
Medical devices are an important part of the healthcare industry, and device approvals (PMAs) by the FDA increased by more than 50% from 2000 to 2009. In May 2006, an SDTM Device sub-team was formed. This team was expanded in February 2009 to include CDASH team members. The goal of this Device sub-team is to develop a set of content standards for a core set of data collection fields and submission standards. Towards that goal, six new SDTM domains have been developed: (1) Device Information (DI), Device In-Use Properties (DU), (3) Device Exposure (DX), (4) Device Events (DE), (5) Device Tracking and Disposition (DT) and (6) Device Subject Relationships (DR). Two new SDTM variables (generic and unique device identifiers) will needed for these 6 six new SDTM domains. This Presentation/paper will describe the project history, the problem we are attempting to solve,the status of these new submission domains and next steps.
DS06 : SDTM Domain Mapping with Structured Programming Methodology
Chengxin Li, Boehringer Ingelheim
Jing-Wei Gao, Boehringer Ingelheim
Nancy Bauer, Boehringer Ingelheim
Tuesday, 10:15 AM - 10:45 AM, Location: Continental 1
This paper describes the implementation of SDTM domain mapping with structured programming methodology. Structured design and structured programming make our mapping codes easy to implement, understand, debug, and maintain. Example codes for domain mapping are provided, with which the readers can easily adapt to meet their mapping requirements.
DS07 : Generating SUPPQUAL Domains from SDTM-Plus Domains
John R Gerlach, TAKE Solutions, Inc.
Glenn O'Brien, Independent Consultant
Tuesday, 10:45 AM - 11:15 AM, Location: Continental 1
Generating Supplemental (SUPPQUAL) domains is a staple component of any CDISC conversion project. In fact, many of the SDTM domains often have a respective SUPPQUAL domain, which can be just as challenging to produce, as well as being considerably larger in size. Even if there is an SDTM-Plus domain that contains the variables intended for a respective SUPPQUAL domain, the task of creating the respective SUPPQUAL domain can still be quite tedious, as well as error prone. Obviously, it would be better to automate this process that guarantees compliance and accuracy. This paper explains a SAS utility for generating SUPPQUAL domains from SDTM-Plus domains.
DS08 : Avoiding a REL-WRECK; Using RELREC Well
Karl Miller, Pharmanet/ I3
JJ Hantsch, Pharmanet/I3
Janet Stuelpner, SAS
Monday, 4:00 PM - 4:30 PM, Location: Continental 1
Everything that happens to a single subject in a clinical trial is interrelated. Most points of data can be reported and processed in isolation, but not all. When two or more points of data in two or more CDISC Study Data Tabulation Model (SDTM) domains require reporting their relationship, there is a Special-Purpose domain, RELREC (Related Records) for this purpose. Although employed for accidental (or inexplicit) relationships where data destined for two domains is collected on one CRF page; an explicit relationship between data reported in two domains is also possible. A subject who takes a concomitant medication to treat an adverse event has an explicit relationship between two records in the CM and AE domains. An adverse event which causes a dose-delay or titration (AE & EX), a lab value which indicates a change in dosing epoch (LB & DS) and a substance usage report which results in study termination (SU & DS) are other examples of related records. Careful consideration of what types of explicit relationships are expected and how they will be reported can prevent RELREC from turning into a RELWRECK. In this paper, we demonstrate our method of avoiding the re-mapping of entire SDTM domains and risking mismatched relationships in your final submission data.
DS09 : Considerations in the Submission of Exposure Data in an SDTM-Compliant Format
Fred Wood, Octagon Research Solutions
Jerry Salyers, Octagon Research Solutions
Richard Lewis, Octagon Research Solutions
Monday, 1:15 PM - 2:15 PM, Location: Continental 1
The submission of data regarding the subjects’ exposure to study treatments is critical in making decisions regarding their safety. Many sponsors are concerned about getting their data into an SDTM-compliant format; however, our experience in legacy-data conversions shows that many studies don’t collect data with sufficient granularity to make it usable to get a reliable assessment of actual exposure. This paper will discuss the pros and cons of various methods of collecting and submitting exposure data, as well as some of the challenges sponsors may face in converting data to be consistent with the SDTMIG Exposure domain.
DS10 : Considerations in the Submission of Pharmacokinetic (PK) Data in an SDTM-Compliant Format
Fred Wood, Octagon Research Solutions
Peter Schaefer, Pharsight
Richard Lewis, Octagon Research Solutions
Tuesday, 1:15 PM - 2:15 PM, Location: Continental 1
Pharmacokinetic data is collected for a variety of purposes in clinical trials. Some of the first-in-human trials provide absorption and elimination data for comparisons with data collected in non-human species. PK data is also used in the determination of dosing frequency for therapeutic benefits, as well as for bioequivalence testing for alternative dose forms and/or regimens. This paper will focus some of the challenges associated with managing the collected concentration data and the subsequent calculation of PK parameters. Examples of how the collected data map into the SDTMIG PC (Pharmacokinetic Concentrations) and PP (Pharmacokinetic Parameters) datasets and the use of RELREC will be presented. An overview of the development of CDISC controlled terminology for PK parameters and PK-parameter units will be provided.
DS11 : Experiences in Leading a Company-Wide First CDISC Filing from a Programming Perspective
Sho-Rong Lee, Genentech, Inc.
Wednesday, 10:15 AM - 10:45 AM, Location: Continental 1
The FDA first began accepting CDISC submissions about a decade ago. Some companies have CDISC filing experience, others are in the active preparation phase, and others are currently evaluating their capacity to launch the processes. Many factors need to be considered prior to launching a CDISC filing: support from management and filing team, implementation of SDTM data mapping and ADaM analysis data derivations, CDISC awareness training in various functions, eSubmission CRT metadata documents and structure, annotated CRF, and Controlled Terminology, just to name a few. This paper will describe an approach to initiating a company-wide first CDISC submission, the design and construction of CDISC tools and processes, and discuss challenges, successes, and lessons-learned. The goal is to present insights gleaned from the recent CDISC submission experience of Genentech, a member of the Roche Group, thereby benefiting the industry.
DS13 : “Analysis-ready” – Considerations, Implementations, and Real World Applications
Ellen Lin, Amgen Inc
Beiying Ding, Amgen Inc
Tuesday, 2:15 PM - 2:45 PM, Location: Continental 1
“Analysis-ready” is one of the widely accepted key principles that have been applied in clinical trial analysis database design. It is also emphasized in CDISC ADaM model v2.1 and implementation guide v1.0. However, there still remains ambiguity in some aspects of “analysis-ready” concept and implementation. This is largely due to the often very diverse nature of collected data and planned analyses. The lack of consensus in the industry on the concept of analysis and the application scope of ”analysis-ready” principle has also led to drastically different interpretation/implementation. In some situations, programming logic such as complicated data derivation dependency may affect implementation of this principle as well. In the past years working on ADaM database design and regulatory submissions, we have encountered numerous challenges and decision makings trying to maximize “analysis-ready” potential in practice. In this paper we share our experiences and interpretations of “analysis-ready” for various purposes such as descriptive summary, statistical modeling, manual data review, special considerations for adhoc analysis and data integration. We also provide real world examples to highlight approaches employed in some typical data and analysis scenarios. All discussions are in the context of ADaM and SAS programming.
DS14 : A Complex ADaM dataset? Three different ways to create one.
Kevin Lee, Cytel
Tuesday, 4:00 PM - 4:30 PM, Location: Continental 1
The paper is intended for Clinical Trial SAS® programmers who create and validate a complex ADaM dataset. Some of ADaM datasets require the complex algorithms. These algorithms could require the several steps of data manipulations and more than one SDTM. It is really difficult to create the complex ADaM dataset according to ADaM data structures. Furthermore, it is as much difficult to validate those ADaM datasets. The paper will introduce three different ways to create the complex ADaM dataset. First way is to create ADaM from SDTM directly without any intermediate permanent datasets. The second way is to create ADaM thru the intermediate permanent datasets like SDTM+ or ADaM+ from SDTM. The third way is to create the final ADaM thru the intermediate ADaM from SDTM. The paper will discuss the benefits and shortcomings of each way and also show the examples.
DS15 : Building Flexible ADaM Analysis Variable Metadata
Songhui Zhu, K & L Consulting Services
Lin Yan, Celgene Corp.
Tuesday, 3:30 PM - 4:00 PM, Location: Continental 1
Building variable metadata for a clinical study is tedious and time-consuming in the process of ADaM development. Especially, changes in variables and analysis rules from various sources make harmonization of metadata for multiple parallel studies challenging. There is a need to build meta data with great flexibility to accommodate various dynamic changes in variables and analysis rules. In this paper, methods for creating variable metadata with various flexibilities are presented. The flexibilities include that to accommodate the changes in attributes from SDTM data, that to accommodate the changes in attributes of variables in common variables, that to accommodate the changes in analysis time points, and that to accommodate the changes in analysis derivation rules. In addition, principles for grouping variables are also proposed.
DS16 : Common Misunderstandings about ADaM Implementation
Nate Freimark, Theorem Clinical Research
Susan Kenny, Amgen
Jack Shostak, Duke Clinical Research Institute
John Troxell, Consultant
Tuesday, 4:30 PM - 5:30 PM, Location: Continental 1
The December 2009 release of Version 1.0 of the Analysis Data Model Implementation Guide (ADaMIG), and its endorsement by the United States Food and Drug Administration (FDA), has lead to widespread industry implementation. As sponsors attempt to implement the ADaM standard, there has been some variety in interpretation of aspects of the ADaMIG. The authors describe some common difficulties with implementation of ADaM, and explain best practices. Variables AVALC and DTYPE, the relationship between PARAM and AVAL, and other aspects of the ADaMIG are discussed. In each case, the authors attempt to clarify the intent of the ADaMIG, in order to assist current implementers and to set the stage for improved clarity in the next version of the standard.
DS17 : Creating Integrated Summary of Safety Database using CDISC ADaM : Challenges, Tips and Things to Watch Out For.
Rajkumar Sharma, GENENTECH
Tuesday, 2:45 PM - 3:15 PM, Location: Continental 1
Most individual trials are not powered to identify trends and rare adverse events. By combining trails, rare events can be detected and a better picture can be obtained of the general safety profile of a drug. Creating an Integrated Safety Database by pooling studies together can facilitate this. Pooling information across multiple studies into an Integrated Safety Database generally requires that relevant data from all studies have the same structure and metadata standards. Since, CDISC structure is better defined; it can be used to create Integrated Safety Database with little ease if the studies pooled to build Integrated Safety Database are in CDISC SDTM and ADaM format. However, there are challenges and issues to watch out for. This paper will cover in detail on how to create Integrated Safety Database, challenges associated with it, things to watch out for and tips.
DS18 : CDISC ADaM Application: Does All One-Record-per-Subject Data Belong in ADSL?
Sandra Minjoe, Octagon Research Solutions
Wednesday, 8:00 AM - 9:00 AM, Location: Continental 1
The CDISC (Clinical Data Interchange Standards Consortium) ADaM (Analysis Data Model) team has developed a one-record-per-subject structure called ADSL (Analysis Data Subject Level). The ADaM IG (Implementation Guide) version 1.0 describes many variables that are commonly used in ADSL, suggests that sponsors include additional variables that describe the subject’s trial experience, but warns against including too much in this structure. This paper helps implementers determine what one-record-per-subject data should be included in ADSL, describes cases when it would be advisable to instead put this data in another structure, and gives examples/applications of other structures that could be used. Traceability is stressed, both across analysis datasets and back to SDTM data. Some familiarity with SDTM and ADaM is assumed.
DS19 : Multiple Applications of ADaM Time-to-Event Datasets
Huei-Ling Chen, Merck
Helen Wang, Merck
Tuesday, 11:15 AM - 11:45 AM, Location: Continental 1
The CDISC ADaM team has recently created guidelines for Time to Event (TTE) datasets. In the guidelines, a TTE dataset is specifically used to support survival analysis. This paper demonstrates that the TTE dataset can be used to support Exposure-Adjusted Incidence Rate (EAIR) analysis. This exploratory use of the ADaM TTE datasets structure filled in a void in the CDISC ADaM analysis dataset structure for EAIR analysis. Detailed procedure and code for generating an EAIR analysis dataset will be presented. In addition, step-by-step instructions for generating a TTE analysis dataset and survival analysis result will also be covered in this paper.
DS20 : An Innovative ADaM Programming Tool for FDA Submission
Xiangchen (Bob) Cui, Vertex Pharmaceuticals, Inc.
Min Chen, Vertex Pharmaceuticals, Inc.
Tathabbai Pakalapati, Vertex Pharmaceuticals, Inc.
Wednesday, 10:45 AM - 11:15 AM, Location: Continental 1
It is a good practice to include data definition tables (define.xml) and a reviewer’s guide along with ADaM datasets to minimize the time to familiarize with submitted clinical data and expedite the approval process by FDA reviewers. It is important to ensure consistency in metadata among data definition tables, reviewer’s guide and ADaM datasets. This paper describes automated ADaM Programming Tool, consisting of six SAS macros, to streamline the process of creating programming specification, compliance checking of specifications with FDA and CDISC requirements, deriving ADaM datasets and generating define.xml and a reviewer’s guide. The tool also automates the processes of version control of specifications, consistency checking of controlled terminology and value level metadata between ADaM and define files, detection of empty variables in ADaM datasets, preparation of batch files, and addition of core variables to both all ADaM datasets and define.xml at final run thereby achieving accuracy and efficiency.
DS21-SAS : Creating Analysis Data Marts from SDTM Warehouses
Frank Roediger, SAS
Tuesday, 8:00 AM - 9:00 AM, Location: Continental 1
Getting clinical trial data into SDTM domains is only the first hurdle in the race. The next obstacle is being able to combine separate trials’ SDTM domains so that they can be analyzed. On the surface, the task of combining SDTM data sets is straightforward – simply stack each trial’s domain with the corresponding domain from all the other trials. However, as for any other type of data warehouse that is loaded over time, careful attention needs to be given to assure that strict version control is in place for field attributes, controlled terminologies, and thesaurus-based data such as MedDRA. This paper describes a two-step approach that has been used to build SDTM warehouses and then to extract analysis data marts from them. The first step is to create two comprehensive warehouses: one for ongoing trials and another for completed trials. The second step is to create analysis data marts by extracting from the warehouses the domains for a set of homogeneous trials. Analysts develop the criteria for homogeneity using information in the trials’ Trial domains (TA, TE, TI, TS, & TV). These criteria drive an extract process that creates an analysis data mart of SDTM domains that contain only the records for the trials that meet the analyst’s criteria.
DS22-SAS : Harnessing the Power of SAS ISO 8601 Informats, Formats, and the CALL IS8601_CONVERT Routine
Kim Wilson, SAS
Wednesday, 9:00 AM - 10:00 AM, Location: Continental 1
Clinical Data Interchange Standards Consortium (CDISC) is a data standards group that governs clinical research around the world. This data consists of many date, time, datetime, duration, and interval values that must be expressed in a consistent manner across many organizations. The International Organization for Standardization (ISO) approved the ISO 8601 standard for representing dates and times, and this standard is compliant with CDISC. This paper addresses how to create and manage ISO 8601 compliant date, time, and datetime values in a CDISC environment. The paper also discusses the computation of durations and intervals. Examples that use the SAS call routine CALL IS8601_CONVERT and other programming logic are also provided, along with helpful tips and suggestions. In addition, the paper presents solutions to some common date and time problems, such as handling missing date components.
Data Visualization and GraphicsDG01 : Proficiency in JMP Advanced Visualization
Charles Edwin Shipp, Consider Consulting, Inc.
Wednesday, 9:00 AM - 10:00 AM, Location: Continental 2
The premier tool for robust statistical graphics is JMP. It combines easy navigation, advanced algorithms, and graphical output you can trust. After a brief introduction of JMP navigation and resources within the JMP software, we take a tour of classical and modern advanced graphical capability of JMP. We then introduce case studies that show the power of JMP, ending with graphics and analysis. To conclude, we cover directions in training and JMP user exchange.
DG02 : Statistical Graphics for Clinical Research Using ODS Graphics Designer
Wei Cheng, Isis Pharmaceuticals, Inc.
Tuesday, 10:15 AM - 11:15 AM, Location: Continental 2
Statistical graphics play an important role across various stages in clinical research. They help investigators to explore and understand raw data in the early stage of statistical analysis, as well as present final analysis result in the formal publications. The graphs need to be specifically designed and carefully drawn to best represent data and analysis. While this can be done by SAS® programming using traditional SAS DATA steps and SAS/GRAPH procedures, the process is time consuming and time is spent to find the right options or annotation syntax. With SAS 9.2M3, ODS Graphics Designer (Designer) becomes production software. This is an interactive “point and click” application that can be used to design and create custom graphs. This new application greatly enhances the ability to effectively generate statistical graphs for clinical research. In this hands-on workshop, we will show you the application interface and walk you through creating some commonly used statistical graphs for clinical research. The intended audience doesn’t need to know SAS/GRAPH syntax, but wants to create high-quality statistical graphs for clinical trials. Examples will use scrambled data from real world in CDISC format.
DG03 : Building Reports and Documents in SAS® Enterprise Guide 4.2®: How to Create a Single Document from Multiple Reports and Tasks
R. Scott Leslie, MedImpact Healthcare Systems, Inc.
Monday, 10:15 AM - 11:15 AM, Location: Continental 1
The new report building functions of SAS® Enterprise Guide® allow greater ability to create and publish final documents from multiple project tasks and reports. This tutorial demonstrates how to leverage the power of SAS while using the handy features of Enterprise Guide to create a process flow for generating reports. Topics covered include: navigating the Enterprise Guide 4.2 menus, creating a report with the Report Builder wizard, combining reports and tasks into a single document using Report layout, and using prompts to generate custom reports from a template.
DG04 : iRobot: Listing Creator
David Gray, PPD
Zhuo Chen, PPD
Suneela Gogineni, PPD
Wednesday, 10:15 AM - 11:15 AM, Location: Continental 2
This paper introduces a method to automate the creation of a set of SAS listing programs, each generating a listing based on input specifications. A MS Word macro inputs the listing specifications and outputs a MS Excel file containing the key information needed to generate the listing programs. A SAS macro reads the MS Excel file and creates a set of SAS programs, which then can be further customized.
DG05 : A Multifaceted Approach to Generating Kaplan-Meier and Waterfall Plots in Oncology Studies
Stacey D. Phillips, Pharmanet/i3
Monday, 9:15 AM - 9:45 AM, Location: Continental 1
The new Statistical Graphics (SG) procedures and Graph Template Language (GTL) available in SAS version 9.2 have greatly increased the flexibility and ease with which graphs and figures are generated from the SAS System. This paper will demonstrate multiple methods of generating the commonly utilized Kaplan-Meier and waterfall plots from oncology data. Examples and discussion will range from traditional SAS/Graph procedures such as PROC GPLOT to Statistical Graphics (PROC SGPLOT) and ODS with GTL.
DG06 : Creating SAS® Report Outline Using the Output Delivery System for Online Review
Sue Zhang, Abbott Nutrition
Monday, 9:45 AM - 10:15 AM, Location: Continental 1
The destination function of the output delivery system (ODS) RTF or PDF in SAS® permits the user to create linked table of contents (TOC) pages or bookmarks that, with a single mouse click, can take the user directly to the desired SAS output. It allows the user to easily move around the document online via the TOC pages or the bookmarks instead of going back and forth and scrolling through the long SAS report. This paper presents step-by-step procedures for applying ODS CONTENTS and TOC_DATA options and ODS PROCLABEL statement to generate the informative TOC pages and the bookmarks about the SAS report for fast online review. Methods are detailed for 1. Generating the table titles in the TOC pages and the Document Map in Word 2. Presenting the data set names in the TOC pages and the Document Map in Word for the data set documentation 3. Displaying the discrepancy notices in the bookmarks in a PDF file for the two data sets compared using PROC COMPARE during the clinical study validation process. Data presented in this paper are either directly obtained or modified from SAShelp.class for demonstration purposes only. PC SAS version 9.2 and Microsoft® Office Word 2003 on Windows XP Professional version 2002 were used in preparing this paper.
DG07 : Easy-to-use Macros to Create a Line Plot with Error Bars Including Functions of Automated Jittering and Adjusting Axis Ranges
Andrew Miskell, GlaxoSmithKline
Peixin Sun, GlaxoSmithKline
Yufei Du, GlaxoSmithKline
Monday, 11:15 AM - 11:45 AM, Location: Continental 1
This paper presents three macros to create a line plot with error bars that includes automatic jittering and adjusting axis ranges within the plot. The purpose of these three macros is 1) to create a mean or median line plot with error bar of standard error, standard deviation, or range that has individual panels for each value of a parameter and also pages by a different parameter; 2) to automatically jitter the X-axis variable based on the number of lines in the graph; 3) to dynamically define axis ranges that are identical within each page-by parameter yet vary across page-by parameters with options to incorporate up to two reference lines in the range. The jittering and the axis range macros are stand alone macros and can be used apart from the first macro as well. SAS version 9.1 is used for these macros. Windows platform needed and only tested on version 9.1 and above. Basic SAS skills with understanding of macro calls needed.
DG08 : Personalized Risk Graphs for Intervention Trials – in Technicolor! Using PROC GKPI and CALL EXECUTE to Get the Job Done
Janet Grubber, Durham VA HSR&D
Cynthia Coffman, Durham VA HSR&D, Duke University Medical Center
Will Yancy, Durham VA HSR&D, Duke University Medical Center
Corrine Voils, Durham VA HSR&D, Duke University Medical Center
Wednesday, 8:00 AM - 8:30 AM, Location: Continental 2
Research investigators and clinicians are often interested in producing personalized risk graphs for research participants or patients. We used the GKPI procedure in SAS version 9.2 to produce batches of risk graphs for a randomized trial of an intervention to communicate participants’ individual risks for developing type 2 diabetes based on demographic, clinical, and genetic risk factors. We pilot-tested a variety of potential graphs with 35 participants to determine, based on participant feedback, which graphs were most easily understood and most clearly communicated risk. Graphs created by PROC GKPI can only be produced by hard coding the data values within the procedure rather than by passing the values to the procedure through the “DATA=Data Set Name” option. We present a macro and CALL EXECUTE code to demonstrate a method for producing batches of graphs without having to hard code the data values within the GKPI procedure. This paper presents the final versions of these risk graphs and the associated SAS code for producing them. It is targeted towards users with an intermediate level knowledge of SAS programming.
DG10 : Taming the Box Plot
Sanjiv Ramalingam, Vertex Pharmaceuticals Inc.
Tuesday, 3:30 PM - 4:00 PM, Location: Continental 2
Box plots are used to portray the range, quartiles and outliers if any in the data. PROC BOXPLOT can be used to obtain the necessary graphic but certain situations may arise where direct application of the procedure and use of the plethora of options that come along with it may not be sufficient to get the desired graph. One such application is discussed in detail. This should empower the reader upon understanding to create any box plot that the reader may wish. The methodology discussed assumes that the reader has at least a modicum of understanding of the annotate feature.
DG11 : Gilding the Lily: Boutique Programming in TAGSETS.RTF
Rohini Rao, Omeros Corporation, Seattle, WA
Paul Hamilton, Omeros Corporation, Seattle, WA
Tuesday, 4:00 PM - 4:30 PM, Location: Continental 2
PROC REPORT has long been a powerful and flexible tool in clinical programming for producing tables and listings. The ODS TAGSETS.RTF destination was introduced in SAS version 9.2, enabling programmers to produce higher quality output with less effort. This paper demonstrates a few tips and tricks to subtly enhance the æsthetics of the reports. Topics covered include: the use of decimal tabs to properly align numerical output; the usage of Unicode to insert special characters; the deletion of excess blank lines at the top and bottom of a page; the insertion of page-specific footnotes; varying the number of lines per page to improve the flow of output; and the repetition of a header row when the body of the section spans multiple pages. SAS version 9.2 on the Windows platform is used in this paper. It is catered towards an audience with an intermediate level of SAS and ODS knowledge.
DG12 : Is the Legend in your SAS/Graph(R) Output Still Telling the Right Story?
Alice Cheng, Independent Consultant
Justina Flavin, SimulStat, Inc.
Tuesday, 11:15 AM - 11:45 AM, Location: Continental 2
In clinical studies, researchers are often interested in the effect of treatment over time for multiple treatments or dosing levels. Usually, in a graphical report, the measurement of treatment effect is in the vertical axis and a second factor, such as time and visit number, in the horizontal axis. Multiple lines are displayed in the same figure; each line represents a third factor, such as treatment or dosing groups. It is critical that the line appearances (color, symbol and style) are consistent throughout the entire clinical report, as well as, across clinical reports from related studies. Flavin and Carpenter (2004) showed that the GPLOT procedure, by default, did not guarantee consistency in line appearances. They provided macro and non-macro solutions to this problem. With the introduction of Statistical Graphics in SAS® version 9.3, there are multiple approaches to resolve this problem. In this paper, the authors will cover the following topics: The Nature of SGPLOT (How are Line Attributes Assigned?), Re-visit the Line Inconsistency Problem by means of SGPLOT Procedure, 5 Solutions to Resolve Line Inconsistency Issues, The SGPANEL Procedure (as a Special Case of SGPLOT Procedure).
DG13 : Waterfall Charts in Oncology Trials - Ride the Wave
Niraj Pandya, Element Technologies Inc.
Wednesday, 8:30 AM - 9:00 AM, Location: Continental 2
In oncology trials, overall drug response is usually analyzed using Response Evaluation Criteria in Solid Tumors (RECIST) criteria. Using these criteria, results are expressed in 4 to 5 different categories. By collapsing tumor response in to different categories, it reduces the amount of data available for interpretation. Waterfall chart is one way to address this issue. It is a way of plotting the tumor size across the entire spectrum of much broader view of results. Waterfall chart is a data visualization technique that depicts how a value increases or decreases for parameter of interest. Waterfall chart is gaining recognition recently for the reporting of overall drug response in Oncology clinical trials because of its simple yet powerful representation of results. It can be used for performance or response analysis, especially for understanding or explaining the overall response of a parameter which can vary with the effect of multiple factors. SAS® offers multiple solutions to produce waterfall charts and in this article we will focus on the use of recently introduced SGPLOT procedure for creation of waterfall chart.
DG14-SAS : Creating Statistical Graphics in SAS®
Warren Kuhfeld, SAS
Tuesday, 1:15 PM - 3:15 PM, Location: Continental 2
Effective graphics are indispensable in modern statistical analysis. SAS provides statistical graphics through ODS Graphics, functionality used by statistical procedures to create statistical graphics as automatically as they create tables. ODS Graphics is also used by a family of procedures designed for graphical exploration of data. This tutorial is intended for statistical users and covers the use of ODS Graphics from start to finish in statistical analysis. You will learn how to: * Request graphs created by statistical procedures * Use the SGPLOT, SGPANEL, SGSCATTER, and SGRENDER procedures in to create customized graphs * Access and manage your graphs for inclusion in web pages, papers, and presentations * Modify graph styles (colors, fonts, and general appearance) * Make immediate changes to your graphs using a point-and-click editor * Make permanent changes to your graphs with template changes
DG15-SAS : Don't Gamble with Your Output: How to Use Microsoft Formats with ODS
Cynthia Zender, SAS
Tuesday, 4:30 PM - 5:30 PM, Location: Continental 2
Are you frustrated when Excel does not use your SAS® formats for number cells? Do you lose leading zeros on ZIP codes or ID numbers? Does your character variable turn into a number in Excel? Don‘t gamble with your output! Learn how to use the HTMLSTYLE and TAGATTR style attributes to send Microsoft formats from SAS to Excel. This paper provides an overview of how you can use the HTMLSTYLE attribute with HTML-based destinations and the TAGATTR attribute with the TAGSETS.EXCELXP destination to send Microsoft formats from SAS to Excel using the Output Delivery System (ODS) STYLE= overrides. Learn how to figure out what Microsoft format to use and how to apply the format appropriately with ODS. A job aid is included in the paper; it lists some of the most common Microsoft formats used for numeric data. The examples in this paper demonstrate PROC PRINT, PROC REPORT, and PROC TABULATE coding techniques. Other job aids are provided that list some of the most common style attributes used in STYLE= overrides and show how to investigate Microsoft formats.
Hands-On WorkshopsHW01 : An Introduction to the Clinical Standards Toolkit and Clinical Data Compliance Checking
Mike Molter, d-Wise Technologies
Monday, 1:15 PM - 3:15 PM, Location: Plaza A
Since the dawn of CDISC, pharmaceutical and biotech companies as well as their vendors have tried to inject standards compliance checking into their clinical and statistical programming flows. Such efforts are not without their challenges, both from technical as well as process standpoints. With the production of data sets and tabular results taking place inside of SAS programs, it’s tempting to add code to this flow that performs these checks. While the required code can be relatively straightforward for SAS programmers with even modest programming and industry experience, all too often the management of such code and the processes around its use is where the difficulties occur. Without proper management, seemingly simple tasks such as selecting which checks to run or changing process parameters become more complicated than necessary. The Clinical Standards Toolkit (CST) is an attempt by SAS to build a stable framework for the consistent use of BASE SAS around the process of standards compliance checking by striking the proper balance between the flexibility of BASE SAS and the needed discipline of process parameter management. In this workshop we will take a tour of the CST components and execute compliance checks under multiple circumstances set by users in these components. In the end, users should know not only how to set up programs to achieve this task, but also how to manipulate files to make this process work for their own needs.
HW02-SAS : Using the SAS® Clinical Standards Toolkit 1.4 for define.xml creation
Lex Jansen, SAS
Monday, 3:30 PM - 5:30 PM, Location: Plaza A
When submitting clinical study data in electronic format to the FDA, not only information from trials has to be submitted, but also information to help understand the data. Part of this information is a data definition file, which is the metadata describing the format and content of the submitted data sets. When submitting data in the CDISC SDTM format it is required to submit the data definition file in the Case Report Tabulation Data Definition Specification (CRT-DDS, also known as define.xml) format as prepared by the CDISC define.xml team. This workshop will provide an introduction to the structure and content of the define.xml file. The SAS® Clinical Standards Toolkit will then be used to create the define.xml file. The workshop will highlight new support for value level metadata and a printable define.xml (define.pdf).
HW03 : Creating the ADaM Time to Event Dataset: The Nuts and Bolts
Nancy Brucken, Pharmanet/i3
Paul Slagle, UBC
Tuesday, 3:30 PM - 5:30 PM, Location: Plaza A
The ADaM Basic Data Structure can be used to create far more than just laboratory and vital signs analysis datasets. Often, the biggest challenge is the development of efficacy datasets, and of the commonly-used efficacy datasets, creation of a time-to-event (TTE) dataset presents many interesting problems. These TTE datasets are frequently used in survival analysis, for example, to generate Kaplan-Meier curves for oncology reports. As a continuation of this series, we will provide directions for creating the ADaM-based TTE dataset, as well as help in using that structure to support both validation and statistical review of the results. Prior experience with ADSL and creation of other BDS datasets is expected.
HW04 : Ready To Become Really Productive Using PROC SQL?
Sunil Gupta, Gupta Programming
Monday, 9:15 AM - 11:15 AM, Location: Plaza A
Using PROC SQL, can you identify at least four ways to: select and create variables, create macro variables, create or modify table structure, and change table content? Learn how to apply multiple PROC SQL programming options through task-based examples. This hands-on workshop reviews topics in table access, retrieval, structure and content, as well as creating macro variables. References are provided for key PROC SQL books, relevant webinars, podcasts as well as key SAS® technical papers.
HW05 : SAS® Macro Programming Tips and Techniques
Kirk Paul Lafler, Software Intelligence Corporation
Tuesday, 10:15 AM - 12:15 PM, Location: Plaza A
The SAS® Macro Language is a powerful tool for extending the capabilities of the SAS System. This hands-on workshop presents numerous tips and tricks related to the construction of effective macros through the demonstration of a collection of proven Macro Language coding techniques. Attendees learn how to process statements containing macros; replace text strings with macro variables; generate SAS code using macros; manipulate macro variable values with macro functions; handle global and local variables; construct arithmetic and logical expressions; interface the macro language with the DATA step and SQL procedure; store and reuse macros; troubleshoot and debug macros; and develop efficient and portable macro language code.
HW06 : Doing More with the Display Manager: From Editor to ViewTable - Options and Tools You Should Know
Arthur Carpenter, CA Occidental Consultants
Wednesday, 8:00 AM - 10:00 AM, Location: Plaza A
If you have used the interactive interface for SAS® you have used the Display Manager. As it ships, the Display Manager is very powerful and yet fairly easy to use with a minimal learning curve for the new user. Because it is functional ‘right out of the box’, most users do very little to customize the interface. This is a shame, because the Display Manager contains a great many hidden opportunities to make it even more powerful, even easier to use, and customized for your way of using the interface. The Display Manager builds a variety of windows, screens, and dialogue boxes to facilitate communication between SAS, the Operating System, and the user. For each of the five primary windows and extending to the dozens of secondary windows there are options that control the content, display, and level of interaction. Options have defaults and a majority of these can be easily changed to suite your needs. You think that you know the Display Manager, but you will be amazed at what you have yet to learn. From simple tricks that will save you hours of work , to embedding tools and macros in the Enhanced Editor, there is so very much more that we can do in the Display Manager.
HW07 : SDTM, ADaM and define.xml with OpenCDISC(R)
Matthew Becker, PharmaNet/i3
Tuesday, 1:15 PM - 3:15 PM, Location: Plaza A
Standards are an ongoing focus of the health care and life science industry. Common terms you will see and hear during industry conferences include: “SDTM,” “ADaM,” “ODM,” “LAB,” “SEND,” and “define(.xml/.pdf).” What do these terms mean? How do we create and validate these standards before submission to a client OR the FDA? Is there an easier way to ensure compliance? As a user, many of us have spent hours reading the SDTM/ADaM standards and implementation guides to generate “compliant” SAS data sets for our clinical studies. We have spent countless hours having another user QC our data structures...but is there an easier way? In this Hands-On Workshop (HOW) we are going to briefly describe a few of the key terms (SDTM, ADaM, define) and investigate the use of OpenCDISC Validator to perform the following tasks: 1. Validate SDTM 3.1.1 SAS data sets 2. Validate SDTM 3.1.2 SAS data sets 3. Validate ADaM 1.0 SAS data sets 4. Generate define.xml
HW08-SAS : An Introduction to Creating Multi-Sheet Microsoft Excel Workbooks the Easy Way with SAS®
Vincent DelGobbo, SAS
Tuesday, 8:00 AM - 10:00 AM, Location: Plaza A
This paper explains how to use Base SAS® 9 software to create multi-sheet Excel workbooks (for Excel versions 2002 and later). You learn step-by-step techniques for quickly and easily creating attractive multi-sheet Excel workbooks that contain your SAS output using the ExcelXP ODS tagset and ODS styles. The techniques that are presented in this paper can be used regardless of the platform on which SAS software is installed. You can even use them on a mainframe! Creating and delivering your workbooks on-demand and in real time using SAS server technology is discussed. Although the title is similar to previous papers by this author, this paper contains new and revised material not previously presented.
Health Outcomes and EpidemiologyHO01 : Multiple Techniques for Scoring Quality of Life Questionnaires
Brandon Welch, Rho, Inc.
Seungshin Rhee, Rho, Inc.
Tuesday, 8:00 AM - 8:30 AM, Location: Continental 3
In the clinical trials computing environment, data sets come in a variety of shapes and sizes. From laboratory data to electrocardiogram (ECG) measurements, transforming raw data to analysis-ready SAS® data sets is often complicated. New challenges arise when we receive data collected from quality of life (QOL) questionnaires. With these data we often compute scores that measure underlying constructs – such as mental or social well-being. There are many different types of questionnaires, and it is advantageous to have an arsenal of programming tools when calculating the appropriate scores. In this article, we present a mock questionnaire and common techniques to achieve appropriate calculations. Depending on the input data structure, we illustrate how to calculate scores using various techniques including ARRAY processing, PROC SQL, and simple SAS functions. The techniques we present offer a good overview of basic data step programming and SAS procedures that will educate SAS users at all levels.
HO02 : 7 Steps to Insights Using SAS: Progression Free Survival
Karen Walker, Walker Consulting LLC
Tuesday, 8:30 AM - 9:00 AM, Location: Continental 3
7 steps to Progression Free Survival Insights using SAS" will show how to take any oncology source data, render it to tumor domains, create both subject level and parameter level analysis data then subsequently produce stunning insightful documents with SAS that can be used to find a cure for cancer. A growing number of people are affected by cancer, as thus we need as many persons as possible to understand it. Because I feel so strongly about this, I'm giving up the goods on the best work I know. You'll find in this paper a collection of methods, and programs that are so intuitive, and eloquently assembled that anyone who can read will understand my process. Perhaps there's a talented doctor who has no time to tinker with a computer program, and yet is close to finding the right treatment. This paper is for that person. My aim is to make CDISC SDTM and ADaM data rendering follow easier. So this paper illustrates the natural progression from: data gathering, to analysis, to insights.
HO04 : Tumor Assessment using RECIST: Behind the Scenes
Suhas R. Sanjee, Merck Sharp & Dohme Corp.
Rajavel Ganesan, Merck Sharp & Dohme Corp.
Mary N. Varughese, Merck Sharp & Dohme Corp.
Tuesday, 9:00 AM - 9:30 AM, Location: Continental 3
Tumor assessment is a vital part of drug discovery in our Oncology franchise. Most of the efficacy endpoints such as Progression Free Survival, and Time to Progression are based on tumor assessments. Tumor assessments can be quantitative and qualitative. Various steps involved in quantitative tumor assessments are discussed in this paper. In quantitative tumor assessment, patient scans/images are fed into an algorithm, which quantifies the tumor based on its characteristics. A number of image processing algorithms are available for tumor quantification from Computed Tomography (CT), Positron Emission Tomography (PET), Magnetic Resonance Imaging (MRI) and other modalities. These algorithms are traditionally not implemented in SAS® and the statistical programmer will have these assessments pre-calculated in SDTM data for analysis. This paper describes the implementation of a simple tumor quantification and assessment algorithm using SAS procedures. Once the tumor is quantified, tumor assessments are derived based on the RECIST criteria. Implementing an end to end tumor assessment process in-house using SAS provides the capability to perform sophisticated exploratory analysis of assessments derived from patient scans. Another advantage is that the pharmaceutical / medical device companies can submit the SAS code used for end to end tumor assessment to the regulatory agencies and thus enhance transparency. This is especially helpful when primary endpoints and key secondary endpoints of the trial are derived from tumor assessments.
HO06 : Are You Discrete? Patients' Treatment Preferences and the Discrete Choice Experiment
Beeya Na, ICON Late Phase & Outcomes Research
Eric Elkin, ICON Late Phase & Outcomes Research
Tuesday, 9:30 AM - 10:00 AM, Location: Continental 3
The discrete choice experiment (DCE) was designed for use in economics and marketing research to study consumer preferences. However, DCE has been increasingly used in health care research as a method to elicit patient preferences for characteristics of different types of treatments. In a DCE, attributes are defined for treatments (for example: frequency of administration, occurrence of side effects, how long treatment effect lasts) and levels of the attributes (for example: taking one pill once a week, once a day, or twice a day). Respondents are presented with pairs of hypothetical treatments with different combinations of each attribute level and are asked to choose their preferred treatment. Analyzing the responses allows evaluation of the relative importance of the attributes and the trade-off respondents are willing to make between the attributes. This talk will explain how to set up the data and discuss the appropriate analysis using the conditional logit model (PROC PHREG and PROC LOGISTIC).
HO07 : Using the SAS System as a bioinformatics tool: a macro that calls the standalone Basic Local Alignment Search Tool (BLAST) setup
Kevin Viel, None
Tuesday, 10:15 AM - 11:15 AM, Location: Continental 3
Aligning DNA and RNA sequences is an integral task in genomics research. For brief and practical purposes, DNA and RNA can be thought of as character strings comprised of A, C, G, and T (DNA) or U (RNA). Within the constraints of the SAS System, namely 32,767 bytes for a character variable, patterns can be matched or “aligned” using, for example, regex expressions (PRXMatch). This size limit typically suffices even for the state of the art sequencing projects, in which read lengths may be well under 15,000 base-pairs. For larger sequences, a SAS programmer may have to consider calling a perl program or use a tool like Basic Local Alignment Search Tool (BLAST), a popular tool for alignments. An example of an alignment to a sequence larger than the SAS limit might be the need to determine the start position of a primer within a gene, for instance 186 Kb F8. The National Center for Biotechnology Information (NCBI) of National Institutes of Health (NIH) provides the blast+ software package. The goals of this paper are 1) to describe a SAS macro that calls BLASTn using the X statement, 2) demonstrate an example alignment, and 3) illustrate how the macro can be used to create primer and amplicon names. As separated paper will demonstrate how to use the BLASTn results to genotype and create database entries in an integrated suite of SAS programs used in a ongoing genomics investigation.
HO08 : ESTIMATING RISK IN THE PRESENCE OF UNREPORTED ZEROS IN THE DATA
Brandon Fleming, University of Maryland, Baltimore County (UMBC) - Department of Mathematics and Statistics
Tuesday, 11:15 AM - 11:45 AM, Location: Continental 3
When evaluating risk of injury (exposure) using sample data alone, it is desirable to have as the denominator the total population size. However, if the sample itself contains only information about the frequency of exposure (e.g. frequency of N=1 exposure, 2, 3, etc.), and those who have never been exposed would not be entered into the database, how is one to ascertain the unreported zeros (frequency of N=0)? This paper explores the use of PROC NLMIXED in estimating the risk using the truncated Poisson assumption. The performances of three candidate estimators are analyzed with simulated data that contains frequency information. PROC NLMIXED is used first to identify the truncated Poisson mean, and then to estimate the frequency of N=0, i.e. those without injury or exposure (unreported zeros).
Industry BasicsIB01 : Adapt and Survive: Implementing the new ICH Development Safety Update Report (DSUR)
Rajkumar Sharma, Genentech Inc., A member of Roche Group
Patricia Gerend, Genentech Inc., A member of Roche Group
Wednesday, 8:00 AM - 8:30 AM, Location: Continental 3
In 2011, the International Conference on Harmonization (ICH) rolled out its E2F Development Safety Update Report (DSUR) guideline. The DSUR is similar to the EU’s Annual Safety Report (ASR) and the US’s Investigational New Drug Annual Report (IND-AR) in that its purpose is to provide brief overview of safety for a project on an annual basis so health authorities can better make decisions to protect the safety of patients. However, there are some significant differences in content between DSUR and previous annual reports. Since DSUR implementation is new to all companies, getting a handle on best practices is a wide-spread challenge. This paper will address the major highlights of DSUR content and suggest some ways of producing these reports efficiently within a company.
IB02 : Risk Evaluation and Mitigation Strategies (REMS) and How They Impact Biostatistics and Statistical Programming Groups
Arthur Collins, Biogen Idec, Inc.
Wednesday, 10:15 AM - 10:45 AM, Location: Continental 3
The Food and Drug Administration Amendments Act (FDAAA) of 2007 expanded the regulatory authority of the FDA with regard to risk management programs, creating a new program called Risk Evaluation and Mitigation Strategies (REMS). The REMS requirements build on the Risk Minimization Action Plans (RiskMAP) which were instituted in 2005. As a result, the requirements placed on drug makers for post-marketing safety and risk assessment have been growing. Approximately one-third of new drug approvals have REMS associated with them, 16 of the 30 existing RiskMAP programs have been transitioned to REMS programs, and the FDA may be moving towards using REMS for mitigation of off-label product use. This leads to more work for departments supporting REMS activities including Biostatistics and Statistical Programming. Often the types of data that must be collected and analyzed for REMS are non-standard, less controlled, and may come from an assortment of sources with which these groups do not typically work. While REMS present new challenges to drug makers, they also bring opportunities including enhanced labeling, better communication with patients and providers, and strengthened ties to healthcare professionals. These opportunities can be attractive to many functional areas in the company, including Regulatory Affairs, Medical Affairs, Marketing, Drug Safety, and Clinical Development. In this paper I will present some background and history of the FDA’s risk management efforts and some discussion of how the current requirements may be implemented, as well as some of the challenges and opportunities that they present.
IB03 : Metadata: Some Fundamental Truths
Frank DiIorio, CodeCrafters, Inc.
Tuesday, 2:15 PM - 3:15 PM, Location: Continental 3
An essential characteristic of a healthy workplace is its ability to increase and diversify its work load without compromising product quality. Workflow volume and quality can often be addressed by adding to existing staff and improving the computing environment. When the work flow becomes a work deluge, however, the need for a paradigm shift becomes apparent. Metadata – data describing corporate processes and data – is a potent tool for making this shift. This paper discusses a number of conceptual issues related to the design, implementation, and growth of metadata-based systems. It identifies situations where metadata can improve processes and suggests how to evaluate both the benefits and costs of implementation. The treatment of the topic is high-level. The reader will not learn coding and design techniques, but will gain an appreciation of the power of metadata-driven workflow, and will have an eyes wide open understanding of what resources need to be expended in order to achieve it. Although the scenarios used are from the pharmaceutical industry, the larger, take-away points are applicable to all sectors.
IB04 : Could Have, Would Have, Should Have! Adopting Good Programming Standards, Not Practices, To Survive An Audit.
Vincent Amoruccio, Alexion Pharmaceuticals
Wednesday, 9:00 AM - 10:00 AM, Location: Continental 3
The primary purpose of a Pharmaceutical is to develop and market drugs which treat or prevent disease safely and efficaciously. A major step in the licensing of a drug, in particular in the United States, is the inevitable audit by the FDA. While the FDA suggests that adherence with Good Clinical Practices (“GCP”) is a critical requirement, it falls short in providing programming standards for the SAS deliverables of a submission. SDTM and ADaM are solutions to standardizing the review of data, but not the programs themselves. The lack of regulations leaves programmers unmanaged and exposed to risk when asked to deliver SAS Programs to the FDA during a submission. While many programmers are addressing this through groups, papers, websites, and blogs there are no formal Good Programming Practices (“GPP”). Until there is, programmers must create, manage, and defend their own programming choices during an audit. It is not enough to establish programming ‘practices’ since auditors only care about what was done rather than what could have, would have or should have been done. This paper will first discuss the importance for GPP and common practices appearing in current GPP discussions. It will then discuss the difference between practices and standards and suggest ways to select practices to manage as standards. It will suggest ways check and document compliance with GPS, not GPP, and prepare for a successful FDA Audit. Finally, it will end with a call to the FDA for established programming standards.
IB05 : Problem with your SAS Program? Solutions abound!
Max Cherny, GlaxoSmithKline
Tuesday, 4:30 PM - 5:00 PM, Location: Continental 3
All SAS users at some point will come across a SAS problem that they cannot resolve on their own. This paper explains where and how to get help to solve any SAS problem. The appropriate use of such sources of help as SAS Help Facility, SAS-L Newsgroup, SAS.com's Knoweldge Base, and many others is demonstrated. Their strengths and limitations are analyzed and compared. Additionally, most SAS users do not even realize they can contact SAS technical support about almost any SAS problem. This underutilized approach of getting help directly from SAS is also explained.
IB07 : Using a Picture Format to Create Visit Windows
Richann Watson, PharmaNet/i3
Wednesday, 8:30 AM - 9:00 AM, Location: Continental 3
Creating visit windows is sometimes required for analysis of data. We need to make sure that we get the visit/day in the proper window so that the data can be analyzed properly. However, defining these visit windows can be quite cumbersome especially if they have to be defined in numerous programs. This task can be made easier by applying a picture format, which can save a lot of time and coding. A format is easier to maintain than a bunch of individual programs. If a change to the algorithm is required, the format can be updated instead of updating all of the individual programs containing the visit definition code.
IB08 : SAS Enterprise Guide Best of Both Worlds – Is it Right for You?
Sunil Gupta, Gupta Programming
Tuesday, 1:15 PM - 2:15 PM, Location: Continental 3
Whether you are new to SAS or a seasoned SAS Programmer, you still face the same dilemma. Does SAS Enterprise Guide represent the best of both worlds to make the transition to SAS easier with a point-n-click interface or enhance your productivity with over 90 tasks? Do you follow the same traditional path taken by millions who learned SAS many decades ago or do you take the yellow brick road to analyze your data? This presentation explores the vast differences between these two cultures and how they impact your programming environment. While there are numerous benefits to using SAS Enterprise Guide, there are also some caveats to keep in mind to make the transition smoother.
IB09 : Programmer's Safety Kit: Important Points to Remember While Programming or Validating Safety Tables
Sneha Sarmukadam, Pharmanet/i3
Tuesday, 5:00 PM - 5:30 PM, Location: Continental 3
With many global pharmaceutical companies/ CRO's outsourcing their work, there is a need of qualified professionals to do the clinical programming job. In order to build clinical programming capabilities, there is a requirement of well designed trainings. The training module generally consists of rigorous SAS training, followed by trainings on Safety/ Efficacy modules. This paper will provide a checklist for reporting/validation of safety data in the pharmaceutical domain. With the help of this checklist, programming/validation can become relatively easy. Amateurs in clinical programming and especially programmers without a background of clinical domain might find these guidelines to be very useful. Moreover, the safety outputs across various studies are more or less similar. Using a standard checklist can ensure accuracy and efficiency in reporting.
IB10 : You Could Be a SAS® Nerd If . . .
Kirk Paul Lafler, Software Intelligence Corporation
Wednesday, 10:45 AM - 11:15 AM, Location: Continental 3
Are you a SAS® nerd? The Wiktionary (a wiki-based Open Content dictionary) definition of “nerd” is a person who has good technical or scientific skills, but is generally introspective or introverted. Another definition is a person who is intelligent but socially and physically awkward. Obviously there are many other definitions for “nerd”, many of which are associated with derogatory terms or stereotypes. This presentation intentionally focuses not on the negative descriptions, but on the positive aspects and traits many SAS users possess. So let’s see how nerdy you actually are using the mostly unscientific, but fun, “Nerd” detector.
Management and SupportMS01 : If You Can’t Learn It From a Book, Why Are You Reading This?
Steve Noga, Rho, Inc.
Monday, 9:15 AM - 10:15 AM, Location: Continental 2
People management, an inexact science if ever there was one, is often thrust upon individuals as they move up the corporate ladder. With promotion comes added responsibility whether a person is ready for it or not. How is a person supposed to learn to manage other people? Seminars, and online courses, and books, Oh My! The choices are numerous and chances are that knowledge will be obtained. However, how did you learn to deal with missing data where none was expected? You experienced the frustration after your program produced erroneous results and you learned how to prepare for this situation the next time. Management is no different. Welcome to my observations from twenty-one years of managing SAS® programmers.
MS02 : A CareerView Mirror - Another Perspective on your Work and Career Planning
Bill Donovan, OckhamSource
Monday, 10:15 AM - 10:45 AM, Location: Continental 2
Career planning in the today’s tumultuous job market place requires a more rigorous and disciplined approach, which must begin with each individual tracking his or her particular skills and experiences. The ability to organize and inventory your entire career-related experiences is the foundation of a solid career plan. The catalog of your work assignments and functional responsibilities creates a reflection of your efforts in your career to date. All of this helps to build your CareerView Mirror.
MS03 : Consulting: Critical Success Factors
Kirk Paul Lafler, Software Intelligence Corporation
Charles Edwin Shipp, Consider Consulting, Inc.
Monday, 2:45 PM - 3:15 PM, Location: Continental 2
The Internet age has changed the way many companies, and individuals, do business – as well as the type of consultant that is needed. The consultants of today and tomorrow will require different skills than the consultants of yesterday. Today's consultant may just as likely have graduated with an MBA degree as with a technical degree. As hired advisers to a company, a consultant often tackles a wide variety of business and technical problems and provides solutions for their clients. In many cases a consultant chooses this path as an attractive career alternative after toiling in industry, government and/or academia for a number of years. This presentation describes the consulting industry from the perspective of the different types of organizations (e.g., elite, Big Five accounting firms, boutique, IT, and independent) that they comprise. Specific attention will be given to the critical success factors needed by today's and tomorrow's consultant.
MS04 : Managing Programmers across the Great Divide: An Odyssey in managing CRO's, in Big and Small Pharma, Biotech and Pharma and in NJ, CA and Boston.
Todd Case, Biogenidec, Inc.
Monday, 5:00 PM - 5:30 PM, Location: Continental 2
Managing SAS programmers has always been difficult, given the somewhat paradoxical nature of having to possess the technical nature of SAS programming, and at the same time as a Manager having to understand and, most importantly, communicate in assisting and overcoming the difficult technical challenges posed to SAS programmers in the pharmaceutical industry (a recent trend has been to analyze elevated Liver Function Tests [LFTs] administered on the same day, when there can be multiple tests on the same day, etc.). In addition, managers of SAS programmers routinely have to present to a variety of peers and stakeholders within their own companies (not to mention at industry conferences!). In addition, the recent trend in offshoring has added at least two major new difficulties in effectively managing SAS programmers: (1) having to deal with different time zones (which at times has meant trying to get a team to work all 24 hours of the day), and (2) an even more challenge to communicate effectively. This presentation will present one manager's experience in laying the groundwork for understanding the technical nature of programming. Subsequently , it will detail the travel, challenges and successes over the course of managing SAS programmers in biotech and pharmaceutical, small and large pharma and in New Jersey (home of traditional 'Big Pharma'), San Francisco (home of 'hot biotech start-ups') and now in Boston, home of all of the before mentioned industries.
MS05 : Giving Data a Voice: Partnering with Medical Writing for Best Reporting Practices
Katherine Troyer, REGISTRAT-MAPI
Brit Minor, REGISTRAT-MAPI
Monday, 10:45 AM - 11:15 AM, Location: Continental 2
An active partnership among programmers, statisticians, and medical writers throughout the reporting process promotes efficiency and quality, and assures that the product meets both scientific and regulatory objectives. Programmers and medical writers have essential but significantly non-overlapping areas of expertise, and when their functions are highly compartmentalized, they apply their skills at separate steps in the reporting procedure. Furthermore, some reporting deliverables, such as client-specific summaries of safety or other data, are produced entirely by programmers. Statisticians interact with both functional groups, and frequently can assist with communication; however, the role and timing of statistical input can vary widely among institutions and projects. Problems may arise when programming and medical writing function too independently. For example, an analysis program design may be at odds with regulatory requirements or scientific objectives. A programmer responsible for a final deliverable may have insufficient word-processing familiarity to complete the task efficiently. A medical writer may be unaware of analytical issues and/or assumptions that affect data presentation, and may incorrectly interpret and describe results. And regardless of how well integrated the study team may be, some data issues do not emerge until summaries are produced; then discussion is required in order to plan additional programming. We believe that programmers, statisticians, and medical writers should collaborate from the beginning through the end of a reporting project, to ensure that data are analyzed and presented with the least manipulation of output required in order to create a well-organized, easy-to-interpret, and technically acceptable product.
MS06 : Managing a Blended Programming Staff of Permanent Employees and Contingent Workers
Jim Grudzinski, GlaxoSmithKline
Monday, 2:15 PM - 2:45 PM, Location: Continental 2
The focus of this presentation will be my experience of managing a blended programming staff. The challenge of managing the shift to a small permanent programming staff, supplemented by an expanded contingent worker staff. It will highlight the impact of shifting roles on the permanent staff and the management techniques I used to ease this transition. I’ll discuss the need to ensure that the changing environment does not impact programming deliverables. I will also key on techniques I used to manage contingent workers effectively and the resultant concerns of relying on contingent workers for the programming staff success.
MS07 : How to Keep the Project on Budget in the Clinical Trial Study
Kevin Lee, cytel
Monday, 4:00 PM - 4:30 PM, Location: Continental 2
If a SAS programmer is responsible for the budget of clinical trial study in the biometric department, he or she will need to watch and keep the projects on budget throughout the study. The paper is intended for those who are interested in budgeting in the biometric department. It will discuss the reasons and possibilities in going over budget during the study such as incorrect budget, improper resourcing, poor execution, incorrect deliverables and miscommunication with sponsor. The paper will also discuss the ways to reduce the costs and save a budget. The paper will be based on CDISC clinical trial setting.
MS08 : Building a Team of Remote SAS® Programmers
Ramesh Ayyappath, ICON Clinical Research, INC.
Monday, 3:30 PM - 4:00 PM, Location: Continental 2
Virtual work arrangements have become increasingly prevalent during the last decade among SAS programmers working in pharmaceuticals and CROs, and it is set to accelerate in the coming years due to technological advancement and organizational cost benefits. There are tremendous benefits to this kind of work arrangement – for both the employer and employee. However, building and maintaining a successful virtual team is a challenging task and it involves significant efforts from the organization, management, and employee levels. For an employer, hiring an experienced SAS programmer is hard enough, and if telework has to be factored in, it makes the process even more difficult. For an employee, telework is a great incentive, but this kind of work arrangement has its own set of problems and is not suited for everyone. In this paper, I will discuss my experience in hiring remote SAS programmers – various competencies that are vital for teleworking SAS programmers to succeed, various challenges involved in hiring teleworkers, and ways to overcome some of those challenges. I will also present from an employee’s perspective, the key obstacles that may limit the success of a remote programmer, and methods to overcome them in order to become more efficient and effective teleworkers. Also, from an employee perspective, I will present the key obstacles that may limit the success of a remote programmer, and ways to overcome them and become more efficient and effective teleworkers.
MS09 : Coaching the Individual SAS Programmer: Generalist vs. Specialist
Mark Matthews, PharmaNet/i3
Sandra Minjoe, Octagon Research Solutions
Tuesday, 8:00 AM - 9:00 AM, Location: Continental 2
Good managers recognize that all programmers are not created equally, and we can best motivate and reward employees when we recognize them as individuals. One major distinction is that some programmers are generalists and enjoy the variety that comes with different types of studies and analyses, while others are specialists and prefer to hone their skills in one particular area. Difficulties arise when trying to force a specialist into a generalist role, or a generalist into a specialist role. This paper will show managers how to determine whether a programmer is a generalist or a specialist, help them find a good fit, and recommend some resources to increase skills in each area. The two authors have chosen different career paths and share their experiences of generalist and specialist, both good and bad. Mark is a generalist, in management at a CRO, where he has worked with many different clients. Sandra is a specialist, giving training and consulting in one niche market. Each knows the joy found in doing what comes natural vs. the pain from being asked to do what does not, and applies this learning to managing others.
MS10 : Legacy SDTM Domain Development on the Critical Path - Strategies for Success
David Izard, Octagon Research Solutions
Monday, 11:15 AM - 11:45 AM, Location: Continental 2
The design and development of SDTM domains from legacy source data and documents requires careful planning and execution in order to produce a high quality set of deliverables. The placement of this activity on the critical path introduces additional challenges that need to be addressed early and consistently revisited during study execution in order to ensure that business needs are met at the end of the day. This paper and presentation will explore the inherent challenges, risks and strategies that can be employed to ensure that the development of SDTM deliverables at this important time can effectively be managed and ultimately reduce overall time, energy & effort for all Clinical activities related to study completion.
MS11 : A Comparison of Two Commonly Used CRO Resourcing Models for SAS/ Statistical Programmers
R. Mouly Satyavarapu, PharmaNet/ i3
Monday, 4:30 PM - 5:00 PM, Location: Continental 2
Why do we have Contract Research Organizations (CROs)? Pharmaceutical, Biotechnology, and Medical Device companies are trying to streamline their costs by outsourcing the processes to conduct and report clinical data. In this paper, I will introduce the benefits of outsourcing to a CRO for companies varied clinical functional activities. The decision to outsource to a CRO is typically driven by business needs that include: either to benefit the expertise of the CRO, or they have limited resources within their company (technology, staff, etc.), and cost reduction. Based on the scope of work and the company’s preferences to operate, the sponsor/ client (the company) and the CRO decides on the resourcing model that meets the needs of both parties, which would eventually be included in their contracts. On a broader classification, there are two commonly used models within the CRO industry, the first being “Traditional Deliverable Based Model” and the other is “Full-Time Equivalent (FTE) Time and Material Model (also called as Role-Based model)”. The author has a total of 8 years of industry experience working under both models of the CRO industry. This paper would present a compilation of the experiences and differences the author has come across and perceived while working with these commonly used resourcing models of the CROs. It will also list the advantages and disadvantages between these two models within the CROs.
MS12 : Managing a Team of Veteran Programmers – Some Tips for Discussion
Todd Case, Biogenidec., Inc.
Arthur Collins, Biogenidec., Inc.
Monday, 1:15 PM - 2:15 PM, Location: Continental 2
Managing is part art, part science and all about understanding, communication, dialogue and presenting content. This comes into play in different ways with Junior and Senior SAS Programmers. While Junior programmers (as a general rule) tend to ask their Manager for more direction and advice, Senior Programmers may tend to rely more on peers, networks, or user groups they have worked with in the past for technical solutions. In addition, as SAS programmers become more advanced in their careers they tend to have found and gained many efficiencies early in their careers, and perhaps more valuable but less frequent efficiencies as they’re careers have progressed. This presentation will present some tips on effectively managing individual programmers with significant experience as well as effectively managing a team of SAS programmers with significant experience (significant meaning each member having as much or more clinical programming experience as they’re manager). In order to do that, this paper will first pose a question that certainly some of you may be thinking: Why would a team have a manager without as much clinical experience as the programmer’s themselves? Then the paper will continue by discussing several topics in more detail regarding effective management.
MS13 : Connect with SAS® Professionals Around the World with LinkedIn and sasCommunity.org
Charles E. Shipp, Shipp Consulting
Kirk Paul Lafler, Software Intelligence Incorporated
Tuesday, 9:00 AM - 10:00 AM, Location: Continental 2
Accelerate your career and professional development with LinkedIn and sasCommunity.org. Establish and manage a professional network of trusted contacts, colleagues and experts. These social networking and collaborative communities enable users to connect with millions of SAS users worldwide, anytime, anywhere. This presentation explores how to maximize LinkedIn profiles and social networking content, develop a professional network of friends and colleagues, join special-interest groups, access a Wiki-based web site where anyone can add or change content on pages of the web site, share biographical information between both communities using a built-in widget, exchange ideas in Bloggers Corner, view scheduled and unscheduled events, use a built-in search facility to search for wiki-content, collaborate on projects and file sharing, read and respond to specific forum topics, and more.
PostersPO01 : Graphical Outputs for Efficacy Analyses Using SAS/GRAPH® SG Procedures
David Collins, GlaxoSmithKline
Graphical representation of treatment effect is typically easier to understand as compared to a statistical analysis table. The rollout of SAS® 9.2 has provided a variety of new options to create sophisticated analytical displays. Using PROC TEMPLATE and SAS/GRAPH® SG procedures, one can produce graphical outputs of efficacy data for decision making purposes. A simulated Proof-of-Concept (PoC) efficacy trial is analyzed for illustration purposes.
PO02 : Does NODUPKEY Select the First Record in a By Group?
David Franklin, TheProgrammersCabin.com
Does the NODUPKEY option in the SORT procedure always select the first observation in a group of variables? There are some who say it does and some that say it does not. This paper looks into this question, with examples, and shows that the NODUPKEY has really no effect on whether the first observation in a group of data gets selected, but does find that there are two other options that effect whether this happens and presents the effects of these options. Because these two options are rarely specified in programming and the default values shipped with SAS are almost always used, if these values are changed, unexpected results will happen.
PO03 : Streamlining Regulatory Submission with CDISC/ADaM Standards for Non-standard Pharmacokinetic/Pharmacodynamic Analysis Datasets
Xiaopeng Li, Celerion
Katrina Canonizado, Celerion
Chun Feng, Celerion
Nancy Wang, Celerion
Analysis Data Model (ADaM) which is encouraged by FDA in clinical trial data submission is designed to support analysis datasets. Non-Standard ADaM dataset for Pharmacokinetic (PK) and Pharmacodynamic (PD) analyses have not been clearly discussed . Based on Analysis Data Model v2.1, this paper shares the approaches of how to map non-standard PK/PD data to create ADaM datasets to support the analyses for Thorough QT studies, reproductive safety studies and studies with urine collections. The approaches make the ADaM datasets for these studies consistent with Analysis Data Model v2.1 and submission ready. Detailed mapping methods and examples of these ADaM datasets will be presented
PO04 : Algorithm to Compare the Slopes (Regression Coefficients) between the Subgroups in Simple/Multiple Regression using PROC REG
Sandeep Sawant, No
Regression analysis is the most common technique used for data analysis in clinical trials. In regression analysis, a regression line is fitted for the response variable (e.g. Viral load at the end of the study) based on few explanatory variables (e.g. Dose level). A regression line Y=a+bX is fitted, where Y is the response variable, X is the explanatory variable, a denotes the intercept and b is the slope (regression coefficient) of the line. The slope indicates the change in the value of Y if X is changed by one unit. Therefore slope is often useful measure of examining the rate of change in variable Y. In clinical trials, comparing slope (rate of change) for two (or more subgroups e.g. Active vs. Placebo) can be the area of interest to assess the effect of medical treatment. SAS procedure PROC REG does not performs the desired analysis directly but some kind of data manipulation is needed. This paper will discuss the algorithm for comparing the regression coefficients for simple/multiple regression for 2 or more subgroups.
PO05 : EXACTing a Price: Compute Fisher's Exact Test P-values Only When Needed
Robert Abelson, Human Genome Sciences
Many times when p-values are computed for a categorical analysis, it is not known in advance whether an exact or an asymptotic p-value is needed, so both are computed using the EXACT option in PROC FREQ. Sometimes SAS issues a warning that “Computing exact p-values may require much time and memory.” It would be helpful to compute exact p-values only when they are needed. The use of Fisher's Exact Test is recommended when expected cell counts of less than 5 comprise 25% or more of a table. Some statisticians are more conservative, favoring a lower cutoff percentage. Also, while many statisticians use the Pearson chi-square p-values when exact p-values are not needed, some prefer the likelihood ratio chi-square. A macro is presented which gets the expected cell counts, determines if an exact p-value is needed, computes exact p-values only when the number of cells with expected counts less than 5 exceeds the cutoff percentage, and computes chi-square (either Pearson chi-square or likelihood ratio test) p-values for the remaining tables.
PO06 : A Practice to Create Executable SAS Programs for Regulatory Agency Reviewers
Hongyu Liu, Vertex Pharmaceuticals
Lynn Anderson, Vertex Pharmaceuticals
James O'Hearn, Vertex Pharmaceuticals
Kexi Chen, Vertex Pharmaceuticals
Providing executable SAS programs (in connection with the statistical analysis data sets and clinical table outputs) is presently not required with a NDA submission; however they are of great benefit to a regulatory agency reviewer for expediting the review process. This paper will discuss basic considerations, challenges, approaches and examples of practices on how to create executable SAS programs for regulatory agency reviewers. SAS Version 9.2 Audience: All
PO07 : How to Determine Carriage Controls in SAS Transfers?
Eric Kammer, Novartis
Tiantian Sun, Novartis
It is not uncommon for the pharmaceutical industry to out source data management activities. However, when the data comes back to the sponsor company there are sometimes hidden surprises that cause problems in the SAS datasets that are transferred back from the CRO. In particular sometimes carriage controls are hidden in the SAS datasets from the transfers that occur or from the database applications used at the CRO. In normal data cleaning activities the focus is on the actual concomitant medications or adverse events(experiences) that are part of a file. It is not easy for data management to determine the carriage controls problems that are then communicated by the statistical programming group as these can not been seen by the “naked eye.” This presentation will show an approach to detect hidden carriage controls using SAS, give clear direction of data cleaning to DM, and the end result will be cleaner data that arrives back for tabulations and the clinical study reports. The application uses SAS 9.2 in Jreview so the users can run the report as needed for any study within a Unix operating system.
PO08 : One at a Time; Producing Patient Profiles and Narratives
JJ Hantsch, Pharmanet/I3
Janet Stuelpner, SAS
Patient profiles can take many forms depending on the purpose for which it is going to be used. For data cleansing, it can be a data dump for each patient so that any data anomalies can be discovered. It is important to see each and every field that is collected on the Case Report Form (CRF). Once the data is clean and data standards have been applied, the best method for medical review, discovery of subject abnormalities or preparation for the Clinical Study Report (CSR), the narrative type of profile would best. Clinical reviewers are interested in how the whole patient looks, but not quite interested in each field that is collected on the CRF. All of this can be done with Base SAS® and macros to create the listings, formatted profiles and narratives. Another way to create the patient profile is with a data visualization tool, JMP Clinical®. Patient profiles are customizable and can display data from any combination of the core safety CDISC domains. Along with the visual display, a configurable patient narrative can be created for each subject who experienced a serious adverse event during the clinical trial. Reviewers and medical writers enjoy the speed of this programmed process, using the write-ups as a starting point for the final patient narratives compiled in the CSR required by the FDA.
PO09 : GRAPHS MADE EASY WITH ODS GRAPHICS DESIGNER
Kevin Lee, Cytel
Graphs can provide the visual patterns and clarities that are not apparent in tables and listings, but sometimes it takes too long to create ones. Now, The ODS Graphics Designer makes it much easier. The paper is intended for Clinical Trial SAS® programmers who are interested in creating graphs using ODS Graphics Designer. The ODS Graphics Designer is a SAS/GRAPH GUI based interactive tool. The codes in ODS Graphics Designer are based on the Graph Template Language (GTL), but SAS programmers can create graphs using its point-and-click interaction without any programming. The ODS Graphics Designer allows SAS programmers to create many kinds of graphs such as scatter plots, series plots, step plot, histogram, box and more. The paper will show how to start the ODS Graphics Designer in SAS. The paper will also show how easy to create simple or complex graphs using the designer and how to enhance graphs using other features such as legends, cell properties, plot properties and so on. The paper will demonstrate how to create GTL and template codes from designer that will also create the exact graphs in SAS programming. The setting is set up in CDISC environment, so ADaM datasets will be used as source data.
PO10 : ADaM Implications from the “CDER Data Standards Common Issues” and SDTM Amendment 1 Documents
Sandra Minjoe, Octagon Research Solutions
Over the past few years, the United States Food and Drug Administration (US FDA), specifically the Center for Drug Evaluation and Research (CDER), has been receiving more data from sponsors in a Clinical Data Interchange Standards Consortium (CDISC) or CDISC-like structure. Reviewers have had tools built and received training, but there are some technical issues with many submissions that are hindering their review process and thus their full adoption of CDISC. This prompted the issuance of a document entitled “CDER Common Data Standards Issues Document”. Working closely with CDER to address many of these issues, the CDISC Submission Data Standards (SDS) team created an Amendment to the Study Data Tabulation Model (SDTM) version 1.2 and the SDTM Implementation Guide (IG) version 3.1.2. The CDISC Analysis Data Model (ADaM) team has not created a corresponding amendment, but there are many issues noted in the CDER document that have implications on ADaM. This paper examines issues that could affect ADaM, and describes how to handle them so that data and supporting documents submitted to FDA CDER are reviewer-friendly.
PO11 : SAS UNIX-Space Analyzer – A handy tool for UNIX SAS Administrators
Airaha Chelvakkanthan Manickam, Cognizant Technology Solutions
In the fast growing area of information and technology, SAS users tend to occupy a lot of space with their SAS datasets. As time goes on, these SAS datasets tend to grow and slowly occupy the entire space on the file system. It is a challenge for the SAS admin to bring control over the file system as there are numerous SAS users and everyone utilizes the file system for their day-to-day use. Hence a solution is necessary to manage the UNIX file system from the perspective of top space utilizing users, high volume SAS files, how long SAS files are kept and owners of oldest SAS files and newest SAS files, etc. The proposed solution uses SAS to issue UNIX commands to capture the attributes of every file on the file system and SAS to build a mini-data warehouse around the usage of the file system. Using the data warehouse, the SAS admin can issue simple queries to understand the utilization of the file system at any point time. The solution also includes creation of usage charts for presentation with the upper management on the file system usage. The charts are produced on MS Excel and refreshed automatically using SAS DDE connection with MS Excel. SAS Enterprise Guide 4.2 is used to automatically schedule and refresh the data warehouse on a weekly basis.
PO12 : Automatic Consistency Checking of Controlled Terminology and Value Level Metadata between ADaM Datasets and Define.xml for FDA Submission
Xiangchen (Bob) Cui, Vertex Pharmaceuticals
Min Chen, Vertex Pharmaceuticals
When submitting clinical study data (SDTM and ADaM datasets) in electronic format to the FDA, it is preferable to submit data definition tables (define.xml) and a reviewer guide (define.pdf). These define files minimize the time needed for FDA reviewers to familiarize themselves with the data, thereby speeding up the overall review process. It is desirable to ensure the consistency between datasets and define files, and achieve technical accuracy and operational efficiency. This paper introduces a SAS macro approach to automate the process of consistency checking of controlled terminology and value level metadata between ADaM datasets and programming specifications, and further between ADaM datasets and define.xml. Define.xml is automatically generated from the programming specifications in the ADaM programming process. This process is conducted during the whole cycle of ADaM programming, instead of at a later stage. Hence it avoids the waste of time and resources for verification of the consistency and/or resolution of inconsistency at a later stage, significantly reducing the time and resources to develop SAS® programs for ADaM derivation and validation, and prepare define files. This paper also details how to develop ADaM Metadata (programming specification) for automation purpose, illustrates five scenarios of mismatches from consistency checking, and provides corresponding resolutions to these mismatches.
PO13 : Creating Metadata Driven Macros to Gain Dynamism and Reduce Complexity in Macro Calls.
Fernando Enriquez, Pharmanet/i3
It often happens in clinical programming that we have to create standard derived data sets for several studies. It is always a good idea to create a macro and use it for all studies. The problem comes when there are too many parameters to consider for the study; large lists of changing values for parameters to be defined in the macro call, variable names changing in source data sets across studies, different categories to be derived or different value ranges for creating categories in the derived data sets. Many more problems can be listed and here is where metadata is useful to resolve such complexity when creating macro calls. Our metadata is a plain text file structured as a table in which the main macro can import as a data set to obtain all the parameters needed to process the derived data set with personalized specifications depending on the study; such values can represent categories to be processed with value ranges, lists of values, variable names, derivation specifications, filters, etc. This will give the main macro more dynamism and the macro user will find it easier to adjust the metadata file as needed without touching a single line of code - not even the macro call. This paper is directed towards SAS® Base programmers with some SAS Macro Language knowledge.
PO14 : Transposing Tables from Long to Wide: A Novel Approach Using Hash Objects
Joseph Hinson, Agile-1
Changhong Shi, Merck Sharp & Dohme Corp.
Transposing tables is often a necessity in data analysis. In clinical studies, some data, for example laboratory data, are collected in a longitudinal manner. Yet a horizontal form of such data may be more suited for statistical analysis. SAS® provides the TRANSPOSE procedure for such purposes, but this approach can be quite challenging. SAS® 9 introduced the use of hash objects for the purpose of providing fast table lookups and merging without the need for pre-sorting data. In this paper, we exploited an entirely different aspect of the technique - the ability of hash objects to look at a whole table as a matrix in a DATA step rather than observation-by-observation. This allowed us to easily rearrange data in a table.
PO15 : Automate Validation of CDISC ADaM Variable Label Compliance
Wayne (Weizhen) Zhong, Octagon Research Solutions
A sometimes overlooked feature of the CDISC ADaM Implementation Guide is that it specifies label names for each ADaM variable. To improve quality of delivery and reduce the tedious task of checking variable labels, this paper presents techniques for producing a quick compliance report. Details include how to check ADaM variables with placeholders like xx and *, and how to detect at the dataset metadata level a few cases of ADaM non-compliance. The latest CDISC ADaM metadata can be stored in an EXCEL spreadsheet, and easily updated with future ADaM label rules.
PO16 : Scrambling of Un-Blinded Data without ‘Scrambling Data Integrity’!
Jaya Baviskar, Pharmanet-i3
Scrambling of data is widely used and successfully implemented across several functional sectors and the pharmaceutical domain is one such area to effectively implement this technique. It is an efficient way to work with data while retaining data integrity as this is critically important while working on sensitive data seen across the pharmaceutical domain. The ‘scrambling of data’ is more in demand for ‘Blinded’ studies with a short life-span that have to execute all processes in relatively smaller timelines. Hence the idea behind scrambling data is to facilitate ‘Programmers’, ‘Biostatisticians’, ‘Data Managers’ and other team members a glimpse of study data without compromising on data integrity. This helps to pre-empt the process of working on ‘Analysis / Derived Datasets’ or assess and design study-specific programs; thereby providing comparatively longer time-frame than usual. The scrambling can happen across Phase I, II and III studies and the decision to scramble is usually initiated by the Biostatistician or the Project Manager. Keeping this concept in mind the paper will elaborate further on ways to scramble data, types of scrambling utilized, the type of data considered for scrambling, SAS® readily available functions and any associated general information.
PO17 : Basic Debugging Techniques
Beatriz Garcia, Pharmanet/i3
Alberto Hernandez, Pharmanet/i3
When we are working with a large amount of data, tracking down logic errors in our code may be difficult to do- the size of a dataset may make it difficult to print, for example, and manually tracing through record processing can take a long time. Since debugging is the process of removing logic errors from a program, this paper describes some useful techniques like using a simple %PUT statement or PUT statement or the DATA step debugger(SAS 9.2) to facilitate and speed up the debugging of your SAS code.
PO18 : Study on Good Programming Practices in Health and Life Sciences
Arthur Collins, Biogen Idec, Inc.
Mark Foxwell, Cmed
Beate Hientzsch, Accovion
Cindy Song, Sonofi
Most organizations with programming personnel have a document where they describe their best practices for developing high quality and efficient code. This document typically provides guidance on programming techniques such as code design, naming conventions, appropriate use of macros, hard coding restrictions, defensive programming strategies, maximizing efficiency, etc. The Steering Board for Good Programming Practices in Health and Life Sciences (GPP Board) is an industry group with representatives from a diverse array of health and life sciences organizations working to develop common good programming practices applicable to programs developed by statistical programming for the analysis and reporting of clinical trial data as well as for data integration or preparation of clinical data for e-submissions. One of our goals is to create a consolidated guideline which potentially could replace those documents created and maintained by each organization. This poster presents background on the GPP Board and discusses some of the basic tenets of GPP. It will also describe a study that we are conducting to collect data on current programming practices in health and life sciences organizations.
Statistics and PharmacokineticsSP02 : Leverage SAS/Genetics Package to perform Biomarker Drug Response Analysis on Pharmacogenomic Data.
Deepali Gupta, Sr. SAS Programmer
Shirish Nalavade, Sr. SAS Programmer
Monday, 2:15 PM - 2:45 PM, Location: Continental 3
The primary objective of genetic analysis is to infer how an individual’s genetic profile affects subject’s response to drug treatment and more over how safe and effective treatment dosage varies depending on their respective genetic makeup. ALLELE procedure analysis serve to characterize the markers themselves or the population from which they were sampled, and can also further serve as the basis for joint analysis on markers and traits. This procedure uses the notation and concepts described by Weir (1996) in reference for all equations and methods. This paper will provide an introduction to PROC ALLELE and will demonstrate on how to adopt this procedure to analyze pharmacogenomics (marker) data. we will illustrate on how to construct tables of allele, genotype frequencies, Hardy-Weinberg Equilibrium (HWE)analyses and linkage disequilibrium, between each pair of markers and look at key statistical estimates used for inferences analysis.
SP03 : Computing Initial Values for Pharmacokinetic ML Nonlinear Regression via genetic algorithms and parallel genetic algorithms
Francisco Juretig, Nielsen
Monday, 9:15 AM - 9:45 AM, Location: Continental 3
In the case of no-random effects the estimation can be carried via nonlinear least squares (proc nlin) or nonlinear maximum likelihood (proc nlmixed). In either case the election of starting values is usually problematic. The derivatives methods will yield an approximate quadratic convergence, as long as the starting values are well chosen. Yet there are two potential problems: if the likelihood is highly irregular, the algorithm will get stuck at local maxima; and if the starting values are too far from the optimum then the algorithm may not converge. If any of these two problems arise, then there are two possible solutions: (1) linearizing the model and estimating it via ordinary least squares. (2) Using a big grid to compute good initial estimates. The first solution entails the problem of potential high correlation because of the improved dimensionality of the problem and an almost certain bias. The second one is probably better, but requires too much computing power. Yet, Genetic algorithms allow a better exploration of the likelihood and provide an improvement per generation. It will be shown how to obtain good initial estimates using a genetic algorithm, especially for the most used pharmacokinetic problems. For nonlinear fixed effects models these values will, usually, yield the maximum of the negative log-likelihood.If it is possible to use a multi-processor computer, it makes sense to parallelize the problem: finally, a macro that can parallelize a genetic maximum likelihood problem into N processors will be presented.
SP04 : Application of Meta-Analysis in Clinical Trials
Marina Komaroff, Noven Pharmaceuticals
Monday, 1:45 PM - 2:15 PM, Location: Continental 3
Drug development is a long and expensive process where each step should be carefully planned. During design stage of a clinical trial, the sample size calculation has to be performed based on a primary objective of the trial and target to achieve the desired power for detecting a clinically meaningful difference between test drug and standard/control at a fixed Type I error rate. However, information about test drug in pilot studies is limited and most of the times statisticians do “the best guess” about effect size of a new drug that leads to wrong sample size calculation and then to failure of the trail. Meta-analysis can help. Combining all existing information about the test drug, meta-analyses intent to give better estimation of effect size for a new drug that determines required sample. In current years, one of the obstacles was that meta-analysis is not an easy task and special software is usually required to perform it. The goal of this paper is to demonstrate that concept of meta-analyses is apprehensible, and SAS® can be used to perform meta-analysis on regular basis. User-friendly SAS macro calculates the effect size and determines what sample size is needed to reach the goal of clinical trial. For visual presentation of the result, macro generates accompanying forest plots of effect sizes and plot of anticipated sample size. For the sake of validation, the results from meta-analysis were compared with analysis of subject-level (pooled) data. The conclusions from two approaches came out the same. It demonstrates the validity and strength of meta-analysis. This paper suggests that when data sets have been accumulated with ongoing research, the effect size calculated in meta-analysis can be treated as “best evidence” and should be taking in consideration while designing the next clinical trial that will lead to successful NDA submission.
SP06 : Arriving ‘Normally’ at t-Test for a Treatment Combination
Naina Pandurangi, No
Monday, 11:15 AM - 11:45 AM, Location: Continental 3
In Clinical trials involving more than 1 active treatment, statistics in a treatment combination are also analyzed. When the inferences (t-test p-value) of continuous data of a treatment combination are based on Normality checks of the data, there is a slight probability of going wide of the mark. This paper discusses the 2 areas of caution and the resultant incorrect decisions/oversight: 1) Passing the input dataset for Normality check: If not careful, may lead to false Normality decision. 2) Picking the t-test p-value based on equality of variance of F in case of lack of data: If not careful, may lead to oversight of p-value. Note: As this paper only enforces ‘Look Before you leap’ notion while working on these tests using SAS, the theory also holds in 2 independent situations i.e. 1) Checking for Normality when a treatment combination is analyzed and 2) t-test p-value on lack of data.
SP07 : Pattern-Mixture Models as Linear Combinations of Least Squares Means from MMRM with Delta Method Variance Estimation
Bohdana Ratitch, Quintiles
Michael O'Kelly, Quintiles
Monday, 10:15 AM - 11:15 AM, Location: Continental 3
Methods for dealing with missing data in clinical studies that incorporate various Missing Not at Random (MNAR) assumptions about the missingness mechanism are becoming increasingly recommended for sensitivity analyses and even as primary analyses in certain therapeutic areas. MNAR assumptions can be modeled within several statistical frameworks, one of which is known as pattern-mixture models (PMMs). Certain PMM-based analyses for continuous outcomes can be formulated in such a way that the estimate of the difference between treatment arms is expressed as a linear combination of Least Squares Means (LSMs) for different effects of a longitudinal model with correlated errors, weighted by the appropriate proportions of study drop-outs and completers. This approach requires special considerations for the estimation of the variance because the proportions of drop-outs and completers used in the linear combination of LSMs are themselves multinomial random variables and their variances need to be incorporated into the overall estimate. This can be done using a delta approximation method for variance estimation. In this paper, we present details of implementing such analyses (including delta variance estimation method ) using exclusively SAS/STAT ® core functionality, such as PROC MIXED, data steps, and PROC FCMP. To illustrate this approach, we are using an example of MNAR assumptions that take into account the reasons for discontinuation from the study.
SP08 : Using SAS® to Calculate Noncompartmental Urine Parameters
Vanessa Rubano, Boehringer Ingelheim Pharmaceuticals Inc.
Modesta Wiersema, Boehringer Ingelheim Pharma GmbH & Co. KG
Monday, 9:45 AM - 10:15 AM, Location: Continental 3
Clinical Pharmacology is an integral part of clinical trials and the approval of a new drug. To obtain regulatory approval for a New Drug Application (NDA), a sponsor is required to show the elimination of drug and metabolites within the human body’s matrices, such as urine and feces, supported by plasma pharmacokinetic (PK) endpoints. The industry standard used for calculating PK endpoints is the application of clinical trial data to WinNonLin® (WNL), a software modeling package that is based on mathematical equations. The authors of this paper aim to provide an introduction to PK urine assessments and describe a method, using SAS® programming, to derive PK endpoints for individual and cumulative amounts excreted (Ae), fractions of the administered dose excreted (fe) and renal clearance (CLR) of drug from the plasma matrix to urine for a complete profile of a subject or patient. The methods outlined within this paper were implemented using DATA step programming elements in conjunction with user defined macros.
SP09 : Automated Forward Selection for Generalized Linear Models with Categorical and Numerical Variables using PROC GENMOD
Manuel Sandoval, Pharmanet/i3
Monday, 1:15 PM - 1:45 PM, Location: Continental 3
Generalized linear models are a powerful tool to measure relationships between variables, as they can handle non-normal distributions without altering the properties of the variables involved. When applied to risk factor analysis, they can help determine the most important factors contributing to the incidence, prevalence or acquisition of a particular medical condition. This paper presents a particular case in which the aforementioned factors are unknown and a selection must be made from a pool containing both numerical and class variables. Since the model uses an option that is only present in PROC GENMOD (and not in PROC LOGISTIC for example), an algorithm for selecting variables needed to be created from scratch. The proposed macro was built in such a case. Several factors, both numerical and categorical, were tested using forward selection and defined criteria for entering the model and for keeping the variable in the model. The macro also selects the numeric and categorical variables to include only the later in the class statement of the PROC GENMOD.
SP12 : A SAS® Macro for Biomarker Analysis using Maximally Selected Chi-Square Statistic With Application in Oncology
Quan (Jenny) Zhou, Eli Lilly and Company
Bala Dhungana, Eli Lilly and Company
Monday, 2:45 PM - 3:15 PM, Location: Continental 3
Biomarker assessment has become an essential tool for evaluating treatment effects in subpopulations of potential drug responders in oncology studies. It is believed that treatment effects can differ between patient subgroups with different genotypes, and therefore biomarkers may help predict treatment effects in these subpopulations. There are many statistical methods for selecting biomarkers as candidate classifiers when identifying subgroups. One approach is to transform each continuous biomarker into a binary covariate by selecting a threshold with certain optimal properties and fit it in the survival model, one biomarker at a time. Selection of a threshold can be done by using maximally selected chi-square statistic, as proposed in Miller and Siegmund, 1982. In this paper, we demonstrate how to implement the maximum chi-square method with a SAS macro that fits proportional hazards Cox regression models on time-to-event endpoints, determines the biomarker threshold to classify patients into subgroups, and then performs analysis to test for the biomarker effect and treatment effect within and between the biomarker patient subgroups using Kaplan-Meier and Proportional Hazards models. This macro can be applied more broadly to evaluate treatment effects in subgroups formed by a set of continuous covariates in the context of survival analysis. It can also be easily modified to fit logistic models for binary and ordinal outcomes.
SP13 : Communicating Clinical Trial Safety Results the Statistical Graphic Way
Mat Soukup, FDA
Monday, 3:30 PM - 4:30 PM, Location: Continental 3
Within a regulatory setting, it is important to communicate findings observed in clinical trials to our clinical colleagues in addition to our statistical brethren. Findings may be in terms of potential safety signals, highlights of key efficacy findings, or even summaries of patient discourse. The key is that compelling statistical graphics can provide transparent representations of the data collected in clinical trials which thereby improve our ability to make sound decisions on the safety and efficacy of drug products. In this presentation, I will present various graphical approaches that characterize the data in ways which are all-important in making regulatory decisions. In addition, I will provide highlights from the FDA/Industry/Academia Safety Graphics Working Group who are developing a palette of statistical graphics for displaying clinical trial safety results.
Techniques and Tutorials - AdvancedTA01 : Checksum Please: A Way to Ensure Data Integrity!
Carey Smoak, Roche Molecular Systems, Inc.
Mario Widel, Roche Molecular Systems, Inc.
Sy Truong, Meta-Xceed, Inc.
Tuesday, 4:30 PM - 5:30 PM, Location: Continental 7
Want to ensure data integrity? Want to know if a file has been altered? Checksum, please! Checksums have a variety of applications, such as verifying that an application has been installed correctly and providing a method of verifying whether or not a file has been altered. For example, checksums can be used in a SAS program to verify that a .csv file has not been altered before importing the .csv file into a SAS dataset. SAS by itself does not make the use of checksums but fortunately all operating systems have some utility available to do it. We will provide a bit of background on their uses and concentrate on a particular example.
TA02 : Exploring DATA Step Merges and PROC SQL Joins
Kirk Paul Lafler, Software Intelligence Corporation
Monday, 4:30 PM - 5:30 PM, Location: Continental 7
Explore the various DATA step merge and PROC SQL join processes. This presentation examines the similarities and differences between each, and provides examples of effective coding techniques. Attendees examine the objectives and principles behind merging and joining, as well as the coding constructs associated with inner and outer merges and joins, and hash techniques.
TA03 : A Cup of Coffee and PROC FCMP: I Cannot Function Without Them
Peter Eberhardt, Fernwood Consulting Group Inc.
Tuesday, 3:30 PM - 4:30 PM, Location: Continental 7
How much grief have you put yourself through trying to create macro functions to encapsulate business logic? How many times have you uttered "If only I could call this DATA step as a function"? If any of these statements describe you, then the new features of PROC FCMP are for you. If none of these statements describe you, then you really need the new features of PROC FCMP. This paper will get you started with everything you need to write, test, and distribute your own "data step" functions with the new (SAS® 9.2) PROC FCMP. This paper is intended for beginner to intermediate programmers, although anyone wanting to learn about PROC FCMP can benefit.
TA04 : PIPE Dreams: Yet Another Tool for Dynamic Programming
Scott Burroughs, GlaxoSmithKline
Monday, 2:15 PM - 2:45 PM, Location: Continental 7
Statisticians are often divided into Bayesians and Frequentists when it comes to study design and analysis beliefs. As a SAS programmer, you could put me in the Dynamic camp. This is my 6th presentation at a SUG, and all have had something to do with dynamic programming. Dynamic programming is letting the ever-changing and often unknown data drive the results….no hardcoding! There is a dynamic tool that I’ve used when I needed to read in non-SAS data sets or files from a certain directory where I didn’t necessarily know the names of the files nor how many there were. The PIPE command in SAS is a tool to read in files from a directory when the names and quantity are unknown and changing.
TA05 : Using SAS Colon Effectively
Sudhir Singh, Merck Serono
Monday, 2:45 PM - 3:15 PM, Location: Continental 7
SAS® colon is not clearly documented or indexed in the SAS® manual. Although the colon is one of the most powerful useful symbol operators available to the programmer, some of its features are rarely found in typical SAS® code. In this paper I will explain how colon can be used as a operator modifier, label indicator, a format modifier, a keyword component, a variable name wildcard, an array bound delimiter or a special log indicator. Mastering these usages can give your code needed functionality and/or efficiency lift.
TA07 : Diverse Report Generation with PROC REPORT
Chris Speck, PAREXEL International
Monday, 4:00 PM - 4:30 PM, Location: Continental 7
Automation is often the goal of SAS programming. If we could just hit “Submit” and watch our program generate all our tables and listings while making the right decisions at run time we could get a lot more accomplished. Of course, with the SAS macro facility, we can already do this…up to a point. We can equip our macros with macro logic, we can feed them different parameters, and then watch as they produce one output after another. This works great when all your outputs are based on the same data set and require the same number of columns. But what if they don’t? What if you need to automate report generation from a large number of different data sets? What if you must allow for any number of columns? Developing such a program using macro logic would be cumbersome indeed. You would need %IF %THEN blocks for every contingency, and your program would get so bogged down in logic that you’d be better off without automation at all. This paper will demonstrate how SAS programmers can easily and gracefully automate Diverse Report Generation. The methods discussed in this paper use a patient profile program as a primary example and will make use of the REPORT procedure, the SQL procedure, and SASHELP views. Code in this paper will work in version 9 or higher, in Windows or Unix. This paper will be most suitable for intermediate and advance SAS programmers.
TA08 : Perl Regular Expressions in SAS® 9.1+ - Practical Applications
Joel Campbell, Advanced Analytics
Tuesday, 2:15 PM - 2:45 PM, Location: Continental 7
Perl Regular Expressions (PRX) are powerful tools available in SAS® and many other programming languages and utilities which allow precise and flexible pattern matching. This paper discusses the relative advantages of PRXMATCH() versus INDEX() and presents practical applications to serve as a starting point for programmers of all experience levels to begin incorporating the PRX functions in their own code. A method to create dynamic regular expressions from data step variables is presented in addition to a method of pulling information from a string with a single line of code via PRXCHANGE(). The main aim of this paper is to provide all programmers with resources and motivation to begin using the PRX functions in their code.
TA09 : Supplementing Programmed Assisted Patient Narratives (PANs) with Graphs Using SAS
Faye Yeh, Takeda Global Research and Development, Inc
Melvin Munsaka, Takeda Global Research and Development, Inc
Tuesday, 2:45 PM - 3:15 PM, Location: Continental 7
When addressing safety issues of concern, it is often necessary to provide narratives on specific patients. Indeed, the ICH-E3 guideline calls for provision of narratives of deaths, other serious adverse events, and certain other significant adverse events judged to be of special interest. To facilitate for this, it is useful to provide programmed assisted narratives (PANs) in the form of targeted patient profiles. PANs are also useful for internal safety review, safety monitoring, and data integrity checks. They are basically by-patient numeric and text listings organized by domain and include targeted data needed for narratives. They allow for consolidation of information about the important domains in one place without having to look at multiple listings for the subject’s data. This may include patient identifier, demographic information, and other relevant information pertaining to the events of special interest, such as, concomitant and previous illnesses, details of timing and duration and relevant concomitant and previous medication. In order to get maximum benefit in writing the narratives and to quickly identify key characteristics of importance, it is useful to integrate numeric and text output in the PANs with graphs. The use of graphs allows for multidimensional data visualization and quick assessment of the relationship between data which may not be easy to pick out from text and listing format only. This paper will discuss the integration of numeric and text PANs with graphs using SAS. SAS code for integrating text and numeric output with graphs in the PANs will be illustrated.
TA10 : Programming With CLASS: Keeping Your Options Open
Arthur Carpenter, CA Occidental Consultants
Monday, 1:15 PM - 2:15 PM, Location: Continental 7
Many SAS® procedures utilize classification variables when they are processing the data. These variables control how the procedure forms groupings, summarizations, and analysis elements. For statistics procedures they are often used in the formation of the statistical model that is being analyzed. Classification variables may be explicitly specified with a CLASS statement, or they may be specified implicitly from their usage in the procedure. Because classification variables have such a heavy influence on the outcome of so many procedures, it is essential that the analyst have a good understanding of how classification variables are applied. Certainly there are a number of options (system and procedural) that affect how classification variables behave. While you may be aware of some of these options, a great many are new, and some of these new options and techniques are especially powerful. You really need to be open to learning how to program with CLASS.
TA11-SAS : Why Does SAS Say That? What common DATA Step and Macro Messages Are Trying to Tell You
Charley Mullin, SAS
Kevin Russell, SAS
Tuesday, 1:15 PM - 2:15 PM, Location: Continental 7
SAS notes, warnings, and errors are written to the log to help SAS programmers understand what SAS is expecting to find. Some messages are for information, some signal potential problems, some require you to make changes in your SAS code, and some might seem obscure. This paper explores some of these notes, warnings, and errors that come from DATA step and macro programs. This paper deciphers them into easily understood explanations that enable you to answer many of your questions.
Techniques and Tutorials - FoundationsTF01 : Macro Quoting to the Rescue: Passing Special Characters
Mary Rosenbloom, Edwards Lifesciences LLC
Arthur Carpenter, CA Occidental Consultants
Monday, 10:45 AM - 11:15 AM, Location: Continental 9
We know that we should always try to avoid storing special characters in macro variables. We know that there are just too many ways that special characters can cause problems when the macro variable is resolved. Sometimes, however, we just do not have a choice. Sometimes the characters must be stored in the macro variable whether we like it or not. And when they appear we need to know how to deal with them. We need to know which macro quoting functions will solve the problem, and even more importantly why we need to use them. This paper takes a quick look at the problems associated with the resolution and use of macro variables that contain special characters such as commas, quotes, ampersands, and parentheses.
TF02 : What to Do with a Regular Expression
Scott Davis, Experis
Monday, 10:15 AM - 10:45 AM, Location: Continental 9
As many know, SAS has provided support for regular expressions for some time now. There are numerous papers that expose the basic concepts as well as some more advanced implementations of regular expressions. The intent of this paper is to narrow the gap between the very beginning and the advanced. In the past you might have solved a programming problem with a combination of SUBSTR/SCAN and other functions. Now a regular expression may be used to greatly reduce the amount of code needed to accomplish the same task. Think of this paper as a recipe or guide book that can be referenced for some real-life examples that will hopefully get you thinking about ways to create your own regular expressions.
TF03 : Simplifying Effective Data Transformation Via PROC TRANSPOSE
Arthur Li, City of Hope
Tuesday, 9:00 AM - 10:00 AM, Location: Continental 9
You can store data with repeated measures for each subject, either with repeated measures in columns (one observation per subject) or with repeated measures in rows (multiple observations per subject). Transforming data between formats is a common task because different statistical procedures require different data shapes. Experienced programmers often use ARRAY processing to reshape the data, which can be challenging for novice SAS® users. To avoid using complex programming techniques, you can also use the TRANSPOSE procedure to accomplish similar types of tasks. In this talk, PROC TRANSPOSE, along with its many options, will be presented through various simple and easy-to-follow examples.
TF04 : The SAS® DATA Step: Where Your Input Matters
Peter Eberhardt, Fernwood Consulting Group Inc.
Monday, 9:15 AM - 10:15 AM, Location: Continental 9
Before the warehouse is stocked, before the stats are computed and the reports run, before all the fun things we do with SAS® can be done, the data need to be read into SAS. A simple statement, INPUT, and its close cousins FILENAME and INFILE, do a lot. This paper will show you how to define your input file and how to read through it, whether you have a simple flat file or a more complex formatted file.
TF05 : Interesting Technical Mini-Bytes of Base SAS – From DATA Step to Macros
Airaha Chelvakkanthan Manickam, Cognizant Technology Solutions
Tuesday, 5:00 PM - 5:30 PM, Location: Continental 9
Over the last several decades, SAS Institute has improved and added thousands of features to Base SAS. It is almost impossible for someone to know the entire tricks and tips of Base SAS. This paper is intended to highlight some useful tips and advanced of Base SAS that the author has come across during his experience in working with SAS. These technical mini-bytes are divided into two sections namely 1) Data step mini-bytes 2) Macro mini-bytes. In this paper, the readers can know interesting examples varying from common to advanced data step executions such as CALL SET, dynamically accessing SAS datasets and PERL pattern matching. Similarly simple to advanced macro tips including CALL EXECUTE and QUOTE functions in macros are discussed.
TF06 : Making Your SAS® Data JMP® Through Hoops
Mira Shapiro, Analytic Designers LLC
Monday, 4:30 PM - 5:00 PM, Location: Continental 9
Longtime SAS users can benefit by adding JMP to their repertoire. JMP provides an easy-to-use and robust environment for data exploration, graphics and analytics without the need for programming expertise. This paper will provide an introduction to JMP 9 with an emphasis on features that SAS users will find useful. During this presentation, users will learn how to read their SAS data, import Excel spreadsheets, transform their data, explore distributions, create reports and create sophisticated graphics all in the JMP environment. Users will be introduced to the tools within the JMP 9 environment that provide a pathway to quickly learn how to use the product and some of its unique features.
TF07 : Get SAS®sy with PROC SQL
Amie Bissonett, PharmaNet/i3
Monday, 11:15 AM - 11:45 AM, Location: Continental 9
As a data analyst for genetic clinical research, I was often working with familial data connecting parents with their children and looking for trends. When I realized the complexity of this programming could be accomplished quite easily in the SQL Procedure, I started learning the syntax and I’ve been a PROC SQL fan ever since. I’ve been suggesting SQL solutions to colleagues for their programming tasks for years and now I’m writing a paper to share the power of PROC SQL with you. This paper will give you some simple PROC SQL steps to get you started and some more complex steps to show you a few of the ways that PROC SQL can make your programs easier and more efficient.
TF08 : Merge vs. Join vs. Hash Objects: A Comparison using "Big" Medical Device Data
James Johnson, Medtronic, Inc.
Monday, 2:15 PM - 2:45 PM, Location: Continental 9
Due to the use of memory-based operations, hash objects have the potential to reduce the time it takes to combine and sort data files. This presentation compares the performance of three method in SAS used to retrieve a subset of data from a SQL Server database table with over 300 million records. The methods compared are 1) merging data in a DATA step and sorting with PROC SORT; 2) PROC SQL inner join with an ORDER BY statement; and 3) hash object programming.
TF09 : Building the Better Macro: Best Practices for the Design of Reliable, Effective Tools
Frank DiIorio, CodeCrafters, Inc.
Monday, 1:15 PM - 2:15 PM, Location: Continental 9
The SAS® macro language has power and flexibility. When badly implemented, however, it demonstrates a chaos-inducing capacity unrivalled by other components of the SAS System. It can generate or supplement code for practically any type of SAS application, and is an essential part of the serious programmer's tool box. Collections of macro applications and utilities can prove invaluable to an organization wanting to routinize work flow and quickly react to new programming challenges. But the language's flexibility is also one of its implementation hazards. The syntax, while sometimes rather baroque, is reasonably straightforward and imposes relatively few spacing, documentation, and similar requirements on the programmer. In the absence of many rules imposed by the language, the result is often awkward and ineffective coding. Some amount of self-imposed structure must be used during the program design process, particularly when writing systems of interconnected applications. This paper presents a collection of macro design guidelines and coding best practices. It is written primarily for programmers who create systems of macro-based applications
TF10 : Getting Your Data in Shape With PROC TRANSPOSE
Nancy Brucken, PharmaNet/i3
Tuesday, 8:00 AM - 9:00 AM, Location: Continental 9
How many times have you been able to transpose your datasets successfully on the first try with PROC TRANSPOSE? With the increasing use of CDISC SDTM and ADaM datasets, we frequently need to go from “narrow” to “wide” datasets, as when transposing lab data for a listing, or from “wide” to “narrow” datasets, as when transposing vital signs data from its original source to an SDTM VS domain. This paper explores some of the more commonly-used features of PROC TRANSPOSE, and also covers several situations where PROC TRANSPOSE cannot easily handle the required dataset restructuring.
TF11 : Validating SAS Macros and Updated SAS Version
Sy Truong, Meta-Xceed, Inc.
Carey Smoak, Roche Molecular Systems
Monday, 2:45 PM - 3:15 PM, Location: Continental 9
The rapid updates to the SAS software implemented in a regulated controlled environment within the Pharmaceutical Industry have created validation challenges. Users require the efficiencies of automated macros and functionality of the new version of SAS but the complexity of computing environments in a dynamic hardware and software platform makes it difficult to maintain a validated SAS system. This paper will share practical methods and techniques used in the validation of SAS macros and upgrade of a SAS to a new version on a production server. Maintaining a validated SAS system is critical in performing analysis of clinical data in a regulated environment. The lessons learned are based on real world SAS upgrade experiences that will be shared in this paper to save time and ensure the integrity of analysis performed in an environment that is constantly being updated.
TF12 : Atypical Applications of PROC TRANSPOSE
John King, Ouachita Clinical Data Services, Inc.
Tuesday, 3:30 PM - 4:30 PM, Location: Continental 9
Did you know that PROC TRANSPOSE can be used to verify the existence of variables in a SAS® data set? It can process a list of variable name(s) or a “SAS Variable List” or a combination of the two. You can also check if the variables in a list are numeric or character. Using the return code from PROC TRANSPOSE a program can branch based on success or failure of PROC TRANSPOSE. Did you know you can use PROC TRANSPOSE and a data step to determine if a list of variables is in one variable list but not another? This paper shows how to accomplish these tasks and other atypical uses of PROC TRANSPOSE.
TF13 : Rediscovering the DATA _NULL_ for Creating a Report, And Putting That Text File Into RTF Format In a Single Datastep
David Franklin, TheProgrammersCabin.com
Tuesday, 1:15 PM - 2:15 PM, Location: Continental 9
Before PROC REPORT existed, most text output files were created by PROC PRINT and PROC TABULATE, or where a report was needed that needed a bit more sophistication, DATA _NULL_ was used. This paper rediscovers the DATA _NULL_ as it is used for creating a text output file and presents a few macros, tricks and techniques that will make your report shine, including a macro for wrapping text so it can go onto multiple lines, doing “Page x of y” processing, centering or right aligning text, page breaking where you want it, and carrying over group values from one page to another. Finally, the output that is produced from a full example will be transformed into an RTF file by a macro in a single datastep.
TF14 : Comparing Datasets: Using PROC COMPARE and Other Helpful Tools
Deb Cassidy, PPD
Wednesday, 9:00 AM - 10:00 AM, Location: Continental 9
There may be many reasons to compare datasets including working in a double-programming environment, determining if your code revisions worked as expected, and determining the impact from raw data updates. PROC COMPARE works great in many cases and you need nothing more than the lines proc compare data=new_data comp=old_data; run; However, sometimes you get so many pages of differences that you are at a loss as to where to begin. If you want your datasets to be identical, this paper will cover examples of PROC COMPARE options and other helpful tools to get to everyone’s favorite line of output: NOTE: No unequal values were found. All values compared are exactly equal. If you are expecting differences, the paper will cover ways of making it easier to see your differences. This presentation assumes no prior experience with PROC COMPARE and only an introductory knowledge of SAS.
TF15 : A Hitchhiker's Guide for Performance Assessment & Benchmarking SAS® Applications
Viraj Kumbhakarna, JP Morgan Chase
Wednesday, 8:00 AM - 9:00 AM, Location: Continental 9
Almost every IT department today, needs some kind of an IT infrastructure to support different business processes in the organization. For a typical IT organization the ensemble of hardware, software, networking facilities may constitute the IT infrastructure. IT infrastructure is setup in order to develop, test, deliver, monitor, control or support IT services. Sometimes multiple applications may be hosted on a common server platform. With a continual increase in the user base, ever increasing volume of data and perpetual increase in number of processes required to support the growing business need, there often arises a need to upgrade the IT infrastructure. The paper discusses a stepwise approach to conduct a performance assessment and a performance benchmarking exercise required to assess the current state of the IT infrastructure (constituting the hardware and software) prior to an upgrade. It considers the following steps to be followed in order to proceed with a planned approach to implement process improvement. 1) Phase I: Assessment & Requirements gathering a) Understand ASIS process b) Assess AIX UNIX server configuration 2) Phase II: Performance assessment and benchmarking a) Server performance i) Server Utilization ii) Memory Utilization iii) Disk Utilization iv) Network traffic v) Resource Utilization b) Process Performance i) CPU Usage ii) Memory usage iii) Disc space
TF16 : Reading and Writing RTF Documents as Data: Automatic Completion of CONSORT Flow Diagrams
Arthur Carpenter, CA Occidental Consultants
Dennis Fisher, CSULB
Monday, 5:00 PM - 5:30 PM, Location: Continental 9
Whenever the results of a randomized clinical trial are reported in scientific journals, the published paper must adhere to the CONSORT (CONsolidated Standards Of Reporting Trials) statement. The statement includes a flow diagram, and the generation of these CONSORT flow diagrams is always problematic, especially when the trial is not the typical two-arm parallel design. Templates of the typical two-arm design flow diagram are generally available as RTF documents, however the completion of the individual fields within the diagram is both time consuming and prone to error. The SAS Macro language was used to read a RTF template file for the CONSORT flow diagram of choice, fill in the fields using information available to the SAS program, and then rewrite the table as a completed RTF CONSORT flow diagram. This paper describes the process of reading and writing RTF files.
TF17 : The Use and Abuse of the Program Data Vector
Jim Johnson, Ephicacy Consulting Group, Inc.
Tuesday, 10:15 AM - 11:15 AM, Location: Continental 9
Have you ever wondered why SAS does the things it does, or why your programs “get away with” the things that they do, or why SAS would not do what you wanted it to? A key operational component of SAS is the program data vector. Without it SAS would not function as we know it. With knowledge of how the program data vector functions programmers can better understand how SAS works. This paper will help you understand how the program data vector works, how data steps use it, and how you can exploit, manipulate, and trick it. Many examples are included; the magic behind of the DOW-loop and other mysteries will be discussed. There is something in this paper for all levels of programmers from the very beginner to the most advanced.
TF18 : The Use and Abuse of the Program Data Vector - Examples
Jim Johnson, Ephicacy Consulting Group, Inc.
Tuesday, 11:15 AM - 11:45 AM, Location: Continental 9
This presentation is an extension of the 50 minute "Use and Abuse" presentation and will go into detail of some of the examples in the main paper, including the magic behind the DOW-loop and other mysteries, that could not fit in the earlier presentation. There is something in this paper for all levels of programmers from the very beginner to the most advanced.
TF19 : Cross-Platform Data Migration in a Clinical Environment
Frederick Pratter, Utopia Ltd
Srinivas Chittela, Purdue Pharma LP
Tuesday, 4:30 PM - 5:00 PM, Location: Continental 9
Moving SAS datasets, catalogs, macro libraries and other research data from one platform to another is sufficiently complex in itself, but adding the FDA requirement that all clinical data must continue to be available for years after the study is completed makes the process even more so. This paper relates the steps that were required to move 50,000+ SAS datasets in over 3,000 libraries from HP-UX to Linux. The files were created in Windows and UNIX using V6, V8 and V9 SAS. Each data type required individual handling, and tables needed a different treatment from catalogs. The procedures developed and the results obtained should be of interest to any organization confronting this kind of cross-platform migration.
TF20-SAS : PROC REPORT Unwrapped: Exploring the Secrets behind One of the Most Popular Procedures in Base SAS® Software
Allison Booth, SAS
Tuesday, 2:15 PM - 3:15 PM, Location: Continental 9
Have you ever wondered why a numeric variable is referenced in different forms within a compute block? Do you know the difference between a DATA step variable and a variable that is listed in the COLUMN statement? Then, this paper is for you! Welcome to PROC REPORT Unwrapped. We are looking at PROC REPORT and uncovering some of the behind-the-scenes details about this classic procedure. We'll explore the components associated with PROC REPORT along with discovering ways to move column headers, to change default attributes with styles and CALL DEFINE statements. We'll also dig deep into example code and explore the new ability to use multilabel formatting for creating subgroup combinations. So for anyone who has ever written PROC REPORT code, stay tuned. It's PROC REPORT Unwrapped!
TF21-SAS : Using the New Features in PROC FORMAT
Rick Langston, SAS
(Presented by Kim Wilson)
Monday, 3:30 PM - 4:30 PM, Location: Continental 9
This paper describes several examples using functions as labels within PROC FORMAT definitions. Also described is the new feature allowing for Perl regular expressions for informatting data, as well as other new features in PROC FORMAT for SAS® 9.3.
TF22-SAS : The Greatest Hits: ODS Essentials Every User Should Know
Cynthia Zender, SAS
Wednesday, 10:15 AM - 11:15 AM, Location: Continental 9
Just when you think you know every song (feature) in the ODS hit parade, you discover that there’s an option or destination or feature that has you singing its praises because the feature boosted your reports to the next level. This paper covers some of the essential features and options of ODS that every user needs to know to be productive. This paper shows concrete code examples of the ODS ?Greatest Hits. Come to this session and learn some of the essential reasons why ODS and Base SAS® rock!