Paper presentations are the heart of a SAS users group meeting. PharmaSUG 2015 will feature over 200 paper presentations, posters, and hands-on workshops. Papers are organized into 14 academic sections and cover a variety of topics and experience levels. Detailed schedule information will be added in May.
Note: This information is subject to change. Last updated 11-May-2015.
Click on a section title to view abstracts for that section, or scroll down to view them all.
- Applications Development
- Beyond the Basics
- Career Planning
- Data Standards
- Data Visualizations & Graphics
- Hands-on Training
- Healthcare Analytics
- Industry Basics
- Management & Support
- Quick Tips
- Statistics & Pharmacokinetics
- Submission Standards
- Techniques & Tutorials
Beyond the Basics
|Paper No.||Author(s)||Paper Title (click for abstract)|
|CP01||Kirk Paul Lafler
& Charlie Shipp
|What's Hot, What's Not - Skills for SAS® Professionals|
& Darpreet Kaur
|CP03||Carey Smoak||Managing the Evolution of SAS® Programming|
& Otis Evans
& Mandy Bowen
|CRO, TLF, SOP? OMG!: A Beginner's Guide to the Clinical Research Organization.|
& James Zuazo
|Prospecting for Great Programmers - Advice from a Hiring Manger to Make You a Solid Gold Addition to the Team|
& Chelsea Jackson
|Where are the SAS Jobs?|
|CP07||Yingqiu Yvette Liu||Clinical knowledge helps SAS programmers in the pharmaceutical industry get job done better|
& Kirk Paul Lafler
|Downloading, Configuring, and Using the Free SAS® University Edition Software|
|CP10||Kirk Paul Lafler||A Review of "Free" Massive Open Online Content (MOOC) for SAS Learners|
Data Visualizations & Graphics
|Paper No.||Author(s)||Paper Title (click for abstract)|
|HT01||Kriss Harris||Picture this: Hands on SAS Graphics Session|
|HT02||Kirk Paul Lafler||Application Development Techniques Using PROC SQL|
& Mary Rosenbloom
|Are You a Control Freak? Control Your Programs - Don't Let Them Control You!|
|HT04||Sergiy Sirichenko||Usage of OpenCDISC Community Toolset 2.0 for Clinical Programmers|
& Xue Yao
|DS2 with Both Hands on the Wheel|
& Louis Semidey
|Introduction to Interactive Drill Down Reports on the Web|
|HT07||Andrew Kuligowski||Using INFILE and INPUT Statements to Introduce External Data into SAS®|
|HT08-SAS||Vince Delgobbo||Creating Multi-Sheet Microsoft Excel Workbooks with SAS®: The Basics and Beyond Part 2|
|Paper No.||Author(s)||Paper Title (click for abstract)|
& John R Gerlach
|Statistical Analyses Across Overlapping Time Intervals Based on Person-Years|
& Meenal Sinha
|Using SAS® to Analyze the Impact of the Affordable Care Act|
|HA03||Dave Handelsman||Now You See It, Now You Don't -- De-Identifying Data to Support Clinical Trial Data Transparency Activities|
& Daniel Sturgeon
|Medication Adherence in Cardiovascular Disease: Generalized Estimating Equations in SAS®|
|HA05||Kathy Fraeman||A General SAS® Macro to Implement Optimal N:1 Propensity Score Matching Within a Maximum Radius|
|HA06||Tracee Vinson-Sorrentino||The Path To Treatment Pathways|
|HA07||Jennifer Popovic||Distributed data networks: A paradigm shift in data sharing and healthcare analytics|
|HA09||Art Carpenter||Simple Tests of Hypotheses for the Non-statistician: What They Are and Why They Can Go Bad|
|Paper No.||Author(s)||Paper Title (click for abstract)|
|IB01||Brian Shilling||The 5 Most Important Clinical Programming Validation Steps|
|IB02||Soujanya Konda||USE of SAS Reports for External Vendor Data Reconciliation|
& Min Lai
|Tackling Clinical Lab Data in Medical Device Environment|
|IB04||Rajinder Kumar||SAS Grid : Simplified|
& Kevin Lee
& Vikash Jain
|Two different use cases to obtain best response using RECIST 1.1 in SDTM and ADaM|
& Benedikt Trenggono
|The Disposition Table - Make it Easy|
|IB08||Julie Chen||Improving Data Quality - Missing Data Can Be Your Friend!|
|IB09||Rajkumar Sharma||Tips on creating a strategy for a CDISC Submission|
& Rajinder Kumar
|Proc compare: wonderful Procedure!|
Management & Support
Statistics & Pharmacokinetics
|Paper No.||Author(s)||Paper Title (click for abstract)|
|SS01||Nina Worden||Getting Loopy with SAS® DICTIONARY Tables: Using Metadata from DICTIONARY Tables to Fulfill Submission Requirements|
|SS02||Ryan Hara||Japanese submission/approval processes from programming perspective|
|SS04||David Izard||Begin with the End in Mind - Using FDA Guidance Documents as Guideposts when Planning, Delivering and Archiving Clinical Trials|
& Tatiana Scetinina
|OSI Packages: What You Need to Know for Your Next NDA or BLA Submission|
& Max Kanevsky
|The Most Common Issues in Submission Data|
& Frank Roediger
|Getting Rid of Bloated Data in FDA Submissions|
|SS09-SAS||Lex Jansen||SAS Tools for Working with Dataset-XML files|
|SS10-SAS||Ken Ellis||Using SAS Clinical Data Integration to Roundtrip a Complete Study Study Metadata (Define-XML) and Study Data (Dataset-XML)|
Techniques & Tutorials
Applications DevelopmentAD01 : The Implementation of Display Auto-Generation with ADaM Analysis Results Metadata Driven Method
Chengxin Li, sas programmer
Monday, 8:00 AM - 8:20 AM, Location: Oceans Ballroom 3
Standard data structures by CDISC should lead to standard programs and result in auto-generation of SDTM and ADaM data sets and tables, listings and figures. This paper provides a display auto-generation solution implemented with SAS macros. The solution is driven by ADaM analysis results metadata and illustrated with a survival analyses display. Together with display auto-generation, this paper also describes the methods handling of dynamic footnoting, dynamic reporting, etc.
AD02 : A Way to Manage Clinical Project Metadata in SAS Enterprise Guide
David Wang, Sanofi
Monday, 8:30 AM - 8:50 AM, Location: Oceans Ballroom 3
This paper will show how we use an existing standard (SDTM, AdaMs, or Company standard ADS) as a starting point to build and maintain a project metadata, which can be used across all studies under the same project. This paper will use a real submission project experience to demonstrate how this approach was used during ADS development and how it saved a lot of time in a submission project. It can guarantee the SAS variable contributes (e.g. label, length and type) to be consistent across studies. It also helps the project programming lead to manage all studies in the creation of ADS and its define documents as well as ADS pooling process. All changes in the metadata were managed with SAS Enterprise Guide without third party software. It has been a quite convenient tool to use by including the same SAS macro in each of ADS programs. Major SAS code will be included in the paper.
AD03 : Proc STREAM: The Perfect Tool for Creating Patient Narratives
Joseph Hinson, inVentiv Health Clinical, Princeton, NJ
Monday, 9:00 AM - 9:20 AM, Location: Oceans Ballroom 3
STREAM is a new procedure in SAS 9.4 which allows free-form text embedded with macro elements to be streamed to an external file. On the surface, one could immediately say: "big deal; I can easily do that with PUT statements, or even via text assignments". But this nifty tool can do far more -- embedding even macro programs inside text. Furthermore, when the macro-embedded text is processed by the SAS Word Scanner, the SAS syntax rules get ignored. The procedure simply streams text directly to an external file, bypassing entirely the SAS compiler in the process but engaging only the Macro Processor. Thus, Proc STREAM really shines when dealing with text containing such SAS syntax-violators as HTML or XML tags. Patient narratives also present challenging text processing. Life would have been much easier if each patient had at most one adverse event, took only one concomitant medication, and possessed only a single medical history situation. But in the real world, each patient's profile would most likely include multiple records of medical history, concomitant medication, and adverse events. With Proc STREAM, macro programs, rather than just macro variables, can be embedded in text to handle multiple data, generate tables, and insert figures, as demonstrated in this paper.
AD04 : Accelerate define.xml generation using Define Ready
Senthilkumar Karuppiah, TAKE SOLUTIONS
Monday, 9:30 AM - 9:50 AM, Location: Oceans Ballroom 3
With FDA & other regulatory agencies mandating electronic submissions and standardized data, Pharmaceutical companies & sponsors are gearing up for their submissions in CDISC SDTM and ADaM based standards. An important component of such submissions is the generation of the CRT-Data Definition Document (Define xml). Generating define.xml - based on metadata document gives more challenges to SAS programmers in the space of automation. This paper will discuss the detailed mechanism/automated way of generating and validating the Define.XML from a user interface (GUI) and back end with the SAS integrated modules and pre-loaded metadata file. Also paper takes a quick look on, how the Define ready will automatically generate and validate the components of define.xml like dataset metadata. Variable level metadata, codelist, valuelist, origin and comments. Define Ready will also ensure the projects are organised SPONSOR/THREAUPTIC/STUDY wise and support multi user capabilities along with audit trail and validation reports in a regulated environment.
AD05 : Have SAS Annotate your Blank CRF for you! Plus dynamically add color and style to your annotations.
Steven C. Black, Agility-Clinical
Monday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 3
Creating the Blanks CRF annotations by hand is a arduous process and can take many hours of precious time. Once again SAS comes to the rescue!! Using the SDTM specifications data we can have SAS create all of the annotations needed and place these on the appropriate page within the BlankCRF. In addition we can dynamically color and adjust the font size of the annotations. This process uses SAS, a touch of Adobe Acrobat's FDF language, and Acrobat Reader to complete the process. In this paper I will go through each of the steps needed and provide detailed explanations into exactly how to accomplish each task.
AD06 : A SAS Macro-based Tool to Automate Generation of Define.xml V2.0 from ADaM Specification for FDA Submission
Min Chen, Alkermes
Xiangchen (Bob) Cui, Alkermes, Inc
Monday, 3:30 PM - 4:20 PM, Location: Oceans Ballroom 3
Both define.xml and a reviewer guide (define.pdf) are integral parts of any electronic FDA submission, in addition to SDTM and ADaM datasets. The standardized, well-defined, and detailed define files minimize the time needed for FDA reviewers to familiarize the data, and speed up the overall review process. The programming specification documentation serves as a part of a reviewer guide, as well as the documentation for programming validation. It is very crucial to ensure the consistency of the attributes of variables among datasets, define files, and programming specification. It is highly desirable to automate this process to ensure technical accuracy and operational efficiency. The automation of generating define-xml has been a challenge for statistical programming as Define-XML v2.0 released in March 2013 with significant enhancement from version 1.0 This paper introduces a metadata-driven SAS macro-based tool that can automate the creation of Define.xml v2.0 from a CDISC-compliant ADaM Specifications for FDA electronic submission. ADaM specifications were designed for define-xml v2.0 new features. It avoids the waste of resources for manual creation and/or verification of the consistency at the later stage, facilitates regulatory submissions, and achieves high quality of submission.
AD07 : SDTM Annotations: Automation by implementing a standard process
Geo Joy, Novartis Pharmaceuticals
Andre Couturier, Novartis Pharmaceuticals
Monday, 11:15 AM - 11:35 AM, Location: Oceans Ballroom 3
Annotating a Blank Case Report Form -- a collection of unique CRF pages stored in a file named BlankCRF.pdf is an important part in submission activity & a lot of productive time is spent to manually annotate each page. Maintaining consistency across annotations, validating them against submission data set are all manual and error prone activities. This paper talks about an effective way to automate the entire SDTM annotation process by setting up an annotation database and by reusing the annotations done by earlier studies with the help of SAS® and PDF editing tool. This is particularly useful as majority of the standard CRF pages are reused in multiple trials and programmers can annotate those pages in almost no time.
AD08 : SAS Reports on your fingertips? - SAS BI is the answer for creating Immersive mobile report
Swapnil Udasi, Inventiv health
Monday, 2:15 PM - 2:35 PM, Location: Oceans Ballroom 3
The widespread use of smartphone and tablets has created a shift in how information is consumed. Nowadays it's very important for the users to be able to access the information which is latest, readily available wherever they are located. So is it possible in SAS to access the reports through your mobile or tab? The answer is YES! With the help of SAS BI we can now access the information at our fingertips and make decisions at any place and any time. We can even interact, navigate, filter and drill down the reports and make faster decisions. We can also easily annotate to share thoughts, ideas and questions and even add audio/video comments and can share them via email.
AD09 : The Dependency Mapper: How to save time on changes post database lock.
Apoorva Joshi, Biogen IDEC
Shailendra Phadke, Biogen Idec
Monday, 4:30 PM - 4:50 PM, Location: Oceans Ballroom 3
We lay emphasis on building a perfect road map to database lock. To name a few, we keep all stake-holders in-sync, nail down the specs and simulate dry-runs before database lock. This helps us produce the final deliverable such as tables, listings, figures and datasets in a seamless fashion post database lock. In our race to create these deliverables in record time, we usually forget to account for one aspect - human error. Changes to any part of the deliverable post database lock can be frustrating and could potentially delay timelines. What could be the impact of a small change, such as correcting a label typo in ADLB (analysis dataset for labs) to the impact of something more severe, such as a new derivation in ADSL (subject level analysis dataset)? By creating a dependency map - an extensive map of data sources and their dependent programs - one can greatly reduce time consumed on changes post database lock. This paper explores the various advantages of creating a dependency map and explores different methods to create such maps programmatically. With sample user cases the various time saving advantages of dependency maps are explained in detail. While detailed code is not provided in this paper, important snippets of code necessary for someone to get started are provided.
AD10 : Qualification Process for Standard Scripts in the Open Source Repository with Cloud Services
Hanming Tu, Accenture
Dante Di Tommaso, F. Hoffmann-La Roche, Ltd.
Dirk Spruck, Accovion GmbH
Christopher Hurley, MMS Holdings Inc.
Nancy Brucken, inVentiv Health Clinical
Tuesday, 8:00 AM - 8:50 AM, Location: Oceans Ballroom 3
This paper describes the steps, platform and progress of initiating a qualification process for standard scripts hosted in the Google Code repository with cloud services. The open source repository is used as a collaborative development platform for hosting the specialized programs to be used as analytic tools for clinical trial research, reporting, and analysis through cloud services. We will present how to access the repository, how to contribute to the repository and more importantly how to ensure the quality of the scripts being stored in the repository.
AD11 : When Reliable Programs Fail: Designing for Timely, Efficient, and Push-Button Recovery
Troy Hughes, No Affiliation
Tuesday, 9:00 AM - 9:50 AM, Location: Oceans Ballroom 3
Software quality comprises a combination of both functional and performance requirements, which together specify not only what software should do, but how well it should do it. An often overlooked performance requirement, recoverability describes the efficiency with which software can resume functioning following a catastrophic failure. Mathematically, recoverability can be represented as mean time to recovery (MTTR), which reflects the average amount of time required to resume program functioning. A second recoverability metric describes the level of effort or amount of time developers are required to invest to bring about that resumption to functioning. Recoverability can be facilitated through both hardware and SAS infrastructure optimization, but SAS programs as well must be optimized to ensure SAS processes intelligently resume after a failure. SAS software development best practices are described that reduce MTTR, including a modular approach to software design, the use of control tables that demonstrate serialize process successes and failures, and data-driven engines that ensure efficient recovery. Best practices also are described that reduce or eliminate effort required by developers to return software to functioning, with the automation of cleanup and resumption processes. Even reliable, robust programs can fail due to unforeseeable environmental and other circumstances, thus recoverability mechanisms can be emplaced that minimize both the time and effort involved in resumption to functioning.
AD12 : Agile Software Development for Clinical Trials
Troy Hughes, No Affiliation
Monday, 2:45 PM - 3:05 PM, Location: Oceans Ballroom 3
Agile methodologies for software development, including the Scrum framework, have grown in use and popularity since the 2001 Manifesto for Agile Software Development. More than having obtained ubiquity, Agile demonstrably has defined software development in the 21st century with its core foci in collaboration, value-added software, and flexibility gained through incremental development and delivery. Although Agile principles can be extrapolated easily to other disciplines, Agile nomenclature, literature, and application too often remain focused on software development alone. References to Agile often depict an idealized "developer" archetype whose responsibilities seem focused narrowly on the production of releasable code, in a role that often omits operational and other non-developmental activities. In contrast, SAS practitioners who work in the pharmaceutical and broader clinical trials fields not only include hardline developers but also researchers, biostatisticians, statistical programmers, data analysts, and other professionals who often view code generation as a process not a product. For these professionals, success is measured instead through data analysis, data-driven decision making, the release of analytic products, and the ultimate advancement of research. Moreover, clinical trials programming and data management often dictate an intimate knowledge of health care, health policy, statistics, and a host of federal regulations and guidelines. Notwithstanding this shift in focus from technical to business value, Agile methodologies can greatly benefit software development in the pharmaceutical and clinical trials industries.
AD13-SAS : Patient Profiles and SAS Visual Analytics
Pritesh Desai, SAS
Monday, 1:15 PM - 2:05 PM, Location: Oceans Ballroom 3
Clinical trials data are collected from many different sources. Once the trial begins, all of the data needs to be cleaned, explored and reviewed before they are processed. Patient profiles are used for many phases of the process. These profiles can take many forms depending upon the reviewer and the purpose. An ideal patient profile would contain all current data for each subject while empowering reviewers to rapidly assess the patients' status while still allowing access to any level of detail desired. This paper will explain how end users of various technical skill levels can use SAS Visual Analytics (VA) to achieve this, as well as other productivity improvements throughout the clinical trial process
AD14-SAS : Making Shared Collaborative Research Environments Usable for Researchers and Sponsors
Matt Gross, SAS
Kishore Papineni, Astellas
Tuesday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 3
Pharmaceutical companies are coming together to share anonymised individual level clinical trial data with external researchers - often referred to as clinical trial data transparency initiatives - to enable them to conduct further research that may advance scientific understanding and improve patient care. These companies have set up systems to provide researchers with an analytical environment that provides secure access to the individual level data. These environments need to both protect individual privacy and confidentiality, as well as meet the needs of researchers by, for example, providing the tools to analyze the data. As companies take the first steps in providing access to data in these environments, initial usability studies as well as early user surveys have been undertaken by the sponsors of the environments to get feedback from researchers and inform the future development of the system. This presentation will discuss a variety of aspects regarding how to make these environments as usable and valuable to the researcher including " Discussing the usability studies and surveys including their results. " The proposed evolution of the environment to respond to feedback from the first research groups and allow expansion for even more types of data and researchers.
Beyond the BasicsBB01 : The Knight's Tour in Chess -- Implementing a Heuristic Solution
John R Gerlach, Dataceutics, Inc.
Monday, 8:00 AM - 8:50 AM, Location: Oceans Ballroom 1
The Knight's Tour is a sequence of moves on a chess board such that a knight visits each square only once. Using a heuristic method, it is possible to find a complete path, beginning from any arbitrary square on the board and landing on the remaining squares only once. However, the implementation poses challenging programming problems. For example, it is necessary to discern viable knight moves, which changes throughout the tour. Even worse, the heuristic approach does not guarantee a solution. This paper explains a SAS® solution that finds a knight's tour beginning from every square on a chess board.
BB02 : A New Era: Open access to clinical Trial Data - A case study
Jacques Lanoue, Novartis
Aruna Kumari Panchumarthi, Novartis
Monday, 9:00 AM - 9:20 AM, Location: Oceans Ballroom 1
Access to the underlying (patient level) data that are collected in clinical trials provides opportunities to conduct further research that can help advance medical science or improve patient care. This helps ensure the data provided by research participants are used to maximum effect in the creation of knowledge and understanding. Researchers can use anonymised patient level data and supporting documents from clinical studies to conduct further research. This paper will present: *Challenges in Data Sharing. *Challenges around Anonymization. *Define a new Business process from scratch. *Define a global standard but acknowledging Division specifics. *Implementation with its lesson learned and "do and don't".
BB03 : A Toolkit to Create a Dynamic Excel Format Metadata to Assist SDTM Mapping Process
Huei-Ling Chen, Merck
Helen Wang, Sanofi
Monday, 9:30 AM - 9:50 AM, Location: Oceans Ballroom 1
SDTM (Study Data Tabulation Model) dataset structure is a recommended standard data format when sponsors (e.g. pharmaceutical companies) submit a new drug application to the FDA. Data management teams and clinical SAS programming teams frequently encounter data mapping tasks to convert raw datasets to SDTM structured datasets. Having a toolkit to quickly summarize the raw datasets into a metadata in Excel spreadsheet format can be helpful to enable the mapping tasks. This paper demonstrates a toolkit which automatically searches the entire designated SAS data library and creates a dynamic Excel format metadata to summarize the data. This metadata has two main features: 1) one overview spreadsheet to list out all datasets embedded with a hyperlink so it can direct to individual dataset spreadsheet; 2) for each dataset, there is an individual spreadsheet presenting the variables attributes. In addition, there is a link directs back to the overview spreadsheet.
BB04 : Process and programming challenges in producing define.xml
Mike Molter, d-Wise Technologies
Monday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 1
Wednesday, 8:00 AM - 8:50 AM, Location: Oceans Ballroom 11
Companies all over the pharmaceutical industry are looking for solutions for creating define.xml. The unfamiliarity of a foreign format like XML; the business rules surrounding the content; the intricacies of the metadata - all of these paint a picture of a daunting task that requires specialized software to be operated by specialists in the field. In this paper we'll see that while budget-saving Base SAS solutions do exist, the task is not trivial. We'll also see that while programming challenges are part of the puzzle, challenges in setup and process are where much of the task's energy and resources need to be dedicated.
BB05 : A Methodology of Laboratory Data Reporting of Potentially Clinical Significant Abnormality (PCSA) for Clinical Study Report
Xiangchen (Bob) Cui, Alkermes, Inc
Min Chen, Alkermes
Monday, 1:15 PM - 2:05 PM, Location: Oceans Ballroom 1
In February 2005, the Center for Drug Evaluation and Research (CDER) issued a Reviewer Guidance: Conducting a Clinical Safety Review of a New Product Application and Preparing a Report on the Review , which points out that reporting of Potentially Clinically Significant Abnormalities (PCSA) from laboratory, Vital signs, and ECG values for selected parameters is critical to clinical safety review. This paper describes a methodology, which shows step-by-step process to build a standard library for PCSA criteria dataset for selected serum chemistry parameters and hematology parameters, and automatically generate CDISC-compliant standard variables in ADLB for PCSA criteria, and create SAS table programming templates for PCSA tables and their listing. The standardization of PCSA criteria dataset facilitates generating of CDISC-compliant standard variables in ADLB for PCSA criteria and PCSA table template. With this strategy, the standard SAS programs for both ADaM datasets and PCSA tables can be easily developed across the studies. Technical accuracy and operational efficiency can be ensured.
BB06 : Implementing Union-Find Algorithm with Base SAS DATA Steps and Macro Functions
Chaoxian Cai, AFS Inc
Monday, 2:15 PM - 3:05 PM, Location: Oceans Ballroom 1
Union-Find algorithm is a classic algorithm used to form the union of disjoined sets and find connected components of a graph from a given set of vertices and edges. The algorithm is often used in data manipulations that involve graphs, trees, hierarchies, and linked networks. A tree data structure can be used to implement the algorithm. A SAS data set is a tuple data structure that is excellent for table manipulations, but cannot be directly applied for implementing Union-Find algorithm. In this paper, we will discuss the programming techniques that are available in Base SAS for the implementation of the Union-Find algorithm. We will explain how to implicitly represent graphs and trees using SAS arrays, and how to build hierarchical trees and perform queries on trees using SAS DATA steps and macro functions. We will also discuss the programming techniques that have been used to minimize average running time of the algorithm.
BB07 : Is Your Failed Macro Due To Misjudged "Timing"?
Arthur Li, City of Hope
Tuesday, 1:15 PM - 2:05 PM, Location: Oceans Ballroom 1
The SAS® macro facility, which includes macro variables and macro programs, is the most useful tool to develop your own applications. Beginning SAS programmers often don't realize that the most important function in learning a macro program is understanding the macro language processing rather than just learning the syntax. The lack of understanding includes how SAS statements are transferred from the input stack to the macro processor and the DATA step compiler, what role the macro processor plays during this process, and when best to utilize the interface to interact with the macro facility during the DATA step execution. In this talk, these issues are addressed via creating simple macro applications step-by-step.
BB09 : Things Are Not Always What They Look Like: PROC FORMAT in Action
Peter Eberhardt, Fernwood Consulting Group Inc
Lucheng Shao, Ivantis,Inc
Tuesday, 2:15 PM - 3:05 PM, Location: Oceans Ballroom 1
When we first learn SAS, we quickly learn the value of built-in formats to convert our underlying data into visual representations fit for human consumption. For many SAS programmers that is as far as their understanding of formats goes. In this paper we will show creating and using formats and informats created with PROC FORMAT. We will show how you can use these artifacts to do more than just change the appearance of a variable.
BB11 : Generic Macros for Data Mapping
Qian Zhao, J&J
Jun Wang, J&J
Ruofei Hao, J&J
Monday, 11:15 AM - 11:35 AM, Location: Oceans Ballroom 1
Data mapping and its QC can be a daunting task. When data are mapped, e.g., to SDTM, the variable values, variable names, variable types, and data structures, etc., may change. Writing generic macros to perform data mapping and QC of the mapping that work for any data set may seem impossible. This paper explores a technique to use the combination of external control files and generic macros to perform data mapping and QC. This technique can be extended to other uses. This paper uses two examples to illustrate the concept. In the first example, we will demonstrate how to QC a data set mapped from a horizontal structure to a vertical structure by isolating a single variable at a time and circulating through all variables. In the second example, we will demonstrate how to write a macro against data mapping specification to generate mapping program.
BB13 : A Unique Way to Annotate Case Report Forms (CRFs) in PDF, Using Forms Data Format (FDF) Techniques
Boxun Zhang, Seattle Genetics
Tyler Kelly, Seattle Genetics
Tuesday, 8:00 AM - 8:20 AM, Location: Oceans Ballroom 1
One of the essential tasks for programmers, as part of the internal processes and/or a component of a regulatory submission, is to annotate CRFs. While there are various approaches to accomplish this, Adobe Acrobat is commonly used. To overcome labor intensiveness of incorporating PDF annotations manually, creating an FDF file provides the possibility of a repository to store and manage the annotations. As these annotations are mapped by page numbers, it's still challenging to automatically assign the annotations back to the CRFs as desired, and across similar studies. By determining the degree of similarity based on the text strings of CRFs, it would be possible to establish accurate mappings of annotations to CRFs. This paper describes a simple method of using SAS® COMPGED function based on fuzzy matching and explores the dynamic possibilities of incorporating PDF annotations in CRFs.
BB14 : Perl Regular Expression in SAS® applications
Yang Wang, Seattle Genetics
Abdul Ghouse, Seattle Genetics
Tuesday, 8:30 AM - 8:50 AM, Location: Oceans Ballroom 1
SAS® provides text functions to perform search, replace and other manipulations on character strings. SAS® also provides a set of functions that works with Perl regular expression (PRX functions) to perform string tasks with more flexibility and power. In this paper, we use applications or macro examples to demonstrate advantages of the PRX functions: An automated emailing utility: A SAS® application that automates a process that selectively sends out report files to different email addresses based on a specific schedule. User-defined strings of Perl regular expression used to select the target report files are maintained in a lookup file. A smart Excel® reader: Excel® allows mixed data type in one column but SAS® dataset column only allows a single data type. In order to read in all the cells in their intended format, this SAS® macro identifies the cell format by PRX functions and casts the right data type for the column. An efficient SAS® macro input parameters checker: We provide examples to show PRX functions have great advantages over SAS® text functions. In conclusion, because Perl regular expression patterns can be written in the form of static string, it allows users to pass static patterns that provide logic to a SAS® program instead of nesting one or more string functions.
BB15 : Creating Data-Driven SAS Code with CALL EXECUTE
Hui Wang, Biogen
Tuesday, 9:00 AM - 9:20 AM, Location: Oceans Ballroom 1
In SAS programming language, the basic function of CALL EXECUTE is to resolve its argument and then execute the resolved value. It is one of the call routines which can interact with the macro facility within a data step. Therefore, in data-driven programming practice, it is widely used on such occasions as running macros conditionally or passing data step values to macros in a data step. This paper will discuss how to take advantage of the features of CALL EXECUTE to apply data-driven programming strategies. Such examples include: 1) Create a dataset frame based on metadata provided; 2) Evaluate assessment values based on external criteria; 3) Merge comments from EXCEL file back to corresponding data records regardless of variable attributes.
BB16 : Unpacking an Excel Cell: Dealing with Multi-Line Excel Cells in SAS
Lucheng Shao, Ivantis,Inc
Tuesday, 9:30 AM - 9:50 AM, Location: Oceans Ballroom 1
In clinical trials it is very common that a variable of a subject could have multiple observations at a certain visit. This could end up with having multiple observations in one cell in the original dataset. If the original dataset is a Microsoft Excel file, multiple lines in one cell would lead to unexpected output when this Excel file is imported into SAS®. This paper will show you examples like this and how to deal with such problems. The critical step in solving the problem would be to generate an array in SAS and have each of the multiple lines be an element of this array. Then you can choose the desired way to formulate the final SAS output using those array elements. This paper is intended for SAS users who are familiar with SAS BASE and want to learn more about how to create a new SAS function using the FCMP procedure.
BB17 : Fresh Cup of Joe: Utilizing Java to automate Define.XML for SDTM Origin mapping from SAS® aCRF PDFs.
Tony Cardozo, Theorem Clinical Research
Tuesday, 10:15 AM - 10:35 AM, Location: Oceans Ballroom 1
Define.XML specifications provide an extended CDISC Operational Data Model (ODM) to describe clinical data and statistical analyses submitted for FDA review. Part of the Define.XML process is identifying and mapping Study Data Tabulation Model (SDTM) variables to their SAS annotated Case Report Form (aCRF) origin. This process can be labor intensive and time consuming. This paper first introduces the programming language Java as a powerful alternative to processing and manipulating PDF documents, and then demonstrates how it may be used to fully automate the SAS® aCRF to SDTM variable origin mapping process. In addition to explaining how to implement Java, this paper provides specific examples of how to use iTEXT, a free Java-PDF library, to extract all SAS aCRF annotations while setting all XML accessible hyperlinks/destinations within the PDF document. Lastly, it covers how SAS® can utilize the extracted annotations to determine all one-to-one, one-to-many, and many-to-many SDTM variables to SAS aCRF relationships. This allows for a fully programmatic solution to Define.XML variable origin mapping.
BB18 : Not Just Merge - Complex Derivation Made Easy by Hash Object
Lu Zhang, PPD
Tuesday, 10:45 AM - 11:05 AM, Location: Oceans Ballroom 1
Hash object is known as a data look-up technique in data steps for many of its advantages. Before SAS 9.2, it was mostly used to accomplish efficient data merging due to the limits that unique keys were required in Hash objects. After SAS 9.2, Hash objects in data step can perform data look-up even though the keys are not unique. Therefore, many complex derivations in our daily work become more straightforward and simpler than before. In this paper, practical examples in analysis database derivations will be given to illustrate how it works.
BB19-SAS : Clinical Trials Analysis Driven by CDISC Data Standards
Kelci Miclaus, JMP
Tuesday, 11:15 AM - 11:35 AM, Location: Oceans Ballroom 1
JMP® Clinical is a combined solution of JMP and SAS software tools that relies on the SDTM and/or ADaM data standard for a wide range of analytical reviews of clinical data. Clinical analysis reports utilize the JMP software interface and drive the creation and execution of an extensive set of source SAS macro programs. The analyses target medical monitoring and regulatory medical review, signal detection, data quality and fraud detection, and risked-based monitoring (RBM). Generating cross-domain data also leads into more advanced subgroup analysis and predictive modeling routines. In this presentation, we will highlight several aspects that make this solution unique. These include an "intelligent" system that tracks how SAS and JMP algorithms are employed based on CDISC (SDTM, ADaM, ADaM BDS, SEND) variable presence and usage; sophisticated statistically-driven reports that direct medical review with several entry-points to patient profiles and patient AE narratives; and integrations with SDD, CDI, and SAS Metadata servers. Additionally, we will highlight how SAS programmers can incorporate their own programs that leverage all of these capabilities.
BB20 : Macro Programming Best Practices: Styles, Guidelines and Conventions Including the Rationale Behind Them
Don Henderson, Henderson Consulting Services LLC
Art Carpenter, CA Occidental Consultants
Monday, 3:30 PM - 5:20 PM, Location: Oceans Ballroom 1
Coding in the SAS Macro Language can be a rich and rewarding experience. The complexity of the language allows for multiple solutions to many programming situations. But which solution is best, is there even a "best" solution, and how do you decide? Should you choose a technique based on ease of programming or because of the efficiency of the solution? What is the best practice for laying out your code, selecting your statements, and organizing the way you use the symbol tables? Learn about macro language programming styles, guidelines and conventions and why they are important. Since not everyone always agrees on programming practices, this paper will focus on the rationales behind the techniques so you can make an informed choice based on your environment.
Career PlanningCP01 : What's Hot, What's Not - Skills for SAS® Professionals
Kirk Paul Lafler, Software Intelligence Corporation
Charlie Shipp, Consider Consulting Corporation
Tuesday, 3:30 PM - 3:50 PM, Location: Oceans Ballroom 1
As a new generation of SAS® user emerges, current and prior generations of users have an extensive array of procedures, programming tools, approaches and techniques to choose from. This presentation identifies and explores the areas that are hot and not-so-hot in the world of the professional SAS user. Topics include Enterprise Guide, PROC SQL, PROC REPORT, Macro Language, DATA step programming techniques such as arrays and hash, SAS University Edition software, support.sas.com, sasCommunity.org®, LexJansen.com, JMP®, and Output Delivery System (ODS).
CP02 : E-Learn SAS
Vikas Gaddu, Anova Groups
Darpreet Kaur, Anova Groups
Tuesday, 4:00 PM - 4:20 PM, Location: Oceans Ballroom 1
This paper focus on how to create SAS e-learning tutorials, which tools to use and how to organize your contents so that it is captivating for all audiences. A tutorial consist of a topic, which can have multiple modules and each module will have multiple lessons. Each lesson should have quiz at the end to access the progress of each Student. Your Learning Management System should be able to store these quiz scores and track your progress using standards models like SCORM. What we said so far is very generic and can be applied to any e learning solution, what makes creating e learning content for SAS programming different. For programming you will need good examples to make your point, A well thought out sample project which mimics the real world work and An interface that allows students to ask questions and submit their code for review by expert. There are number of software that you will need to create a e learning lesson. For example you will need screen recording software like Camtasia or Captivate to record your SAS code and output it. Wacom tablet and sketchbook pro to make illustrations and figures. To add animation and humor we can use goanimate.com. Finally you need a good script and LMS authoring software like Articulate Storyline to bring everything in and converting it into a SCORM module. You will also need a website that can host such contents and track activity of each user.
CP03 : Managing the Evolution of SAS® Programming
Carey Smoak, InClin, Inc.
Tuesday, 4:30 PM - 4:50 PM, Location: Oceans Ballroom 1
Over the past 30+ years, I have seen many innovations in SAS as a software tool. While SAS software has become more and more innovative, the SAS programming profession in the pharmaceutical / biotechnology industry has both evolved and de-evolved in some important ways. The evolution is mainly due to advances in SAS as a software tool. Thus the SAS programmer is challenged to keep up with new innovations in the SAS software. At the same time, the daily work of a SAS programmer may be de-evolving due to working in a specification driven environment and standardization (CDISC and macros). Standardization and specifications are good, but SAS programmers may become bored in environments that are highly standardized. The recent growth in the data scientist profession may offer a way for SAS programmers in the pharmaceutical / biotechnology industry to grow their careers and do more than they are currently doing. This paper will present some ideas for growing the SAS programming profession.
CP04 : CRO, TLF, SOP? OMG!: A Beginner's Guide to the Clinical Research Organization.
Stephen Terry, INC Research
Otis Evans, INC Research
Mandy Bowen, INC Research
Tuesday, 5:00 PM - 5:20 PM, Location: Oceans Ballroom 1
Navigating the operational activities of a Clinical Research Organization (CRO) can perhaps be daunting to newly hired entry level employees; in addition to being a complex web of interrelated entities and divisions of labor, the industry is replete with acronyms, protocols, and esoteric jargon. To hopefully help alleviate this confusion, we present this paper as an introduction to the general workings of the CRO industry and its place in the larger clinical trial process. We will focus on the involvement of the sponsor, the CRO, and the Food and Drug Administration (FDA) in the development, analysis, and approval processes.
CP05 : Prospecting for Great Programmers - Advice from a Hiring Manger to Make You a Solid Gold Addition to the Team
Christopher Hurley, MMS Holdings Inc.
James Zuazo, MMS Holdings
Wednesday, 10:45 AM - 11:05 AM, Location: Oceans Ballroom 1
Eureka! SAS programming in the Pharmaceutical industry is a very rewarding career. For the manager, when hiring or dealing with the day to day, having extraordinary SAS programming resources to count on is as good as gold. When looking at a stack of CVs, how do we shake the pan to find the ones that are truly sparkling? That is, how to select the ones to interview? What makes a good hire? Once integrated into the team, how does the programmer earn respect, gain trust, and elevate to be the real deal and not just iron pyrite? Stand out with some nuggets of advice to differentiate yourself in the eyes of your manager and peers. This paper is for everybody. It breaks down concepts from the manager and employee perspectives, before and after a programmer is added to a team. This is important because job changes occur frequently in the course of a career. Shine from the very start and get brighter over time! Topics will be explored on how to develop as a programmer, how to communicate with your clients and peers, and how to develop your leadership potential. The ideas unearthed in this presentation will set you on your quest to build a solid gold career.
CP06 : Where are the SAS Jobs?
Beth Ward, Chiltern
Chelsea Jackson, Chiltern
Wednesday, 8:00 AM - 8:20 AM, Location: Oceans Ballroom 1
Who is hiring? The hiring trends have shifted in the past several years for Pharmaceutical, Biotechnology and Clinical Research Organizations. Functional Service Provider models have grown throughout the years and are showing strong results. This paper outlines the different job options for a SAS® professional. They will be enlightened on trends happening in the industry and why employers outsource their work. The information provided will give insight and challenge the professional to navigate towards aligning themselves with these potential companies.
CP07 : Clinical knowledge helps SAS programmers in the pharmaceutical industry get job done better
Yingqiu Yvette Liu, PA
Wednesday, 8:30 AM - 8:50 AM, Location: Oceans Ballroom 1
SAS programming in the pharmaceutical industry is not a pure programming job like that in the IT industry. It requires more understanding and knowledge on the business side, i.e. the complicated nature of clinical trial and clinical trial data. In this sense, Clinical Trial Data Analyst is a better title reflecting the nature of the job. For SAS programmers in the Pharmaceutical industry, it will give them a unique perspective of view if they have thorough understanding of clinical trials and sufficient knowledge of the disease related to a given drug in clinical trial(s). This unique view helps bring a quicker grasp and deeper understanding of the clinical data, and it also may empower SAS programmers catch and solve hidden issues that usually won't be seen in others' eyes. In this paper, the author will provide examples that clinical knowledge helps SAS programmers in the pharmaceutical industry get job done better. Hopefully it will inspire more collegues in the pharmaceutical SAS programming field to enhance their clinical knowledge and understanding which may result in high quality of product delivered and faster personal career growth.
CP08 : Downloading, Configuring, and Using the Free SAS® University Edition Software
Charlie Shipp, Consider Consulting Corporation
Kirk Paul Lafler, Software Intelligence Corporation
Wednesday, 10:15 AM - 10:35 AM, Location: Oceans Ballroom 1
The announcement of SAS Institute's free "SAS University Edition" is an exciting development for SAS users and learners around the world! The software bundle includes Base SAS, SAS/STAT, SAS/IML, Designer Studio (user interface), and SAS/ACCESS for Windows, with all the popular features found in the licensed SAS versions. This is an incredible opportunity for users, statisticians, data analysts, scientists, programmers, students, and academics everywhere to use (and learn) for career opportunities and advancement. Capabilities include data manipulation, data management, comprehensive programming language, powerful analytics, high quality graphics, world-renowned statistical analysis capabilities, and many other exciting features. This presentation discusses and illustrates the process of downloading and configuring the SAS University Edition. Additional topics include the process of downloading the required applications, "key" configuration strategies to run the "SAS University Edition" on your computer, and the demonstration of a few powerful features found in this exciting software bundle. We conclude with a summary of tips for success in downloading, configuring and using the SAS University Edition.
CP10 : A Review of "Free" Massive Open Online Content (MOOC) for SAS Learners
Kirk Paul Lafler, Software Intelligence Corporation
Wednesday, 9:00 AM - 9:50 AM, Location: Oceans Ballroom 1
Leading online providers are now offering SAS users with "free" access to content for learning how to use and program in SAS. This content is available to anyone in the form of massive open online content (or courses) (MOOC). Not only is all the content offered for "free", but it is designed with the distance learner in mind, empowering users to learn using a flexible and self-directed approach. As noted on Wikipedia.org, "A MOOC is an online course or content aimed at unlimited participation and made available in an open access forum using the web." This presentation illustrates how anyone can access a wealth of learning technologies including comprehensive student notes, instructor lesson plans, hands-on exercises, PowerPoints, audio, webinars, and videos.
Data StandardsDS01 : Understanding SE, TA, TE Domain
Jacques Lanoue, Novartis
Monday, 8:00 AM - 8:20 AM, Location: Oceans Ballroom 4
ABSTRACT The FDA clearly request that the subject element table (SE) be part of the submission data. In order to derive the SE domain, the trial domains Trial element (TE), and Trial Arm (TA) also need to be defined and part of the data. This paper will discuss how the Trial domains TE, TA contributes to the derivation of the SE domain and will provide one interpretation of the derivations needed to achieve a compliant and useful SE domain.
DS02 : SDTM TE, TA, and SE domains: Demystifying the Development of SE
Kristin Kelly, Accenture Life Sciences
Jerry Salyers, Accenture Life Sciences
Fred Wood, Accenture Life Siences
Tuesday, 8:00 AM - 8:50 AM, Location: Oceans Ballroom 4
There is a necessary relatedness among the SDTM Trial Design domains, TA (Trial Arms) and TE (Trial Elements), and the special-purpose domain, SE (Subject Elements), that serves to convey both the planned (TE/TA) and actual (SE) treatment paths for a subject within a study. The SE domain is derived from the subject-level general observation class domains based on 1) the Start and End rules in TE, and 2) the Transition and Branching rules (TATRANS and TABRANCH) in the TA domain. The SDTMIG principles state that there are no gaps between Elements, meaning that the start date/time of an Element (SESTDTC) is the same as the end date/time of the previous Element (SEENDTC). The SE domain can be cumbersome to create if the code attempts to account for both the Start Rule and End Rule for each Element. This paper will discuss a streamlined process for deriving SESTDTC and SEENDTC by accounting for only the Start Rule (SESTDTC) for each Element, while including the End Rule (SEENDTC) in the code only if the subject does not fulfill the planned Start Rule of their next Element (e.g., due to study discontinuation, some other unplanned occurrence), or if it is the last Element.
DS03 : Considerations in Conforming Data from Multiple Implantable Medical Devices to CDISC Standards Using SAS®
Julia Yang, Medtronic Inc.
Monday, 2:45 PM - 3:05 PM, Location: Oceans Ballroom 4
Both pharmaceutical and medical device trial analyses are at the subject level, capturing a drug's or device's impact on the patient. Medical device trials are unique in that their analyses are also at the device level. Efficacy and safety data are affected by how each device is implanted and by the interactions of multiple related devices. Therefore, device-level analysis dataset (ADDL) is additionally required for device trials; similar to the required use of subject-level analysis dataset (ADSL). It can be challenging and time-consuming to plan and create SDTM and ADaM data for the first time. However, once the ADaM data set is created, the SAS programs can be reused in reporting for multiple different device trials. This should result in substantial savings of both time and resources. This paper addresses: 1) the need in device trials to have a SDTM dataset referred to as Device Exposure as Collected (DC), a counterpart to Exposure as Collected (EC) for drug development; 2) the use of BDS structure to develop Interim Device-Level Analysis Dataset (ADDLINT), which facilitates the creation of ADDL from the DC dataset; 3) the use of ADDL as the foundation for device clinical trial analysis. Key Words: Implantable Medical Devices, SAS®, SDTMIG-MD, Device Exposure as Collected (DC), Interim Device-Level Analysis Dataset (ADDLINT), Device-Level Analysis Dataset (ADDL), Device Events, Device Survival
DS04 : Using ADDL Model for Device Analysis
Priya Gopal, Theorem Clinical
Tuesday, 9:30 AM - 9:50 AM, Location: Oceans Ballroom 4
The ADaM device sub-team was reviewing the need for per subject per device analysis model to be created for device trials. One of the important attributes that are different on device trials is that there could be more than one device used per subject in a clinical trial. Each device that is implanted in a subject, could have different attribute settings. Attributes like shape, diameter, size etc., could be set differently in each device implanted for the same subject. Each device could be separately implanted, explanted, modified and controlled externally for the same subject. Based on some of the analysis needs that were seen on different use cases , a device-level analysis draft model was submitted to the ADaM team for review. In this paper, I would like to discuss some of the analysis scenarios seen on device trial analysis and how the proposed ADaM ADDL device model could be used for those scenarios.
DS05 : LOCF vs. LOV in Endpoint DTYPE Development with CDISC ADaM Standards
Maggie Ci Jiang, Teva Pharmaceuticals
Monday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 4
CDSIC ADaM Implementation Guide v1.0 (IG) has defined the standards on how to use the DTYPE terminology when developing endpoint values in ADaM DBS datasets, and provided users some examples to illustrate how to utilize the standards. However, the definitions and examples from ADaM IG are limited to the applications of the sole definition or an individual case such as DTYPE for AVERGE, LOV or Endpoint. When it comes to the situation that multi-criteria are defined in SAP, for example, SAP defines that the Last Observed non-missing Value (LOV) is the endpoint value, and also allows Last Observation Carried Forward (LOCF) to be applied to the scheduled visit if the expected value is missing, which terminology should be used for the derived endpoint DTYPE, LOCF or LOV or just Endpoint? This paper will present a review of experiences implementing endpoint DTYPE in ADaM BDS datasets, and discuss the comprehesive utilization of DTYPE terminology step by step with some practical examples. No requirement for SAS version. Audience is expected to have read and understood the basic concept of CDISC ADaM Implementation Guide, and have interest in participating in further discussion.
DS06 : Considerations in ADaM Occurrence Data: Handling Crossover Records for Non-Typical Analysis
Karl Miller, inVentiv Health
Richann Watson, Experis
Monday, 1:15 PM - 1:35 PM, Location: Oceans Ballroom 4
With the release of the new ADaM Occurrence Data Model for public comment in the first quarter of 2014, the new model is clearly established to encompass adverse events, as well as concomitant medications, along with other data into this standard occurrence analysis structure. Commonly used analysis for this type of occurrence structure data can be found utilizing subject counts by category, based on certain criteria (i.e. treatment, cohort, or study period). In most cases, the majority of the analysis data will be in a one-to-one relationship with the source SDTM domain record. In this paper, the authors will discuss the creation of ADaM occurrence data for specific cases outside of the common or typical analysis, where analysis requires a record in SDTM data, which spans across multiple study treatments, periods or phases, to be replicated for inclusion in non-typical analysis for a record being analyzed under multiple study treatments, periods or phases. From the assignment and imputation of key timing variables (i.e. APERIOD, APHASE, ASTDT), through the appropriate derivation of indicator variables and occurrence flags (i.e. ANLzzFL, TRTEMFL, ONTRTFL, and AOCCRFL, AOCCPFL, etc.) the authors guide you through this non-typical process in order to maintain efficiency along with ensuring the traceability in the generation of this analysis-ready data structure.
DS07 : The Best Practices of CDISC ADaM Validation Checks: Past, Present, and Future
Shelley Dunn, d-Wise
Ed Lombardi, Agility-Clinical, Inc.
Monday, 1:45 PM - 2:35 PM, Location: Oceans Ballroom 4
The CDISC ADaM Validation Check document was last revised in 2012 (V1.2). The current checks include specific guidelines and rules for validation of ADaM data sets based on the ADaM Implementation Guide V1.0. As new ADaM documents are released the ADaM Compliance Sub-Team has been preparing machine-testable failure criteria to support new requirements. What are the best practices when it comes to ADaM validation? Will running validation checks, based on the CDISC ADaM Validation Check document, be sufficient to validate any analysis data structure? Many companies in the industry rely on a single set of checks provided by an outside vendor to validate their ADaM data sets. Unfortunately this practice alone is not sufficient to ensure all ADaM structure requirements are followed. ADaM validation is more than just running a tool and hoping for no errors. While the ADaM Validation Checks provide a comprehensive list of checks, not every nuance of ADaM particulars can be tested through machine-testable failure criteria. Therefore it becomes necessary to supplement these checks with both sponsor specific checks as well as manual data structure reviews. This presentation will focus on ways to improve the quality of validating analysis data structures by providing a look at the past, present, and future of ADaM compliance checks. Regardless of where you are on the spectrum, from being new to ADaM to being an author of the ADaMIG, this presentation will guide you to understand what you can do to help implement best practices around the ADaM Validations checks.
DS08 : Proper Parenting: A Guide in Using ADaM Flag/Criterion Variables and When to Create a Child Dataset
Richann Watson, Experis
Karl Miller, inVentiv Health
Paul Slagle, inVentiv Health
Monday, 3:30 PM - 4:20 PM, Location: Oceans Ballroom 4
There has always been confusion as when to use some of the various flags (ANLzzFL, CRITyFL) or category variables (AVALCATy) reserved for ADaM basic data structure (BDS). Although some of these variables can be used interchangeably it might help to come up with rules to help keep a consistency of when to use these variables. Furthermore, there may be some situations where the creation of a new data set would be the better solution. These data sets are referred to as parent-child data sets throughout the paper. This paper will focus on rules that the authors follow when developing ADaM data set specifications for the proper use of ANLzzFL, CRITy/CRITyFL, AVALCATy or when a parent-child data set option is more feasible.
DS09 : ADaM Example for a Complex Efficacy Analysis Dataset
Milan Mangeshkar, Exelixis
Sandra Minjoe, Accenture
Tuesday, 11:15 AM - 11:35 AM, Location: Oceans Ballroom 4
This paper walks through a complex analysis need and an ADaM solution. The analysis need involved summarizing both pain data and pain medications data across multiple timepoints. Our methodology was to first understand our analysis need, and then determine whether we could use ADaM structures to get us there, while always following the ADaM fundamental principles. The design work was a joint effort of the study lead statistician, study lead programmer, and an ADaM consultant. Our solution made use of interim ADaM BDS datasets to collect and consolidate the data from each SDTM domain before combining into the ADaM dataset actually used for analysis. These interim datasets provided traceability between the SDTM datasets and the ADaM dataset used for analysis, and were also instrumental for data review and listing generation.
DS10 : PhUSE De-Identification Working Group: Providing de-identification standards to CDISC data models
Jacques Lanoue, Novartis
Jean-Marc Ferran, Consultant & Owner, Qualiance; Director of Special Projects, PhUSE
Monday, 8:30 AM - 8:50 AM, Location: Oceans Ballroom 4
In this era of Data Transparency and sharing data with researchers, companies are defining their processes and de-identification guidance in order to comply with data privacy regulations. In particular, it is possible for researchers to request access to data across sponsors and both the difference of data models and de-identification techniques may make the analyses cumbersome and error-prone. While the CDISC data models are now adopted in the industry, PhUSE launched in July 2014 a dedicated Working Group to define de-identification standards for CDISC data models starting with SDTM. Participants from Pharmaceuticals, CROs, Software Vendors, CDISC specialists, Data Privacy specialists and Academia have joined forces to define a set of rules against SDTM to provide the industry with a consistent approach to data de-identification and increase consistency across anonymized datasets. Each domains and variables holding potentially Personally Identifying Information (PII) have been rated in terms of impact on data privacy. Based on that rating the variables are allocated standard rules of de-identification, the rationale and the impact on data utility is documented., allocated rules to apply and their rationale and impact on data utility. This presentation will elaborate on the Working Group main findings, the current deliverables and the perspective to take this first initiative to the next stage.
DS11 : It Depends On Your Analysis Need
Sandra Minjoe, Accenture
Monday, 4:30 PM - 4:50 PM, Location: Oceans Ballroom 4
As an ADaM consultant, I'm regularly asked by friends and coworkers fairly general, and what they think are "quick" questions. My usual response to these types of questions is often "It depends on your analysis need," which I suspect is a bit disappointing to hear. However, my vague answer is because the question itself doesn't have enough detail for me to reply with exactly one correct or best answer. In this paper I delve into some of these questions that I've been asked and describe some of the different answers that each has & depending on your analysis need.
DS12 : DIABETES: Submission Data Standards and Therapeutic End Points
Naveed Khaja, SW-T Consulting Group
Tuesday, 9:00 AM - 9:20 AM, Location: Oceans Ballroom 4
Diabetes mellitus is a chronic metabolic disorder characterized by hyperglycemia. It is caused by defective insulin secretion, resistance to insulin action, or a combination of both. Most patients with diabetes mellitus have either type I diabetes (insulin-dependent or early onset) or type II diabetes (with a complex pathophysiology). While there are drugs available with different mechanism of actions, the most recent under development include glycemic control based on changes in HbA1c where the primary data end point is reduction of HbA1c. The clinical trials now submit the data to the regulatory authorities using CDISC Standards. These standards provide rules for structuring information so data can be entered consistently reducing variability across trials submitted to the agency. The Submission Data standard (SDTM) developed by CDISC includes all the information collected during a study. A User Guide for Diabetes was recently developed under the CFAST initiative describing how to use the CDISC standards by identifying the common data elements. The purpose of the paper is twofold. Firstly, it attempts to briefly describe the development landscape of diabetic drugs with a focus of primary end points. Secondly, the data collected along with their standards published for Diabetes therapeutic area is discussed with a specific emphasis on SDTM.
DS13 : Implementing Various Baselines for ADaM BDS datasets
Songhui Zhu, A2Z Scientific Inc
Tuesday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 4
Most ADaM datasets can be implemented as BDS data structure. In BDS datasets, if there is only one baseline per subject per parameter, variables ABLFL, BASE, and BASEC can be used to implement baselines and the implementation is pretty straightforward. However, when there are multiple baselines such as time-matched baselines and baselines for cross-over studies, the implementation can be tricky. In some cases, new records need be created, but in some other cases, no new records need to be created. In the meantime, BASETYPE and DTYPE need be implemented appropriately. In this paper, the author's explains how to implement baselines for all various cases using examples.
DS14 : What to Expect in SDTMIG v3.3
Fred Wood, Accenture Life Siences
Monday, 9:00 AM - 9:50 AM, Location: Oceans Ballroom 4
The next version of the SDTMIG (version 3.3) is expected in Q3 of this year. The content has been posted for review in three batches. The rationale for this posting process for versions of the SDTMIG and the SDTM will be described. A number of new domains and concepts that have been added to SDTMIG v3.3 will be discussed. Included are the physiology-based Findings domains, (Nervous System Findings (NV), Ophthalmic Examinations (OE), Respiratory Measurements (RE), and Cardiovascular Findings (CV); and the new Interventions domain, Procedure Agents (AG). Functional Tests and Clinical Classifications are two new Findings domains that will be managed in the same way the Questionnaires (QS) domain is, with the Category variable indicating the type of measurements or tests. The definitions of the Tumor domains (TU, TR) have been expanded to include non-tumor lesions. The paper will present an overview of several new concepts. Included will be Disease Milestones, which includes new variables and new special-purpose domains, first introduced in the Diabetes Therapeutic-Area User Guide (TAUG). The option of including non-standard variables in parent domains, thus avoiding the need for SUPP-- datasets, was sent out for public comment, and the outcome of that will be discussed. The concept of a focus of interest within a subject (FOCID), which has the same meaning across all domains, was first used in the OE domain, but also has applicability to SEND data.
DS15 : Considerations in Submitting Non-Standard Variables: Supplemental Qualifiers, Findings About, or a Custom Findings Domain
Jerry Salyers, Accenture Life Sciences
Fred Wood, Accenture Life Sciences
Richard Lewis, Accenture Life Sciences
Tuesday, 2:15 PM - 3:05 PM, Location: Oceans Ballroom 4
As discussed by one of the authors in 2013, the SDTM Implementation Guide (SDTMIG) provides for a standard mechanism and structure for submitting non-standard variables. That paper discussed common issues seen when submitting SUPP-- datasets, from using an inappropriate IDVAR to the practice of submitting data "as collected" (i.e. "coded") though often largely uninterpretable. Examples highlighted some of the unexpected outcomes when the parent domain and data from the supplemental domain are merged together (as during the course of review) based on the identified "merge key" in the IDVAR variable. With the advent and growing knowledge and use of the Findings About (FA) domain, many sponsors are challenged in determining where non-standard data best fit. It's not uncommon to see sponsors submitting data in FA when supplemental datasets would have been preferred and would not have required the creation of RELREC records. Similarly, we've seen FA used when a custom Findings domain would have sufficed, as the --OBJ variable would not have added any clarity to the data. This paper will highlight several criteria that can be used to determine how best to represent such non-standard data.
DS16 : What is the "ADAM OTHER" Class of Datasets, and When Should it be Used?
John Troxell, Accenture
Tuesday, 1:15 PM - 2:05 PM, Location: Oceans Ballroom 4
As is well known by now, the CDISC ADaM team has defined four classes of ADaM datasets: ADSL (Subject-Level Analysis Dataset), BDS (Basic Data Structure), ADAE (Adverse Events Analysis Dataset), and OTHER. The ADAE class is soon to be generalized into the OCCDS (Occurrence Data Structure) class. The ADSL, BDS and ADAE/OCCDS structures are defined and widely used. However, the OTHER class is by nature relatively unspecified and mysterious. This paper explores philosophical and practical questions surrounding the OTHER class of ADaM datasets, including considerations for deciding when its use is appropriate, and when it is not.
DS18 : ADaM Implementation Roundtable Discussion
Nancy Brucken, inVentiv Health Clinical
Richann Watson, Experis
Steve Kirby, Theorem Clinical Research
Tuesday, 3:30 PM - 5:20 PM, Location: Oceans Ballroom 4
Many aspects of analysis requirements are study-specific, and some may require highly-customized supporting datasets to generate the necessary tables, listings and figures. For that reason, items such as specific variables to include and the number of analysis datasets required for a given study are not defined in Version 1.0 of the ADaM Implementation Guide. Join us for an informal discussion of ADaM implementation challenges, and bring your own ADaM data set questions for consideration.
DS19-SAS : Managing Custom Data Standards in SAS® Clinical Data Integration
Melissa R. Martinez, SAS
Monday, 11:15 AM - 11:35 AM, Location: Oceans Ballroom 4
SAS® Clinical Data Integration (CDI) is built on the idea of transforming source data into CDISC standard domains such as SDTM and ADaM. However, you can also define custom data standards to use within CDI. This paper describes several scenarios for using custom data standards, such as incorporating company-specific requirements, developing therapeutic area or compound level domains, and even using study-specific data standards. This paper also describes in detail how to set up the required data sets to define a custom data standard, register the custom data standard with SAS Clinical Standards Toolkit (CST), and import the custom data standard into SAS Clinical Data Integration. In addition, best practices are discussed for both managing the security of SAS Clinical Standards Toolkit Global Library that is central to the SAS Clinical Data Integration application and for the overall process of developing custom data standards.
Data Visualizations & GraphicsDV01 : Variable-Width Plot - SAS® GTL Implementation
Songtao Jiang, Boston Scientific
Tuesday, 11:15 AM - 11:35 AM, Location: Oceans Ballroom 3
If you do statistical analyses using S-PLUS, you may be familiar with BLiP graphical tool. The variable-width plot (VWP) included in this tool was developed to graphically present the one-dimension data. A unique feature is its capability of drawing the distribution plot with variable width proportional to the density at particular locations. Without certain limitations of some commonly used graphical methods, such as scatter plot, histogram, or boxplot, the VWP is a very useful tool for studying the data distributions. While the variable-width plot in the BLiP tool has been implemented in the S-PLUS, this type of plot is not available in SAS. In this paper, we implement VMP using SAS GTL. The properties and benefits are discussed by comparing it to other existing graphical methods.
DV02 : Creating Sophisticated Graphics using Graph Template Language
Kristen Much, Rho, Inc.
Kaitlyn Mcconville, Rho, Inc.
Tuesday, 1:15 PM - 1:35 PM, Location: Oceans Ballroom 3
Graph Template Language (GTL) is an excellent tool for customizing the underlying attributes of graphics produced by SAS/GRAPH. However, many find learning this relatively new (in production since SAS 9.2) language a challenge. This paper will take an example based approach to creating complex single- and multi-cell statistical graphics. Focus will be placed on syntax and options available in GTL, overlaying graphs of different types, and creating graphs with more complex layouts. The examples provided using data from the Immune Tolerance Network (ITN) and Autoimmune Disease Clinical Trials (ADCT) will enable you to take your graphs to the next level using GTL.
DV04 : An Enhanced Forest Plot Macro Using SAS
Janette Garner, Gilead Sciences
Tuesday, 2:15 PM - 2:35 PM, Location: Oceans Ballroom 3
A forest plot allows for a quick visualization of results across multiple subgroups. Additional information such as the actual values of the forest plot or the response rates in each treatment group is usually not included in a basic forest plot. Building upon forest plot code written by Sanjay Matange, a platform-independent macro was developed to allow the user to create a forest plot with the additional information mentioned above. Moreover, the plot can be enhanced with a bar plot of response rates in each treatment group and numerical labels of the values for the forest plot. It utilizes SG procedures available in SAS 9.3 (in particular, the HIGHLOW statement) and a standardized input dataset. The macro can accommodate the data based on the differences between two treatment groups or the odds/risk ratio. Readability has also been improved by adding reference bands to the graph.
DV05 : Techniques of Preparing Datasets for Visualizing Clinical Laboratory Data
Amos Shu, MedImmune LLC
Zhouming(Victor) Sun, Medimmune
Tuesday, 2:45 PM - 3:05 PM, Location: Oceans Ballroom 3
Visualizing clinical laboratory data helps clinicians and statisticians quickly understand the results of lab tests. Some people have been able to easily generate figures for clinical laboratory data by using SAS® Graph Template Language and SAS/GRAPH® SG Procedures. The most difficult part of generating graphs, however, is not the SAS/GRAPH® Language, but the process of creating the datasets that are used to generate the graphs. This paper discusses the techniques of preparing datasets that will be used to generate five types of figures for clinical laboratory data. These same techniques are applicable to visualizing electrocardiography and vital signs data as well.
DV06 : Have a Complicated Graph? Annotate Can be Great!
Scott Burroughs, GlaxoSmithKline
Tuesday, 3:30 PM - 3:50 PM, Location: Oceans Ballroom 3
Back in the day before PROC REPORT became popular to create tables, DATA _NULL_ was the go-to vehicle to do just about anything you wanted in a table. It was versatile and highly customizable. SAS/GRAPH procedures have been adding new functions and features throughout the years, and now we have the powerful SG line of PROCs to use. However, there still are times when they can't do exactly what we want. The ANNOTATE feature of SAS/GRAPH is still being used to append both data-driven and stand-alone objects to graphs, including data points, p-values, and other highlighting features. But could you do the entire figure using it? Certainly! This will not be my first PharmaSUG paper using ANNOTATE to do all of the data presentation.
DV07 : Getting Sankey with Bar Charts
Shane Rosanbalm, Rho, Inc
Tuesday, 4:00 PM - 4:20 PM, Location: Oceans Ballroom 3
In this paper we present a SAS macro for depicting change over time in a stacked bar chart with Sankey-style overlays. Imagine a clinical trial in which subject disease severity is tracked over time. The disease severity has valid values of 0, 1, 2, and 3. The time points are baseline, 12 months, 30 months, and 60 months. A straightforward way to represent this data would be with a vertically-oriented stacked bar chart. Visit would be used as the x-axis variable. Disease severity would be used to form the groups in the stacked bars. The y-axis measure would be percent of subjects in each group. This type of data visualization allows us to see the change within each group over time. However, the data visualization does not allow us to see which group of subjects is driving these changes. For instance, if the number of subjects at severity level 1 were to increase from baseline to month 12, how do we know whether the new subjects are coming from group 0, 2, or 3? Sankey diagrams provide a visual depiction of the magnitude of flow between nodes in a network. If we think of the groups in a stacked bar chart as these nodes, then Sankey-style overlays can be used to show how the subjects flow from one severity level to another over time. In this paper we will present just such a data visualization.
DV08 : Looking at the Big Picture - Snapshots of Patient Health in SAS®
Ruth Kurtycz, Spectrum Health
Wednesday, 8:00 AM - 8:50 AM, Location: Oceans Ballroom 3
Healthcare professionals are exceedingly familiar with the multitude of measurements, diagnostics, and categorizations, which are used in the attempt to effectively define overall health. Commonly seen are measures such as Body Mass Index (BMI), cholesterol level, smoking status, activity level, etc. Yet each measurement itself only communicates a sliver in the big picture question of "how healthy am I?" To help see the big picture, the %REPORT macro has been developed which allows the user to take a "Snapshot" of a particular patient's health at any given time. These snapshots are one-page reports that contain the most recent health diagnostics of interest for a particular patient and the dates they were taken. By displaying these measures together, healthcare providers can gain a more complete understanding of a patient's overall health. The reports also include four visual displays of common health measurements over time, which allow the patient and the healthcare provider to identify improvements in health and together create action plans to address negative trends. To utilize the methods discussed here, users should have a basic understanding of the REPORT and GREPLAY procedures and access to SAS v9.3 or greater.
DV09-SAS : Clinical Graphs are Easy with SAS 9.4
Sanjay Matange, SAS Institute Inc., Cary, NC
Wednesday, 9:00 AM - 9:50 AM, Location: Oceans Ballroom 3
Axis tables, polygon plot, text plot, and more features have been added to Statistical Graphics (SG) procedures and Graph Template Language (GTL) for SAS® 9.4. These additions are a direct result of your feedback and are designed to make creating graphs easier. Axis tables let you add multiple tables of data to your graphs and to correctly align with the axis values with the right colors for group values in your data. Text plots can have rotated and aligned text anywhere in the graph. You can overlay jittered markers on box plots, use images and font glyphs as markers, specify group attributes without making style changes, and create entirely new custom graphs using the polygon plot. All this without using the annotation facility, which is now supported both for SG procedures and GTL. This paper guides you through these exciting new features now available in SG procedures and GTL.
DV10 : Graphical Presentation of Clinical Data in Oncology Trials
Murali Kanakenahalli, Seattle Genetics Inc
Avani Kaja, Seattle Genetics Inc
Tuesday, 4:30 PM - 4:50 PM, Location: Oceans Ballroom 3
Graphs are gaining ground in usage for presenting data in oncology clinical trials. As the options in ODS graphics and Graph Template Language (GTL) have grown over the last few years, so has the demand for details in graphs. Improvements in ODS and GTL have also provided an opportunity to create various types of graphs with ease. This paper discusses different kinds of graphs like KM plots, waterfall plots, swimmer plots, etc. which are used to present data on survival, adverse events, and response. The paper also showcases our attempts to use the Standardized ODS template for all the plots along with Graph Template Language templates.
DV11 : Leveraging Visualization Techniques to tell my data story: Survival Analysis Interpretation made easy through simple programming
Vijayata Sanghvi, PRA International
Wednesday, 10:15 AM - 10:35 AM, Location: Oceans Ballroom 3
This presentation is targeted to focus on new and customizable key features of PROC LIFETEST with in SAS/STAT package from version SAS 9.2. Especially these features support in generating the survival plot with number of subjects at risk, multiple comparisons of survival curves, hall wellner confidence bands, individual vs group plots, panel plots and many more features, which can be easily be adopted in programming with no annotation techniques. The objective of this paper is to help support the end user to get acquainted to very simple options available at their disposal, to help alleviate their programming effort which is a daunting task. We will have many graphical presentation scenarios shared with the user, incorporating these techniques and options, which will make the programmer's and clinical analyst job simple and easy.
DV12 : R you ready to show me Shiny
Jeff Cai, Pharmacyclics Pharmaceuticals
Wednesday, 10:45 AM - 11:05 AM, Location: Oceans Ballroom 3
DV13 : Forest Plots: Old Growth versus GMO (genetically modified organism)
Scott Horton, Experis
Tuesday, 5:00 PM - 5:20 PM, Location: Oceans Ballroom 3
By their nature, Forest Plots (e.g. plotting hazard ratios with their confidence intervals for different factors) via the SASGRAPH GPLOT Procedure invites the use of the Annotate Facility to produced figure that are informative and inviting. We will explore different ways to produce Forest Plots that vary in the amount of annotation used to produce the figures and assess the plusses and minuses of each approach.
Hands-on TrainingHT01 : Picture this: Hands on SAS Graphics Session
Kriss Harris, SAS Specialists Ltd.
Monday, 8:00 AM - 9:30 AM, Location: Palani
Would you like to be more confident in producing graphs and figures? Do you understand the differences between SGPLOT, SGSCATTER and SGPANEL? Would you like to know the different layout options in Graph Template Language (GTL)? Would you like to know how to easily create industry standard graphs such as Adverse Event timelines, Kaplan-Meier plots and Waterfall plots? Finally, would you like to learn all of these methods in a relaxing open environment that fosters questions? Great, then this topic is for you. In this Hands-on SAS Graphics Session you will be skillfully guided through SGPLOT, SGSCATTER, SGPANEL and GTL. You will also complete fun and challenging SAS Graphics exercises to enable you to retain what you have learned more easily. This Hands-on SAS Graphics session is structured so that you will learn how to create the standard plots that your managers requests, how to easily create simple ad-hoc plots for your customers and also how to create complex graphics. You will be shown how to annotate your plots and this includes how to add Unicode characters on to your plots. You will find out how to create reusable templates, which can be used by your team. Years of information have been carefully condensed into this 90 minutes Hands-on SAS Graphics session. This session will give you insight and will be very interactive! Feel free to bring some of your challenging graphical questions along!
HT02 : Application Development Techniques Using PROC SQL
Kirk Paul Lafler, Software Intelligence Corporation
Monday, 10:15 AM - 11:45 AM, Location: Palani
Structured Query Language (SQL) is a database language found in the base-SAS software. It permits access to data stored in data sets or tables using an assortment of statements, clauses, options, functions, and other language constructs. This hands-on workshop (HOW) demonstrates core concepts as well as SQL's many applications, and is intended for SAS users who desire an overview of this exciting procedure's capabilities. Attendees learn how to construct SQL queries; create complex queries including inner and outer joins; apply conditional logic with case expressions; identifying FIRST.row, LAST.row, and BETWEEN.rows in By-groups; create and use views; and construct simple and composite indexes.
HT03 : Are You a Control Freak? Control Your Programs - Don't Let Them Control You!
Art Carpenter, CA Occidental Consultants
Mary Rosenbloom, Edwards Lifesciences, LLC
Tuesday, 8:00 AM - 9:30 AM, Location: Palani
You know that you want to control the process flow of your program. When your program is executed multiple times, with slight variations, you will need to control the changes from iteration to iteration, the timing of the execution, and the maintenance of output and LOGS. Unfortunately in order to achieve the control that you know that you need to have, you will need to make frequent, possibly time consuming, and potentially error prone manual corrections and edits to your program. Fortunately the control you seek is available and it does not require the use of time intensive manual techniques. List processing techniques are available that give you control and peace of mind that allow you to be a successful control freak. These techniques are not new, but there is often hesitancy on the part of some programmers to take full advantage of them. This paper reviews these techniques and demonstrates them through a series of examples.
HT04 : Usage of OpenCDISC Community Toolset 2.0 for Clinical Programmers
Sergiy Sirichenko, Pinnacle 21
Wednesday, 8:00 AM - 11:00 AM, Location: Palani
All programmers have their own toolsets like a collection of macros, helpful applications, favorite books or websites. OpenCDISC Community is a free and easy to use toolset which is useful for clinical programmers who work with CDISC standards. In this Hands-On Workshop (HOW) we'll provide an overview of installation, tuning, usage and automation of OpenCDISC Community applications including: Validator - ensure your data is CDISC compliant and FDA submission ready Define.xml Generator - create metadata in standardized define.xml v2.0 format Data Converter - generate Excel, CSV or Dataset-XML format from SAS XPT ClinicalTrials.gov Miner - find information across all existing clinical trials
HT05 : DS2 with Both Hands on the Wheel
Peter Eberhardt, Fernwood Consulting Group Inc
Xue Yao, Statistician
Tuesday, 10:15 AM - 11:45 AM, Location: Palani
The DATA Step has served SAS® programmers well over the years, and although it is handy, the new, exciting, and powerful DS2 is a significant alternative to the DATA Step by introducing an object-oriented programming environment. It enables users to effectively manipulate complex data and efficiently manage the programming through additional data types, programming structure elements, user-defined methods, and shareable packages, as well as threaded execution. This tutorial is developed based on our experiences with getting started with DS2 and learning to use it to access, manage, and share data in a scalable and standards-based way. It facilitates SAS users of all levels to easily get started with DS2 and understand its basic functionality by practicing the features of DS2.
HT06 : Introduction to Interactive Drill Down Reports on the Web
Michael Sadof, MGS Associates, Inc.
Louis Semidey, The Semidey Group
Tuesday, 1:15 PM - 2:45 PM, Location: Palani
HT07 : Using INFILE and INPUT Statements to Introduce External Data into SAS®
Andrew Kuligowski, HSN
Monday, 1:15 PM - 2:45 PM, Location: Palani
SAS® has numerous capabilities to store, analyze, report, and present data. However, those features are useless unless that data is stored in, or can be accessed by SAS. This presentation is designed to review the INFILE and INPUT statements. It has been set up as a series of examples, each building on the other, rather than a mere recitation of the options as documented in the manual. These examples will include various data sources, including DATALINES, sequential files, and CSV files.
HT08-SAS : Creating Multi-Sheet Microsoft Excel Workbooks with SAS®: The Basics and Beyond Part 2
Vince Delgobbo, SAS
Monday, 3:30 PM - 5:00 PM, Location: Palani
This presentation explains how to use Base SAS®9 software to create multi-sheet Excel workbooks. You learn step-by-step techniques for quickly and easily creating attractive multi-sheet Excel workbooks that contain your SAS® output using the ExcelXP Output Delivery System (ODS) tagset. The techniques can be used regardless of the platform on which SAS software is installed. You can even use them on a mainframe! Creating and delivering your workbooks on-demand and in real time using SAS server technology is discussed. Although the title is similar to previous presentations by this author, this presentation contains new and revised material not previously presented.
Healthcare AnalyticsHA01 : Statistical Analyses Across Overlapping Time Intervals Based on Person-Years
John Reilly, Dataceutics, Inc.
John R Gerlach, Dataceutics, Inc.
Tuesday, 1:45 PM - 2:05 PM, Location: Oceans Ballroom 2
Consider an ongoing observational study consisting of subjects who take a drug to treat a disease that causes serious health problems, such as renal failure, or even death. A scheduled analysis produces a report showing several items of interest (e.g. incidence rate) across overlapping time intervals that are based on person-years, that is, a subject's exposure to the therapy treatment. Consequently, a subject may contribute to multiple time intervals. Although a subject might have more than one event only the earliest event is used. Conversely, a subject might not have any events, yet contributes to the overall person-years for appropriate time intervals. This paper explains an intuitive method for augmenting the analysis data set so that the overlapping time intervals are represented, accordingly.
HA02 : Using SAS® to Analyze the Impact of the Affordable Care Act
John Cohen, Advanced Data Concepts, LLC
Tuesday, 3:30 PM - 3:50 PM, Location: Oceans Ballroom 2
The Affordable Care Act that is being implemented now is expected to fundamentally reshape the health care industry. All current participants -- providers, subscribers, and payers -- will operate differently under a new set of key performance indicators (KPIs). This paper uses public data and SAS® software to illustrate an approach to creating a baseline for the health care industry today so that structural changes can be measured in the future to establish the impact of the new laws.
HA03 : Now You See It, Now You Don't -- De-Identifying Data to Support Clinical Trial Data Transparency Activities
Dave Handelsman, d-Wise, Inc.
Tuesday, 2:15 PM - 3:05 PM, Location: Oceans Ballroom 2
Clinical trial data transparency initiatives will only be successful if the data being shared is properly de-identified in order to protect patient confidentiality and comply with national regulations, while still supporting investigation and analysis. In this emerging area, however, the rules regarding clinical trial data de-identification can be confusing, open to interpretation and difficult to understand. At a basic level, de-identification means that someone reviewing trial data should not be able to match an individual patient's data to a real-life individual. This means not only obfuscating patient ID information (patient number and site number, for example), but also masking all dates, eliminating references to sensitive terms like "HIV", and a wide variety of additional, and often confusing, rules. All of these data modifications must be done in such a way that the clinical trial data can still be successfully analyzed on its own, or when combined with additional trial data. To further complicate matters, this additional trial data may frequently be provided by multiple biopharmaceutical companies. Many companies actively engaged in clinical trial data transparency initiatives are using SAS to perform de-identification. Additionally, they have published their individual de-identification strategies in order for patients to understand how their confidentiality will be protected, and to inform researchers how they will need to prepare to analyze the data. This paper will review the company strategies and the various SAS approaches to de-identification in use today.
HA04 : Medication Adherence in Cardiovascular Disease: Generalized Estimating Equations in SAS®
Erica Goodrich, Priority Health
Daniel Sturgeon, Priority Health
Tuesday, 4:00 PM - 4:20 PM, Location: Oceans Ballroom 2
Medication non-adherence has been estimated to generate $290 billion annually in medical costs (NEHI, 2009). Not only is this an extremely expensive dilemma, it has consequences including worsening conditions, increased hospitalizations and readmissions. Recent medical based evidence had lead to an increased interest in investigating individuals' adherence to cardiovascular disease-related medications. Data from medical databases from a regional health plan of subjects on cardiovascular medications were sampled as subject matter. Binary data using a 70% possession ratio (PR) cutoff for determining medication adherence was analyzed over semi-quarterly time frames. Results of Generalized Estimating Equation (GEE) models are explained through SAS® output.
HA05 : A General SAS® Macro to Implement Optimal N:1 Propensity Score Matching Within a Maximum Radius
Kathy Fraeman, Evidera
Tuesday, 4:30 PM - 4:50 PM, Location: Oceans Ballroom 2
A propensity score is the probability that an individual will be assigned to a condition or group, given a set of covariates when the assignment is made. For example, the type of drug treatment given to a patient in a real-world setting may be non-randomly based on the patient's age, gender, geographic location, overall health, and/or socioeconomic status when the drug is prescribed. Propensity scores are used in observational studies to reduce selection bias by matching different groups based on these propensity score probabilities, rather than matching patients on the values of the individual covariates. Although the underlying statistical theory behind propensity score matching is complex, implementing propensity score matching with SAS® is relatively straightforward. An output data set of each patient's propensity score can be generated with SAS using PROC LOGISTIC, and a generalized SAS macro can do optimized N:1 propensity score matching of patients assigned to different groups. This paper gives the general PROC LOGISTIC syntax to generate propensity scores, and provides the SAS macro for optimized propensity score matching. A published example of the effect of comparing unmatched and propensity score matched patient groups using the SAS programming techniques described in this paper is presented.
HA06 : The Path To Treatment Pathways
Tracee Vinson-Sorrentino, IMS Health
Tuesday, 5:00 PM - 5:20 PM, Location: Oceans Ballroom 2
Refills, switches, restarts, and continuation are valuable and necessary metrics when analyzing pharmaceutical treatment paths. Calculating duration of therapy by product and length of time between products, aka "therapy starts", are important pieces of getting to those metrics. Either as stand-alone metrics or data points along the way to a bigger picture, they are integral business needs for pharmaceutical clients. The very nature of rx claims give us many overlapping incidents of therapy and those overlapping dates can skew the duration of therapy calculation, if the calculation is done incorrectly. This paper will show one very successful way of calculating duration of therapy and length of time between products. Additionally, as a prelude to that process it will also show how to make SAS tell you what value you need for the arrays that will go into the steps and how to force SAS to put that value right into the array code.
HA07 : Distributed data networks: A paradigm shift in data sharing and healthcare analytics
Jennifer Popovic, Harvard Pilgrim Health Care Institute
Tuesday, 1:15 PM - 1:35 PM, Location: Oceans Ballroom 2
Administrative claims data are rich sources of information that are used to inform study topics ranging from public health surveillance to comparative effectiveness research. Data sourced from individual sites can be limited in their scope, coverage and statistical power. Sharing and pooling data from multiple sites and sources, however, present administrative, governance, analytic and patient privacy challenges. Distributed data networks represent a paradigm shift in healthcare data sharing and are evolving at a critical time when 'big data' and patient privacy are often competing priorities. A distributed data network is one for which no central repository of data exists. Rather, data are maintained by and reside behind the firewall of each data-contributing partner in a network, who transform their source data into a common data model and permit indirect access to those data through the use of a standard query approach. Transformation of data to a common data model ensures that standardized applications, tools and methods can be applied to them. This paper focuses on the experiences of the Mini-Sentinel project as a case study for the successful design and implementation of a multi-site distributed data network. Mini-Sentinel is a pilot project sponsored by the U.S. Food and Drug Administration (FDA) to create an active surveillance system - the Sentinel System - to monitor the safety of FDA-regulated medical products. This paper showcases some flexible, scalable and reusable SAS-based open source analytic tools built and maintained by Mini-Sentinel collaborators that facilitate the analysis of data for epidemiologic studies.
HA09 : Simple Tests of Hypotheses for the Non-statistician: What They Are and Why They Can Go Bad
Art Carpenter, CA Occidental Consultants
Wednesday, 9:00 AM - 9:50 AM, Location: Oceans Ballroom 2
Hypothesis testing is a central component of most statistical analyses. The focal point of these tests is often the significance level, but what does this value really mean and how can we effectively use it? And perhaps more importantly, what are the pitfalls and dangers in its interpretation? As we conduct statistical tests of hypotheses, are there other things that we should be doing or looking out for that will aid us in our decision making? After years of academic study, professional statisticians will spend additional years gaining a practical knowledge and experience so that they can correctly conduct and correctly interpret the results of statistical tests. You cannot gain these insights over night, however, this tutorial provides a basic introduction to the concepts of hypothesis testing, as well as, what you should look for and look out for,while conducting statistical tests of hypotheses.
Industry BasicsIB01 : The 5 Most Important Clinical Programming Validation Steps
Brian Shilling, inVentiv Health Clinical
Monday, 8:00 AM - 8:20 AM, Location: Oceans Ballroom 9
The validation of a SAS programmer's work is of the utmost importance in the pharmaceutical industry. Because the industry is governed by federal laws, SAS programmers are bound by a very strict set of rules and regulations. Reporting accuracy is crucial as these data represent people and their lives. This presentation will give the 5 most important concepts of SAS programming validation that can be instantly applied to everyday programming efforts.
IB02 : USE of SAS Reports for External Vendor Data Reconciliation
Soujanya Konda, inVentiv Health Clinical
Monday, 8:30 AM - 8:50 AM, Location: Oceans Ballroom 9
As the adoption of industry data standards grows, organizations must streamline process as to how to manage data effectively with quality often by using various techniques, The main objective of Data Management (DM) is to deliver a qualitative database to SAS Programming , Statistical Analysis teams in a timely manner in turn helps to generate bug-free reports. The ultimate challenge is managing the third party vendor data which loads into the data base and our aim is to reconcile this Vendor data (Lab data, SAE Data) with the related data present in our database. To find out the optimized process in such a way that avoids lot of manual effort, the various challenges, efficient techniques are discussed further in this paper.
IB03 : Tackling Clinical Lab Data in Medical Device Environment
Juan Wu, Medtronic Inc
Min Lai, Medtronic Inc
Monday, 9:00 AM - 9:20 AM, Location: Oceans Ballroom 9
Working with clinical laboratory data can be among the most challenging tasks for SAS® programmers when creating analysis datasets (AD) and tables, figures, and listings (TFL). Lab results are usually the largest dataset among clinical data. There are many different tests, and the same lab test can have different lab units. Many multi-centered medical device clinical studies use local labs, which result in the complication of lab units. Therefore the number of conversion factors adds up quickly. In order to tackle the complicated lab data, we developed a lab data manipulation process, which sets up a central lab test library and a lab unit conversion library to help improve proficiency in programming for AD and for TFL. This paper will illustrate this process in detail. The sample lab test library and lab unit conversion library will be provided as well. We intend to reduce the complexity of coding and enable the trained clinical staff to maintain the lab test and lab unit libraries.
IB04 : SAS Grid : Simplified
Rajinder Kumar, Inventiv International Pharma Services Pvt. Ltd.
Monday, 9:30 AM - 9:50 AM, Location: Oceans Ballroom 9
For most organizations, huge volume of data means big trouble, because it has great impact on speed and accuracy of data analysis. Due to high demand of reliable environment and faster response time for huge data and large number of users from same organization, SAS grid is in great demand. This paper puts light on some of common features of SAS Grid. It also provides some of the limitations of SAS grid and alternate solution for those limitations. It shares different examples related to SAS grid's limitations and their solutions. For example, there are some Procedures in SAS, which can be difficult to use in SAS grid environment. Few options or functionality which work fine in SAS 9.2, may need slight modification before getting used in SAS grid. This paper provides details about all these problems, root cause of the problem and their possible solutions (if any). I am sure, as use of SAS grid is becoming more popular, these will help all the end users in finding answers to their most common questions.
IB06 : Two different use cases to obtain best response using RECIST 1.1 in SDTM and ADaM
Tom Santopoli, Accenture
Kevin Lee, Accenture
Vikash Jain, Accenture
Monday, 10:45 AM - 11:35 AM, Location: Oceans Ballroom 9
Each therapeutic area has its own unique data collection and analysis. Especially, Oncology has a unique way to collect and analyze the data and one of unique data points in oncology study is best response. The paper will be based on Solid Tumor and RECIST 1.1, and it will show use cases on how best response will be collected in SDTM domains and derived in ADaM datasets using RECIST 1.1 in solid tumor oncology study. The paper will provide the brief introduction of RECIST 1.1 such as legions type (i.e., target, non-target and new) and their selection criteria (e.g., size and number). The paper will provide the practical application on how tumor measurements for target and non-target lesions are collected in TR domain, how those measurement are assessed according to RECIST 1.1, and eventually how responses are represented in RS domain based on the assessment from tumor measurements. We will also put in prospective a pictorial road map on which way we choose to derive responses to give a prospective to the user and the process to get from beginning to end objective. This paper will also discuss a use case where the visit level responses are been derived programmatically in ADaM and perform a sensitive analysis in comparison to investigator provides visit level responses to SDTM RS domain. This case study will help user identify the differences between both the methodologies and help answer any anomalies from investigator inference prospective vs. analytical calculations by the programmer.
IB07 : The Disposition Table - Make it Easy
Endri Endri, ProXpress Clinical Research GmbH
Benedikt Trenggono, ProXpress Clinical Research GmbH
Wednesday, 8:00 AM - 8:50 AM, Location: Oceans Ballroom 4
The disposition table is one of the most complex tables in statistical programming, because it provides an overview of clinical data and events from numerous sources. Disposition tables include (but are not limited to): overviews of patients who completed the various study epochs, and the number of adverse events and serious adverse events related to study medication. Some disposition tables require the merging of several datasets, and there are often other complexities, such as the requirement to apply imputation rules, which must be taken into account. These complexities lead to an increased workload and more time spent on programming the table. This presentation explains the steps required to create a disposition table using SAS as a labor and time-saving tool not just to produce the table but to generate the table-creation code automatically based on the data specifications as well as the data itself.. The idea behind it is the retrieval of content and structures of SAS datasets using simple text comparison algorithms. For a better understanding of the algorithms, the presentation will include simple examples with figures and explanations, as well as step-by-step instructions for automated programming with practicle examples of disposition tables. The purpose of the paper is to present ideas on how the manual programming process can be simplified by the use of artificial intelligence, the aim being to make the life of the statistical analyst easier, and to provide more time for validation and review with less time spent on writing programs.
IB08 : Improving Data Quality - Missing Data Can Be Your Friend!
Julie Chen, Clinical Programmer, UBC, PA
Wednesday, 9:00 AM - 9:20 AM, Location: Oceans Ballroom 4
In daily clinical programming work, we always have a good chance of meeting missing data. Missing data can be found anywhere, in a dataset, listing, or table. Missing data could be generated in every step from data collection, to SDTM / ADaM dataset creation, and table/listing production. In some cases, the existence of missing data is the sign of poor data quality. However, if we find missing data and determine where it comes from, then this can be a very good source of information and actually help us to improve data quality. This paper will demonstrate how missing data could be generated, where it is most likely to occur, and present some strategies which can help us to find missing data. The most important thing is that missing data can be informative and investigating it can improve data quality.
IB09 : Tips on creating a strategy for a CDISC Submission
Rajkumar Sharma, Nektar Therapeutics
Wednesday, 9:30 AM - 9:50 AM, Location: Oceans Ballroom 4
A submission to FDA for an NDA (New Drug Application) or a BLA (Biologics License Application) is very exciting as well as very critical for a company. Companies and programmer who are dealing with submission for the first time are often confused on how to develop a strategic plan and how to do a successful filling. A clear strategy and proper planning are very critical for a timely submission. This paper will discuss on some basics about electronic submission and then it will provide guidance on how to plan things out for a successful filling. This paper will also discuss on what to look out for while working with an outside vendor for preparing electronic submission related deliverables.
IB11 : Proc compare: wonderful Procedure!
Anusuiya Ghanghas, Inventiv International Pharma Services Pvt. Ltd.
Rajinder Kumar, InVentiv International Pharma Services Pvt. Ltd
Wednesday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 4
In Pharmaceutical industry, we work with huge volume of data. Sometime we want to cross check the data, if it is as per specification or sometimes after making some modifications we want to cross check if only desired changes has been made. Here comes in picture one of the wonderful procedures, Proc Compare provided by SAS. It helps us in comparing two datasets for their equality or difference with respect to number of observations, number of variables, dataset label, variable attributes and finally values of those variables. If used properly, this procedure can be of great help. As it checks for all these, along with some other things in two different datasets or sometime in same dataset. But if not used properly, it can be quite problematic also. Every SAS programmer will be happy to see "NOTE: No unequal values were found. All values compared are exactly equal." as proc compare output. But it is not only getting this message, many other things also need to be checked before considering this as fully compared. This paper helps in understanding all other things in Proc Compare and if there are mismatches then how to handle them and get above message for full comparison.
Management & SupportMS01 : It's Not That You Know It, It's How You Show It
Jim Christine, dataceutics.com
Monday, 8:00 AM - 8:20 AM, Location: Oceans Ballroom 2
Without a doubt, technical skills are a foundation for success in business. Effective communication is a consistently growing need within the technical community. The way ideas are related, both one-on-one and in groups, can have significant impact on the success of any project. It is not merely about the quantity of the communication, nor the accuracy of the content, but the manner by which the interaction takes place. Electronic mail, teleconferencing, and instant messaging remove important ingredients from the communication process. While many find it sufficiently intimidating to speak in person before an audience of any size, it may be more daunting to present a complex set of ideas without the benefit of facial expressions, voice intonation, or hand gestures. How something is said is oftentimes more important than what is actually said. This paper, and the accompanying presentation, will review effective ways to share information with individuals and projects teams. It includes the common aspects of in-person and long distance communication and provides methods to overcome the challenges of electronic mail, teleconferencing, and instant messaging. The paper will be useful for management at any level, project and technical leads, and individual contributors.
MS02 : Are You An Indispensable SAS Programmer ?
R. Mouly Satyavarapu, inVentiv Health Clinical
Monday, 9:30 AM - 9:50 AM, Location: Oceans Ballroom 2
Job security is a serious concern for so many SAS® programmers in today's workforce, especially knowing that many positions are either being outsourced to low cost regions like China and India or being cut/ revamped. Does that mean working towards or to become an indispensable SAS® programmer is the solution to this concern? In fact to become indispensable, lot of the programmers try to - monopolize a particular skill; be willing to go the extra mile; have a good attitude; be a thought leader; stay current with technology and trends; be a team player; offer solutions; be committed; do the work that matter most, but not the work that's easy; continually improve oral and communication skills; and be consistently reliable and trustworthy. I wholeheartedly agree that each of us should become the best we can be, that our work should be developed and refined to the point that it's viewed as an art, and we are seen as the contributor or architect behind the product. By all means, each should be a great employee and always seek to become a more valuable contributor, but we shouldn't elude ourselves into the thinking that we are "the only or the best" contributor in the team. In this paper, I will outline the perspective of a programmer, a manager, and the organization for being an indispensable SAS® programmer. And for every perspective as there are challenges, I will key them out by these categories, for the - programmer, manager, and organization.
MS03 : How to harness your company's Wiki for world domination (or even better write the best reusable code possible.)
Kurtis Cowman, PRA Health Sciences
Monday, 8:30 AM - 9:20 AM, Location: Oceans Ballroom 2
How can a group of laymen be smarter than the experts? From betting markets to Wikipedia, every day we see real world examples of how large groups that are properly motivated and regulated can accomplish nearly impossible tasks. On "Who Wants to Be a Millionaire" why was the "Ask the Crowd" option correct statistically more often than the "Ask and Expert" option? Yet daily we ask only our most senior staff, our experts, to design macros and reusable code without properly evaluating whether their effort is indeed the most successful. By utilizing a company accessible web application such as a Wiki and the proper management guidance this paper will show you how to harness the knowledge of your global programming staff from top to bottom to quite literally write the best reusable code you company has to offer.
MS04 : No Regrets: Hiring for the Long Term in Statistical Programming
Chris Moriak, AstraZeneca
Graham Wilson, AstraZeneca
Elizabeth Meeson, AstraZeneca
Monday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 2
Recruiting statistical programmers for lead positions who can be subject matter experts, plan work and direct teams is a formidable task. Finding the right combination of programming experience, clinical development knowledge, and leadership can prove to be elusive. After finding candidates, making the accurate decision to hire might be indefinable. In a widely quoted study on recruiting by the Recruiting Roundtable, 50% of hirings are regretted either by the employer or by the new employee within just six months. With the average number of work days to fill an opening being 58 days for companies >5000 employees, can a late phase pharmaceutical company better its odds in making the right selection over just flipping a coin? Improving ones odds means creating a better assessment process for both the company and the candidate. The company must assess whether a candidate meets minimum qualifications, has the capability to perform the job, and fits into the business' culture. To accomplish this task for statistical programming, the hiring process needs to assess six capability categories: Programming, Clinical Development, Leadership, Delivery Skills, Functional Technology, and Communication. Through use of coordinated questions at specific screenings stages, a company is better able to assess a candidate's capabilities while providing better insight to the candidate regarding job expectations and company culture. This paper will present the selection process being implemented to significantly reduce the chances of regret at a large, late phase pharmaceutical company.
MS05 : Development of a Clinical SAS University training program in Eastern Europe
Erfan Pirbhai, Experis Clinical
Sergey Glushakov, Intego Group
Donnelle Ladouceur, Experis Clinical
Monday, 1:15 PM - 2:05 PM, Location: Oceans Ballroom 2
Many specialized positions require training above and beyond standard University degree programs for even entry-level work. We are starting to see various industries turning to Universities for specialized training to develop their workforce. To build on the success of our programming team in Eastern Europe and establish a pool of resources for the future, we have developed and implemented a Clinical SAS training program at a University local to our offshore office. The program enrolled students in the Mathematics graduate program. Curriculum was developed to include SAS, Statistics, Clinical Development, English, and Clinical SAS programming - culminating in SAS certification, Internship opportunities, and client placement. A varied approach was utilized for instruction including traditional text and classroom instruction but also overseas interactive web-ex presentations, optional Hot Topic sessions, social media groups, and senior team members as mentors. This presentation will take the audience through the process and sometimes creative solutions we took to make this program a reality - from the collaboration with the University and recruitment/selection of students, to the outcome for the inaugural class. We believe this type of training program could be successfully utilized anywhere but that strong partners and collaboration are critical. This topic is thought to be of interest to those in management as well as programming and Biostatistics with no particular level of skill or background required.
MS06 : Advantages and Disadvantages of Two Commonly Used CRO Resourcing Models in the Pharmaceutical SAS Programming Environment
Evelyn Guo, AMGEN
Mark Matthews, GCE Solutions
Monday, 2:15 PM - 3:05 PM, Location: Oceans Ballroom 2
In the current pharmaceutical and biotech industry, outsourcing data services to a CRO continues to exist. In today's industry, two commonly used models, "Deliverable Based Model" and "Full-Time Equivalent (FTE) Time and Material Model" (also called as role based model), are widely used. Previously, one presentation at PharmaSUG (See PharmaSUG 2012 -Satyavarapu "A Comparison of Two Commonly Used CRO Resourcing Models for SAS/ Statistical Programmers") identified the differences of these two resourcing models and specifically discussed those models from a CRO point of view. Our paper will emphasize the advantages and disadvantages from a sponsor point of view and discuss how these variances affect the project work and collaboration. By discussing the comparison of these two models, this paper will provide valuable consideration points for sponsors to choose the CRO resourcing model that is of best fit and share ways to bridge the gaps that exist in the models. The two authors have worked in the pharmaceutical environment for a total of 15 years and the CRO environment for 9 years. They have experienced individual contributor roles, team lead and supervisory roles in statistical programming. They will share our experiences we have encountered while working with multiple pharmaceutical companies and CROs.
MS07 : How to Build and Manage a Successful and Effective Offshore FSP Service for US Sponsors out of India
Ganesh Gopal, Ephicacy Consulting Group
Debiprasad Roy, Medivation Inc.
Monday, 3:30 PM - 3:50 PM, Location: Oceans Ballroom 2
Over the last few years the Functional Service Provider (FSP) model has been a successful sought after component, adding a new dimension in the relationship between a sponsor and a service provider. FSP has become a significant byword in the data management, statistical programming and biostatistics value segments. This model is gaining momentum owing to the fact that the client - be it a sponsor or a CRO - gain the advantage of having a flexible work force while keeping central control of an outcome based approach. With increasing cost pressures an FSP model based out of a low cost center such as India, if effectively managed and executed, can provide access to a qualified global workforce that can be of overall benefit across the sponsor-CRO value chain. Through a case study, which explores the successful collaboration between a sponsor and a service provider we propose to outline the machinery of this model that has worked well for the past 3 years and experiencing a growth path reinforced by key management methods. In the paper we will describe the challenges and emphasize the strategies adopted individually and collaboratively by both companies in shaping the protocol of engagement. The perspectives of both the sponsor and the service provider will be juxtaposed for retrospective analysis and prospective planning.
MS08 : How to build an "offshore" team with "onshore" quality
Lulu Swei, PRA Health Sciences
Monday, 4:00 PM - 4:20 PM, Location: Oceans Ballroom 2
In today's competitive global market, many pharmaceutical companies and CROs are building offshore clinical programming teams for various reasons. Often times, we heard cases about offshore outsourcing deals gone bad. For the past year and half, PRAHS successfully build a clinical programming team though a joint venture with WuXi Apptec in China. Currently we have clinical programming teams in Shanghai, Beijing and Wuhan. They truly have become an extension of our global programming team, and enable us to provide around the clock service for our sponsors. This paper will discuss the challenges we faced and solutions throughout the process in the following area: 1. Recruitment and retention; 2. Training and certification; 3. Global and local management support; 4. Resourcing process and policy; 5. Project governance
MS09 : Managing a Remote Workforce: Making Sense Out of Today's Business Environment
Dave Polus, Experis ManpowerGroup
Tuesday, 8:00 AM - 8:50 AM, Location: Oceans Ballroom 2
Telecommuting has become a given facet of our working environment. The author remembers a time not so long ago (OK, it WAS a long time ago) when a department was located in the same office as each other, and "working from home" wasn't in our vernacular. Technology has allowed workforces to become virtual; the past 10-15 years has seen an exponential increase in the number of programmers that rarely, if ever, see the inside of an office building. While this is a positive from almost every aspect, it also creates some interesting management opportunities. This paper will attempt to dissect telecommuting, from its origins and evolution, to our current state, and where we're headed as an industry.
MS10 : An Analysis of Clinical Programmer and Other Technical Job Descriptions: Lessons Learned for Improved Employment Postings
Troy Hughes, No Affiliation
Monday, 4:30 PM - 5:20 PM, Location: Oceans Ballroom 2
Pharmaceutical and Clinical Trials employ a host of technical and non-technical personnel, from study directors, principal investigators, and managers to physicians, marketers, and accountants, to clinical programmers, biostatisticians, data analysts, and auditors. While not all industry positions require advanced programming or data manipulation skills, many do in varying degrees, and SAS has underpinned and advanced the clinical trials industry for decades. Competent, tech-savvy personnel stand at the heart of any successful study and, to this end, appropriate job descriptions that most accurately capture and portray job requirements and responsibilities will predict and capture a more successful workforce. Despite the highly technical nature of SAS-centric jobs in the clinical trials industry, employment postings for these positions often use nomenclature that omits basic software engineering concepts such as quality assurance, quality control, user acceptance testing, functional testing, development and production environments, or descriptions of the software development life cycle (SDLC) or project management methodologies that are espoused. This text describes an analysis of over 3,000 clinical trials and pharmaceutical technical positions that utilize SAS posted on Indeed.com for six months between July and December, 2014, and compares the language of these positions with over 20,000 non-pharmaceutical object-oriented programming (OOP) job descriptions over the same time period. It concludes with recommendations that clinical trials positions that require advanced SAS software development or data analytic skills should better articulate hard programmatic skills in employment postings, including non-programmatic functions of the SDLC such as code review, code testing, code propagation, and documentation.
MS11 : Schoveing Series 1: Motivating and Inspiring Statistical Programmers in the Biopharmaceutical Industry
Priscilla Gathoni, Manager, Statistical Programming
Tuesday, 9:30 AM - 9:50 AM, Location: Oceans Ballroom 2
Is it possible as a people manager to arouse enthusiasm among the associates that work with you? What creates passion, enjoyment, acceptance, and a love for the work we do in analyzing clinical trial data? Is it possible to wake up and say, "I love my job!" This paper explores 7 habits that a people manager can inhabit and also instill in statistical programmers to create a healthy and balanced work life. Each habit explores ways in which effective results can be achieved through a change in human behavior and the thought process. The paper shows that bad habits can be broken and good habits formed. We can choose the right actions, which in turn influences our reactions. We have dominion over our thoughts, feelings, and life - making conscious choices is inevitable if you want to change. The legacy and attitude we carry at work transcends to our homes, children, and family. Through learning, teaching, and applying these habits, both the people manager and statistical programmer can be assured of a lifetime of harmony, peace, and a strong character rooted in deep roots of daily self-improvement. Embrace yourself with proactive, helpful, and effective behaviors that will eventually strengthen and build others in a positive and friendly way.
MS12 : "Firefighters" - clinical trial analysis teams
Iryna Kotenko, Experis Clinical A Manpower Group Company
Tuesday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 2
It is quite typical situation for our industry when the team approaching the deadline of some reporting event experiencing lack of resources and forced to work beyond the regular hours. The reasons for this are: underestimating during early planning of the project, unforeseen circumstances, errors in communication and many others. The consequence of all these is the constant stress for all project participants, which affects the quality of work. In this article I would like to discuss the possibility and feasibility of the developing a team of dedicated specialists that could be involved into the project work for the short term to perform as "firefighters" on the basis of CRO or biometric analytical department of pharmaceutical concern. Entering a new employee in a project at the stage close enough to deadline seems troublesome and ineffective, but is it really so? The purpose of this paper is to construct a working model for the so-called "firefighters team" so that its work is effective and it solves the problems of lack of resources on the project. There will also be an estimation of the positive and negative aspects of this type of command and the potential demand for this type of service.
MS14 : Steps Required to Achieve Operational Excellence (OpEx) Within Clinical Programming
Opeyemi Ajoje, Accenture Life Sciences
Monday, 11:15 AM - 11:35 AM, Location: Oceans Ballroom 2
Lean, Six Sigma, Shingo, The Toyota Way and other methods under the umbrella of Operational Excellence (OpEx) have being traditionally applied to manufacturing operations. Other areas of company operations have implemented OpEx with great success: reduced cycle times, fewer quality issues and faster flow of information, as well as culture of continuous improvement. Why haven't we heard much about OpEx within Clinical Programming? Are the benefits that OpEx can deliver even of interest of Clinical Programming Operations, such as understand and exceed your customer expectations, improve quality, reduce errors, identify and drive out waste from processes, simplify your operations making them easier to manage, reduce cost, cycle-time and time-to-client etc. This paper will cover 5 basics steps required to achieve Operational Excellence in order to manage Clinical Programming department effectively.
MS16-SAS : A View Into SAS® Modern Architectures - Adding 'Server' SAS to Your Artifacts
Matt Becker, SAS
Tuesday, 9:00 AM - 9:20 AM, Location: Oceans Ballroom 2
Back in the late-1980's, I worked for a large CRO institution doing Clinical data integration using SAS computing. As a company we used PC SAS connected to a file server in a small office building. Over the years and as SAS technology advanced, our company moved to server SAS with a network attached file storage. In 2006, SAS introduced SAS Grid, a modernized SAS server platform that manages and optimizes jobs processed across multiple servers. Over the past two years, Health and Life Science organizations are moving to SAS Grid for failover, workload balancing, prioritization and scalability. In this talk, I will walk through the past, review the present and look to the future of SAS computing.
MS17 : Panel Discussion: Managing in a Virtual Workplace
Matt Becker, SASA
Kent Letourneau, PRA
Wednesday, 8:30 AM - 10:20 AM, Location: Oceans Ballroom 9
With the high demand for SAS programming resources and emergence of high speed internet in homes, more and more companies are allowing their programmers to work remotely from home offices. In many companies it is not uncommon for a programming manager to have staff reporting to them from multiple locations in different parts of the US or even multiple countries. This trend in the industry has lead the need to develop new strategies and policies for the management of this virtual workforce. Expert panelists made up of managers and a programmer with many years of experience either managing remote employees or working as a remote employee will engage in a lively and informational discussion about the nuances of effective management of remote programmers. Questions and answers will be intertwined within the topical discussions to create a dynamic interaction with the audience.
MS18 : PhUSE CSS Round Table Discussion - Statistical Computing Environment
Mark Matthews, GCE Solutions
Wayne Woo, Novartis Vaccines
Andre Couturier, Novartis
Gina Wood, Novartis
Wednesday, 8:30 AM - 10:20 AM, Location: Oceans Ballroom 10
Many companies use a Statistical Computing Environment (SCE) to assist in the execution, management and reporting of statistical programming activities. There are several commercial products that provide this functionality but many companies develop their own systems and processes. A PhUSE CSS project has been created within the Emerging Trends and Technologies Working Group which will explore the expectations for a SCE. Our road map consists of understanding the broad use of various SCEs and identify the common gaps and best practices in order to identify the requirements and functionality of an optimal SCE. Join us for a discussion around this topic and hear the PhUSE project team's findings and participate in the discussions leading to the success of our mission.
PostersPO01 : TEAE: Did I flag it right?
Arun Raj Vidhyadharan, inVentiv Health Clinical
Sunil Mohan Jairath, inVentiv Health Clinical
Everyone knows what a treatment-emergent adverse event is. Treatment-emergent adverse event is termed as an event that first appears during treatment, which was absent before or which worsens relative to the pre-treatment state. However there are extensions to the definition of a TEAE that varies from study to study and that one might not think of while deriving this flag. In this paper, we examine the various factors that would contribute in defining a treatment-emergent adverse event.
PO02 : Creating a Break in the Axis
Amos Shu, MedImmune LLC
Breaking the vertical and/or horizontal axis can simplify the figure, improve aesthetics, and save space. Two SAS samples 48330 and 38765 have provided examples to break the vertical axis. However, using ENTRY and DRAWLINE statements can create a much better break.
PO03 : A Visual Reflection on SAS/GRAPH® History
Haibin Shu, AccuClin Global Services LLC
John He, AccuClin Global Services LLC
The authors use these four procedures to reflect upon the different stages of SAS/GRAPH development. These four procedures, as widely used by programmers on figures, represented important key features which were developed and/or evolved over a long period of time. A visual reflection based on real-project examples will not only help us to better understand these valuable tools but also help us to promote the use of the advanced features on visual analytics in a more appealing and accurate way.
PO04 : Automation of paper dossier production for Independent Review Charter
Ting Ma, Pharmacyclics
Jeff Cai, Pharmacylics
Michelle Zhang, Pharmacyclics
Linda Gau, Pharmacyclics
Independent review charter (IRC) is recommended by FDA to minimize bias in radiographic interpretation of radiological findings. For oncology clinical trials, paper dossiers documenting the clinical information of subjects need to be submitted to IRC for assessments on an on-going basis whenever the pre-defined milestones are reached. When information from the same subject is sent to IRC more than once, track changes may need to be displayed on the paper dossier to assist the data review. Tracking the historical submissions for every subject and generating the final delivery PDF files with highlighted changes can be quite time-consuming. This paper presents an automated process to produce the track-changed paper dossier in batch, using SAS 9.3, regardless of the different submission history for each subject. The automation should greatly improve the efficiency of this programming task for IRC. This paper is intended for audience with intermediate level of SAS programming skills. Key words: Independent Review Charter (IRC), Dynamic Data Exchange (DDE), Visual Basic Application (VBA), X command
PO05 : A Web-Based Approach to Fighting Analysis Chaos
Gordon Fancher, Seattle Genetics
Rajeev Karanam, Seattle Genetics
Shawn Hopkins, Seattle Genetics
Review of analyses by cross-functional teams is essential to provide quality output for industry conferences, manuscripts, regulatory submissions, and internal decision-making. Documentation and tracking of reviews are important to ensure all comments are addressed. In addition to comments, documentation typically includes records of the items being produced, team members performing the work, tasks assigned and performed by each team member, the project timeline and progress relative to the timeline, and specifications and metadata pertaining to each item being produced. Datasets, tables, listings, figures, and SAS® macros and other utilities are items that have the potential to be tracked. There is a need to track these analyses over time for multiple purposes. Such purposes include determining common items used to prioritize development of infrastructure, developing standards for timelines and metrics for evaluating workload and resourcing, and analyzing performance on both team and individual team member levels. Both documentation and tracking can be complicated and time consuming, particularly if various portions of the information are captured in different places. This paper describes the design and development process Seattle Genetics' Clinical Programming group used to create a web-based application that can be used for documentation, tracking, output review, and managing all of the information in one place. The paper also discusses various aspects of the application, such as usage by assigned role, support for company processes and business rules, benefits, and plans for enhancement.
PO06 : Programming Pharmacokinetic (PK) Timing and Dosing variables in Oncology studies: Demystified
Kiran Cherukuri, Seattle Genetics
Pharmacokinetic (PK) analysis is a major part of clinical trials intended to characterize the time course of the drug and/or metabolite concentrations after drug administration in order to obtain information on drug disposition in humans. This paper discusses how timing and dosing variables are derived through a generic program to facilitate PK analysis. In oncology studies, the last timepoint of one cycle is often also the first timepoint of the subsequent cycles. Accommodating multiple timepoints for a single PK concentration data point on two different timescales is important to facilitate analysis, but presents a programming challenge. This Challenge is described and solutions provided. In addition, the significance of these variables as they are used for concentration time profiles and derivation of PK parameters such as AUC, Cmax and Tmax is presented.
PO07 : OSI Requests : Create bookmarked PDF using ODS Rearrange and replay your output using PROC DOCUMENT
Jacques Lanoue, Novartis
Aruna Kumari Panchumarthi, Novartis
Sponsor are receiving FDA requests ISO now in a systematic fashion. The requirements are very specific on the structure of the deliverable. The paper is a technical presentation on how we used SAS PROC DOCUMENT to acheive the desired output.
PO08 : Exchange of data over internet using web service(e.g., SOAP and REST) in SAS environment
Kevin Lee, Accenture Life Sciences
We are living in the world of abundant information, and the ability to seamlessly exchange information between customers, partners and internal business units is vital for success for any organization. Today, much of the information can be accessed and exchanged between different system over the internet using web services. Web services allow different systems to exchange data over internet. The paper will show how SAS can exchange the data with the different software system over internet using web services. The paper will introduce the basic concepts of web service and its method of communication: SOAP(Simple Object Access protocol) and REST(Representational state transfer). First, the paper will briefly describe SOAP and its structure - HTTP header and SOAP envelop. The paper will show the examples of how SAS programmers send a request to the web service and receive the response from the web service using SOAP in SAS environment. The paper will also show how SAS programmers can create a SOAP request using SOAPUI, the open-source software which allows the users to create SOAP and test the connectivity with the web service. The paper will explain how FILENAME and SOAPWEB function send SOAP request file and receive response file. The paper will also explain the structure of SOAP response file in XML. The paper will show the structure of REST, and it will instruct how SAS programmers write SAS codes to get the data from other system using REST. The paper will introduce SAS FILEMANE, its url and debug options.
PO09 : Adding Subversion® Operations to the SAS® Enhanced Editor
Oscar Cheung, PPD
Ken Borowiak, PPD
Subversion® (SVN) is an open-source version control system whose use is growing amongst clinical trial programming departments. SVN can be used for versioning many files types, such as Microsoft Word and Excel, XML, RTF and SAS® data sets and logs. However, clinical trial programmers have a valuable tool in SVN at their disposal for tracking the development and maintenance of SAS programs. This paper demonstrates how to enable two of the most common SVN operations, namely SVN ADD and SVN COMMIT, in the SAS Enhanced Editor.
PO10 : Summing up - SDTM Trial Summary Domain
Yi Liu, Celerion
Jenny Erskine, Celerion Inc.
Stephen Read, Celerion Inc.
Trial Design Model (TDM) datasets, specifically the Trial Summary Information (TS) domain, can provide a challenge for SAS programmers and SDTM reviewers based on the complexity of study design and classification of study related characteristics as detailed in protocols, Statistical Analysis Plans (SAP), study related data or other supporting reference dictionaries and specifications. This paper gives a brief background and overview of key TS domain requirements and discusses some practical guidance in support of the TS domain build, its population and subsequent review. The paper also includes some recommendations around clearer definition of TS parameters in line with more recent SDTM and OpenCDISC validation enhancements and considerations around support of configurable data files that can be used to help drive the appropriate definition and documentation of TS data across a variety of protocols.
PO11 : Efficiently Produce Descriptive Statistic Summary Tables with SAS Macros
Chunmao Wang, Quintiles
In the pharmaceutical industry, two types of tables are commonly used: One is for categorical variables, computing frequency counts and percentages; another is for continuous variables, computing means, standard derivation, medians, min and max, confidence limits, and ranges, etc. Producing these tables is simple yet trivial, and sometime cumbersome and time consuming, as many variables and many conditions might be requested. Hence a template or tool that can help to create these tables is very useful. In this paper we introduce two simple SAS macros that can be incorporated into your SAS programs easily to generate descriptive statistic summary tables. They will not only save a lot of time but also improve the quality.
PO12 : Utilizing SAS® for Cross-Report Verification in a Clinical Trials Setting
Daniel Szydlo, Fred Hutchinson Cancer Research Center
Iraj Mohebalian, Fred Hutchinson Cancer Research Center
Marla Husnik, Fred Hutchinson Cancer Research Center
Phase III clinical trials require frequent and diligent monitoring of study data that includes the generation of Data Safety Monitoring Board (DSMB) reports. The importance of data accuracy, frequency of generation, and sheer length of DSMB reports combine to make their verification crucial and time-consuming. MTN-020, a Phase III, double-blind, placebo-controlled, randomized HIV prevention clinical trial, is an example of a study requiring the production of biannual DSMB reports. In an effort to improve the accuracy and efficiency of DSMB report production we developed procedures utilizing SAS® to automate a number of verification checks. This paper presents the framework and the SAS® tools we developed for verifying DSMB reports for the MTN-020 study that are generalizable to other research reports. The tools we developed to support DSMB report production included the following: SAS® programs that created standardized results datasets for the generation of tables, listings, and figures (TLFs); driver files that determined the verification checks to be performed; SAS® programs that performed verification checks on results datasets; and HTML-formatted output that clearly showed whether and where errors occurred. We present specific examples of our tools using mock data from the MTN-020 study that utilize SAS® facilities such as the Macro Language and Output Delivery System (ODS).
PO13 : Leveraging ADaM principles to make analysis database and table programming more efficient
Andy Hulme, PPD
The purpose of the CDISC Analysis Data Model (ADaM) is to create an analysis database that is both traceable to source data and analysis-ready. In most studies, the source data will consist mostly of SDTM domains and thus it may seem practical to create an analysis database that is only an extended version of SDTM. A method like this will tend to neglect the analysis-ready principle of ADaM, specifically that ADaM domains should be designed using the planned table outputs such that analyses can be recreated from the analysis database with minimal additional programming. Using ADaM principles, this paper discusses several techniques that can increase the efficiency of analysis database programming and make table programming simpler. Examples include storing of treatment information in ADSL, date imputation, creating new records in Basic Data Structure (BDS) domains for new analysis parameters, and deriving specific ADaM variables to prevent inefficient rework in downstream programs.
PO14 : Subset without Upsets: Concepts and Techniques to Subset SDTM Data
Jhelum Naik, PPD
Sajeet Pavate, PPD
For certain studies, the critical endpoints may occur at an interim visit rather than the end of the study OR the safety data may need to be monitored at periodic intervals. These trial designs dictate the requirement for subset of CDISC SDTM (Study Data Tabulation Model) data to generate CDISC ADaM (Analysis Data Model) and subsequent analysis. We demonstrate via Case studies and operational details that it is more efficient to subset the SDTM data vs. the raw data. We explain two approaches to subset the SDTM data: Visit based and Hybrid. The Visit based approach hinges upon identification and use of a unique "cutoff" date per subject based on the subject's Visit date to subset the data across all SDTM domains. This unique date can be generated based upon the status of a subject in the study. Hybrid approach looks at the Visit (unique cutoff date per subject) to subset the Efficacy domains as well as a common "calendar" cutoff date to subset the Safety domains (as categorized in the Statistical Analysis Plan for the study). We highlight the differences between the two approaches, e.g. handling log pages. We show how the data can be further cut to keep only a subset of subjects in the final data. We provide operational details on the subset technique, issues to watch out for in certain key domains such as SE, EX, AE etc., Quality Checks to ensure data consistency across SDTM domains, issues in OpenCDISC and handling missing and/or partial dates.
PO15 : Update: Development of White Papers and Standard Scripts for Analysis and Programming
Nancy Brucken, inVentiv Health Clinical
Michael Carniello, Astellas Pharma US
Mary Nilsson, Eli Lilly & Company
Hanming Tu, Accenture
A PhUSE Computational Science Symposium (CSS) Working Group is creating white papers describing recommended analysis and reporting methods for frequently-collected types of data included in clinical trials and regulatory submissions. An online platform for sharing code has also been created, making these standards easy to implement. This paper provides an update as of April 2015 on the progress made in these efforts.
PO16 : Automating biomarker research visualization process
Xiaohui Huang, Gilead Sciences
Jigar Patel, Gilead Sciences
Biomarker assays have become more and more popular in drug discovery clinical trial designs. It is widely used in different therapeutic areas to support decision making regarding drug candidates and accelerate drug development, in addition to reducing costs. Graphic presentation is essential in every Biomarker study report. An analysis of Biomarker data are typically fast paced and facing a tight timeline. A very useful tool is to have a utility macro to facilitate the visualization process and promote standardization to fulfill regulatory requirements. This paper presents a macro toolkit for generating the most commonly used statistical graphics in biomarker data analysis. Instead of relying on the default template, the code is developed in SAS 9.2 and using Graph Template Language (GTL) to take control of the graphic appearance. One feature of this macro is that it can automate symbol assignments to different populations in a consistent way across different plots. The following will describe key elements of the macro functionality and techniques used in code development. This macro tool is not only for programmers but also for clinical scientists who do not necessarily have SAS programming skills but want to do some exploratory research of the biomarker data themselves.
PO17 : Spelling Checker Utility in SAS® using VBA Macro and SAS® Functions
Ajay Gupta, PPD Inc
In Pharmaceuticals/CRO industries, it is quite common to have typographical error during the data entry, data set programming, and table programming. This typographical error if not catch on time can add more programming/reviewer time and can affect the quality of deliverables. In order to find the misspelled words in SAS data set and tables, programmer normally review the data visually which can be tedious and prone to error. This paper will introduce a method to report all the misspelled word from SAS data set, word document in SAS using word basic commands via DDE (Dynamic Data Exchange) and SAS functions. If needed, this macro can be extended to find misspelled word from other file format for e.g. Microsoft Excel.
PO19 : Macro to Read Mock Shells in Excel and Create a Copy in Rich Text Format
Rakesh Mucha, Sarah Cannon Research Institute
Jeffrey Johnson, Sarah Cannon Research Institute
It is a common practice for Statisticians in Pharmaceutical industry to create mock shells as part of the Statistical Analysis Plan. It not only helps reviewers to know what Tables, Listings (TLs) they can expect as part of the analysis, but also gives them a chance to recommend necessary changes based on Protocol and/ or Clinical Expertise. It also helps Statistical/SAS Programmers to generate actual Tables, and Listings based on the mock shells. Even though actual TLs are typically presented in Rich Text Format (RTF), some statisticians prefer producing mock shells in Excel for various reasons like traditionally creating mock shells in Excel, creation of mock shells in Excel is relatively easy to RTF etc. It is preferable to create mock shells in Rich Text Format for various reasons like sponsor, programmers prefer shells in RTF or PDF to Excel; it is good to have mock shells in the format, actual TLs will be presented. In this paper we introduce a macro, %xl2rtf, which was developed in SAS 9.3 and can be used to read mock shells in Excel and creates a copy of shells in RTF. This gives Statisticians the flexibility of presenting mock shells in RTF without having to actually create mock shells in RTF. This paper expects basic SAS programming knowledge for the users so they can make the macro work for their study by updating the in path and out path of the shells, and also make minor updates according to their environment or shells.
PO21 : Evaluating SDTM SUPP Domain for ADaM - Trash Can or Buried Treasure
Xiaopeng Li, Celerion
Yi Liu, Celerion
Chun Feng, Celerion
Study Data Tabulation Model (SDTM) is commonly expected as industry standard for clinical study electronic submissions to the US Food and Drug Administration (FDA). SDTM has its own standard regulations on data structure for capturing and categorizing variables across all the SDTM parent domains. Analysis Data Model (ADaM), as the FDA recommended analysis submission data model, is generated based on SDTM. Due to the distinctive structure of SDTM data, programmers who generate ADaM compliant data sets frequently encounter difficulties locating or deriving ADaM-oriented variables from SDTM parent domains. This is often difficult because customized or sponsor-specific analysis information needed in ADaM data may not be captured or allowed in the SDTM parent domains. Therefore, maintaining all needed information in SDTM supplemental domains becomes an efficient solution under the SDTM data structure. The paper discusses the importance of supplemental domains in terms of traceability of data and supports for analysis, and illustrates how beneficial SDTM supplemental domains are to ADaM programmers using real-life clinical data examples.
PO22 : Challenges in Developing ADSL with Baseline Data
Hongyu Liu, Vertex at Boston
Hang Pang, Vertex at Boston
For a CDISC compliant submission, the ADSL (Analysis Data Subject Level) domain is the minimum required analysis dataset. However, developing an ADSL dataset can be quite challenging. One of the challenges is how to acquire and organize the baseline data for ADSL creation. Should ADSL be developed using SDTM data as its source data so that all ADaM (Analysis Data Model) domains are programmed independently without recursion? Or should ADSL be developed partially depending on other ADaM domain's programming? This paper will discuss various models that a sponsor may consider during the ADSL dataset generation.
PO23 : Analysis Methods for a Sequential Parallel Comparison Design (SPCD)
James Zuazo, MMS Holdings
Harry Haber, MMS Holdings
Christopher Hurley, MMS Holdings Inc.
There are many considerations to take into account when planning a clinical trial, and an important consideration is trial design. Adaptive designs are sometimes chosen, as they include features that will increase the efficiency of a trial. One such example is the Sequential Parallel Comparison Design (SPCD), an adaptive design that allows for re-randomization of specific placebo subjects ("non-responders") from an early stage to placebo or treatment in a subsequent stage of the trial. This increases the number of subjects who receive active treatment and improves the power of the trial. Trials using the SPCD design can be analyzed using a number of different methods, including Ordinary Least Squares (OLS), Seemingly Unrelated Regression (SUR) and Repeated Measures Linear Model (MMRM). These methods are described in three papers that feature the SPCD design. This poster will describe, display and compare the methods. The SPCD design is currently in its early stages of implementation and acceptance. The efficiencies of this design will ensure an increase in popularity and a realization of the benefits through its application. This poster will give guidance on how to analyze studies that use this design.
PO24 : Introducing a Similarity Statistic to Compare Data Libraries for Improving Program Efficiency for Similar Clinical Trials
Taylor Markway, PRA Health Sciences
Amanda Johnson, PRA Health Sciences
Measuring the similarity between studies allows programmers to follow the best development strategies. If the similarity is overestimated development strategies focused on code reuse will lead to a breakdown in quality or timelines by overextending resources attempting to maintain an unrealistic pace. Alternatively, underestimating the similarity creates inefficiencies through duplication of effort. This paper introduces a similarity statistic (Study Similarity Factor or SSF) to quantify similarity between two studies. This statistic is developed to determine the similarity between any two SAS® data libraries with no requirements for following any industry standards such as CDISC. However, the statistic's value is enhanced through the widespread adaptation of such standards. This paper will detail the code using Base SAS® to generate the SSF and show that the SSF can be tailored to suit an organization's particular needs. In the examples presented throughout this paper the SSF is a normalized frequency of matching combinations of unique data sets and variables as well as variable only matches. Through live study examples this paper will show how a systematic standard for measuring study similarity leads to more informed decision making and improved efficiency in software development. This tool can be made part of the planning process to make programming in clinical research more efficient by answering the question 'How similar is Study A to Study B?' This information can then be used to assess a particular development strategy; for example, using the same resource team on studies with overlapping timelines.
PO25 : Enhancing Infrastructure for Growth
Amber Randall, Axio Research
Bill Coar, Axio Research
As many leadership experts suggest, growth happens only when it is intentional. It is vital to the immediate and long term success of our employees as well as our employers. As SAS programming leaders, we have a responsibility to encourage individual growth as well as provide the opportunity. With an increased workload yet fewer resources, initial and ongoing training seem to be deemphasized as we are pressured to meet project timelines. The current workforce continues to evolve with time and technology. More important than simply providing the opportunity for training, individual trainees need the motivation for any training program to be successful. Although many existing principles for growth remain true, how such principles are applied needs to evolve with the current generation of SAS programmers. The primary goal of this poster is to identify critical components we feel are necessary for the development of an effective training program meeting individual needs of the current workforce. Rather than proposing a single 12-step program that works for everyone, we feel identification of key components for enhancing existing training infrastructure is a step in the right direction.
Quick TipsQT02 : iSplit Macro: to split large SAS datasets
Hany Aboutaleb, Biogen Idec
Monday, 1:15 PM - 1:25 PM, Location: Oceans Ballroom 9
In July 18, 2012, the FDA (CBER) issued a guidance imposing certain requirements on electronic submissions. Under this guidance, an analysis data set must be split if the dataset is greater than 1 GB. As a result, it is important for programmers to have a convenient and reliable tool to perform this conversion. At Biogen Idec we have developed such a tool in the form of a utility macro. This macro can be used to efficiently and effectively split data sets when they exceed the FDA-imposed size limitation.
QT04 : Let SAS Generate XML Code for ACCESS Audit Trail Data Macro
Sijian Zhang, VAPHS
Monday, 2:00 PM - 2:10 PM, Location: Oceans Ballroom 9
Microsoft ACCESS database is often used for data entry and data storage in small and medium sized projects. As a component of information security, the data audit trail application is usually required to be built in. Since ACCESS 2010, the data macro at the table level is available, which is very helpful for this purpose. However, the data macro setup can be very tedious when the number of variables is large. The alternative way is to export a small ACCESS macro as an XML file template, expand the template to all audit fields, and then import the expanded XML code back into the ACCESS database. This paper will give a brief introduction of audit trail in ACCESS and show how the XML code is generated with SAS macro facility.
QT05 : EXTENDED ATTRIBUTES: A New Metadata Creation Feature in SAS 9.4 for Data Sets and Variables
Joseph Hinson, Princeton, NJ
Monday, 2:15 PM - 2:25 PM, Location: Oceans Ballroom 9
Wouldn't it be nice if one could run the CONTENTS procedure and obtain such system attributes as "Core", "Source", "DataType", "Derivation", "URL", " SAScode", "Programmer", et cetera? With SAS 9.4 Extended Attributes, one is no longer limited to just the familiar predefined system attributes "Name", "Label", "Type", "Length", and "Format" for variables. Infact, with Extended Attributes, one can have an entire SAS code as a data set attribute! These custom data set and variable attributes can be created with the SAS 9.4 DATASETS procedure through its new XATTR statements. The user-defined attributes get embedded in the data set and can be read from Proc CONTENTS, DICTIONARY.XATTRS table, or from the SASHELP.VXATTR view of that table. In the Dictionary Table, extended attributes have the member type "EXTATTR". The resulting data set gets the extension ".sas7bxdat". For clinical programming, the real usefulness of this new feature is the ability to embed CDISC metadata into data sets such that the creation of define.xml is greatly facilitated.
QT06 : PROC SQL for SQL Die-hards
Barbara Ross, Advance America
Jessica Bennett, Advance America
Monday, 2:30 PM - 2:40 PM, Location: Oceans Ballroom 9
Inspired by Christianna William's paper on transitioning to PROC SQL from the DATA step, this paper aims to help SQL programmers transition to SAS by using PROC SQL. SAS adapted the Structured Query Language (SQL) by means of PROC SQL back with Version 6. PROC SQL syntax closely resembles SQL, however, there some SQL features that are not available in SAS. Throughout this paper, we will outline common SQL tasks and how they might differ in PROC SQL; as well as introduce useful SAS features that are not available in SQL. Topics covered are appropriate for novice SAS users.
QT07 : Creating the Perfect Table Using ODS to PDF in SAS 9.4®
Elizabeth Dennis, EMB Statistical Solutions, LLC
Maddy Dennis, Pharmapace
Monday, 2:45 PM - 2:55 PM, Location: Oceans Ballroom 9
Submitting output to the FDA in PDF format has become more common recently. Unfortunately, when RTF files are electronically converted to PDF, unwanted format changes can occur, such as border lines no longer being visible. Creating the table using ODS to PDF directly is a better technique. However, PROC REPORT statements written to create RTF tables produce different results when creating a PDF file. Using SAS 9.4®, this paper will discuss the ODS to PDF statement along with the PROC REPORT statements which will create a perfectly formatted table that conforms to the FDA Portable Document Format Specifications.
QT08 : Hands Free: Automating Variable Name Re-Naming Prior to Export
John Cohen, Advanced Data Concepts, LLC
Monday, 3:00 PM - 3:10 PM, Location: Oceans Ballroom 9
Often production datasets come to us with data in the form of rolling 52 weeks, 12 or 24 months, or the like. For ease of use, the variable names may be generic (something like VAR01, VAR02, etc., through VAR52 or VAR01 through VAR12), with the actual dates corresponding to each column being maintained in some other fashion - often in the variable labels, a dataset label, or some other construct. Not having to re-write your program each week or month to properly use these data is a huge benefit. Until, however, you may need to capture the date information to properly document - in the variable names (so far VAR01, VAR02, etc.) - prior to, say, exporting to MS/Excel® (where the new column names may instead need to be JAN2011, FEB2011, etc.). If the task of creating the correct corresponding variable names/column names each week or month were a manual one, the toll on efficiency and accuracy could be substantial. As an alternative we will use an approach using a "program-to-write-a-program" to capture date information in the incoming SAS® dataset (from two likely alternate sources) and have our program complete the rest of the task seamlessly, week-after-week (or month-after-month). By employing this approach we can continue to use incom-ing data with generic variable names and output our results with specific (and correct!) variable names, all hands free.
QT09 : Using Meta-data to Identify Unused Variables on Input Data Sets
Keith Hibbetts, Inventiv Health Clinical
Monday, 3:30 PM - 3:40 PM, Location: Oceans Ballroom 9
Meta-data is a very powerful tool in the creation of clinical trial data. It's commonly used to define data standards, and can also be used to increase the level of automation and efficiency in data transformation. This paper will explore an additional benefit to using meta-data in data transformation: it can be used to programmatically identify any variables in the input datasets that weren't utilized. Identifying these unused variables provides a valuable quality check to help ensure that no important data is being lost in the data transformation.
QT10 : A Simple Macro to Select Various Variables Lists
Ting Sa, Cincinnati Children's Hospital Medical Center
Yanhong Liu, Cincinnati Children's Hospital Medical Center
Monday, 3:45 PM - 3:55 PM, Location: Oceans Ballroom 9
Often we need to select a subset of variables from a dataset. SAS software has provided the "SAS variable list" from which we can select the variables without typing their names individually. However, the "SAS variable list" cannot be used in all SAS procedures, such as the SELECT statement in the SQL procedure. Also, there is no "SAS variable list" available to select variables that share a common suffix or middle part in their names. In this paper, we introduce a macro that not only incorporates the "SAS variable list" to select a subset of variables, but also can be used to select variables that share common patterns in their names. Additionally, the results from the macro can be applied to all SAS procedures.
QT11 : Don't Get Blindsided by PROC COMPARE
Josh Horstman, Nested Loop Consulting
Roger Muller, Data-to-Events.com
Monday, 1:45 PM - 1:55 PM, Location: Oceans Ballroom 9
"NOTE: No unequal values were found. All values compared are exactly equal." In the clinical trial world, that message is the holy grail for the programmer tasked with independently replicating a production dataset to ensure its correctness. Such a validation effort typically culminates in a call to PROC COMPARE to ascertain whether the production dataset matches the replicated one. It is often assumed that this message means the job is done. Unfortunately, it is not so simple. The unwary programmer may later discover that significant discrepancies slipped through. This paper will briefly overview some common pitfalls in the use of PROC COMPARE and explain how to avoid them.
QT12 : Let SAS "MODIFY" Your Excel file
Nelson Lee, Genentech
Monday, 4:15 PM - 4:25 PM, Location: Oceans Ballroom 9
It is common to export SAS data to Excel by creating a new Excel file. However, there are times that we want to update the input Excel file instead of creating a new one so that we can preserve certain data attributes such as text highlight, font color, and etc. This paper shares a quick tip that utilizes the SAS MODIFY statement to update a column in an Excel file where the input file and the output file are the same. The advantage of this approach, documented in this paper, is that it reads data from an edit check specifications document (an Excel file) and it updates a targeted column in the same document. This paper is written for audiences with beginner skills; the code is written using SAS Version 9.2 on the Windows operating system.
QT13 : Is Your SAS® System Reliable?
Wayne Zhong, Accretion Softworks
Monday, 1:30 PM - 1:40 PM, Location: Oceans Ballroom 9
If you ask your SAS Server to add 1 + 1, how often will you receive the answer 2? As professionals working in the pharmaceutical industry, we take for granted that our computer systems will follow the Code of Federal Regulations (CFR) Title 21 Part 11, namely that our SAS systems possess 'accuracy, reliability, consistent intended performance'. Computer systems however are complex and difficult to validate completely by IT, and quality lapses result in SAS Systems that get it right "most of the time". This paper discusses ways a SAS System can perform inconsistently, its outward symptoms, and how to prevent it all by running a QA test program. The program is in SAS and full code is provided.
QT14 : Getting the Most Out of PROC SORT: A Review of Its Advanced Options
Max Cherny, GlaxoSmithKline
Monday, 4:00 PM - 4:10 PM, Location: Oceans Ballroom 9
The paper describes the use of some underutilized, yet extremely useful, options of PROC SORT. These options include SORTSEQ, CASE_FIRST, NUMERIC_COLLATION, DUPOUT, NOUNIQUEKEY and others. The paper provides easy and reusable examples for each of the options
QT15 : A Macro to Easily Generate a Calendar Report
Ting Sa, Cincinnati Children's Hospital Medical Center
Tuesday, 8:00 AM - 8:10 AM, Location: Oceans Ballroom 9
In this paper, a macro is introduced to generate a calendar report in two different formats. The first format displays the entire month in one plot and we call this a "month-by-month" calendar report. The second format displays the entire month in one row and we call this an "all-in-one" calendar report. To use the macro, you just need to prepare a simple data set that has three columns: one column identifies the ID, one column includes the date and one column specifies the notes for the dates. On the generated calendar reports, you can include notes and add different styles to certain dates. Also the macro provides the option for you to decide for those months that don't have data in your data set, whether they should be shown on the reports.
QT19 : Using the power of SGPLOT new features in SAS 9.4 made customized graphic programming easier in clinical efficacy and exposure-response analyses
Peter Lu, Novartis Pharmaceuticals Corporation
Xingji Han, Novartis Pharmaceuticals Corporation
Hong Yan, Novartis Pharmaceuticals Corporation
Tuesday, 10:30 AM - 10:40 AM, Location: Oceans Ballroom 9
Generating customized statistical graphs for the clinical efficacy and exposure-response analyses is essential in clinical study reports. It requires often advanced SAS graphic programming skills and is time-consuming. This paper explored several new features in SGPLOT implemented in SAS 9.4 which allow us to generate the customized graphs without writing lengthy code, complex macros, or customizing the graphic template. Examples are given to demonstrate how to align an axis text-table for a Kaplan-Meier (K-M) curve, insert customized legends, overlay graphs, add unicode symbols and jitter data points without manipulating data. In the first example two statements XAXISTABLE and TEXT in SGPLOT are used to align an axis text-table (e.g., median time-to-event, number of patients, and time-point event-free rate estimates etc.) and insert the customized legends to a K-M curve without using annotation, dynamic format, or modifying the graphic template. The second example uses a new function in SGPLOT to create overlaid graphs (e.g., a boxplot and a scatter plot) for the exposure-response analysis (drug concentration and QT change from baseline vs time). The third example uses the SYMBOLCHAR statement to add various symbols (e.g., unicode) and an option JITTER to easily jitter the data points for a scatter plot. These aforementioned new features in SGPLOT can be easily used by any level of SAS users from beginner to expert.
QT20 : Simplicity is the Soul of Efficiency: Simple Tips and Tricks for Efficient SAS® Programming
Shefalica Chand, Seattle Genetics, Inc.
Tuesday, 8:45 AM - 8:55 AM, Location: Oceans Ballroom 9
Efficient SAS® programming is one of the keys to success for SAS® programmers and in-turn to the organization they are associated with. SAS® programmers are always looking for tips and tricks to make their job more effective and less time consuming, without compromising on the quality. This saved time and resources can then be utilized in learning new skill sets, improving existing processes and many other infrastructure development ideas and initiatives. TextPad® is one such simple tool; when used in conjunction with SAS®, it can do wonders. This paper sheds light on various aspects of using TextPad® for efficient SAS® (Statistical/Clinical) Programming techniques. Some of the utilities of TextPad® that will be discussed in this paper are as below: " Executing SAS® from TextPad® and creating "Run SAS" button in the Toolbar " Viewing multiple files (SAS, LST, LOG) " Creating quick tags and shortcuts using TCL files " Comparison of multiple files (SAS, LST, LOG) " Synchronize scrolling to facilitate manual comparison " Quick fix indentations by block selection " Search text string features: "Find" and "Find in files" options " Accessing direct links to SAS® programs, in search results of "Find in files" option " Replacing text strings in one or more files " Color coding, to identify keywords of concern (Example: WARNING, ERROR, UNINTIALIZED, MISSING, etc.)
QT21 : Sorting big datasets. Do we really need it?
Daniil Shliakhov, Experis Clinical A Manpower Group Company
Tuesday, 9:00 AM - 9:10 AM, Location: Oceans Ballroom 9
Very often working with big data causes difficulties for SAS programmers. As we all know, large datasets can be manipulated and analyzed using SAS, but these SAS programs may take many hours to run. Often, timelines are short and delivery dates cannot be moved, so decreased running time is critical. One of the most time-consuming tasks for the programmer is sorting large datasets. This paper describes different techniques of quick sorting like using indexes or even how to avoid sorting at all by using hash-objects. The main goal of this paper is to find the best methods for sorting large datasets within SAS depending on the different types of data structure (for example, vertical or horizontal) and different number of variables. To identify the efficiency of different techniques of quick sorting, this paper also provides comparison between the sorting process with and without those techniques. The comparison will show how much resource time-resource is consumed sorting different types of large SAS datasets.
QT22 : Creating output datasets using SQL (Structured Query Language) only.
Andrii Stakhniv, Experis Clinical A Manpower Group Company
Tuesday, 9:15 AM - 9:25 AM, Location: Oceans Ballroom 9
PROC SQL is one of the most powerful procedures in SAS. With this tool we can easily manipulate data and create a large number of outputs. I will illustrate how we can create final datasets based on three common types of safety outputs using SQL (Structured Query Language) only: - Outputs with a fixed structure (like Disposition outputs); - Outputs with descriptive statistics (like Demographics or Laboratory outputs); - Outputs with a flexible 2-level hierarchy structure (like Adverse Events outputs). The approaches and tricks presented here can be utilized in everyday work as they are easy to implement and understand. Additionally, this information can be a helpful resource for those who are new to statistical programming analysis.
QT23 : A Macro to Produce a SAS® Data Set Containing the List of File Names Found in the Requested Windows or UNIX Directory
Mike Goulding, Experis ManpowerGroup
Tuesday, 9:30 AM - 9:40 AM, Location: Oceans Ballroom 9
Clinical programmers often need to perform a particular process for each file that exists in a specific directory, on Windows or UNIX. For example, consider a directory that contains SAS® V5 transport files, which need to be converted back to standard data sets, perhaps as part of doing a final quality check prior to regulatory submission. To run through the conversion step for all these files dynamically, somehow the programmer must first create a data structure which contains the file names in the target directory. This paper presents the macro dir_contents, which captures all file names from the requested directory, and returns the file names as observations within a SAS® data set. In this structure, the data set can be readily processed by a subsequent macro loop, to perform whatever procedure might be appropriate. The macro performs basic error checking, and supports filtering the requested directory by file extension. The macro obtains the information using SAS® software functions rather than system-specific commands, to ensure complete portability between Windows and UNIX.
QT24 : Reproducibly Random Values
Ting Bai, Gilead Sciences, Inc.
William Garner, Gilead Sciences, Inc
Tuesday, 9:45 AM - 9:55 AM, Location: Oceans Ballroom 9
For questionnaire data, multiple responses may be provided. However, a single value, chosen at random, is required for analysis. We propose a method to randomly select between the multiple responses which yields the same value at a subsequent analysis, when additional data has been collected.
QT25 : Transitions in Depressive Symptoms After 10 Years of Follow-up Using PROC LTA
Seungyoung Hwang, Johns Hopkins Bloomberg School of Public Health
Tuesday, 10:15 AM - 10:25 AM, Location: Oceans Ballroom 9
PROC LTA is by far the most popular and powerful SAS procedure for latent transition analysis used throughout a wide variety of scientific disciplines. However, few have reported easy-to-understand explanation with the example SAS code on examining transitions in latent statuses. This paper provides an in-depth analysis, with some explanation of the SAS code, to examine transitions in latent statuses of depressive symptoms after 10 years of follow-up using PROC LTA. The author also examined whether clinical characteristics predicted membership in the different statuses and predicted transitions between latent statuses over time. Examples of using PROC LTA are drawn from the Baltimore Epidemiologic Catchment Area Study. This paper gently guides all SAS users-even those with limited in statistics or who have never used SAS-through a step-by-step approach to using SAS for latent transition analysis and interpret the results. Moreover, this paper is ideally suited to students who are beginning their study of social and behavioral health sciences and to professors and research professionals who are researching in the fields in epidemiology, clinical psychology, or health services research.
QT26 : Keyboard Macros - The most magical tool you may have never heard of - You will never program the same again (It's that amazing!)
Steven C. Black, Agility-Clinical
Tuesday, 8:30 AM - 8:40 AM, Location: Oceans Ballroom 9
Stop cutting and pasting old code into new programs, stop worrying about forgetting or misplacing really neat code, let SAS remember it for you!! All through Keyboard Macros! The term Keyboard macro is a mild minomer as they are not really macros or at least not in the way that most SAS programmers think of macros. They should be called Keyboard Rememberalls. Except they actually tell you what you forgot and they can hold all of your secrets. In this paper I will demonstrate how to create, use, and transfer keyboard macros. I will also discuss a few of the deeper darker aspects of Keyboard Macros. With the intent that the reader will use them and make them a part of their programming repertoire.
QT27 : The concept of the "Dynamic" SAS Programming
Sergey Sian, Senior Statistical Programmer at Quintiles
Tuesday, 10:45 AM - 10:55 AM, Location: Oceans Ballroom 9
The situation of writing SAS code which generates SAS programs, and these generated programs are then run to produce output, is not new. However, this is sometimes a necessary, and certainly a useful tool to use. The concept of "Dynamic" SAS programming will be discussed in this paper, along with why it is useful and how to do it. This will be augmented with two examples - one being a series of Adverse Event Tables, a second being a series of Lab Tables. Along the way there will also be some discussion on the SAS options available to record what has been run and how it ran.
QT28 : Which TLFs Have Changed Since Your Previous Delivery? Get a Quick YES or NO for Each TLF
Tom Santopoli, Accenture
Tuesday, 2:00 PM - 2:10 PM, Location: Oceans Ballroom 9
Ever need to make a delivery where only some of the TLFs are expected to have changed since the previous delivery? This paper presents a method to obtain a quick YES or NO for each TLF to indicate which TLFs have and have not changed since the previous delivery. Prior to investigating specific changes between TLFs, such as differences in lines and cells, a quick YES or NO for each TLF is a very useful overview to ensure that only the appropriate TLFs have been impacted by any updates. A quick YES or NO can be obtained even for Figures. A couple quick tips to obtain the results of PROC COMPARE and to use CALL EXECUTE for data-driven macro calls are presented as the primary coding techniques.
QT29 : Sensitivity Training for PRXers
Ken Borowiak, PPD
Tuesday, 11:00 AM - 11:10 AM, Location: Oceans Ballroom 9
Any SAS® user who intends to use the Perl style regular expressions through the PRX family of functions and call routines should be required to go through sensitivity training. Is this because those who use PRX are mean and rude? Nay, but the regular expressions they write are case sensitive by default. This paper discusses the various ways to flip the case sensitivity switch for the entire or part of the regular expression, which can aid in making it more readable and succinct.
QT30 : FILLPATTERNS in SGPLOT Graphs
Pankhil Shah, PPD
Tuesday, 11:15 AM - 11:25 AM, Location: Oceans Ballroom 9
With more updates to PROC SGPLOT in SAS 9.3, there has been substantial change in graph programming. Programming code size and complexity have been reduced compared to PROC GPLOT, and with little effort one can create much better quality graphs with PROC SGPLOT. However, this transition has a few down falls, as PROC SGPLOT doesn't share some of the useful features of PROC GPLOT. One of them is showing FILLPATTERNS in graphs produced by PROC GPLOT. Currently, PROC SGPLOT doesn't support any direct styling option for showing patterns to differentiate between groups. Also, styling using SG Attribute Map data set used in SG procedures is not useful in the same regards. The main purpose of this paper is to provide a work-around on the above mentioned issue by creating new or modifying an existing SAS predefined templates for SG procedures.
QT31 : Copying Files Programatically in SAS® Drug Development (SDD)
Wayne Woo, Novartis Vaccines
Tuesday, 11:30 AM - 11:40 AM, Location: Oceans Ballroom 9
SAS® Drug Development (SDD) is a statistical computing environment and repository that provides a 21 CFR Part 11 compliant system for analysis and reporting of clinical trials data. Sometimes, the tradeoff to implement a rigorous controlled system involves sacrificing features that have become commonplace in the toolkit of programmers working in other systems. One such feature unavailable in SDD is the use of the X command, thus forbidding use of the operating system's file copying command via %SYSEXEC and the FILENAME PIPE device. This led the author to search for workarounds to programatically copy non-SAS-dataset files within SDD. This paper explores a few coding techniques that can be used to programmatically copy files within SDD.
QT32 : Accelerating production of Safety TFLs in Bioequivalence and early phase
Denis Martineau, Algorithme Pharma
Tuesday, 1:15 PM - 1:25 PM, Location: Oceans Ballroom 9
Due to the short time span between Study Start and Report to client plus the average number of studies per month, we needed to create synergy with different teams implicated in these studies. This enabled us to automate most of the Tables, Figures and Listings production using SAS.
QT33 : Simulate PRELOADFMT Option in PROC FREQ Using PROC FORMAT
Ajay Gupta, PPD Inc
Tuesday, 1:30 PM - 1:40 PM, Location: Oceans Ballroom 9
In Pharmaceuticals/CRO industries, table programing is usually started when partial data is available and it is common that we need to summarize data based on all possible combinations of values. However, if there is no data collected for all levels of these values in the data set, by default, existing SAS procedures are not able to display them in a summary table. A viable solution is to add the PRELOADFMT option in these procedures. But, some procedures like PROC FREQ, which are widely used to summarize data and calculate statistics, do not have the PRELOADFMT option. Unfortunately, SAS does not provide direct functions to perform the above mentioned task. But, using the existing SAS procedure for e.g. PROC FORMAT, PROC DATASETS, and PROC FREQ, we can create a SAS macro which can perform the above mentioned task easily without much programming work.
QT34 : Grooming of Coarse Data by Smart Listing using SAS system
Soujanya Konda, inVentiv Health Clinical
Tuesday, 1:45 PM - 1:55 PM, Location: Oceans Ballroom 9
Organizations strive to deliver the quality database; one of the crucial and typical parts is data cleaning. Though word data cleanings sounds simpler it is very tedious to the clean entire data and provide the quality data. However; we can try new techniques to manage the data the effectively, efficiently to provide discrepant free data for further process. Data Management (DM) is to deliver a qualitative database to SAS Programming, Statistical Analysis teams for their analysis. DM team is to conduct reviews more often and repeatedly on same reports and to locate the discrepant data would take lots of time; find out the optimized process by creating smart listings in such a way that it identifies, newly entered, modified records on each and every fresh run. In turn; it avoids lot of manual effort, the various challenges, efficient techniques are Corley discussed further in this paper.
QT36 : mlibcompare - keep an eye on the changes of your data
Wen Shi, Accenture Life Science
Tuesday, 2:15 PM - 2:25 PM, Location: Oceans Ballroom 9
When developing the regular submission package, we usually follow the order of generating SDTM first, then ADaM, then the TLF outputs. In real practice, we may keep receiving many cuts of new data and still need to guarantee the function of our programs. Mlibcompare will help you quickly identify the structural changes of your datasets in your library. It combines the results from several SAS procedures and generates an excel summary report with the Microsoft Excel DDE interface.
QT37 : I/O, I/O, it's off to work we go: digging into the ATTRC function
Karleen Beaver, PPD
Tuesday, 2:30 PM - 2:40 PM, Location: Oceans Ballroom 9
Several file I/O functions allow the SAS® programmer to gain data set properties directly without running a procedure on the data set or retrieving the data set observations. These functions can be called in a DATA step or can be used in open code. This paper illustrates the improvements in efficiency and capability when using the file I/O function ATTRC, specifically the attr-name parameter values LABEL and SORTEDBY, compared to running the CONTENTS procedure and accessing DICTIONARY tables in the SQL procedure.
QT38 : Statistical Review and Validation of Patient Narratives
Indrani Sarkar, inVentiv Health Clinical
Bradford Danner, Sarah Cannon Research Institute
Tuesday, 2:45 PM - 2:55 PM, Location: Oceans Ballroom 9
Patient narratives are often a required component to a safety review or a clinical summary report being submitted to a regulatory agency. Events that will often require a narrative are deaths, other serious adverse events, and significant adverse events determined to be of special interest because they are clinically relevant. The manner in which they may be reported can vary, but nearly always include records of a patient's demographics, medical history, drug administration, and adverse events over the course of a clinical study. Combining the various sources of data into one cohesive presentation can be challenging, and possibly more difficult or tedious, the validation and review of each individual profile produced. Considering all the various sources of data which need to be checked and cross-referenced, time and effort can become substantial. The main purpose of this paper is to briefly describe a method, utilizing SAS®, by which we first confirm the list of subjects and events compiled by study clinicians is consistent to the study data, followed by a presentation of outputs from which comparisons may be made during statistical review and/or validation.
QT40 : Regaining Some Control Over ODS RTF Pagination When Using Proc Report
Gary Moore, PharmaSUG
Tuesday, 3:00 PM - 3:10 PM, Location: Oceans Ballroom 9
When creating RTF files using SAS Proc Report, some of the control over pagination is lost because third party programs like Microsoft Word determine the pagination during rendering. Often, this pagination does not appear appropriate or esthetically pleasing. We can regain some of that control using the break option with Proc Report and manually setting page break points in the data. This requires an examination of pagination while answering some questions about the data and report. What elements factor into the determination of pagination? What types of pagination are possible? What type of pagination is required? This paper examines these questions and provides a SAS Macro that will assist in manually setting break points for pagination.
QT41 : Automated Checking Of Multiple Files
Kathyayini Tappeta, Percept Pharma Services
Tuesday, 3:30 PM - 3:40 PM, Location: Oceans Ballroom 9
Most often clinical trial data analysis has tight deadlines with very close data transfers. When last minute modifications are done by different team members, an automated check of the required content in several files ensures increased efficiency under tight timelines. This obviates the need for each file to be opened and checked. The basic concept for such checks involves opening multiple files at a given location, extracting information from these to a single file and performing required checks so that each individual file need not be opened. It can have several applications including checking multiple log files for errors/ warnings, checking multiple proc compare files for notes and in case of macros written for tables, listings and graphs the parameter values can be checked to see if entered correctly. This paper provides code for automating the checking of files with example applications.
QT42 : Automating Clinical Trial Reports with ODS ExcelXP Tagset
John O'Leary, Department of Veterans Affairs
Tuesday, 3:45 PM - 3:55 PM, Location: Oceans Ballroom 9
This paper offers a solution for generating multiple Excel workbooks by combining SAS Base tools and a Visual Basic for Applications (VBA) macro. Beginning and Intermediate SAS programmers will benefit from learning an approach that was implemented to create basic Excel workbooks for each of forty research sites in a Veterans Affairs clinical research trial. While there are many techniques that can be used in SAS, the example offered in this paper utilizes Proc Report, ODS Excel XP Tagset, the SAS Macro Facility and a VBA macro to convert Excel XML files to standard Excel workbooks. Although this example involves the reporting of non-adherent study participants in a large clinical research trial, the technique and associated SAS code can be adapted for SAS programmers in a variety of environments who need to automate reporting for multiple entities.
QT44 : Get a Data Dictionary using PROC contents: Easily
Beatriz Garcia, None
Jose Alberto Hernandez, None
Tuesday, 4:00 PM - 4:10 PM, Location: Oceans Ballroom 9
SAS programmers often have to work with a minimum set of documentation, and even the most experienced programmers find obstacles in the programming leading to the TLG's. In this example, we are given no variable names in the specifications (or also known as the spec file) or the annotated CRF is missing. A Data Dictionary would be a very good option for starting to analyze our data. This paper will present a macro with the use of PROC CONTENTS for creating a data dictionary that facilitates the way the knowledge of the project and can often reduce iterative questions to the Lead Programmer.
QT45 : A Practical Approach to Create Adverse Event Summary by Toxicity Grade Table
Zhengxin (Cindy) Yang, inVentiv Health, Clinical
Tuesday, 4:15 PM - 4:25 PM, Location: Oceans Ballroom 9
Adverse event summary by toxicity grade table is a common requirement in clinical safety study. The table displays severity grade of organ toxicity per CTCAE (Common Terminology Criteria for Adverse Events) as a sub-categories in AE summary tables. Those tables can be difficult to create with added sub-categories of organ toxicity grades. The variation of grade presentation with study requirements also adds complexity to the table programming. This paper presents a practical approach for generating AE grade tables using modular macro calls. It uses modules to address various formatting issues with AE grade tables. A set of guidelines is presented to simplify the programming process.
QT47 : Customizing the Graph Templates for a Kaplan-Meier Failure Plot
Hugh Geary, Novella Clinical, a Quintiles company
Tuesday, 4:30 PM - 4:40 PM, Location: Oceans Ballroom 9
Although PROC LIFETEST in SAS/STAT® 13.1 is generally used to generate Kaplan-Meier Survival Plots it can also generate 'Failure' plots. Survival here denotes the time to an event, such as death, and the plot is descending because subjects are removed from the count as they reach the event. In a failure plot subjects are added to the count as they reach the event and they have failed because they failed to avoid reaching the event (of death). The failure plot is an ascending curve. The failure plot is also used to display the 'Time to 'Recovery' which is a more positive frame of reference. For the survival plot we have the option of modifying templates directly or using macros. However for the failure plot we must modify the failure plot templates directly. This paper explores modifying the failure templates. The code has been written using Base SAS in SAS 9.3 and should be usable by anyone familiar with PROC LIFETEST.
Statistics & PharmacokineticsSP01 : Multilevel Randomization
Marina Komaroff, Noven Pharmaceuticals
Lois Lynn, Noven Pharmaceuticals, Inc.
Monday, 10:15 AM - 10:35 AM, Location: Oceans Ballroom 11
Randomization in clinical trials is essential for the success and validity of a study. PROC PLAN is an important SAS® procedure that generates randomization schedules for variety of experimental designs. This procedure was developed for the major types of randomization like simple, block and stratified randomization where the latter controls and balances the influence of covariates. In addition to SAS® documentation, multiple papers were written to explain how to adapt and enhance the procedure with DATA steps and/or PROC FORMAT. Clinical research in transdermal medicine introduces the situation where a multilevel randomization is required for levels like treatment, location (arm, thigh, back, etc.) and side (left, right, upper, center, etc.) of a patch application while retaining balance at each level and combination of levels. Schedules get especially complicated for cross-over studies where location and side of patch application needs to be rotated by period and balanced as well. To the authors' knowledge, there are no published papers to accommodate these requirements. This paper introduces a novel concept of multilevel randomization, provides SAS code utilizing PROC PLAN, and a few examples with increasing complexity to generate balanced multilevel randomization schedules. The authors are convinced that this paper will be useful to SAS-friendly researchers conducting similar studies that require multilevel randomization.
SP02 : MMRM: Macro for Selecting Best Covariance Structure with the Method of Interest
Linga Reddy Baddam, Inventiv Health Clinical
Sudarshan Reddy Shabadu, inVentiv Health Clinical
Chunxue Shi, inVentiv Health Clinical
Monday, 10:45 AM - 11:05 AM, Location: Oceans Ballroom 11
In clinical trial analysis, while handling longitudinal continuous data, there are very often cases that the Mixed Model Repeated Measures (MMRM) tool is used to deal with the continuous endpoints when a dependent variable is collected multiple times. It is usually up to the statistician to specify the criterion for identifying the best covariance structure against the chosen model among all possible covariance structures arranged in particular order of interest, or in the way it has been specified in Statistical Analysis Plan and/or Protocol. In order to achieve this with the use of SAS, clinical programmers have to develop a program which dynamically checks the each covariance structure by order of interest, or in the way it has been specified in Statistical Analysis Plan and/or Protocol, and to produce the estimates and main effects information based on the best covariance structure selected for the given model, The features of the %MMRM macro are explicitly discussed further in detail in this paper.
SP03 : Missing data for repeated measures: single imputation vs multiple imputation and their implications on statistical significance.
Giulia Tonini, Menarini Ricerche
Simona Scartoni, Menarini Ricerche
Angela Capriati, Menarini Ricerche
Andrea Nizzardo, Menarini Ricerche
Camilla Paoli, Menarini Ricerche
Monday, 11:15 AM - 11:35 AM, Location: Oceans Ballroom 11
Missing data is often a major issue in clinical trials, especially when the outcome variables come from repeated assessments. Single imputation methods are widely used. In particular, when data collection is interrupted at a certain time point, Last Observation Carried Forward (LOCF) is usually applied. Regulatory agencies advise to use the most conservative approach to impute missing data. As a drawback, single imputation methods do not take into account imputation variability. In this work we intend to compare single imputation versus multiple imputation methods in order to verify the effect on the successive inferential analysis, especially in terms of statistical significance of results. In particular we intend to verify if a more conservative single imputation method can be considered also a conservative method in term of statistical significance, respect to multiple imputation, where the higher variability can reduce the probability of having a significant result. We simulated a dataset representing a clinical trial testing the analgesic efficacy of a combination of drugs on moderate to severe pain after surgery. Pain is measured using a VAS scale. Analysis of covariance is applied to the primary efficacy variable, which is VAS change versus baseline. Both methods for handling missing data are applied. Multiple imputation in SAS uses PROC MI. We finally present statistical significant results. Analyzing results from several simulated dataset, we found out that multiple imputation consistently reduce the probability of finding statistical significance.
SP04 : Means Comparisons and No Hard Coding of Your Coefficient Vector - It Really Is Possible!
Frank Tedesco, United Biosource Corporation
Monday, 1:15 PM - 2:05 PM, Location: Oceans Ballroom 11
Means Comparisons and No Hard Coding of Your Coefficient Vector - It Really Is Possible! Frank Tedesco, United Biosource Corporation Abstract: When doing mean comparisons using a linear vector of coefficients, typically we need to modify our SAS PROC statements to account for different study designs, for instance when doing a submission (Phases I, II,III). What if you didn't need to hard code your linear vector in every program specific to the analysis? The number of linear coefficients in the vector are dependent on the number of treatment levels and cohort(s) being analyzed, in other words the levels of your independent variable(s) will change with each study design. This paper will use pharmacokinetic data as an example with the added difficulty of analyzing drug metabolites which often have parameter values with measurements below quantifiable limits. Often this causes treatment levels and cohorts to drop from the analysis. The goal of this paper is to demonstrate that the use of the macro facility can provide for SAS program stability and reduce SAS program maintenance ultimately resulting in a time and cost savings across studies.
SP05 : Using ANCOVA to assess Regression to the Mean
Kathryn Schurr, Spectrum Health
Monday, 2:45 PM - 3:05 PM, Location: Oceans Ballroom 11
Regression to the mean (RTM) is a statistical phenomenon in which results appear to be statistically significant. There are many documented occurrences of this effect throughout history and can lead researchers to potentially erroneous conclusions. The conclusions may be that a change has been seen in a population or sample, but in fact the change is, in part, due to random occurrence. By implementing SAS and PROC GLM, users can conduct an ANCOVA to determine whether or not RTM is present. This will help prevent users from making a presumptuous statement regarding observed change within their study.
SP06 : Confidence Intervals Are a Programmer's Friend
Xinxin Guo, Quintiles
Monday, 2:15 PM - 2:35 PM, Location: Oceans Ballroom 11
A confidence interval (CI) is a type of interval estimate of a population parameter and is one of the most common terms statistical programmers face in everyday practice. This paper will present a collection of SAS code to calculate the CI's of a proportion obtained by PROC FREQ (special handling for category with zero count is also considered) based on the assumption that proportion follows a Binomial distribution, and the CI of incidence rate obtained by a handy formula based on the assumption that the number of events occurring in a fixed interval of time follows a Poisson distribution. The paper will also look at PROC GENMOD and PROC PLM and compare the results from these two procedures against the code given.
SP07 : Growing Needs in Drug Industry for NONMEM Programmers Using SAS®
Sharmeen Reza, Cytel Inc.
Monday, 4:00 PM - 4:20 PM, Location: Oceans Ballroom 11
Nonlinear Mixed Effects Modeling (NONMEM) is a type of population pharmacokinetics/pharmacodynamics (pop-PK/PD) analysis that is used in Clinical Pharmacology research. At different stages of drug development the population PK approach, coupled with PD modeling, allows integrated analysis, interpretation, and prediction about drug safety, efficacy, dose-concentration relationship, and dosing strategy. Analysis reports help regulatory agencies evaluate new drug submission, review safety/effectiveness of a drug, and guide drug labeling. Pharmacologists work with NONMEM® software which requires pop-PK/PD data in text (ASCII/CSV) format, whereas agencies request data be submitted in SAS transport files. SAS is used to create, maintain, update, and recreate data sets required for modeling purposes, and facilitate the creation of regulatory format as well as the text file. Utilizing SAS also assists in maintaining data integrity, handling large data, tracking data manipulation and derivation through log. The use of SAS for the creation of pop-PK/PD analysis data sets that are consumed by NONMEM® software has led to a greater demand for specialized 'NONMEM' programmers. These professionals are equipped to pool data from multiple clinical studies, manipulate data coming in diverse formats, combine and validate records in a single SAS output data set. This paper explains a NONMEM data set structure, some core variables, group interaction, and a programmer's tasks and challenges involved.
SP08 : How Latent Analyses Within Survey Data Can Be Valuable Additions To Any Regression Model
Deanna Schreiber-Gregory, Nationa University
Monday, 4:30 PM - 4:50 PM, Location: Oceans Ballroom 11
The current study looks at several ways to investigate latent variables in longitudinal surveys and their use in regression models. Three different analyses for latent variable discovery will be briefly reviewed and explored. The procedures explored in this paper are PROC LCA, PROC LTA, PROC CATMOD, PROC FACTOR, PROC TRAJ, and PROC SURVEYLOGISTIC. The analyses defined through these procedures are latent profile analyses, latent class analyses, and latent transition analyses. The latent variables will then be included in a three separate regression models. The effect of the latent variables on the fit and use of the regression model compared to a similar model using observed data will be briefly reviewed. The data used for this study was obtained via the National Longitudinal Study of Adolescent Health, a study distributed and collected by Add Health. Data was analyzed using SAS 9.3. This paper is intended for any level of SAS user. This paper is also written to an audience with a background in behavioral science and/or statistics.
SP09-SAS : Current Methods in Survival Analysis Using SAS/STAT® Software
Changbin Guo, SAS
Monday, 8:00 AM - 9:50 AM, Location: Oceans Ballroom 11
Interval censoring occurs in clinical trials and medical studies when patients are assessed only periodically. As a result, an event is known to have occurred only within two assessment times. Traditional survival analysis methods for right-censored data are not applicable, and so specialized methods are needed for interval-censored data. This tutorial describes these techniques and their recent implementation in SAS software, both for estimation and comparison of survival functions as well as for proportional hazards regression. Competing risks arise in studies when individuals are subject to a number of potential failure events and the occurrence of one event may impede the occurrence of other events. A useful quantity in competing-risks analysis is the cumulative incidence function, which is the probability sub-distribution function of failure from a specific cause. This tutorial describes how to compute the nonparametric estimate of the cumulative incidence function and discusses a SAS macro which implements it and provides tests for group comparison. In addition, this tutorial describes two approaches that are available with the PHREG procedure for evaluating the relationship of covariates to the cause-specific failure. The first approach models the cause-specific hazard, and the second approach models the cumulative incidence (Fine and Gray 1999).
SP10 : %PIC_NPMLE: A SAS Macro For Nonparametric Estimation In Partly Interval-Censored Survival Data
Liang Zhu, St. Jude Children's Research Hospital
Yimei Li, St. Jude Children's Research Hospital
Qinlei Huang, St Jude Children's Research Hospital
Monday, 3:30 PM - 3:50 PM, Location: Oceans Ballroom 11
Partly interval-censored (PIC) data arise frequently in follow-up studies. An example of such data is provided by the St. Jude Lifetime Cohort study (SJLIFE), which follows childhood cancer survivors for late effects such as growth hormone deficiency (GHD). For some patients, the exact time of the first occurrence of GHD is reported, but for others, the exact time is unknown and is recorded as occurring between two clinic visits. PIC data is an important topic in medical research; however, no statistical software is available for analyzing it. In this paper, we provide two SAS" macros to calculate the nonparametric maximum-likelihood estimator (NPMLE) of the survival function in PIC data. We estimate the NPMLE by using an iterative convex minorant (ICM) algorithm and an EM iterative convex minorant (EM-ICM) algorithm based on two different likelihood functions. Our simulation studies showed that both of the proposed algorithms provide consistent estimates for survival functions and the macro %PIC_NPMLE using EM-ICM algorithm computes much faster than the other macro using EM algorithm. Finally, we illustrate how to use the %PIC_NPMLE macro by applying it to GHD data from the SJLIFE study mentioned above.
Submission StandardsSS01 : Getting Loopy with SAS® DICTIONARY Tables: Using Metadata from DICTIONARY Tables to Fulfill Submission Requirements
Nina Worden, Santen
Tuesday, 9:00 AM - 9:50 AM, Location: Oceans Ballroom 11
Standardizing data allows for a quick understanding of the contents for FDA reviewers however this is only the case when the standards are followed. Lapses in compliance can create delays in the review process, or worse, give the reviewers concerns over the quality of the data. The DICTIONARY tables that are accessible within Base SAS® provide data on your data that can be utilized to check, or create, compliant datasets. This paper will provide examples of requirements and how, with a little looping, issues can be identified and requirements can be met.
SS02 : Japanese submission/approval processes from programming perspective
Ryan Hara, Novartis Pharma AG
Tuesday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 11
Japan is one of the world's biggest pharmaceutical markets and as such, development and approval of new drugs in Japan is one of the top priorities for pharmaceutical companies. The intent of this paper is to present Japan-specific submission requirements and also the review/approval process of the Japanese health authority PMDA (Pharmaceuticals and Medical Devices Agency) to a non-Japanese audience, especially programmers. Thus, the main focus of this paper is the specific programming requirements for a submission/approval in Japan that may differ from global processes. This knowledge may help global teams to better understand the specificities of Japanese pharmaceutical market and in turn, strengthen collaboration of global groups with Japanese teams and colleagues.
SS04 : Begin with the End in Mind - Using FDA Guidance Documents as Guideposts when Planning, Delivering and Archiving Clinical Trials
David Izard, Accenture
Tuesday, 2:15 PM - 3:05 PM, Location: Oceans Ballroom 11
Clinical professionals, including Statisticians and Programmers, work tirelessly contributing to protocol & SAP development, CRF & clinical database design, analysis dataset design & implementation and the generation of tables, figures and listings in support of clinical trials. If the compound under study successfully navigates clinical and regulatory hurdles, most of these items will make their way into an FDA submission. Historically there are many sources of information on how to organize and present these assets for FDA consideration, many of which seemingly contradict each other, making the requirements for regulatory submission cloudy at best. The FDA has recently changed this landscape, releasing final versions of three guidance documents that cover the requirement to file applications electronically, use and provide standardized study data as part of their submissions, and providing clearer and more comprehensive technical considerations for the inclusion of both clinical & non-clinical data in regulatory filings. This paper will examine the assets that are created during the execution of a clinical trial and hold them up against the submission requirements that are dictated in the recently finalized "Standardized Study Data" and "Study Data Technical Conformance Guide" guidance documents. I will examine the paradigm shift has occurred at the FDA, moving from a reactive to proactive submission planning and review environment and the impact that has, not just on preparing your clinical trial assets for submission, but on steps you must take during planning, execution and archiving of your trial to ensure you meet the FDA's expectations come submission time.
SS05 : OSI Packages: What You Need to Know for Your Next NDA or BLA Submission
Thaddea Dreyer, AstraZeneca
Tatiana Scetinina, Astra Zeneca
Tuesday, 3:30 PM - 4:20 PM, Location: Oceans Ballroom 11
The FDA's Office of Scientific Investigations (OSI) has responsibility to select investigator sites for inspection as part of the FDA's Biomedical Research Initiative (BIMO) and needs each sponsor's help in order to do it quickly and efficiently. Sponsors need to provide a variety of site-level information so the OSI can identify sites of interest - for example, high enrolling sites, sites whose data may be swaying the results, sites with prior inspection issues, etc. In late 2012, the FDA released new OSI materials (guidance for industry, specifications, and a webinar) that describe what sponsors need to include in their NDA/BLA submission to support the OSI's efforts. This paper will provide an overview of the released OSI package components (Parts I, II, and III) that should go into a submission, and it will describe how AstraZeneca prepared an OSI package for a recent NDA submission. The emphasis will be on programming deliverables with attention towards Part II (subject level data listings by site) and Part III (summary level clinical site dataset). There will be examples of Q&A with the OSI division, timings for delivery, the need for cross-functional input, and some lessons learned. After reading this paper, you'll be better prepared to put together an OSI package for your submission.
SS06 : The Most Common Issues in Submission Data
Sergiy Sirichenko, Pinnacle 21
Max Kanevsky, Pinnacle 21 LLC
Tuesday, 4:30 PM - 4:50 PM, Location: Oceans Ballroom 11
On December 17th, the FDA made it's long-awaited announcement that future submissions will be required in standardized format. FDA published the technical requirements for standardized submission data in a new binding guidance and supporting documents including the Data Standards Catalog and Study Data Technical Conformance Guide. They also encouraged sponsors to communicate with the review divisions on study specific data questions. Even though the guidance is new, most sponsors has already migrated to standardized submissions with the level of compliance rising rapidly. This has enabled FDA to improve the efficiency of the review process by developing automated review and analysis tools, which have been operationalized by the FDA JumpStart service. This presentation will share our experience of the most common data quality issues we observed during JumpStart across many regulatory submissions. We also provide recommendations on how to ensure high quality submission data by evaluating the risk or potential impact of each issue and how each can be corrected.
SS08-SAS : Getting Rid of Bloated Data in FDA Submissions
Ben Bocchicchio, SAS
Frank Roediger, SAS
Tuesday, 8:00 AM - 8:50 AM, Location: Oceans Ballroom 11
As FDA submissions have become more automated, the amount of data that the FDA has to store has grown rapidly. One of the FDA's attempts to manage this growing mountain of data is to issue Technical Specifications Documents that stipulate rules that submission data need to follow. Two of these rules are designed to reduce data bloating: 1) Make character variables just wide enough to store the longest value they need to accommodate; and 2) Do not re-submit data sets that have not changed. These rules can cause an increase in the work required to prepare a submission package for the FDA, and it is tempting to conclude that the solution to the problem should be more storage, not more rules: "Storage is cheap - just buy more." But storage is just one component of the overall problem. For example, a submission's data is initially loaded into the EDR (Electronic Data Room), but the review process sends copies of the data to numerous destinations within the FDA's internal networks. When you consider that security software needs to screen all network traffic, it is apparent that the solution for bloated submission data is not more storage. This presentation will present some utility processes can help you avoid including bloated data in your FDA submissions.
SS09-SAS : SAS Tools for Working with Dataset-XML files
Lex Jansen, SAS
Tuesday, 1:15 PM - 2:05 PM, Location: Oceans Ballroom 11
Dataset-XML defines a standard format for transporting tabular data in XML between any two entities based on the CDISC ODM XML format. That is, in addition to supporting the transport of data sets as part of a submission to the FDA, it may also be used to facilitate other data interchange use cases. For example, the Dataset-XML data format can be used by a CRO to transmit SDTM or ADaM data sets to a sponsor organization. Dataset-XML supports SDTM, ADaM, and SEND CDISC data sets but can also be used to exchange any other type of tabular data set. The metadata for a data set contained within a CDISC Dataset-XML document must be specified using the CDISC Define-XML standard. Each CDISC Dataset-XML file contains data for a single data set, but a single CDISC Define-XML file describes all the data sets included in the folder. Both CDISC Define-XML v1.0 and CDISC Define-XML v2.0 are supported for use with CDISC Dataset-XML. This presentation will introduce the Dataset-XML standard and present SAS based tools to transform between SAS data sets and Dataset-XML documents, including validation.
SS10-SAS : Using SAS Clinical Data Integration to Roundtrip a Complete Study Study Metadata (Define-XML) and Study Data (Dataset-XML)
Ken Ellis, SAS
Wednesday, 9:00 AM - 9:50 AM, Location: Oceans Ballroom 11
SAS® Clinical Data Integration 2.6 now supports the complete CDISC Define-XML 2.0 specification for both import from, and creation of define.xml files. Support for value-level metadata and supplemental documents, as well as support for the new CDISC Dataset-XML specification have been added and give the user a powerful means by which to import and create complete study definitions. Topics covered include: " import of CDISC Define-XML file metadata, including domains, value-level metadata, supplemental documents, computational algorithms and code lists. " import of accompanying domain data from CDISC Dataset-XML files. " creation of a complete CDISC Define-XML file including all imported metadata (roundtrip). " creation of CDISC Dataset-XML files from the imported domain tables (roundtrip).
Techniques & TutorialsTT01 : Team-work and Forensic Programming: Essential Foundations of Indestructible Projects
Brian Fairfield-Carter, inVentiv Health Clinical
Tuesday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 10
Everyone probably agrees that project success hinges on teamwork and communication. The concept of teamwork often emphasizes leadership and organizational structure over individual responsibility, but the scale and complexity of analysis programming projects mean that effective teamwork demands individual autonomy and initiative and should ideally involve a cooperative effort among peers. And despite generally recognizing the importance of communication, we rarely see pragmatic suggestions on communication techniques. With autonomy comes responsibility: while 'exercising initiative' means that we do things as we recognize a need, we also have to try and anticipate the impact of our actions (choice of programming constructs & style, methods of program organization, etc.) on other team-members. We have to avoid 'cleverness' for its own sake, and instead focus on producing code that will be intelligible to anyone who has to maintain or adapt it. Effective teamwork requires effective communication. To communicate in a technical setting, we need to gather and present evidence, evaluate premises, and draw conclusions via logical deduction; in other words, we need to treat communication as a forensic exercise. Illustrating ideas with 'forensic evidence' is a powerful way of identifying ambiguity and differences in interpretation, and should therefore be the backbone of any technical discussion. This paper addresses the practical question of how to program as part of a team, discussing programming traits that contribute to rather than detract from a team effort, and proposes 'forensic programming' as a vital technique for communicating between team members.
TT02 : PHANTOM OF THE ODS - How to run cascading compute blocks off of common variables in the data set for complex tasks.
Robin Sandlin, St Jude Children's Research Hospital in Memphis
Tuesday, 11:15 AM - 11:35 AM, Location: Oceans Ballroom 10
This paper teaches an SAS®-based Stored Process formatting technique to be used stand-alone or combined with other style techniques. This methodology is a prototype framework for use by SAS® programmers who may not be able to achieve similar fashion results using more sophisticated techniques (such as inline "^S" techniques) . This paper gives examples and complete code, leveraging users knowledge in two different areas, the DATA area, using Proc SQL basic techniques and the REPORT area, using compute block framework (albeit multiple blocks) in what is called "Phantom of the ODS." Due to the nature of the way that compute blocks resolve, the execution of the block prior to its application effectively results in only one style statement per block without resorting to advanced techniques such as inline style, etc. This effect, combined with the one compute block per variable rule, normally results in one style statement per column or grouping variable. The technique introduces "phantom data" in the Proc Sql code, cloned from existing data. The "phantom data" variables are then listed in the Proc Report column statement, but set as non printing in the define statement, not visible to the report user, yet available for multiple compute block styling, which is visible, one block for each phantom variable (no limit actually). The one compute block per variable design paradigm is thus shifted. Anyone with basic SQL knowledge and general knowledge of single compute block processing should find this technique quickly and effectively implementable.
TT03 : PROC SQL: Make it a monster using these powerful functions and options
Arun Raj Vidhyadharan, inVentiv Health Clinical
Sunil Jairath, inVentiv Health Clinical
Monday, 11:15 AM - 11:35 AM, Location: Oceans Ballroom 10
PROC SQL is indeed a powerful tool in SAS. However, we can make it even more powerful by using certain PROC SQL functions and options. This paper explores such functions and options that take PROC SQL to the next level.
TT04 : SAS® Programming Tips, Tricks and Techniques for Programmers
Kirk Paul Lafler, Software Intelligence Corporation
Monday, 9:00 AM - 9:50 AM, Location: Oceans Ballroom 10
Explore a collection of proven tips, tricks and techniques offered in the base-SAS® Software. Attendees learn system options to aid in improved productivity; SQL procedure and table options to influence the optimizer to use specific join algorithms; accessing variable attributes in a table with a single statement; using table integrity constraints to prevent data integrity issues; and using user-defined tools with SAS meta-data Dictionary tables and SASHELP views.
TT05 : DATA Step Merging Techniques: From Basic to Innovative
Art Carpenter, CA Occidental Consultants
Monday, 1:15 PM - 2:05 PM, Location: Oceans Ballroom 10
Merging or joining data sets is an integral part of the data consolidation process. Within SAS® there are numerous methods and techniques that can be used to combine two or more data sets. We commonly think that within the DATA step the MERGE statement is the only way to join these data sets, while in fact, the MERGE is only one of numerous techniques available to us to perform this process. Each of these techniques has advantages and some have disadvantages. The informed programmer needs to have a grasp of each of these techniques if the correct technique is to be applied. This paper covers basic merging concepts and options within the DATA step, as well as, a number of techniques that go beyond the traditional MERGE statement. These include fuzzy merges, double SET statements, and the use of key indexing. The discussion will include the relative efficiencies of these techniques, especially when working with large data sets.
TT06 : Using Arrays to Quickly Perform Fuzzy Merge Look-ups: Case Studies in Efficiency
Art Carpenter, CA Occidental Consultants
Monday, 2:15 PM - 3:05 PM, Location: Oceans Ballroom 10
Merging two data sets when a primary key is not available can be difficult. The MERGE statement cannot be used when BY values do not align, and data set expansion to force BY value alignment can be resource intensive. The use of DATA step arrays, as well as other techniques such as hash tables, can greatly simplify the code, reduce or eliminate the need to sort the data, and significantly improve performance. This paper walks through two different types of examples where these techniques were successfully employed. The advantages will be discussed as will the syntax and techniques that were applied. The discussion will allow the reader to further extrapolate to other applications.
TT07 : Inside the DATA Step: Pearls of Wisdom for the Novice SAS Programmer
Josh Horstman, Nested Loop Consulting
Britney Gilbert, Juniper Tree Consulting, LLC
Monday, 3:30 PM - 4:20 PM, Location: Oceans Ballroom 10
Why did my merge fail? How did that variable get truncated? Why am I getting unexpected results? Understanding how the DATA step actually works is the key to answering these and many other questions. In this paper, two independent consultants with a combined three decades of SAS programming experience share a treasure trove of knowledge aimed at helping the novice SAS programmer take their game to the next level by peering behind the scenes of the DATA step. We'll touch on a variety of topics including compilation vs. execution, the program data vector, and proper merging techniques, with a focus on good programming practices that will help the programmer steer clear of common pitfalls.
TT08 : A Collection of Items From a Programmers' Notebook
David Franklin, Quintiles Real World Late Phase
Cecilia Mauldin, Independent Statistical Programmer
Monday, 4:30 PM - 5:20 PM, Location: Oceans Ballroom 10
Every programmer should have a notebook with notes and pieces of code that are of interest to them. Some will be categorized as personal, notes only of interest to them, but most are of interest to a broader range of SAS® programmers. This paper brings together ten of the most useful tips from the notebooks of two SAS programmers with nearly fifty years of combined experience that will be useful to beginner and experienced SAS programmer alike. Topics include advice for using the SAS Editor; combining SAS datasets; reviewing SAS Logs; creating Excel files from SAS datasets; creating SAS datasets from Excel files; creating editor macros in Enterprise guide; splitting the info from &sysinfo; keep only the character, numeric or printable characters in a strip of text; creating empty data sets; to that little bit of code that calculates the first, middle or end date of month. Bring along your notebook and take away an idea or two.
TT09 : Defensive Coding By Example: Kick the Tires, Pump the Breaks, Check your Blind Spots and Merge Ahead!
Nancy Brucken, inVentiv Health Clinical
Donna Levy, inVentiv Health Clinical
Tuesday, 8:00 AM - 8:50 AM, Location: Oceans Ballroom 10
As SAS® programmers and statisticians, we rarely write programs that are run only once and then set aside. Instead, we are often asked to develop programs very early in a project, on immature data, following specifications that may be little more than a guess as to what the data is supposed to look like. These programs will then be run repeatedly on periodically updated data through the duration of the project. This paper offers strategies for not only making those programs more flexible, so they can handle some of the more commonly encountered variations in that data, but also for setting traps to identify unexpected data points requiring further investigation. We will also touch upon some good programming practices which can benefit both the original programmer and others who may have to touch the code. In this paper, we will provide explicit examples of defensive coding that will aid in kicking the tires, pumping the breaks, checking your blind spots and merging ahead for quality programming from the beginning.
TT10 : Essential Guide to Good Programming Practice
Shafi Chowdhury, Shafi Consultancy Limited
Mark Foxwell, PRA Health Sciences
Cindy Song, Sanofi
Tuesday, 3:30 PM - 4:20 PM, Location: Oceans Ballroom 10
Due to increased needs of the pharmaceutical industry and the success of SAS, an ever growing number of programmers are joining the industry all over the world and inevitably moving from one organization to another. Not to forget about code that is shared across companies and with the regulatory agencies. Therefore, to ensure continuity and allow code to be maintained over time, it is essential for organizations that a programming standard is followed. If there is an industry standard GPP guideline, that is widely recognized and adopted; then organizations can be sure that when new programmers join, they can start to both use existing code with ease, and develop new code that others already in the organization can easily follow. In addition of course their existing programmers can share code easily with each other. PhUSE GPP steering board has developed a GPP guideline that has been out for more than a year. It highlights the main standards that should be followed and what should be avoided. The steering board consists of very experienced programmers from large and small CROs and Pharmaceutical organizations, so they have seen what works and what causes problems. These few easy to follow standards offer the industry a simple way to ensure we continue to develop code that can be easily maintained and shared over time, saving time and cost. This new guideline could either be understood as fundamental principles that can be extended by company specific rules or can just be used as is.
TT11 : Lessons Learned from the QC Process in Outsourcing Model
Faye Yeh, Takeda Development America
Tuesday, 1:45 PM - 2:35 PM, Location: Oceans Ballroom 10
As more and more companies in the pharmaceutical industry (sponsors) adopt an outsourcing model for clinical studies, improving the quality control (QC) process has become one of the main discussions between sponsors and their vendors, in most cases, the clinical research organizations (CROs). This presentation discusses the questions often asked by statistical programmers: " How to improve the communication between sponsors and CROs? " How to improve the clarity of data and TLF (tables, listings, and figures) specification documents? " How to manage the QC processes at both sponsors and CROs sites? " What are the efficient ways to verify CRO's deliveries? Having worked as a statistical programmer at both pharmaceutical companies and CROs for more than 20 years, the author will share some experiences and lessons learned on these four topics. The paper will present some suggestions on how to improve the quality control process in the outsourcing model. It does not intend to make any judgment on the quality of works produced or process currently used by sponsors or CROs.
TT12 : PROC TRANSPOSE® For Fun And Profit
John Cohen, Advanced Data Concepts, LLC
Tuesday, 9:00 AM - 9:20 AM, Location: Oceans Ballroom 10
Occasionally we are called upon to transform data from one format into a "flipped," sort of mirror image. Namely if the data were organized in rows and columns, we need to transpose these same data to be arranged instead in columns and rows. A perfectly reasonable view of incoming lab data, ATM transactions, or web "click" streams may look "wrong" to us. Alternatively extracts from external databases and production systems may need massaging prior to proceeding in SAS®. Finally, certain SAS procedures may require a precise data structure, there may be particular requirements for data visualization and graphing (such as date or time being organized horizontally/along the row rather than values in a date/time variable), or the end user/customer may have specific deliverable requirements. Traditionalists prefer using the DATA step and combinations of Array, Retain, and Output statements. This approach works well but for simple applications may require more effort than is necessary. For folks who intend to do much of the project work in, say, MS/Excel®, the resident transpose option when pasting data is a handy short cut. However, if we want a simple, reliable method in SAS which once understood will require little on-going validation with each new run, then PROC TRANSPOSE is a worthy candidate. We will step through a series of examples, elucidating some of the internal logic of this procedure and its options. We will also touch on some of the issues which cause folks to shy away and rely on other approaches.
TT13 : Looking Beneath the Surface of Sorting
Andrew Kuligowski, HSN
Monday, 10:15 AM - 11:05 AM, Location: Oceans Ballroom 10
Many things that appear to be simple turn out to be a mask for various complexities. For example, as we all learned early in school, a simple drop of pond water reveals a complete and complex ecosystem when viewed under a microscope. A single snowflake contains a delicate crystalline pattern. Similarly, the decision to use data in a sorted order can conceal an unexpectedly involved series of processing and decisions. This presentation will examine multiple facets of the process of sorting data, starting with the most basic use of PROC SORT and progressing into options that can be used to extend its flexibility. It will progress to look at some potential uses of sorted data, and contrast them with alternatives that do not require sorted data. For example, we will compare the use of the BY statement vs. the CLASS statement in certain PROCs, as well as investigate alternatives to the MERGE statement to combine multiple datasets together.
TT14-SAS : Reusability, Macros, and Automation in SAS Clinical Data Integration
Melissa R. Martinez, SAS
Tuesday, 1:15 PM - 1:35 PM, Location: Oceans Ballroom 10
SAS Clinical Data Integration is a graphical user interface based programming environment that provides built-in transformations for the integration and manipulation of data. For those of us who have only ever used SAS by writing code directly, this can be a big change to our processes. Not only are we used to writing our own SAS code in our own style, we typically have developed processes that include tools such as macros, schedulers, and reusable programs. The great news is that many of these same techniques can (and should) be incorporated into a SAS Clinical Data Integration environment. This paper provides examples and instructions for using macros, creating reusable content, and making use of automation tools available in the Business Intelligence Platform on which SAS Clinical Data Integration is based.
TT15-SAS : The REPORT Procedure: A Primer for the Compute Block
Jane Eslinger, SAS
Monday, 8:00 AM - 8:50 AM, Location: Oceans Ballroom 10
It is well-known in the world of SAS® programming that the REPORT procedure is one of the best procedures for creating dynamic reports. However, you might not realize that the compute block is where all of the action takes place! Its flexibility enables you to customize your output. This paper is a primer for using a compute block. With a compute block, you can easily change values in your output with the proper assignment statement and add text with the LINE statement. With the CALL DEFINE statement, you can adjust style attributes such as color and formatting. Through examples, you learn how to apply these techniques for use with any style of output. Understanding how to use the compute-block functionality empowers you to move from creating a simple report to creating one that is more complex and informative, yet still easy to use."
TT16 : PROC PRINT - the Granddaddy of all Procedures, Enhanced and Still Going Strong!
David Franklin, Quintiles Real World Late Phase
Tuesday, 9:30 AM - 9:50 AM, Location: Oceans Ballroom 10
The PRINT procedure, or PROC PRINT, has been around since SAS® first began and is considered one of the granddaddy procedures. Although this procedure has been replaced in part by the REPORT procedure, there is still a lot you can do with it. This paper looks at first a simple dump of data, then dresses it up with the use of statements like the BY and ID statements to publication ready output. Next, output is cranked up a notch to demonstrate how PROC PRINT enhancements can be used to produce HTML (with graphics and links), RTF and PDF (with bookmarks). Along the way the paper will also touch on techniques for post processing to make your output more alive.