List of abstracts subject to change. Last updated 1-Sep-2016.
- Data Analysis, Statistics
- Data Visualization, Graphics and Reporting
- Management and Career Development
- Preparation and Regulatory Standards including CDISC
- Programming Techniques
Data Analysis, StatisticsWhat's the BIG Deal with Missing Data?
Mina Chen, Roche
Peter Eberhardt, Fernwood Consulting Group Inc
In clinical trials, a major problem in the data analysis is missing values caused by unrecorded results of measurement at some planned visits or patients dropping out of the study before completion. Missing values can have a surprising impact on the way that data is dealt with, which may result in biased treatment comparisons and also impact the overall statistical results of the study. Generally, missing values can be represented by SAS in a number of ways, there are various functions, options and techniques associated with missing values, and procedures will have specific ways of handling them. However, not all SAS programmers are aware of the System options, DATA step functions, and DATA step routines that specifically deal with missing values. This paper will include the basics of how to detect missing values, and how to effectively make use of various functions and tools within SAS to utilize missing values.
Hands-on Example of Matrix Operation in SAS with Different Approaches
Huan Cheng, Quintiles
It's not always that lucky the modeling used in the study is well developed by certain SAS procedure. Sometimes, we need to transform mathematical representation into executable SAS code. Most of the technical statistical literature prefers to use matrix algebra. Matrix algebra is also a simple and efficient way to do the calculation. When talking about the matrix operation in SAS, SAS/IML language is the first to come into mind. Besides SAS/IML, there are other ways to achieve the matrix operation. In the paper, first, it will give a brief overview of both IML and PROC FCMP approach. Simple syntax and tips to use will be presented. Next, it will use computing the coefficient estimate of simple linear regression as example to illustrate how to use both IML and PROC FCMP to perform the operation. At last, it will have a little discussion about the two approaches to help users choose the proper method.
Unbiased Estimation after Bivariate Phase II Clinical Trial Designs Incorporating Toxicity and Response
Jie Hao, Kennesaw State University
Herman Ray, Kennesaw State University
The single arm, two-stage clinical trial design is a popular methodology to evaluate oncology treatments in the phase II setting. The designs are typically augmented with an ad hoc toxicity monitoring rule which is imposed outside of the formal two-stage design but there are several designs that formally incorporate both endpoints simultaneously. There are several problems with implementing these designs in practice that outweigh their benefits. One of the issues is appropriate point estimation after the conduct of the trial that accounts for the multiple examinations of the data and the bias caused by the efficacy boundaries. An unbiased estimator is proposed and examined through simulation studies. It is compared to the maximum likelihood estimator.
Three Approaches to Estimate Power by SAS
Xiang Liu, Eli Lilly
To estimate power, there are different ways. This poster will introduce 3 approaches to estimate power by SAS. First, also the most common one is to use one SAS procedure - Proc Power. Second, given the power estimation formula, translating the formula to SAS code is a good option. However, in situation that lacking support of SAS procedure, and with estimation formula unknown, a solution of simulation is optimum. Examples will be utilized to illustrate the 3 ways.
Using SAS to Calculate Statistical Properties of Traditional 3+3 Design
Tao Tan, Jiangsu Hengrui
Traditional 3 + 3 design is a popular design thanks to its simplicity for both modeling and execution. Therefore, it is widely used in Phase I dosing escalation studies to help identifying the maximum tolerated dose(MTD) of a drug for a specific route of administration, it is also used to characterize the most frequent and dose-limiting toxicities, which in turn will be leveraged in Phase II studies design. This paper depicts how to use SAS?to compute and present statistics properties of the 3+3 design without de-escalation. The statistical properties include the likelihood of a dose being chosen as MTD, expected sample size at each dose level, expected dose-limiting toxicity(DLT) incidences at MTD, and both overall dose-limiting toxicities incidences and DLT incidences at each dose level. These statistics properties will help investigators to get some insight. The paper also tries to use SAS to simulate such a trial.
General AUC calculated based on the trapezoidal rule
Jianfeng Ye, SANOFI
Generally, the trapezoidal is used to calculate the area under PK curve. Because the PK test data value is actual data, then all observation values should be positive. In practical application, we proposed to calculate the AUC by the derived value. For example, the change from baseline value is derived from the baseline value and the observation value, which have different symbol. Therefore, this kind of data will not be directly used the trapezoidal rule to calculate the AUC. In this paper, we introduce a method that the adjacent original observations with different symbol generate new dummy observations by the geometric triangle theory, to calculate the AUC use the original observations and the dummy observations by the trapezoidal rule.
Survival Analysis from Statistician to Programmer
Amanda Yi, Roche
Survival analysis is very common in clinical trials. Programmers are often requested to program tables and figures that are often referred to Kaplan Meier plots simply. There can be a huge gap between the language of a statistician and a programmer. This paper will provide a basic understanding of survival analysis by using example data and code to compare what the programmer sees to what the statistician would describe. The output data from PROC LIFETEST will be explained.
Are Your Random Numbers Created by SAS?Good Enough?
Yanze Zhang, MSD
Random numbers are crucial to many statistical analyses and simulations. SAS generates pseudorandom numbers based on deterministic algorithms. There are two ways to create random number streams in SAS, Random-number functions and CALL routines. The stream of random numbers is controlled by a seed. If Random-number functions are used in a DATA step to generate the stream, the only seed matters is the first seed in the DATA step. Examples in this paper demonstrate this characteristic of the Random-number functions. This can be easily overlooked by programmers or statisticians, which may lead to the discrepancies in the results. This is actually the advantage of pseudorandom numbers since the stream of random numbers can be reproduced, and thus the results based on those random numbers can be compared and validated. On the other hand, CALL routines can generate multiple streams of random numbers in a DATA step, which may bring some potential risk in stream overlapping when the stream is long. Examples in this paper also demonstrate the loss of independence of streams in some special cases. The Mersenne-Twister pseudo-random number generator (RNG) used by SAS is also used by many other software, such as C++, R, MATLAB, Stata, Python. This paper also explores the random number generation in R, and the random generation tests in R package randtoolbox are carried out to test the quality of the random numbers generated in SAS for illustration purpose.
How would you 'Respond' to hematology response assessment?
Lynna Zhao, Roche
Unlike solid tumor response assessment that typically RECIST is used to assess the tumor response, in hematology studies, different criteria are used for different therapeutic areas. For instance, Cheson 2007, Lugano 2014 and Juweid et al 2007 are used for Lymphoma, while IWCLL 2008 is used for Leukemia. This paper aims to highlight the response assessment differences between oncology solid tumor studies and hematology studies. Response data from a hematology study will be presented in order to address the differences and challenges from both programming and statistical perspectives.
ESTIMATE/CONTRAST Statement too Complex? Tell the Effect Level and Let SAS Do It for You
Aide Zhou, Sanofi
Eric Liao, Sanofi
Write ESTIMATE/CONTRAST statement will be challenging when the model is complex or there are many levels for the effect, especially for the freshman. We develop a SAS macro to use 2-step approach, the 1st PROC MIXED step to get the coefficient data set, and then use the coefficient data set to automatically built the ESTIMATE/CONTRAST statement, what you need to do is just tell the macro which level of the effect (the actual data value in your data set) that you want to estimate/compare, then this macro will select/create the correct column (coefficient) used for the level of this effect and create the ESTIMATE/CONTRAST statement for you, then use the statement in the 2ND PROC MIXED step to do the analysis. This macro is easy to use and can minimize mistakes when we do the analysis.
Data Visualization, Graphics and ReportingCreate a Graph Template on the Fly
Kathy Chen, Pharmacyclics, Inc. A Abbvie Company
While Proc Template Graphics has enabled us to create customized graphs, the tendency is to have many static templates with minor differences. Instead, it makes more sense to create a template to work with your data in real time. Creating a Graphic Template on the fly based on actual data can greatly improve programming efficiency. This paper will show you how to dynamically create a graphic template.
How to make GTL programming easy - An introduction to a GTL macro
Kevin Chen, Boehringer Ingelheim (China) Investment Co., Ltd
GTL is a powerful tool from SAS graphics and with flexible syntax to create complex and high quality graphics. How to initiate GTL programming quickly and efficiently? A GTL macro can help you to make GTL graph quickly and also can get GTL code directly from this macro. This macro is very easy to handle. You only need to include this macro in the SAS program and then will pop up a series of visualized windows. Based on these visualized windows, you can select graph model, graph parameters and then title, footnote, legend, x-variable, y-variable, reference line, drop line, etc. Finally, you can get the graph directly and GTL code from output window. This macro can match the basic requirement of graph model and no manual book is needed for this macro because all the information of parameters are filled out in the visualized windows.
Big Insights from Small Data with RStudio Shiny
Mina Chen, Roche
Accelerating analysis and faster data interpretation helps to stay competitive in the drug development field. It requires statistical programmers to explore and adapt new, different and suitable approaches to delivery of data analysis according to the stakeholder's requirement. A key approach to data analytics is our increasing use of visualization tools in conjunction with analyst skills to support faster decision-making. As customer demands evolving, statistical programmers are required to proactively communicated and interact with key stakeholders and clearly identify their needs. Our practice is to utilize and explore a newly developed data analysis approach 'RStudio Shiny' to support ad-hoc and exploratory analysis based on specific requirement. This paper outlines how statistical programmers support effective data review and fast decision-making by using RStudio Shiny application.
Creating Reader-friendly Clinical Summary Statistics Tables Using PROC REPORT and ODS RTF
Shiran Chen, Cincinnati Childrens Hospital Medical Center
In clinical trial reporting, a standard summary table displays the continuous and/or categorical variables of interest and the statistics in rows, with the treatments and the associated p-values for treatment comparison in columns. Such a summary table with statistics comparison is always generated before conducting any inferential analysis among treatments. Therefore, it is beneficial to design the table so it enables statisticians and researchers to read the document easily and process information quickly, especially when there is a need to summarize a large number of variables. One can enhance the readability of the summary statistics table by suppressing certain table gridlines to "group" results of the same variable together, formatting row headers as bold text for emphasis, creating leading space for sub row headers, and adding superscripts to indicated footnotes, etc. This paper applies these methods to create a reader-friendly version of the clinical summary table in Microsoft Word using ODS RTF and PROC REPORT techniques, with a focus on the usage of STYLES in the REPORT and COMPUTE statements.
An efficient way to create split Axis in Graph
Ryan Feng, Novartis
As a programmer sometimes you have to play a role as an artist when doing graphs for customers. Here is an example, when we need to present both intensive and sparse lab data on the same graph, it is necessary to just 'censor' the time period when there is not assessment performed in that period. Varieties of strategies have been applied by SAS dwellers to just interrupt the axis at specific time point or value and continue at different point or value, however, which are not ideal. Fortunately, new options have been developed in order to make programmer's life easier since SAS 9.4 M3. The purpose of this paper is to demonstrate how the new options could be used as an artist.
How to customize your business process more effectively by BPMN with SAS?LSAF
Yeqian Gu, SAS Research and Development (Beijing) Co., Ltd
As we all know, new drug development process in pharmaceutical industry is perennial, complex and disjointed. Within every organization there are common business processes designed to meet objectives. However, the process are cross-functional, human interaction increase complexity. The need of change may come from almost anywhere, both inside and outside that may cause a series of cost on management. Therefore, how to effectively customize and optimize your business process throughout the entire life cycle becomes very important. This paper presents BPMN (Business Process Model and Notation) compliant workflow in SAS LSAF (Life Science Analytics Framework), a powerful and flexible feature that can help you design, manage and monitor your processes. It allows customers the ability to manage the creation, modification and deployment of any templates they want. Meanwhile, it is readily understandable by all business users, from the business analysts that create drafts of the process, to the technical developers that will perform the process, and finally, to the business people who will manage and monitor the process. In this paper I will introduce how to customize or optimize your business process by BPMN with SAS LSAF.
Smart Statistical Graphics - a story between SAS and Spotfire in data visualization
Yi Gu, Roche (China) Holding Ltd
Known for the best analytic software, SAS has always been challenged by emerging analytic tools. TIBCO Spotfire 6.5, an analytics and business intelligence platform, which enables data visualization in an interactive mode, has been brought into the competition in pharmaceutical industry these years. With its increasing implementation in the field of safety monitoring and dose escalation, TIBCO Spotfire presents its superiority for exploratory analysis. On the other hand, the capability allowed in the efficiency, strong statistical analysis and data processing defends SAS software. This article is to demonstrate the two softwares in terms of data visualization and explore the possible integration between them.
Animate Your Safety Data
Kriss Harris, SAS Specialists Ltd.
When reporting your safety data do you ever feel sorry for the person who has to read all of the laboratory listings and/or summaries? Or do you ever wonder that there might be a better way to visualize the safety data. Let's help make the reviewer's life easier and also understand the safety data better with the use of animation. This paper shows how you can use animation in SAS 9.4 to report your safety data. Such as visualizing a patient's laboratory results, vital sign results and electrocardiogram results and seeing how those safety results change over time. In addition you will learn how to animate adverse events over time, and how to show the relationships between adverse events and laboratory results using animation. Animating your data will bring your data to life and help to better more lives!
Kriss Harris, SAS Specialists Ltd.
Would you like to be more confident in producing graphs and figures? Do you understand the differences between the OVERLAY, GRIDDED, LATTICE, DATAPANEL, and DATALATTICE layouts? Would you like to know how to easily create life sciences industry standard graphs such as adverse event timelines, Kaplan-Meier plots, and waterfall plots? Finally, would you like to learn all these methods in a relaxed environment that fosters questions? Great-this topic is for you! In this hands-on workshop, you will be guided through the Graph Template Language (GTL). You will also complete fun and challenging SAS graphics exercises to enable you to more easily retain what you have learned. This session is structured so that you will learn how to create the standard plots that your manager requests, how to easily create simple ad hoc plots for your customers, and also how to create complex graphics. You will be shown different methods to annotate your plots, including how to add Unicode characters to your plots. You will find out how to create reusable templates, which can be used by your team. Years of information have been carefully condensed into this 90-minute hands-on, highly interactive session. Feel free to bring some of your challenging graphical questions along!
Change mindset: Review the data based on visualization
Siye Liu, Roche (China) Holdinig Ltd.
Data visualization is already a mainstream in many different industries, and we do benefit from the real results that the companies achieved with the approach. So, are you still using the traditional way in your daily work? Or change mindset and implementing data visualization to help yourself and support other functions in your team? In this paper, by using SAS and other visualization tools, for example Spotfire, excel. To generate a few examples presenting the value of using visualization to review data compare to the traditional way.
Annotate your SGPLOT and GTL Graphs
Sanjay Matange, SAS
The SG procedures and GTL provide you multiple plot statements to create many different kind of graphs. These plot statements can be used together in creative ways to build your graph. However, even with this ability to customize, there are times when you need more than what you can get using just the plot statements. You need a way to add custom information anywhere on the graph. With SAS 9.3, the SG procedures support the ability to annotate the graph using data set based information. With SAS 9.4, GTL also supports this feature. This annotation functionality is designed in a way similar to the annotate facility available with the SAS/GRAPH procedures. There are a few differences and enhancements. If you already know annotation from SAS/GRAPH, or if you are new to it, this paper will show you how to add custom annotations to your graphs.
Clinical Graphs with ODS Graphics Designer
Sanjay Matange, SAS
You just got the study results and want to get some quick graphical views of the data before you begin the analysis. Do you need a crash course in the SG procedures just to get a simple histogram? What to do? The ODS Graphics Designer is the answer. With this interactive application, you can create many graphs including histograms, scatter plots and commonly used clinical graphs using a 'drag-and-drop' process. You can render your graph in batch with new data and output the results to any open destination. You can view the generated GTL code as a leg up to GTL programming. You can do all this without cracking the book or breaking a sweat. This hands-on-workshop takes you step-by-step through the application features.
The NextGEN of Lumberjacks: Are You Using the Right Tool For the Right Tree in the Forest?
Forest plot is a graphical display of the estimated results addressing the same question. Originally, it is used in meta-analysis for displaying the treatment effects from independent studies. It is also a popular graphical approach for displaying the results of subgroup analysis in Randomized Controlled Trials, to address concerns about the generalizability of findings to various populations. Here in this paper we present an approach to demonstrate the simplicity of creating various forest plots using the Graph Template Language (GTL) in SAS 9.4.
Clip Extreme Values for a More Readable Box Plot
Mary Rose Alpha Sibayan, PPD
Thea Arianna Valerio, PPD
The BOXPLOT procedure creates box-and-whiskers plots of measurement that displays the mean, quartiles, minimum, maximum and outlier values for one or several groups. When handling interim or "dirty" data, there can be numerous outliers that may affect the readability of the plots. This issue can result into compressed plots and wide y-axis ranges, from which no clear conclusions can be drawn unless summary statistic values are presented. The CLIPFACTOR option has the ability of clipping extreme values to produce readable and useful plots without affecting the summary statistic values. This paper aims to explore the CLIPFACTOR option of PROC BOXPLOT and present a macro that will compute for the optimal factor value based on the data. In addition, the macro also produces a data set which contains the extreme observations that were clipped from the plots.
Scalable Vector graphics (SVG) using SAS
Yang Wang, Seattle Genetics
Vinodita Bongarala, Seattle Genetics
Scalable Vector Graphics (SVG) is an XML-based vector graphic format, compatible with most modern browsers. Unlike pixel-based graphics which lose resolution when enlarged, Scalable Vector Graphics can be magnified infinitely without loss of quality for any screen resolution and size. Starting with SAS 9.2 version, Scalable Vector graphics is supported through Vector Graphics device. SVG can be produced by SG Procedures, Graph Template Language (GTL) and traditional SAS/GRAPH procedures like PROC GPLOT/GCHART. This paper explains the difference between the two kinds of graphs and discusses the pros and cons of using them for clinical outputs. SAS features such as GTL and SG procedures can be efficiently used to produce high quality SVG. Examples that use traditional SAS/GRAPH such as PROC GPLOT/GCHART procedures to produce SVG graphs are also included. This paper will serve the following purposes: " Help decide what kind of graphics format to produce to meet specific needs " Quick Start guide to learn to use GTL to create vector graphics " Quick conversion of current pixel-based graphics to vector graphics.
How does dynamic data visualizations support CDISC standard datasets review?
Stanley Wei, Novartis
Data visualizations have became quite popular and important recently, especially in speeding-up the safety data review and standard data analysis process with CDISC SDTM/ADaM structureddatasets. Spotfire is one of the powerful tools to be used to realize this function. With the use of predefined Spotfire templates and built-in integration of SAS scripts, we can easily deliver our results in a more efficient and effective way, especially with a real-time and dynamic changed visualizations when deployed in a web-player.
Utilizing SAS?and Groovy to combine multiple PDF/RTF reports to one bookmarked PDF document
Weiquan Xin, JNJ
As part of clinical trial reporting, large numbers of PDF/RTF outputs are created and at the completion of a major milestone in a study, we are often required by medical writers, clients, and regulatory agencies to combine all reports in a user friendly file format document for easy delivery and review process. One solution is to combine them to a bookmarked PDF document using Adobe Acrobat Software. However, manually generating a PDF document from multiple SAS output files is a time consuming task. The time needed to manually create a book marked PDF document substantially increases with the number of SAS output files included. This paper presents an alternative efficacy method that use Java script for combining the multiple PDF/RTF documents into one bookmarked PDF document. A SAS built-in mechanisms dynamic language PROC GROOVY which runs on the Java Virtual Machine (JVM) were used to quickly combine the multiple documents into one cumulative file. We have replaced the manual process by automating the ordering of multiple SAS outputs, and the creation of bookmarks within the PDF document. If needed, this script can be used independently.
Management and Career DevelopmentJourney of a statistical programmer
Wei Diao, Roche (China) Holding
The pharmaceutical industry is changing and evolving in the current analytics age, as a result the statistical programmer's role is also changing. As statistical programmers, we need to prepare ourselves for these changes in our industry and to our updated roles to make sure our names are on the signposts of tomorrow, not the exit signs of yesterday. In this paper we will review some of the hard and soft skills the statistical programmer needs for a successful career journey.
An Automatic QC Status Tracking and Management System
Yonghong Hao, GCP Clinplus Co. Ltd,.
Jun Wu, GCP ClinPlus Co., Ltd.
Yulin Li, GCP ClinPlus Co., Ltd.
Jingfang Wu, GCP ClinPlus Co., Ltd.
Tracking the developing, QC status and comments for hundreds of programs in multiple projects is a challenge work, with the programs and their outputs are frequently changed by multiple programmers every day during the project life-cycle; Record and review the status manually are boring, labor intensive, and difficult to assure the accuracy and timeliness; how to improve the project progress communication efficiency is also a big topic. Based on these needs, we developed a tracking system using VBA and php. It stores the tracking data in an independent SQL server database, and can be reviewed and modified through MS Excel and html. The tracking database can be also used for other analysis or summary for management purpose. Use this system, we can achieve the following goals: (1) Automatically track status and latest modified time of the programs, corresponding outputs and comparison status (pass/fail) timely without any manually inputting; (2) Detect time logic error between among programs, output and QC status timely; (3) Manage the centralized pool of TFL titles, footnote, populations and input data sets, Additionally, any changes of them can be saved and reminded for the rerun automatically; (4) Through a hyperlink icon at the tool bar area, a web-based issues log tracking tool can be accessed for tracking the comments and its status, programmers can add, modify and close the comments for each programs, or review the summary of all comments for the project; (5) The project manage team can easily find out the up-to-date progress of each project via the web-based project progress interface with visual progress bar and see the details of the progress for each program.
How to Present Yourself as a Better Choice SAS Programmer in a Competitive Job Market
Lei Wang, The Lotus Group
Abstract: I have been recruiting for almost 14 years in the pharmaceutical industry focusing on statistician and SAS programmer positions in the US. It's a relatively new and hot industry in China. I want to use my experience to help young Chinese SAS programmers who have the talent and ability to understand the job market in the US and develop and present their skills to meet the need of the industry - very good, young programmers with a fresh perspective and new ideas.
Making Quicker Progress in Career Path for SAS Programmer
Victor Wu, Bejing Data Science Express Counsulting Co.,Ltd
When chatting with friends of SAS programmer, I heard a lot of complaint of slow progress and promotion to next level in career. While, what's the root cause, and how to resolve? In this presentation, the author is going to do root cause analysis for this issue and discuss on what need to know at each stage, cultivating good habit, developing career development plan for short term and long term, making full use of daily work, and searching for internal and external help.
Preparation and Regulatory Standards including CDISCFrom Data Collection to Regulatory Submission - a Journey
Todd Case, Vertex Pharmaceuticals
An NDA/BLA is the end result of an enormous effort to submit and present summaries of clinical trial results and data to the FDA (and almost certainly other geographies will soon be requiring actual data). But just how big of an effort is this? From First Patient/First Visit until Last Patient/Last Visit hundreds of millions of data points are collected, derived, summarized and written up in a CSR, etc. For example, a recent analysis for a 'routine' drug program; for a single study, had 48,970,722 data points that were either collected or created. It doesn't take a great imagination to calculate how much data will be created if multiple studies are included in a submission. This presentation will focus on flow of clinical trial data throughout its journey from visit/collection to CSR to NDA/BLA, provide useful metrics to show how much time is saved by using data standards and provide some recommendations about how to further improve efficient data flow.
The Dream of a Common Vision: (Evolving) Standards
Todd Case, Vertex Pharmaceuticals
On July 21, 2004 the US Food and Drug Administration (FDA) announced a format, called the Study Data Tabulation Model (SDTM) that sponsors can use to submit data to the agency. Twelve years later the FDA is only now (as of December 17, 2016) enforcing the requirement of standardized electronic data submissions in SDTM format and now, in addition to SDTM, there are multiple sources (and versions) of data Standards which impact data supporting applications to the FDA: the FDA Data Standards Catalog (primary list and source of standards), the SDTM itself (Version 1.4), the SDTM Implementation Guide (SDTMIG - Version 3.2), the Analysis Data Model (ADaM) - Version 2.1, the ADaM Implementation Guide (Version 1.1), the FDA Guidance for Industry (December, 2014), the Study Data Technical Conformance Guide (March, 2016) and the Prescription Drug User Fee Act (PADUFA), Version V for fiscal Years 2013-2017. At times these documents, guidance's and laws are somewhat contradictory and it's up to the Sponsor (when appropriate to engage with the FDA) to determine which 'standard' (of the standards) to adapt, which version(s) to use and when to update versions. This paper will provide an approach to determine the appropriate data standards and versions.
Analysis of Patient Reported Outcome Endpoints in Clinical Trial
Tongda Che, Merck
Peng Wan, Merck
Patient reported outcome is an endpoint/outcome that is directly reported by a patient without interpretation by physician or others. This kind of endpoint has wide application in clinical trials for approvals or labels and also plays an important role in cost effectiveness assessments. In this paper, we will introduce the development of PRO, discuss the CDISC standard that could be used, and issues that may be encountered in a realistic trial.
Building Your Own CDISC Application with SAS?Clinical Standards Toolkit
Jing Gao, SAS Research & Development (Beijing)
The SAS?Clinical Standards Toolkit (CST) is a framework that supports clinical research activities and focuses on the CDISC standards. The CST provides validation checks from WedSDM, OpenCDISC and SAS on the CDISC data, especially SDTM and ADaM, as well as the generation of XML-Based Standards files, and validation of XML files against an XML schema. A nice plus is that the CST is a separately orderable component that is available at no additional charge to currently licensed SAS customers. However, the CST is a set of SAS macros, metadata and configuration files, interacting with it requires a technical background of Base SAS and the SAS macro language. So can we make the using of the CST more intuitive and easier? Of course, one solution is to add an easy to use graphical user interface (GUI) for the CST and invoke the CST macros that are provided as open source and are accessible to the user. Compared with processing clinical data, it is simple to create a GUI. Many programming languages, such as Java and Python, are well suited for the GUI development. As we know, Python is an easy to learn, use and powerful programming language, thus in this paper, I will introduce how to use Python to build a CDISC GUI application based on the CST.
MRCT and Project Centric Approach
Jingwei Gao, Boehringer Ingelheim(China) Investment Co., Ltd.
Project Centric Approach (PCA) has become a standard practice in Boerhinger Ingelheim. PCA defines a framework for centralization and harmonization of specification and implementations from the project level, including designs, analysis, SAPs, TOC, ADS plan, table shells, database, resources, planning, operation and etc.. With project standards, PCA maximizes consistency across studies and continuity over time, improve quality, simplify trial level activities, eliminate duplications, and facilitate work on non-standard topics. Regional differences and needs based on intrinsic and extrinsic factors can be considered and built into PCA, covering MRCT elements from the early stage as much as possible. PCA has obvious benefits associated with regional contributions and globalization. The presentation will share why, what and how BI implements PCA and the overall impact.
ADaM Tips for Exposure Summary
Kriss Harris, SAS Specialists Ltd.
Have you ever created the Analysis Dataset for the Exposure Summary also known as ADEXSUMM? Do you understand the different Dose Intensity calculations such as Dose Intensity (mg/m^2 per day), Dose Intensity (mg/kg per day), and Dose Intensity (mg per day)? Do you need to calculate parameters such as Relative Dose Intensity (%), Cumulative Dose, Number of Dose Delays, Number of Dose Reductions and Number of Omitted Doses? This paper will show you how you can create the ADEXSUMM dataset, which includes the parameters that are frequently required and reported. The ADEXSUMM dataset will be created using the source SDTM domains, EX and SUPPEX. After producing your ADEXSUMM dataset you are going to want to produce TFLs based on ADEXSUMM, such as TFLs which include in the output the Number of Cycles Received, Duration of Therapy, Total Prescribed Amount, Relative Dose Intensity (%), etc. This paper will show you SAS tips that you can use to produce your exposure summary outputs.
ADaM Compliance VS Easy-Reviewed in Design and Construct Efficacy Analysis Datasets
Li Huadan, MSD
CDISC ADaM provides a fundamental principle that enables statistical analysis of the data, while at the same time allowing the data reviewers to have a clear understanding of the data's lineage from collection to analysis to results. When develop the analysis datasets, compliant with the ADaM defined structure is first consideration. However, for the efficacy analysis data, if don't think about how to make the reviewer to understand the data's lineage from collection to results, that is, the analysis dataset is not easy-reviewed and must make more explanations in ADRG. This paper will demonstrate how to design and construct ADaM compliance and easy-reviewed analysis dataset.
Practical Experience in Implementing CDISC Standards for Global Cardiovascular and Respiratory Trials
David Ju, ERT Inc.
We have been working with sponsors worldwide over the last 10 years to provide clinical data in CDISC format. We have come to appreciate, and anticipate, that providing "CDISC format" can mean different things to different organizations. In fact, our standard response to a new request for CDSIC is to ask, "Which standard in CDISC are you interested in?" Many sponsors who have limited exposure to CDISC will respond, "Can you just provide the data in a standard CDISC format?" If your organization is at a similar point in implementing CDISC, we will present our experiences as both an organization providing various CDSIC data sets and as a member various CDISC working groups. We will provide practical recommendations on following CDISC as well as highlight potential challenges you may encounter as well as potential solutions when following and implementing CDSIC standards.
Development Approach on the Sponsor-Defined Extensions of the Model
CDISC provides detailed guidance on implementation for both SDTM and ADaM. And it is possible to customize the model to match sponsor's requirements. In this paper, we introduce the development approach on the Sponsor-defined extensions of the model which conform with any rules provided by Implementation Guide. And also it covers the factors to consider when implement the standards within the organization, and discussions on possible future states.
FDA Submission Overview - eCTD preparation
Cindy Song, sanofi
This paper gives an overview of an FDA e-submission package - what are included in the submission and how to prepare for a submission package. The content and structure of a typical e-submission package will be presented.
Automation of SDTM Programming in Oncology Disease Response Domain
Yiwen Wang, Eli Lilly
Yu Cheng, Eli Lilly
Ju Chen, Eli Lilly
Study Data Tabulation Model (SDTM) is an evolving global standard which is widely used for regulatory submissions. The automation of SDTM programming is essential to maximize the programming efficiency and improve the data quality. This paper intends to present some core programming logics to automatically create the Disease Response (RS) domain which is stated in CDISC SDTM IG v3.2 to represent the collected tumor data either quantitative measured or qualitative assessed in most of the clinical trials in oncology therapeutic area. The automation is realized by using SAS?macro facilities which include 1) the environment setting; 2) meta-data automation; 3) functionality oriented macros; 4) SUPPQUAL dataset automation; 5) structured end-to-end automation; 6) log check. The SAS?macros presented in this paper could also be applied to the automation of other SDTM oncology domains (i.e. TU, TR) or be extended to the domains in the SDTM Findings Observation Class.
Proposal for streamlining the SDRG and ADRG authoring process
Stanley Wei, Novartis
SDRG and ADRG are now part of e-submission package, providing FDA reviewers a single point of orientation of submission datasets. These documents incorporate additional information as well as some duplication from other submission documents. The authoring process of these two key documents could be time-consuming and tedious, considering most of the included information may exist in different datasets, documents. Keeping up with changes on an ongoing basis and inconsistency may be a potential problem as well. To simplify the authoring process, a centralized metadata-driven method is proposed to streamline and automate this activity. It will also be more efficient and effective for users to manage all of the included information in both SDRG and ADRG and make sure all of the contexts will be consistent across different sources. A user-friendly interface could be developed to further facilitate this process by incorporating information retrieved from study-level metadata repository, SDTM and ADaM datasets, central comments log, OpenCDISC validation reports, etc. accordingly, when applicable.
How to handle the mapping of the eCRF form with the combined information
Lifang Wu, Roche
When we do SDTM mapping, we may feel difficult to decide how to map those information which we don't see frequently. Then we map them to a farfetched domain. However it's not the best way to catch information when they can go to the specific domains separately. I will share my experience of how to handle the combined information form through one example. And we should be aware that understanding how the study collects the data is important for the mapping.
Strategy to Start to Implement CDISC Standards
Victor Wu, Bejing Data Science Express Counsulting Co.,Ltd
CDISC is data standard mainly for clinical trial, covering protocol presentation, CRF design, data tabulation, analysis data, data exchanging and archiving. Both US FDA and Japan PMDA have announced the requirement to submit data following CDISC standards, in the newly released CFDA guidance, CDISC standards are also recommended. When and how to start to implement CDISC standards? In this presentation, we are going to discuss the strategies on timing, rationale and from A to Z of implantation CDISC standards in a company.
ADaM Validation Check introduction
Xiang Xu, Novartis Pharma Co., Ltd
CDISC's Analysis Data Model (ADaM) specifies the fundamental principles and standards to follow in the creation of analysis datasets. ADaM Validation Checks v1.3 contains a list of requirements which can be used to validate datasets against a subset of these rules which are objective and unambiguously evaluable. This paper will summarized what kinds of rules that must be followed while implementing ADaM.
Define.xml Content Validation - CRF Page Check
Xianhua Zeng, PAREXEL
Shenglin Zhang, PAREXEL
The FDA document stated that "sponsors should make certain that every data variable's codelist, origin, and derivation is clearly and easily accessible from the define file". There are some validation tools that can be used to validate the define.xml, such as SAS Clinical Standard Toolkit and Pinnacle 21 Community. However, these tools do not validate the contents of the define file, it is important to supplement it with other validation checks to ensure the accuracy of your define.xml. To do these additional validations manually can be cumbersome and time consuming. This paper presents a SAS macro ChkDefCrfPage, which extracts define.xml contents, annotations and page information from blankcrf.pdf into SAS datasets using XMLMap and then does the CRF page number consistency check between define.xml and annotated CRF.
Programming TechniquesHow Did This Get Made?
Matt Becker, SASA
What data was used for my result? What macros or SAS code did I pull in? What was created? Our time is sometimes spent reviewing code for
SAS Studio(R): We Program!
Matt Becker, SASA
Have you investigated SAS Studio!?! From the 1980's into the 2010's I used SAS Display Manager (PC SAS front end) for all of my clinical table, listing, figure and database program development. I became accustomed to the program editor, log window, output window&being able to view my working and saved data sets via this programming IDE. I resisted new coding editors through the years UNTIL SAS Studio came to fruition! SAS Studio is a web-based application that accesses your SAS environment - cloud, local server(s) or PC. With the environment, you can access your data libraries, files and existing programs&and write new programs! Additionally, SAS Studio contains pre-defined tasks that generate code for you. Have a specific set of clinical programming code you always use? Snippets! Want to define a personal or global task for AE table summarization? Define it within SAS Studio!
PROC DS2-A powerful tool you need to know
Weston Chen, Novartis
Every programmer may know well about the DATA step, however, it has not fundamentally changed over years although it is powerful. So DS2 is coming and act as a powerful tool for advanced data manipulation. DS2 is a new programming language that is based on object-oriented concepts wherein objects are instances of classes and methods are basic program execution units. It is designed to be simpler to develop programs as well as easier to understand programs to ease maintenance down the road thus, lending itself to more robust programs. In this slides, you will get to know how to use PROC DS2 to submit DS2 language statements; how to use PROC HPDS2 executes DS2 language statements within SAS High-Performance Analytics environment.
Two macros for dataset validation
I would like to share two macros for free. These two macros will help validator to QC datasets. They have following functionalities: 1. Quickly locate discrepancies between BASE dataset and COMPARE dataset. 2. List all types of discrepancies of any specified variables. 3. Quickly find out all not-matched variables within minimum records. These two macros will improve dataset validation process for validator.
An Introduction to Creating Multi-Sheet Microsoft Excel Workbooks the Easy Way with SAS?br> Vince Delgobbo, SAS
Transferring SAS data and analytical results between SAS and Microsoft Excel can be difficult, especially when SAS is not installed on a Windows platform. This presentation provides basic information on how to use Base SAS 9 software to create multi-sheet Excel workbooks (for Excel versions 2002 and later). You will learn techniques for quickly and easily creating attractive, multi-sheet Excel workbooks that contain your SAS output using the ExcelXP tagset. The techniques can be used regardless of the platform on which SAS software is installed. You can even use them on a mainframe! More in-depth information on this topic will be presented if time permits.
You Want ME to use SAS?Enterprise Guide??
Vince Delgobbo, SAS
Starting with SAS?, one copy of SAS Enterprise Guide is included with each PC SAS license. At some sites, desktop PC SAS licenses are being replaced with a single server-based SAS license and desktop versions of Enterprise Guide. This presentation will introduce you to the Enterprise Guide product, and provide you with some good reasons why you should consider using it.
DS2 with Both Hands on the Wheel
Peter Eberhardt, Fernwood Consulting Group Inc
Xue Yao, Winnipeg Regional Health
The DATA Step has served SAS?programmers well over the years, and although it is handy, the new, exciting, and powerful DS2 is a significant alternative to the DATA Step by introducing an object-oriented programming environment. It enables users to effectively manipulate complex data and efficiently manage the programming through additional data types, programming structure elements, user-defined methods, and shareable packages, as well as threaded execution. This tutorial is developed based on our experiences with getting started with DS2 and learning to use it to access, manage, and share data in a scalable and standards-based way. It facilitates SAS users of all levels to easily get started with DS2 and understand its basic functionality by practicing the features of DS2.
SAS Automation and More
Kangmin Fan, Abbott
SAS programmers/statistical analysts spend much time on repeating tasks, such as rerunning programs with different parameters and creating similar reports. Using different techniques, we automate the tasks to free the programmers from these tedious, manual burdens. These techniques include the Windows task scheduler, script/batch files, method to find a specific input from a folder, SAS tools like the email file engine, macros to identify error/warning messages in the SAS log file, etc. This paper will demonstrate how we use these tools to benefit the programmers
The Simple Application of SAS DDE in Excel
Danting Guo, Fountain Medical Development, Inc., Nanjing, China
The DDE mechanism in SAS enables SAS to control Excel or Word. Basically, what you can do with Excel/Word using keyboard and mouse can be replicated programmatically in SAS. Although the process of exporting the data from a SAS data set to an Excel file is relatively simple, but SAS output need to be edited to meet the statistical requirements, someone has to manually create and format an Excel spreadsheet, which can be a time consuming process and the editing process will be also very easy to make mistakes. This paper mainly introduces how to uses DDE within SAS to control excel and provide some examples, such as how to send data or results generated in SAS to specific cell locations in an Excel worksheet and so on, which may greatly reduce the amount of work for the editing process.
A Macro to Import Data from CSV File to SAS
Liang Guo, FMD
Pei Zhang, FMD
CSV is a universal and relative simple file format, and widely used in data transfer. Generally, CSV file can be imported into SAS by using PROC IMPORT very easily. However, when CSV file is converted into SAS by using PROC IMPORT, part of the data could be disorder or the program will be aborted due to error if line break exist in data. This paper discuss a macro which is used to import csv data into SAS, and it could overcome the line break problem while keep most capacity of PROC IMPORT.
Macro Quoting: Which Function Should we use?
Pengfei Guo, MSD R&D (China) Co., Ltd.
There are several macro quoting functions in SAS and even some experienced programmers can't distinguish them exactly. We usually look up the SAS help document when encountering some log issue during using. This article will summarize the difference and list some fallible situations. At the end, we also provide a step-wise method to help user make easy decision.
Confessions of a Macro Developer - The trials and tribulations of building a large macro system
Rowland Hale, inVentiv Health Clinical
Developing a robust, scalable, flexible and user-friendly macro library is a significant undertaking. This paper draws on recent practical experience of such a project and, without going into coding specifics, describes the major considerations, challenges and pitfalls encountered by the project developers along the way and how they were resolved. Areas discussed include project justification and stakeholder buy-in through to the collection of user requirements, development, validation, release and beyond. Certainly, before the project can be given the go-ahead, the resource cost must be weighed up against the new system's potential benefits, and these can be significant. Major efficiency gains both for programmers and validators, as well as helping to ensure standards are adhered to or consistency maintained within a study, project or even companywide can be expected, and we consider how users are benefiting from the system we were involved in building. More specifically, the paper covers topics ranging from initial design, parameter names and value checking, validation via test scripts and user acceptance testing, audit-proof documentation, handling post-release bugs and change requests, the benefits of adding automation to the validation and documentation processes (and how to do it using Excel as a central development database) and hurdles presented by migrating to a new environment. Working in an Agile programming environment is discussed as is best SAS?macro programming practice in general. Emphasis is placed throughout on user-friendliness, not only of the macros themselves, but also of the package as a whole. This includes the provision of training, user manuals and support from the development team, user-friendliness being an important aspect which contributes greatly to the overall success of the project; fundamental for the resulting system is that users see its benefits and want to use it!
A Grid Computing Tool to Batch Run a List of SAS?Programs
Huashan Huo, Pharmaceutical Product Development, LLC.
Zhongyu Li, PPD
Lu Zhang, PPD
In clinical research, it is very common that a large number of SAS programs are to be repeatedly batch run due to program modifications or new data updates. In the past few years, several papers authored by pharmaceutical industry programmers were published in this area to describe methods and tools for automating this process. The purpose of this paper is to introduce a tool for grid computing a large number of SAS programs concurrently. The SAS computing tasks are distributed among multiple computers on a network, all under the tool's management. Not only workloads could be distributed across grid of computers, but also overall execution time is geometrically reduced. This paper describes how to implement Multi-User Workload Balancing and Parallel Workload Balancing without SAS/Share and SAS Grid Manager.
PPG, a metadata driven solution for patient profile generation
Weina Jia, Sanofi
Eric Li, Sanofi
Yihong Chi, Sanofi
Vincent Fan, Sanofi
To facilitate the medical review, data management programmer need to provide the patient profile (PP) to medical reviewer. In most cases, delivering a set of PP for one study takes at least 5 days as data manager need to prepare its specification and then the programmer need to develop the SAS code accordingly. But with PP Generator (PPG), specification preparation and routine programming are not needed any more. It will create PP based on RAVE study design specification (SDS) and source data extracted from clinical database directly, minimize the mistake you might make and greatly reduce repetitive programming, so as to improve the programming efficiency and accuracy and save development time (taking no more than 3 hours). PPG also has the flexibility by providing the interface for PP population, context, styles and format customization to meet data manager's individual demands while reviewing.
A Web Interface for Efficient Clinical Trial Program Development
Hao Jin, Fountain Medical Development, Inc.
Ping Ni, Fountain Medical Development, Inc.
In order to improve the efficiency of code programming in clinical trial, an interface in web page form is developed to contribute several important functions to reduce unnecessary time. Two parts are included in the interface. One is Hyperlink to required documents, such as protocol, sap, TLG shells and programming tracking sheet. The program can automatically create the link by identifying the key words to locate and get these documents' path. It also can be recreated if related documents are updated. The other is a Shortcut to utility tools, which includes batch-run, compare-result, check difference and etc. These functions could make programming process more efficient. The Web interface could be generated by using SAS with a few parameters. It's very easy to implement it when starting a project development. And the functions of this interface could be extended.
Performing Pattern Matching by Using Perl Regular Expressions
Arthur Li, City of Hope
SAS?provides many DATA step functions to search and extract patterns from a character string, such as SUBSTR, SCAN, INDEX, TRANWRD, etc. Using these functions to perform pattern matching often requires utilizing many function calls to match a character position. However, using the Perl Regular Expression (PRX) functions or routines in the DATA step will improve pattern matching tasks by reducing the number of function calls and making the program easier to maintain. In this talk, in addition to learning the syntax of Perl Regular Expressions, many real-world applications will be demonstrated.
Why not Picture Format?
Gaoyang Li, BHC
The importance and benefits of formats procedure are well-known. Nearly every SAS programmer knows how to use SAS built-in FORMATS/INFORMAT or create costumed FORMATS/INFORMAT. However, it seems that picture format was excluded from their toolkit. Why not Picture format? The rationale of PICTURE statement is different from VALUE/INVALUE, at the same time it bring us some wonderful method to display or output data. In this paper, the logic of picture format is parsed step by step to make sure you are confident to use it without worrying error. A few examples with different options are followed to demonstrate its efficiency.
Blind Data Review in clinical trials
Jiashu Li, Lilly China
Dong Guo, Lilly China
Blind data review is an important procedure in clinical trials that connects data management process to statistical analysis. Although statistical function is not the owner of data, blind data review can help identify the issues, review protocol violations, and examine explorative trend. In a word, it oversights the quality of clinical trials during the period of time between first patient visit and the breaking of the blind. How can blind data review be done more efficiently? Our experience with the blind data review are discussed here.
How to give SAS ambiguous instructions and still being a big winner (literally delegate everything to SAS)
Hui Liu, Eli Lilly
SAS is eager to be your ALPHAGO (the artificial intelligence who defeated the Go game world Champion). The main reason you two are not getting there is just because you are too nice to SAS. The most valuable, expensive factor in the field of modern programming is the human-time. However, we are wasting a lot of time by doing things that SAS can do itself without our intervention. When we state our instructions to SAS, the most efficient way is telling SAS the information which is not to its knowledge. Not what SAS knows or it could get by searching, guessing, learning itself. SAS is actually able to think in a way as google or other search engines do. When you type some keywords to google, it returns things which in a high probability being useful to you. That approach applies to SAS either. So, let's show SAS application "who is the real boss!", do not hesitate to be bossy.
Custom useful SAS functions using the PROC FCMP procedure
Qixia Shi, FMD
Guanyu Su, FMD
As a SAS programmer, we usually do some repetitive programming work. Usually, we use DATA STEP or macros to ease writing repetitive program. However, DATA STEP are difficult to be reused in other programs, it will be useful and time-saving if small program units can be reused. Macros are easier to be reused but they are not independent from the main program, they involve non-DATA step syntax, and reuse macros can result in a large number of intermediary data sets being generated .It would be great if we could circumvent the issue of macros generated, as all we really want is one result. Now, multiple subroutines and functions can be declared by using PROC FCMP procedure in SAS 9.2 or higher version. Thanks to the PROC FCMP procedure and RUN_MACRO function, now we can custom independent and reusable subroutines, this enables programmers to read, write, and maintain complex code more easily. This paper will present some useful functions for handing calculations, standardizing and simplifying code created by using PROC FCMP procedure, using these functions we can use only one statement to achieve what we want, that can reduce much repetitive work and save time.
Big Data Processing Techniques
Eduard Joseph Siquioco, PPD
Oftentimes programmers will encounter big data; these data comes with many observations and variables. These are mostly used in conjunction with complex programming tasks and will result in long run times and programming. Different SAS functions, options, and techniques can help in reducing the run time when used correctly. This paper will explore the basic techniques like using SAS options and go in-depth with beginner friendly techniques that programmers most often times overlook.
Maximizing the use of SAS Shortcuts
Eduard Joseph Siquioco, PPD
Using the point and click method of opening datasets may sometimes be tedious especially if you have a lot of files and dataset in the viewer. This paper will explore the different commands available thru SAS DM statements to ease this tedious task; but typing the DM statements may also get tedious as we'll type them to run them. To ease this task, the paper will also explore other shortcuts such as Keyboard abbreviations and assigning function keys to do these DM tasks.
The Impact of Change from wlatin1 to UTF-8 Encoding in SAS Environment
Hui Song, PRA HEALTH SCIENCES
Anja Koster, PRA Health Sciences
As clinical trials become globalized, there has been a steadily strong growing need to support multiple languages in the collected clinical data. The default encoding for a dataset in SAS is "wlatin1". wlatin1 is used in the "western world" and can only handle ASCII/ANSI characters correctly. UTF-8 encoding can fulfill such a need. UTF-8 is a universal encoding that can handle characters from all possible languages, including English. It is backward compatible with ASCII characters. However, UTF-8 is a multi-byte character set while wlatin1 is a single-byte character set. This major difference of data representation imposes several challenges for SAS programmers when (1) import and export files to and from wlatin1 encoding, (2) read in wlatin1-encoded datasets in UTF-8 SAS environment, and (3) create wlatin1-encoded datasets to meet clients' needs. In this paper, we will present concrete examples to help the readers understand the difference between UTF-8 and wlatin1 encoding and provide practical solutions to address the challenges above.
What's on behind - Revisit DATA step processing to avoid unexpected errors in a Data step
Xuan Sun, PPD Inc.
In our daily work, it is common to integrate information from multiple data sources. Benefited from DATA step, such manipulations could be handled efficiently. However, negligence of the DATA step processing flow may sometimes results in unexpected results. In this paper, the compilation and execution phases in DATA step processing will be illustrated. Also examples will be given to show the scenarios when we make mistakes.
Make the Most Out of Your Data Set Specification
Thea Arianna Valerio, PPD
A data set specification usually contains multiple data sets and variables that will be derived for a certain clinical study. Aside from ensuring the data set derivations are correct and robust, it is also important that the data set metadata is consistent with the specification. Manually verifying the data set and variable attributes against the specification is time consuming and might cause risk in ensuring quality. It is also inefficient typing all the variable names, labels, lengths and formats in the programs most especially when working on a dataset with many variables. Through the macros presented in this paper, these steps can be automated and thus allow more time to review the variable derivations in the data set program. The specification is used as the input file to create macro variables that will hold the information on the attributes of all variables in a data set. This way, any changes in the specifications during the course of the study are automatically captured by the macro. Certain techniques are also presented to make the most out of your data set specification.
Using SAS to Assemble Output Report Files into One PDF File with Bookmarks
Sam Wang, Merrimack Pharmaceuticals
Kaniz Khalifa, Leaf Clinical Services, Inc.
Assembling output report files(such as RTF or PDF) into a single PDF file is frequently needed in clinical trial development. The most current practices used to complete this task are using adobe product or other third party applications. However these practices are very time consuming for programmers to create or control the bookmarks because the default bookmarks by those applications may need to use the file names or use certain lines of the titles from the outputs. Also, using third party applications will occur additional cost and will require more time to learn and understand new language syntaxes. Our approach to assemble the output report files has no need of using third party's applications by only using SAS code and procedure. Before generating the output files, we keep the table/listing/figure numbers and their titles into macro variables as the bookmarks. When generating the RTF or PDF output files, we add an ODS DOCUMEMT statement to hold the output objects in their original creation structure. After all output files are generated, we use PROC DOCUMENT to replay or rerun those output objects into one destination PDF file. The order of object files and their bookmarks in the PDF file will be based on the stream order controlled by SAS code.
Hash Object Working Efficiently with Large SAS Datasets
Yazhen Wang, FMD
Hao Xu, FMD
Many SAS users experience challenges when working with large SAS datasets having size close to a gigabyte or even more. Generally it costs a lot of time and energy while storing and processing such datasets, leading to a serious impact on project timeline. For the sake of handling these constraints, users can think of reducing the size of dataset by reducing the size of the variables without losing any information or SAS options like COMPRESS option. Other programming tips and techniques alternatively on the one hand are using well-known SAS statements, on the other hand the new hash object in SAS version 9 provides an efficient and convenient mechanism for quick data storage and efficient retrieval. This paper will introduce hash object basics, present a few common hash object methods and build rules of thumb for when to apply this new technology to your programs.
Interactive SAS Analysis Service (iSAS) within Unix SAS Server
Stanley Wei, Novartis
Interactive SAS Analysis Service (iSAS) is a SAS macro-based package that enables programmers to run programs in an interactive way under Unix SAS environment, in which, frequently used Unix commands, bash scripts, as well as in-house working environment, template codes, utility macros were incorporated to ensure a more straightforward and user-friendly statistical reporting activities. It works equivalently as we do with interface PC SAS for most occasions. It is designed to be a good companion of PC SAS and especially for those who is keen on higher efficiency and enjoy the coding and typing of command lines via the terminal way.
Using Hash object to reduce run time
Xiaotian Wu, PPD
Yu Zhu, PPD
SAS programmers may work with large datasets frequently. However, processing extremely large datasets can cost huge amount of time and effort, and therefore will cause an impact on delivery timeline. Even simple DATA steps and PROC SORT procedures can be problematic and time-consuming. Reasons vary from I/O issues, out of work space or some unexpected errors. Often run times can be reduced by adhering to good programming practice and creating concise and efficiently structured programs. As run time is influenced in two ways: Input/Output (I/O) time and CPU time, this paper outlines several techniques that can be incorporated into the program in order to reduce I/O time or CPU time. Besides simple tips to reduce run time like using DROP/KEEP statements, compress options, and TAGSORT option, this paper will mainly discuss HASH object programming for beginners since it is more efficient compared to SQL and MERGE steps. Hash objects or hash tables are data structures that provide a way to efficiently search data. A hash object, which consists of key items and data items, is a type of array that a program can access using keys. The programming language applies a hash function that maps the keys to positions in the array. It is a great choice when performing many-to-one merge. This paper will also give examples to illustrate how HASH object works.
An effective way to produce laboratory shift table
Amanda Yi, Roche
'Shift table' is often required for most clinical trials, especially laboratory data analysis. The purpose of shift table is to deduce how the results are varying from the baseline to post-baseline visits in the study. This paper will start from an example and explain the anatomy, then a step-wise explanation of how to create a shift table in a wink.
Proportion difference and confidence interval based on Cochran-Mantel-Haenszel method in stratified multi-center clinical trial
Huiping Zhang, Fountain Medical Development, Inc.
In stratified randomized multi-center clinical trials, we often take account of stratified center factors in the estimation of common treatment effect. In particular, strata-adjusted proportion difference can be obtained by weighted average of stratum-specific proportion differences. Several weighting strategies are available such as Cochran-Mantel-Haenszel (CMH), etc. While there is no SAS option in frequency procedure that can be used. In the text, we will develop a program to perform CMH-reweighting analysis of 95% asymptotic confidence interval for proportion difference based on Wald-type statistic.