List of abstracts subject to change. Last updated 26-May-2017.
- Applications Development & Technical Techniques
- Coder’s Corner
- Data Standards/CDISC and Regulatory Submission
- Data Management & Validation
- Data Visualization and Graphics
- Hands-On Workshop (HOW)
- Management and Career Development
- Statistics & Pharmacokinetics
Applications Development & Technical TechniquesSolving the 1,001-Piece Puzzle in 10 (or Fewer) Easy Steps: Using SASv9.cfg, autoexec.sas, SAS Registry, and Options to Set Up Base SAS?
Peter Eberhardt, Fernwood Consulting Group Inc
Are you frustrated with manually setting options to control your SAS? Display Manager sessions but become daunted every time you look at all the places you can set options and window layouts? In this paper, we look at various files SAS accesses when starting, what can (and cannot) go into them, and what takes precedence after all are executed. We also look at the SAS Registry and how to programmatically change settings. By the end of the paper, you will be comfortable in knowing where to make the changes that best fit your needs.
FREQUENTLY USED SAS OPTIONS IN CLINICAL TRIAL PROGRAMMING
wanqing li, Amgen
There are many system options in SAS (464 options in SAS9.4), which used to control the way SAS performs operations, which can be concluded as category of Communications, Environment control, Files, Graphics, Input control, Log and procedure output control, Macro and System administration. Most of the options default setting don’t need to be modified, but dozens of options is really useful in SAS programming when data analysis. E.g., how to easily debug macro, how to manage lots of macros in study, how to control the output display style, how to control some log issue you don’t want it show in your SAS log, how to interact between SAS and windows, and so on, options in SAS can help you to resolve it. Then the paper will give us an overall sight about SAS options and summarize the most frequently used options for drug development clinical trial programmer, also it will teach you detailed how to use the options to achieve your intentions.
HASH in PROC FCMP and It's Application
Qing Liu, Eli Lilly and Company
HASH, a technique that will improve the performance of table lookup, search and sort operations, is often used in DATA step In SAS program. While HASH could improve the efficiency of SAS program, more syntax input is required.And compared with general SAS Programming language, It’s syntax is complicated. Thus the HASH program is often hard to maintain. This paper introduces a new approach to use HASH in SAS?, using HASH object in PROC FCMP. This new approach join the power of HASH and PROC FCMP, and provide an efficient and convenient way to table look-up. A practical application is also provided to demonstrate how HASH in PROC FCMP could create efficiency and convenience.
Networked Table-shell Exploration
Jianfeng Ye, SANOFI
In recent years, the internet technology is developing rapidly, which deeply affect the work pattern of other industries. Every processes of drug clinical trial has being accelerating under the influence of the Internet. As an important part of clinical trials, the statistical analysis process is precise and complex. In the design stage of clinical trial, statisticians need to explain to programmer how is the information to be illustrated using tables, listings and graphs, which generally be recognized as table-shell. This is the finally presentation of data that programming based on. Usually, statisticians use table-shell of similar clinical study as reference to design table-shell of current study, and the modification are according to the SAP requirement. By doing this, the amount of workload can be reduced significantly. However, even current process of designing statistical analysis table-shell is still time consuming and the process itself is manually repeating and tedious. After the digging into the process of table shell designing, we found that, generally, table part is relatively complex, which was consisted with titles, tabulation and footnotes. The tabulation is including table head and body. The listing and the graph has similar structure with table. So we can split a table into different parts, and stored as XML format. When generating new table-shell, statistician only needs to select the combination of the parts in storage. This operation will greatly simplify the process of the table-shell designing. Meanwhile, the modification due to the different between the reference table and new requirement can be easily accomplished by the web script language. On the other hand, big data of clinical table-shells, shared on internet platform will construct a rich table-shell database. The newly generated table-shell further enriches the database itself which finally enhance the prosperity of the whole industry.
Merge multiple RTF files, convert to PDF and add bookmark
xiang wang, FMD
As usual, we will combine lots of RTF output files about statistical analysis into an RTF file to expedite the review process which gets requests from the client or business user. In order to make the merged file is easy to browse, we need adding a TOC (table of contents) to link to the corresponding table. We also need converting merged files to PDF file and add bookmark, so that it won't modify easily. However, multiple RTF files will cause some problems, for example, support in both English and Chinese and spend a lot of time to add a bookmark manually. The paper focuses on introducing the process of Combining RTF and some skills on common problem. It covers three parts: merging multiple RTF files, converting it PDF and adding bookmarks. There will be many outputs to explain the methods to solve the problem. Finally, showing the advantages and disadvantages of it.
RTF TO SAS dataset - an efficient way for tracking outputs
Weston Chen, Novartis
The Rich Text Format is a document file format developed by Microsoft in almost 30 years ago for cross-platform document interchange. Most word processors such as Microsoft word are able to read and write RTF documents. The Rich Text Format (RTF) Specification is a method of encoding formatted text and graphics for easy transfer between applications. An RTF file consists of unformatted text, control words, control symbols, and groups. In the pharmaceutical industry, almost all Tables, Figures and Listings(TFLs) are created in RTF format. Generally, those rtf files are typically visually compared with the output generated on the same specifications from a second programmer or statistician. When large number (or size) of rtf files are involved, this method is very time consuming and not efficient and prone to mistakes. In this slides, it would introduce a method to do the converting process from RTF into SAS datasets, in that way it could make sure the quality by just comparing to the SAS datasets created to some degree.
Journey to the center of the earth – Deep understanding of SAS language processing mechanism
Di Chen, SAS Beijing R&D
SAS is a highly flexible and extensible programming language, and a rich library of encapsulated programming procedures. It is designed for data processing and statistical analysis. However, SAS is procedure-oriented which will be easy to get the unexpected result although no syntax error during code written if you’re not clear the SAS language processing mechanism. This paper mainly introduce the procedure of processing and resolving in the background of SAS language.
Build SAS IDE by Sublime Text
Xinghu Liu, Taimei Technology
We all know that SAS is our statistical programmer’s tool and language for everyday life. With a powerful integrated development environment (IDE), you will do much better at achieving quick, quality results. Although, SAS have offered many choices for users, such as SAS Enhance Editor, SAS Enterprise Guide, SAS studio. Many users are not getting on well with these tools, especially some SAS geeks. Sublime Text is a sophisticated text editor for code, markup and prose. It has a lot of raving fans. The words “best code editor ever” are dropped regularly online. This paper will introduce Sublime Text to build a customized IDE. Firstly, in file management and navigation, Sublime Text helps users to keep work organized and find way around projects. Secondly, code editor as the core function of Sublime Text provides a boatload of features that help user write code more efficiently, such as syntax highlighting, multiple views, add comments, code autocomplete for functions, formats, procedures, macros and programs. Then users can run SAS code in Sublime Text. The code is run in batch mode and all the output files put (.log, .lst etc) in the same folder with SAS code. At last, Sublime Text can edit and summit R and python code interactively with SAS simultaneously.
To check the existence of objects in SAS systems
Jianjun Tan, Sanfoi
This article is aim to summarize SAS techniques which are useful for checking the existence of objects in SAS system. Examples will be provided for demonstrating their usage. All examples are tested in SAS version 9.4. The types of objects include SAS dataset, variable, format, informat, macro, macro variable, library reference, directory path, specific files within a directory, etc.. Main relevant SAS techniques are: 1) dictionary tables in SQL procedure and their associated SASHELP views; 2) SAS functions. Other basic techniques regarding properties of objects in SAS system are also mentioned for reference.
A convenient tool to check the pagination issue in TLF
Yong Cao, Xuan Sun, PPD Inc.
Pagination issue happens when records are too many to be put in a single page. The common way to check such issue is to open the RTF outputs one by one and review them manually. However, it’s time consuming and laborious. Sometimes the issue may be overlooked because of negligence. To solve such problem, a useful tool which can check pagination issue automatically will be introduced in this paper. When using BY statement in PROC REPORT within the context of ODS RTF (or ODS TAGSETS.RTF), SAS starts a new section whenever it comes to a new set of BY values. The key to precisely control the contents in a page is to keep all records within the same section contained in a single page. The core of this tool is a piece of VBScript that retrieves number of sections and number of pages from a RTF output. It can be wrapped into a SAS macro so it’s convenient to be invoked within SAS program. What’s more, the macro also provides functionality to check outputs in batch.
Writing Efficient Queries in SAS@ Using PROC SQL with Teradata
Mina Chen, Roche
The emergence of big data, as well as advancements in data science approaches and technology, is providing pharmaceutical companies with an opportunity to gain novel insights that can enhance and accelerate drug research and development. The pharmaceuticals industry has seen an explosion in the amount of available data beyond that collected from traditional, tightly controlled clinical trial environments. Investing in data enrichment, integration, and management will allow the industry to combine real-time and historical information for deeper insight as well as competitive advantage. On the other hand, pharmaceutical companies are faced with the unique big data challenges collecting more data than ever before. There has never been a greater need for efficient data analytical approach. Together, SAS@ and Teradata provide a combined analytics solution that helps big data analysis and reporting. This paper will introduce, based on real study experience, how to connect to Teradata in SAS and how to conduct analysis with SQL in a more efficient way.
Project Automation? Scripting helps you out
Sean Yan, Shaozun Zhang, Jason Wang, WuXi Clinical Development Services Co.,Ltd
Before the delivery of the SDTM/ADaM the lead programmer will need to do a series of repeated process for the deliverable package. This could be labor intensive and time consuming when some intermediate part failed and leads to a start over. This paper introduces the automation method through scripting on how to execute certain series of SAS programs, produce the Pinnacle 21 validation report using desired parameter, processing the validation report using an existing commented validation report. This allows a full automation of the project from program executing, domain validation, validation report processing under a simple double click.
SAS Application to Automate a Comprehensive Review of DEFINE and All of its Components
Walter Hufford, Vincent Guo, Mijun Hu, Novartis Pharmaceuticals
The DEFINE is a large electronic document comprised of many different but interrelated components such as annotated Case Report Form, Data Reviewer's Guide and metadata. Further complicating the situation is that most electronic submissions contain several studies and an SCS and an SCE, each of which requires their own DEFINE. Reviewing the DEFINE to ensure consistency, accuracy and completeness within a single DEFINE as well as across DEFINEs is both time consuming and resource intensive (and often mind-numbing if you have a large submission with many DEFINEs to review). Automating review of the DEFINE can be achieved with a few simple, easy to develop, SAS? macros. The result is a much quicker review requiring substantially less resources. In addition, findings are noted in a standardized manner allowing for quicker issue resolution and tracking. We will describe our DEFINE review tool in detail and provide code snippets from our macros which should allow you to implement a similar tool with little effort.
Coder’s CornerSYMply PUT: GET the most out of SYMPUTX and SYMGETN
Rob Howard, Veridical Solutions
SYMPUTX and SYMGETN are simple, yet extremely powerful and useful functions. The CALL SYMPUTX routine allows you to quickly and easily store dataset values into macro variables, while the SYMGETN function retrieves these values. Using these functions saves time and reduces the chance of data calculation errors in even the most complex programs. By working through a practical example, you will see the utility and flexibility of these indispensable functions.
Automatic Retention with MERGE statement
Yuni Chen, Roche(China) Holding Ltd.,
As a SAS programmer, we may get very much familiar with MERGE statement, which will be used by quite a few of us whenever we need to combine some variables. Sometimes, however, we may get UNEXPECTED result when doing data manipulation to the variable joined from merge statement. In this paper, the author will use examples to demonstrate the AUTOMATIC RETENTION with merge statement and how to resolve the possible unexpected result. It is always a good way to learn from other’s mistakes!
Extract Information from aCRF by using EXCEL VBA
Kai Zhou, Annie Xu, PAREXEL
We need to extract information from annotations of aCRF to state the origin of SDTM variables to build the metadata. Such as CRF page, value list and code list. We usually search such information in the PDF file directly and that makes the task very tedious, time consuming and inaccurate. This paper introduces a method to automate and accelerate the process by using EXCEL VBA.
The Other Six Senses of SAS
Deepak Ananthan, Karthik Raja Subbiah, Sachin Bharadia, Zifo RnD Solutions
SAS? provides a wide variety of features through a lot of in built functions. It is so vast that it is humanely impossible for a person to know all the functions. The tools available to us is of no help if we are not aware of how to use them to solve our problems. This paper will take some specific challenges we face as CDISC programmers and see how SAS? helps us in addressing them. 1. Display of dates in specific formats in tables or listings or reports can be really challenging. Each sponsor has their own preferences and SAS? provides a variety of options to display dates in different formats. 2. Sorting visits based on visit name can be challenge because of the way SAS? sorts character variables. How we wish SAS? takes numerals within a character as numerals while sorting? Yes, we have an answer. 3. Transposing CRF data for FINDINGS domains in SAS? can be cumbersome if you are using OUTPUT statement. Number of lines of code can be huge depending on number of TESTCDs and maintenance of this code can be challenging. How can this be avoided using ARRAYS? 4. Sticking to FINDINGS domain typically RESULT variables are arranged in order and is there a way I can tell SAS? to take all variables from a certain starting variable (say HEIGHT) to another variable (say WEIGHT) and perform the same operation for all of them? 5. How can BYVAL help us in PROC REPORT to print same structure of results for different categories like Hematology, Urine Analysis, Chemistry etc?
One Step to Produce Shift Table by Using PROC REPORT
Haiqiang Luo, PPD Inc.
Shift tables play a very important role in clinical trial analysis. A shift table is a table that displays the number of subjects in different range (e.g. low, normal, or high) or interested grade at baseline and then shift or transition at selected time points or time intervals. The purpose of the shift table is to illustrate the progress of changing from baseline and help to make reasonable inference. The common programming logic is firstly getting the frequency from PROC SUMMARY (or PROC SQL, PROC FREQ), then calculating percentage in data step, transposing the dataset by PROC TRANSPOSE, and last producing the report by PROC REPORT. This paper simplifies these tedious procedures and use one step PROC REPORT to produce the shift table from analysis dataset.
Could you make join datasets more efficient, especially many to many join?
Wei Duan, PPD
Joining datasets to get variables is a very common thing in SAS programming. Usually we implemented the data join by DATASTEP MERGE and SQL join. However, they are not always efficient. Depending on whether the join key is unique, join dataset can be divided into two parts. The first part includes one to one, one to many join. The second part covers many to many join. This paper focus on using FORMAT and SET SET in one to one and one to many join. Also provide POINT CONTROL and HASH OBJECTS methods to complete many to many join. The advantages, disadvantages and their conditions are summarized at the end of the paper.
Creating a list of files within a folder and inserting EXCEL hyperlinks to open each file
Vincent Fan, Qingan Chang, Sanofi
It is a common requirement to generate a list of files contained within a folder. This paper will demonstrate how to access the directory information, how to create a Microsoft EXCEL file of the list, and especially, how to insert hyperlinks in the EXCEL list to open these files.
SAS %Macro for Exhaustive Statistics Research with Large Datasets
Hyungmi An, Myeongrae Yi, GMCCTC
Before performing statistical analysis, we should check out data structure, identify variables and figure out their summary statistics first. Because it requires tedious and repetitive tasks, automated SAS %macro is useful for exhaustive statistics research with large datasets. We introduce a SAS %macro code which consists of 2 steps: Listing and classifying variables, tabulating their summary statistics.
How read RTF files into SAS dataset?
Zhao Chunpeng, Boehringer Ingelheim
Here is my Macro. It can read monospace RTF file into SAS dataset standalone. And different columns in RTF file will be split into corresponding variables in SAS dataset. It also can read in-text table into SAS dataset too. There are only 4 Macro parameters including 2 input and output parameters. Limited: RTF file with >1 pages. Example: %readrtf_v3(filename=&E_PGMSAF/&RTF..rtf, retain_str=&R_STR, pagn=1/1, outds=rtfds_1) This macro will be shared freely in the meeting.
CMH: The Proc Freq is Extended
Linga Reddy Baddam, Inventiv Health Clinical
In clinical industry more often we use CMH test to test the association between two variables (Exposure variable and Response Variable) based on stratified variable. E.g. compare the response rates between two treatment groups in a multi-center study using the study center as strata. And there are scenarios where we will have more than 2 levels in either exposure variable or response variable in that case we have generalized CMH test to compare response rates among treatment groups. In SAS, FREQ procedure does the CMH analysis and provides results for three different kinds of alternative hypothesis for Nonzero Correlation, Row Mean Scores Differ and General Association. And there is always confusion what info to pick up from generated results from FREQ procedure for the given data without losing the intentions of Bio-statisticians to make accurate conclusions further. The effort of this paper is certainly to help the beginners to understand the basic usage of CMH test and its right way of applying test and choose appropriate statistic to get accurate results to make conclusions further.
Advance Usage of RTF codes in ODS
Billy Xin, Taimei MobileMD System
In the pharmaceutical programing work, the reporting study results were created in Output Delivery System (ODS) with the tools Proc REPORT or DATA _NULL_ . Both of them can be used to create high quality outputs in Rich Text Files (RTF) format. Proponents of PROC REPORT contend the procedure is more user-friendly and the code easier to understand and consequently debug. While the DATA _NULL_ approach argue that it allows the user to fully customize reports, thus avoiding the need to tell a customer (internal or external).The purpose of this Presentation will introduce some advanced tips to output RTF files rather than compare the two tools. Through several example were present to demonstrate which tool is the best fit for a given scenario and some advance tips will be offered on both how to enhance the output for each and how to streamline the use of your reporting tool of choice. This presentation is based on SAS version 9.2 or above, is not limited to any particular operating system, and is intended for not only the beginner of the programmers and also for the experienced pharmaceutical SAS users.
The encapsulation of hash module macro
Mijun Hu, Eason Yang, Novartis Pharmaceuticals
The hash object has been introduced a lot in recent year SAS presentations. Most of them were giving examples and explanations on how to apply them. Somehow people may still struggle on the weird syntax. Statements such as if _n_=1, length and call missing etc. are frequently used to support the hash merge and clear the log issue (uninitialized...). This leads to the motivation of encapsulation the hash merge process. Inputting only three parameters: 1. the dataset you want to merge with; 2. the key variables; 3. the new variables you would like to drag into the datasets. That’s all we need to complete a merge without bothering the complex hash syntax.
Data Standards/CDISC and Regulatory SubmissionHow to define Treatment Emergent Adverse Event (TEAE) in crossover clinical trials
Mengya Yin, Wen Tan, Ultragenyx Pharmaceutical
The definition of TEAE is an event that emerges during treatment, having been absent at pretreatment, or worsens relative to the pretreatment state. For crossover studies, things get a little bit complicated: we need to specify which period the TEAE occurs, so as to figure out which treatment really triggers that TEAE. Meanwhile, what if the crossover trials have adverse events occur during the washout/rest period? Does this AE count as TEAE? If so, then which treatment should we consider that cause these adverse events? This paper will have further discussion and give some suggested options on TEAE definition in such scenarios.
Introduction to Therapeutic Area Data Standards User Guide for QT Studies (TAUG-QT)
David Ju, ERT Inc.
Since late 2011 CDISC has offered therapeutic area standards in order to facilitate the use of CDISC standards throughout the processes of clinical research. The CFAST QT Studies Team prepared the Therapeutic Area Data Standards User Guide for QT Studies (TAUG-QT) Version 1.0 (Provisional) in 2014. The User Guide includes sections on background information as well as ECG overview; however, as QT is a complex domain, it is to be expected that users will face difficulties in attempting to understand the area in a short time. This paper aims provide a summarized introduction for specific users such as data managers and statistical programmers by providing tips and helpful tricks with explanatory examples to TAUG-QT.
Constructive Practice of Generating ADaM Datasets
Amos Shu, AstraZeneca
Many programmers perhaps still do not know there are actually three ADaM standard data structures because the latest ADaM Implementation Guide (ADaMIG) (v 1.1, 2016) still only describes two of them - the subject-level analysis dataset (ADSL) and the Basic Data Structure (BDS). However, a new separated document – “ADaM Data Structure for Occurrence Data” describes the third standard data structure - the Occurrence Data Structure (OCCDS). Questions remain at to what are the derived and collected variables in each dataset and how many ADaM datasets should be created for a study. It is virtually impossible to have a universal standard format for all ADaM datasets because of the different analysis needs of different therapeutic areas. There is also much more room for personal discretion in creating ADaM datasets than their corresponding SDTM datasets. ADaM datasets contain both source and derived data, so some variables may contain the same values, for example, AVALC, VSORRES, and VSSTRESC in an ADVS dataset. This tends to confuse programmers, statisticians, and clinicians because they do not like or are not aware of traceability consideration in creation of ADaM datasets. It is also time consuming to go over such long documents as ADaM and ADaMIG. This paper provides a step by step guide to generating ADaM datasets.
Effective Statistical Programming Preparation and Support for CFDA Self-Inspections and Onsite Inspections
Peng Wan, MSD
On July 22, 2015, the CFDA requested that the country's pharmaceutical firms perform self-inspections of their China-based clinical trial sites. Following on that, the CFDA'S Center for Food and Drug Inspection (CFDI) conducted onsite inspections for those applications. The purpose is to ensure the integrity and quality of the clinical trials and to confirm the completeness, accuracy and/or reliability of the study data. Statistical Programming team need to collaborate with other functions (e.g. Statistician, Data Management, Clinical scientist, CRA) to prepare and support the inspections. Statistical Programming team plays critical role in this activity because we are familiar with the clinical trial data, which is the starting point for analysis supporting CSR, and we know the traceability from data collection to accurate data package we submit to agency. The Paper will introduce the normal process, best practices, and common issues, lesson learned during these inspections, from statistical programming perspective.
Extension Studies - CDISC Submission Challenges and Scenarios
Deepak Ananthan, Arvind Sri Krishna Mani, Zifo RnD Solutions
There are several types of studies being conducted around the world with a wide variety of data being collected from them. Challenges arise in converting this data to SDTM format. SDTM IG comes to the rescue but there are still a number ambiguities left to be resolved. Ever thought about a piece data that was not collected but was needed for submission and analysis? This might not seem out of place for somebody who has worked on SDTM conversion for a follow-up study or an extension study. To assess the long-term safety, efficacy and tolerability of drugs, sponsors continue studies for an extended duration and because of its objectives these assessments are conducted as a separate trial, albeit with the same subjects from the main study. It is also possible that the follow-up or extension study can recruit new subjects based on its design. In such cases it is plausible that, for existing subjects, at a bare minimum, some data like Demographics, Medical History, and Adverse Events etc. can be moved from the main study, as opposed to new recruits for whom the data has to be collected in the extension study. The possibilities of such variations are numerous based on the design and objectives of both studies. The following questions arise: 1. How do you represent these variations across subjects in SDTM? 2. How do you integrate data from both the studies for analysis? Through our presentation, we will try to figure out few possible ways of representing these variations across different submission deliverables.
Assigning VISITNUM and EPOCH
Na Duan, Ru He, WuXi AppTec
Assigning VISITNUM values and EPOCH values in SDTM datasets are especially challenging when scenarios not addressed in the SDTMIG are encountered. FDA requested EPOCH in clinical subject-level observations to help review easily determine which phase of the trial the observation occurred. How to assign VISITNUM values to unscheduled visits across all datasets so they sort chronologically when unscheduled observation date is partial or the same date as the planned visit? How to assign EPOCH values to observations in a general observation class domain? This document addresses practices for assigning VISITNUM to unscheduled visits and methods for defining EPOCH and Trial Elements.
Create Zero Observation dataset to achieve maximum metadata resolution in CDISC Submission
M K Sinha (Ajay), Incedo
In a world of systematic and efficient programming the success of project delivery largely depends on the accuracy of data and implementation of approved standards. The metadata across all the files (analysis/mapping specification file, raw data and final data) need to match for standardization and traceability. CDISC standards for all most all of the data structures clearly states metadata information of a variable, its order of appearance in a dataset, variable length etc. In order to achieve the same metadata as is specified by company or CDISC standard files, programmers need to be very careful while generating the dataset at the final stages. Most of time we enter into situations where number of variables are more, or the order or metadata is different from one actually required in the final data set. Also since these data sets (say SDTM, ADaM) in most cases are developed from SAS coding after referring the analysis/mapping specification file, end moment change in the specification file may not be carried over to these datasets. One approach to limit this error and generate the deliverable per standards is to first develop the analysis/mapping speciation file as per guidelines and then import this file into SAS to create a metadata from these analysis/mapping file. Use these metadata to assign the metadata for final dataset. This paper elaborates on the technique of first creating a zero observation dataset with the metadata as same as specified in the excel analysis/mapping specification file and then to use this dataset to create the final dataset. In this approach the analysis/mapping specification file needs to be current and the resulting datasets will be generated as per standard. A review of final dataset will double check the metadata consistency not only in the resulting datasets but also on the analysis/mapping specification file.
Overview of OSI Deliverables from Programming Perspective
Wang Zhang, MSD R&D (China) Co. Ltd.
For US NDA(s)/ BLA(s) and their supplemental filings to FDA CDER, it is requested that the applicants should submit summary level clinical site data and line listings by site along with other deliverables, so that the Office of Scientific Investigation (OSI) at FDA, which is responsible for the verification of the integrity of the clinical efficacy and safety data, would be able to use specific algorithms to identify high risk sites and conduct site inspections. This paper will give a brief introduction to the contents of these OSI deliverables, and will demonstrate the preparation of the package from the statistical programming’s point of view as the experience sharing.
Project Standard Implementation
yun ma, Boehringer Ingelheim
The Project Centric Approach (PCA) is a BI?BDS working concept?implemented by?Data Management, Statistics, and Programming?that?strives for centralisation?and harmonisation?of specifications and implementations with the end in mind.??The scope of this working concept is a project or any collection of projects. Key features of PCA are a harmonized, systematic and carefully reviewed central project-level document?covering all trial and project deliverables and used for implementation (e.g. SAP, ADS plan, table shells) a project database and project analysis data sets as source of all analyses on trial and project level, early specification of project standards with more effort and resources required up front but time saved in late phase of a project, and a shift of tasks, resources and effort from trial to project level. Thereby, PCA shall ensure consistency across studies?and continuity over time, improve quality as compared to fractioned, disparate processes, simplify trial level activities, eliminate duplication, accelerate the submission process, and cut down costs. Working concept?implemented by?Data Management, Statistics, Programming 1. Centralisation?and harmonisation?of specifications and implementations with the end in mind 2. Harmonized, systematic and carefully reviewed central project-level document covering all trial and project deliverables and used for implementation 3. Project database and project analysis data sets as source of all analyses on trial and project level 4. Early specification of project standards 5. Shift of tasks, resources and effort from trial to project level
ToCreatedefine.xml using .XML and OpenCDISC validator
Murali Neela, SAS
In the pharmaceutical industry we have to submit data like SDTM, ADaM, ODM, LAB, and SEND. Before submitting these standards to the client or FDA,we have to validate the data and verify compliance.Along with this data, there is also time spent on creating Case Report Tabulation Data Definition Specification (define.xml)and it is accompanied by challenges using .xml script and OpenCDISC validator.This is very important step in clinical trials and there is a need to understand how it is been created using the .xml and OpenCDISC validator. Clinical SAS programmers can face issues while converting the define.xml using OpenCDISC or .xml scripts. There are such challengesas follows: ? Metadata creation ? Standard and customized tools not readily available ? Expected output are not clear ? Lack of training and awareness Will discuss using OpenCDISC Validator to generate define.xml, SDTM 3.1.1 standards, SDTM 3.1.2 and SDTM 3.2 ,standards and ADaM 1.0 standards
Multiple Methods to Get Visitnum Visit from SV Domain in SDTM
yaohua huang, PPD
SV domain plays a vital role in creating all kinds of domains for SDTM mapping. Other domains need get the standard Visitnum, Visit variables, Visitdy from the SV domain especially for the unscheduled visits. These variables are important for investigator to distinguish visit of each patient. General process of creating SV will be displayed. In addition, four methods Proc sql, Proc sort, Porc format and Hashing Object will be introduced to obtain these visit relative variables. Each method has its merit and demerit. This article will give a brief introduction on the SAS codes implementing these methods.
Data Standards and FDA Guidance
Beilei Xu, Sanofi
Over the years, CDISC has published many data standards including Therapeutic Area User Guides (TAUGs). FDA has required study data standards in clinical studies that start after December 17, 2016 for NDAs, BLAs and ANDAs. This paper reviews the FDA eStudy Guidance document, Data Standardz Catalog and Technical Conformance Guide document. And it provides some key points of FDA requirements on data standards, traceability and compliance.
How To Derive Parameters At ADaM Level, For Generating Yearly Interval Exposure Tables
Geetha Kesireddy, GCE solutions
Have you ever faced the scenario of adding parameters in ADEX for producing yearly interval tables? By selecting appropriate permissible variables from the BDS structure we can derive the required parameters in ADaM datasets based on our requirement. In this scenario, we can use AVISIT variable to group the drug dosage year wise. This paper will show you how we can create the required parameters in ADEX dataset, followed by report generation which include Number of planned injections, Number of actual injections, Number of actual injections/Number of planned injections(%).
Challenges and complications of handling data cut in clinical data interim analysis
Yuvaraj Murgesan, GCE Solutions
Is handling data cut in clinical data interim analysis an easy task? Here I would like to discuss the challenges and complications of different methods in implementing data cut for Interim analysis. Firstly, implementing data cut at SDTM level based on target date; Secondly, implementing data cut at ADaM level based on target visit and target visit day; and finally, custom data cut logic based on target visit, study day of target visit and specified date.
Data Management & ValidationHow Did This Get Made?
Matt Becker, SAS
What data was used for my result? What macros or SAS code did I pull in? What was created? Our time is sometimes spent reviewing code for
What’s your data flow in LSAF (How to code in LSAF)
Shuang Fu, SAS
You may or may not hear of LSAF, its whole name is SAS Life Science Analytics Framework. It is a powerful solution for clinical industrial. The integrated system can manage, analyze, report, and review clinical research information. I have heard complain from programmers about how hard to install SAS foundation. But using this product, all you need, is a supported browser. Compare with SAS Studio (SAS browser foundation), it’s special for clinical industrial. There are many features in this product, in this paper, only introduce the data flow in LSAF, how to code in it, and how to work effectively in it. Other advantages of LSAF, need your own ventures.
Outlier Analysis in SAS
Chao Jiang, Yan Liu, Merck
Outlier, sometime because of noise and pollution which can confound analysis and reporting, but sometimes it contains valuable result which may impact the efficacy comparison in clinical trials. This paper presents several detection approaches used to identify the outliers and implement using multiple SAS procedures and functions. It also provides the introduction of regression model analyzing, by which to find out the effect of outliers on a model of relationships among variables, from SAS programming perspective. Key word: outlier, SAS code, clinical trial.
Automation of Data Validation Programming
Jasmine Zhang, PAREXEL
Data validation is one of the important processes in Data Management. The automation of SAS? programming is essential to improve the programming efficiency reduce un-necessary re-work. This paper intends to present some core programming logics to automatically create data validation programs that were identified in general logics. The automation is realized by reading the specifications in required format and automatically generates program codes. The automation tool includes: (1) a required format of data validation specification; (2) A set of macros to facilitate different validation logics; (3) an auto-generate program macro. The macros included are generated using SAS? macro facilities. By reading the data validation specification, the automation tool translates the pre-specified language into SAS programming language, and put it into a SAS file.
Strip Special & Non-Printable Characters in Your Data Set
Ying Liu, Merck
Clinical data can be collected and analyzed by internal and external vendors. Various ways of importing data may bring out the special and non-printable characters in the clinical data, accompanied with the potential issues in developing quality data sets and deliverables. There are major issues such as incorrect statistical analysis in tables, listings and figures, or minor issues such as incorrect breaks in lines and words. In order to manage data effectively, this paper will summarize several special and non-printable characters in clinical data to understand the essential reasons. Moreover, corresponding solutions to handle these issues will be provided in this paper.
Covering Proc Compare Limits
Mary Zeychelle Hundana, PPD Pharmaceutical Development Philippines Corp.
Dataset validation is one of the key factors to ensure that the data produced is of good quality. In SAS, PROC COMPARE includes several options and statements that offer great convenience in identifying differences of two datasets. However, this SAS procedure is still incapable of providing value comparisons among different types and relating the two datasets regardless of the sorting. There could be instances that base dataset has a character date variable displayed in DATE9 format but the compare dataset date variable is in numeric date9 format. Although the user can see manually that they have the same values being displayed, PROC COMPARE will not make value comparison possible. Another common issue is that the result of the comparison might show difference in records even though two datasets contain the exact information, but with different record arrangement. The result should have been a 100-percent data match with sorting as the only difference but PROC COMPARE displays this as a complete mismatch. This is because of PROC COMPARE relates observations row by row. The result generated might show difference in records as to when the data are compared by the system as a whole. %CompareDS is a macro code designed not to replace proc compare, but rather cover the limits of proc compare. This macro aims to focus on non-matching records by removing observations that have exactly the same values across all variables, whether it is in the same row or not. Also, it aims to promote comparability among variables that tend to have the same and/or conflicting data types.
Pooling Clinical Trial Data – Best Practice for Programmers
Mina Chen, Peter Peter Eberhardt, Roche
In the drug development industry, pooling of data is used by many companies involved in submitting clinical trial data. There are various reasons to pool data including the production of documents for Development Safety Update Reports (DSURs), publications and parts of an NDA submission. The role of a statistical programmer is to conduct the meta-analysis by tying pooled datasets and analyses together. Data pooling across studies is a challenging process and is often time consuming and error-prone. The key consideration that needs to be discussed before pooling is the differences between studies, which can affect the ability to interpret meta-analysis results. These differences include but are not limited to: (a) different study designs and data structure, (b) different data collection methods across studies, (c) different standards and conventions. To ensure that the data from different studies can be pooled in good quality, the study design, data sources and the way data collected need to be fully understood. This paper will how to pool clinical trial in an efficient way based on our experience, which will cover important points such as planning, programming strategy and validation.
Data Visualization and GraphicsConsistent Group Attributes Independent of Data Order
Yu Li, Novartis
In clinical trials, figures and graphs are often used to present the results from statistical analyses. Group attributes, e.g., color, symbol, pattern, are default defined by GraphData1, GraphData2, …, GraphDataN style statements when GROUP=option is specified to present group information in GTL. Yet, when specific group attributes are preferred, if data order is changed in different groups or some groups are missing at a certain time point, the group attributes may be unexpectedly changed. In this paper, several methods are given to define consistent group appearance to meet our needs.
How process flow standardized our process
Yeqian Gu, SAS Research and Development (Beijing) Co., Ltd
Managing business process takes a lot of tedious work. Especially in Pharmaceutical industry, with the collaboration between Sponsors and CROs, the tasks are becoming more distributed and complex. Pharmaceutical companies require a comprehensive approach to meet their business requirement. It is crucial for the clinical studies to make the entire process in an orderly and consistent way. Workflow definitions encapsulate a combination of automated and manual activities that can be used to manage business process in organizations. This paper utilizes several typical examples to illustrate how workflow in SAS LSAF handle business processes in task allocation, clinical data management and statistical analysis. It describes how business processes are modeled, developed, deployed and managed throughout their entire life cycle. Let’s take a look at the examples that will give you the agility to optimize your workflow designs.
Data Explore Matrix – Quality Control by Exploring and Mining Data in Clinical Study
Yongxu Tang, Yu Meng, Yongjian Zhou, Yang Chong, Yangfei Ma, Merck
General data review mainly focus on two aspects: one is for in-house study review, the other is for traditional outsource surveillance review. This paper introduces a new method called Data Explore Matrix (DEM), which is based on SAS? Version 9.3 to support quality control in clinical studies which consists of 1) High Risk Data Points Identity (HRDI) 2) Experience and knowledge input (E&K) 3) Tools and Models (T&M). Combined with those fields of knowledge towards visualization methods Data Explore Matrix (DEM) shows users with a novel way of reviewing data and detecting signals in a quick time. In this paper we will share some examples using Data Explore Matrix (DEM) focus on Query, Adverse Events, Exposure, Laboratory and Study Visits. The author is convinced that Data Explore Matrix (DEM) is useful to support integrated information review, fast issue identification and perspective decision making.
Text handling in Graph by using Graph template language
Li Jianduo, PAREXEL
Most of us have found that it is difficult for us to keep the text hanging indent in the figure. And it is also not as easy for us to adjust the position of text in the picture as in the rich text format. So we often can’t put the text in the way we want when we face the graph production. How can we let it display on our request? This paper will attempt to use the lesser used and lesser understood pieces of syntax about Graph Template Language. I will focus on one or two handy plot language, what they mean and why I personally have found them useful in my experience.
What could ODS graphics do about Box Plot?
Tongda Che, Merck
Box Plot is commonly used to graphically present data's distribution. The most common boxplot would show data's median, first quartile, third quartile, minimum point and maximum point. And in many cases, we would need to add other features on the boxplot, like display outliers, combining scatter plot or display summary dataset above the plot...etc. This paper would like to discuss how to accomplish these features with ODS graphics.
Clinical Data Visualization using TIBCO Spotfire? and SAS?
Ajay Gupta, PPD Inc
In Pharmaceuticals/CRO industries, you may receive requests from stakeholders for real-time access to clinical data to explore the data interactively and to gain a deeper understanding. TIBCO Spotfire 7.6 is an analytics and business intelligence platform, which enables data visualization in an interactive mode. Users can further integrate TIBCO? Spotfire with SAS? (used for data programming) and create visualizations with powerful functionality e.g. data filters, data flags. These visualizations can help the user to self-review the data in multiple ways and will save a significant amount of time. This paper will demonstrate some basic visualizations created using TIBCO Spotfire and SAS using raw and SDTM datasets. This paper will also discuss the possibility of creating quick visualizations to review third party vendor (TPV) data in formats like EXCEL? and Comma Separated File (CSV).
Hands-On Workshop (HOW)Hands-on GTL
Kriss Harris, SAS Specialists Ltd
Would you like to be more confident in producing graphs and figures? Do you understand the differences between the OVERLAY, GRIDDED, LATTICE, DATAPANEL, and DATALATTICE layouts? Would you like to know how to easily create life sciences industry standard graphs such as adverse event timelines, Kaplan-Meier plots, and waterfall plots? Finally, would you like to learn all these methods in a relaxed environment that fosters questions? Great—this topic is for you! In this hands-on workshop, you will be guided through the Graph Template Language (GTL). You will also complete fun and challenging SAS graphics exercises to enable you to more easily retain what you have learned. This session is structured so that you will learn how to create the standard plots that your manager requests, how to easily create simple ad hoc plots for your customers, and also how to create complex graphics. You will be shown different methods to annotate your plots, including how to add Unicode characters to your plots. You will find out how to create reusable templates, which can be used by your team. Years of information have been carefully condensed into this 90-minute hands-on, highly interactive session. Feel free to bring some of your challenging graphical questions along!
SAS(R) Studio - The Next Evolution of SAS Programming Environments
Matt Becker, SAS
SAS Studio is the newest SAS programming environment, and provides many tools to help you with your programming tasks. In this Hands on Training session, we'll take a tour of this enhanced programming environment, highlighting the following features: the dataset browser, where we will build filters and change what columns are displayed, the snippet manager, where we will explore existing code snippets and learn how to create and manage our own code, the task manager, where we will see how to generate code using a GUI, and then see how to build our own tasks, and the visual query builder, where we will see how to combine datasets quickly and efficiently. SAS Studio is a web-based tool, so you will be able to code and interact with SAS from just a browser. Come see how this tool can help you be a more efficient programmer.
The SAS? Hash Object: It’s Time To .find() Your Way Around
Peter Eberhardt, Fernwood Consulting Group Inc
“This is the way I have always done it and it works fine for me.” Have you heard yourself or others say this when someone suggests a new technique to help solve a problem? Most of us have a set of tricks and techniques from which we draw when starting a new project. Over time we might overlook newer techniques because our old toolkit works just fine. Sometimes we actively avoid new techniques because our initial foray leaves us daunted by the steep learning curve to mastery. For me, the PRX functions and the SAS? hash object fell into this category. In this workshop, we address possible objections to learning to use the SAS hash object. We start with the fundamentals of the setting up the hash object and work through a variety of practical examples to help you master this powerful technique.
ggTables: A set of SAS Macros to Automatically Produce Statistical Tables in Clinical Research
Hongqiu Gu, Beijing Tiantan Hospital
A Statistical table is one of the most important ways to show statistical results in clinical research. As statistical tables are more concise than text and contain more and exact information than graphs, they are more frequently used in statistical reports or academical articles. SAS has various tools to reduce tables, but neither of them can directly generate statistical tables that meet the requirement of publishing for academical journals. Many SAS users and developers have published their macros that generate statistical tables both with descriptive statistics and P values, however, most of these macros limited to produce only baseline comparison table. As a matter of fact, besides baseline table, there are many other kinds of statistical tables. It’s necessary that we need to summarize these statistical tables and then study how to produce these statistical tables using tools provided by SAS. One macro is not enough, what we need is a set of SAS macros that can produce all kinds of statistical tables for us. The purpose of this paper is to introduce a set of SAS macros that can automatically produce nine kinds of highly customized and ready-to-publish statistical table in RTF format. These user-friendly SAS macros consist of four groups of SAS macros: baselines statistical table macros, outcome statistical table macros, risk factor statistical table macros, and subgroup analysis statistical table macros.
Management and Career DevelopmentCore Forces Taking you to Career Successes
ShiMin Wang, 深圳赋能教育科技有限公司
When you look around, some can success more and faster than others. What core forces can make you one of them? The session tells you about getting yourself out of vague areas between system thinking and common thinking. You will learn how the system thinking help you filter out bad edges and let the power of good thinking grow. As result, the session benefits you on flexibility of thinking, problem solving and agility to work-life changes. Ultimately such a system thinking/Power of Frame, will guide you in your career development for successes.
Trend of talent and talent needed for multinational enterprises 人才趋势与跨国企业人才需求
Samantha Xie, Boehringer Ingelheim
This is the era where four generations (50-90) of working force are co-existing in the labor market. By 2025, 80-90 after generations will become the main stream as well as the leading force. Talen market will have a dramatic change and high demanding on good talents. What can you do to meet market and enterprise expectations as an individual and what can we do as companies to provide your best career opportunities? This presentation offers you a general projected picture of expectations from both the future market and enterprises. From the presentation, you will better understand how and what you can prepare and where you can position yourself for a career success.
Statistics & PharmacokineticsTime-Dependent Covariates 'Survival' More in PROC PHREG
Fengying XUE, Michael Lai, Sanofi
Survival analysis is a powerful tool with much strength, especially the semi-parametric analysis of COX model in PHREG, the most popular one. How to explain its enormous popularity? The most import reason is that it does not require you choose some particular probability distribution under the proportional-hazards assumption. While in some cases, such as in long-term follow-up study or some covariate whose attribute really change over process (such as age), the proportional-hazards assumption of constant hazard ratios is frequently violated, and PHREG can also make it. This is the second reason, it relatively easy to incorporate time-dependent covariates. It provides the chance to modulate their dynamic design, leading to a more robust and accurate outcome.
Receiver Operating Characteristic (ROC) Curve Analysis using SAS
Zheng Yao, Merck Serono
From a clinical perspective, biomarkers (e.g. PD-1/L1) draw lots of attention nowadays, especially in oncology study. It is often of interest to use biomarker for disease screening, diagnosis, prognosis, prediction and surveillance. However, the fundamental for use of biomarkers is the accuracy of biomarker, i.e. the ability to classify one condition or diseased from others, such as non-diseased correctly. Therefore, it is important to find a rational cut-off point of biomarker in clinical practice. Statistically, receiver-operating characteristic (ROC) curve analysis is a useful tool in selection of cut-off point and assessment of biomarker accuracy. Additionally, comparison of two or more ROC curves is also useful in clinical research. Therefore, the purpose of this paper is to: (1). explain the background terminology associated with ROC curve analysis and help interpret the analysis output;(2).find a rational cut-off point by performing ROC curve analysis;(3). calculate sensitivity and specificity for a predetermined cut-off point in ROC curve analysis;(4). compare two ROC curves. An example of ROC curve analysis for PD-L1 biomarker scoring is provided in the paper.
A SAS Macro for Area Under Curve Calculation: Demonstration with Plasma Drug Concentration and Continuous Glucose Monitoring
Ka Chun CHONG, Xiaoran Han, Shenzhen Research Institute, The Chinese University of Hong Kong
Area Under Curve (AUC) is a typical parameter for analytic representation. Among various AUC calculation methods, the trapezoidal rule is the most popular numerical approach that approximate the integral of the curves. In this paper, we developed a SAS macro to assist in AUC calculation from a response-by-time dataset. The use of SAS macro were demonstrated with the examples of 1. Plasma drug concentration curve in Phase I study, and 2. Glucose level monitoring curve in diabetes trial.
Clinical PK-PD Model: Problems and Countermeasures
Chen Wang, Amgen
Literature research based on the clinical pharmacokinetic studies published from 2007 to 2014 revealed that only about 30.8% of pharmacokinetic (PK) researches and pharmacodynamics (PD) indexes were related by modeling (383/1242). The possible reasons may include: the model and the concept are difficult to understand, which affects the selection of eligible studies; this methodology requires the dynamic concentration and effect data, which hampers the application of most drugs; there are many repeated sampling points and the participators are not willing to accept, thus, it is difficult to use it among special population; Most of studies are with small sample size without consideration of covariates, and the information provided is still limited. Based on the above analysis, this research re-clarified related concepts and divided clinical PK-PD model into three categories: dynamic concentration-reaction PK-PD model, exposure-response PK-PD model and mechanism PK-PD model. Meanwhile, the exposure-response PK-PD model is more valuable and flexible with a forward-looking design, which will be the mainstream method as it can provide more information.
Probability of Success - Assurance in Clinical Trials
Jack Li, dMed Company Limited
Drug development has become increasingly costly, lengthy, and risky. The call for better decision making in research and development has never been stronger. Though conventional clinical trial design involves considerations of power the high power does not necessarily translate into a high success probability if the treatment effect to detect is based on the perceived ability of the drug candidate. “Assurance” takes account of uncertainty about treatment effect and can provide a more robust assessment of the likelihood of success. It becomes a useful tool in addition to traditional power for supporting internal decision making. The methods for calculating assurance for different type of data will be introduced in the presentation and examples will used to demonstrate how to use this concept in practice.
Visualization & interactive application in design and analysis by using Rshiny
Baoyue LI, Boyao SHAN, Eli Lilly China
A Path to Gain Confidenct with your Confidence Interval in Clinical Trial Analysis
Yi Gu, Roche (China) Holding Ltd
In the statistical analysis of clinical trials, a confidence interval provides useful information regarding to the precision of both the efficacy and safety outcomes. Different method of confidence interval calculation will lead to some deviation of the results. A better understanding of the confidence interval calculation methods in SAS functions is very important to SAS programmers in generating the right numbers based on the different algorithms suggested by statisticians. This paper is to present the SAS functions, including the very basic PROC MEANS to PROC GENMOD, PROC SURVEYFREQ,...., which are frequently implemented in computing CI for means or proportion under different scenarios of clinical trials.
%ScoreQoL: SAS Macro-Program that Automates the Scoring of Quality of Life Questionnaires
Mitchikou Pearl Tseng, PPD
The use of Quality of Life (QoL) questionnaires has been gaining attention in different clinical researches as patients’ quality of life is becoming a standard endpoint in clinical trials in different therapeutic areas. Furthermore, results of analyzing these questionnaires have significant importance since developments for survival and response to treatment are becoming more difficult to reach. However, programming is becoming tedious and repetitive for the computation of the scores of different scales/domains in various QoL questionnaires since the scoring principle and imputation of missing values are consistent all throughout the questionnaire. The most common scoring principle is taking the average of the responses in a specified list of items/questions that represent the scale/domain being measured. The raw score is then linearly transformed using the range and lowest possible response. Afterwards, issues on missing values are addressed. QoL questionnaires mostly vary on the number of scales/domains being measured, number of items/questions that constitute a certain scale/domain, the range of responses and the cut-off (%) of missing values to be imputed. However, there is still no SAS statistical package or macro program available which robustly and efficiently addresses these diverse characteristics of QoL questionnaires. %ScoreQoL is a macro program that automates the scoring of scales/domains of different QoL questionnaires. It includes macro parameters dedicated for the cutoff, scale name, scale code, item list, range, input and output dataset, and reversed scoring. A variable that provides information in each scale’s number of missing items versus the expected count is also included to aid in quality checks.
Some applications of Proc MCMC in clinical trial
Tao Tan, Jiangsu Hengrui
Bayesian methods have become increasingly popular in modern statistical analysis and are being applied to a broadspectrum of scientific fields and research areas. MCMC methods effectively allow generationof samples from the posterior distribution without requiring the distributionexplicitly. The MCMC procedure uses the Markov chain Monte Carlo (MCMC) algorithm to draw samples from an arbitrary posterior distribution, which is defined by the prior distributions for the parameters and the likelihood function for the data that you specify.This paper introduces SAS MCMC procedure and some simple application in phase 1 dose finding.