PharmaSUG Hybrid Single Day Event
October 20-21, 2022
Exploring the Next Phase of Data Analytics
Our 2022 SDE, the first hybrid conference, was a blast! Thank you to our sponsors for their financial support, our presenters for their insightful talks, our volunteers for helping out and most of all, our attendees for supporting this hybrid event. All registrants have been provided with secured access to the live recording of the presentations. If you have not received the email with instructions for access, please contact This email address is being protected from spambots. You need JavaScript enabled to view it.. Links to all of the slide presentations are provided below. You can also view our Photo Gallery!
Friday, October 21, 2022 Single-Day Event Presentations
SDE Committee
SDE Committee (L to R): Matt Becker, Pradeep Bangalore, Margaret Hung, Pallavi Sadhab
Presentation and Seminar Descriptions
ADaM Automation RoadblocksGustav Bernard, IQVIA
One of the main roadblocks for ADaM Automation is the ADaM standards themselves which gives teams a lot of flexibility and multiple ways to produce the same results while still applying the ADaM standard correctly.
ADSL can be very study-specific but about 65% can be automated. Occurrence datasets such as ADAE, ADMH and ADCM are a lot more standard from study to study and > 80% can be automated. BDS datasets can be a bit more study-specific but breaking down the contents of BDS dataset by complexity helps identify which datasets can be automated. For example, a BDS dataset that is only retaining parameters from SDTM to ADaM can have > 80% of content automated.
Back to top
CDISC SDTM IG v3.4: Subject Visits
Ajay Gupta, Daiichi Sankyo
The Study Data Tabulation Model Implementation Guide for Human Clinical Trials (SDTMIG) Version 3.4 has been prepared by the Submissions Data Standards (SDS) team of the Clinical Data Interchange Standards Consortium (CDISC). Like its predecessors, v3.4 is intended to guide the organization, structure, and format of standard clinical trial tabulation datasets submitted to a regulatory authority. Version 3.4 supersedes all prior versions of the SDTMIG. In this presentation, I will do a quick walk-through on the updates within the SDTM IG v3.4 from his predecessor. Later, I will go over the updated SUBJECT VISITS (SV) with examples e.g., new proposed mapping to include missed visits, how to use additional variables in SV.
Back to top
Federated Learning and Virtual Data Lake
Sanjay S. Jaiswal, Accenture
Federated learning is a relatively new approach that leverages edge computing & data analytics, enabling collaboration across platforms, technologies, data standards, and territories. A virtual data lake provides data access without the requirement to physically share or transfer data. It is platform and cloud agnostic, designed as a plug-in component for existing infrastructures. The combination of virtual data lakes and federated learning allow in-situ access and data analytics. This approach enables Life Sciences and Healthcare organizations to offer personalized insights & services by providing access to holistic individual data, at scale and helps identify high-value applications in real world. Innovation is supported by this approach by developing new AI-based modules/services by having access to a wider variety of data and developing advanced solutions while preserving data privacy. Finally, this approach enhances competitiveness by improving current AI algorithms which now have access to a larger volume of (training) data and enhancing services based on improved insights. Business, functional and technical details along with actual client case studies in early R&D, clinical development, regulatory approval, drug launch, pharmacovigilance, disease prevention, diagnosis and treatment and long-term disease management will be discussed proving wide efficacy of this data analytics approach.
Back to top
A Quick Look at Fuzzy Matching Programming Techniques Using SAS Software
Stephen Sloan, Accenture
Data comes in all forms, shapes, sizes and complexities. Stored in files and datasets, SAS users across industries recognize that data can be, and often is, problematic and plagued with a variety of issues. Data files can be joined without problem when each file contains identifiers, or “keys”, with unique values. However, many files do not have unique identifiers and need to be joined by character values, like names or E-mail addresses. These identifiers might be spelled differently, or use different abbreviation or capitalization protocols. This paper illustrates datasets containing a sampling of data issues, popular data cleaning and user-defined validation techniques, data transformation techniques, traditional merge and join techniques, the introduction to the application of different SAS character-handling functions for phonetic matching, including SOUNDEX, SPEDIS, COMPLEV, and COMPGED, and an assortment of SAS programming techniques to resolve key identifier issues and to successfully merge, join and match less than perfect, or “messy” data. Although the programming techniques are illustrated using SAS code, many, if not most, of the techniques can be applied to any software platform that supports character-handling.
Back to top
Clinical Tables with the Latest in Tplyr
Michael Stackhouse, Atorus Research
A lot of the work that goes into creating clinical safety tables is redundant. At its core, many summaries can be broken down into the basics of creating descriptive statistics or counting events and looking at the proportion against some denominator. Tplyr is an R package created to make this process simple, from summarizing results to formatting the data to be presentation ready for output. This presentation will walk through what Tplyr is, what is does, and how it can be used in an organizations clinical reporting process. A special focus will be given to the latest features in Tplyr, looking at the newest tools that have been added and how Tplyr is built to work with Shiny applications, allowing you to not just look at summary results but dive deeper into the source.
Back to top
CDISC Update: Enhancing Metadata, Documenting Relationships
Diane Wold, CDISC
SDTM metadata was enhanced in 2.0. and further enhancements are planned for the next version of the SDTM. Those metadata changes will be carried into the next version of the SDTMIG. That version will be SDTMIG v4.0, the first major version since SDTMIG 1.0 in 2004.
Back to top
CDISC Analysis Results Standard
Jeff Abolafia, Pinnacle 21
The CDISC Analysis Results Standard (ARS) Team is charged with enhancing Analysis Results Metadata, automating the generation of analysis results, and providing better traceability and understandability of analysis results and reporting. In this presentation we report on the progress of the CDISC ARS Team. We will provide an update of the proposed data model for storing most analysis results generated from the ADaM ADSL, BDS, and OCCDS data structures and the enhanced metadata model required to generate and understand these results.
Back to top
An Expeditious Approach for Handling Pinnacle 21 Messages
Malini Narreddy, Sanofi
We, as a sponsor understand that Pinnacle 21 report sometimes might be a mountain to climb during study conduct. As we all know, Pinnacle 21 validation checks play a vital role in confirming regulatory submission data to ensure its compliance with the CDISC standards, FDA, etc. Through these checks, we could potentially identify data issues and compliance issues, which could be evaluated and communicated to the respective departments for a resolution. However, sometimes the checks triggered might not be easily comprehensible due to the lack of vividness in the Pinnacle 21 rules/explanations or the programmer's inexperience for the specific scenario. As a result, excess time and effort are spent to generate a solution, fostering an inefficient use of budget and resources. To alleviate this tedious process, our company has implemented "Pinnacle 21 Message Action Plan", which serves as an informative guide for the programmers on how to tackle each validation check. Our company's experienced programmers have collaborated in generating this detailed guide, where each Pinnacle 21 message has been provided with additional information, which will be deeply explored with examples in the presentation. In addition to the action plan, we have created a robust tool to have these columns appended to a Pinnacle 21 report when generated during the conduct of the study. A walk-through of a report and described additions will be showcased in the presentation.
Back to top
Using Git with Your SAS Projects
Chris Hemedinger, SAS
Few technologies have done more to advance code collaboration and automation than Git. GitHub's popularity has drawn the attention of all types of programmers including SAS programmers. Many SAS products have direct integration with Git – extending to GitHub. In this session we will cover:
- What is Git and why do I care?
- Using Git with SAS Enterprise Guide
- Using Git with SAS Studio
- Git functions in Base SAS
- Where to learn more
Back to top
SAS and Open Source Working Together
Jim Box, SAS
Open source languages like R and Python are immensely popular and quite useful. Did you know you could write code blocks of Python and R inside of SAS programs? You can also invoke SAS analytics from open source programs. In this presentation, we will summarize all the ways SAS and open source can be used together to solve problems.
Back to top
Technical Rejection Criteria for Study Data (TRC) and Beyond
Lina Cong, FDA/CDER
Study data is the most important part of the drug application submission. Submitting standardized study data can accelerate the drug review process and make the review more efficient. In order to comply with FDA Study Data Guidance and enforce the CDISC data submission standards, the FDA developed Technical Rejection Criteria for Study Data (TRC) to help industry understand how FDA is using eCTD validations to check conformance. On Sept 15, 2021 eCTD validations for study data in TRC took effect. If a submission fails eCTD validations for study data in TRC, the submissions will be rejected. This presentation contains the TRC background, SEND requirements for TRC, the update of the rejection trend for eCTD validations for study data, TRC rejections & top error reasons. Beyond TRC, Frequently Asked Questions related study data submissions and study data standards from eData mailbox will be included.
Back to top
Introduction to R for the Statistical Programmer
Michael Stackhouse and Jessica Higgins, Atorus Research
In this workshop, statistical programmers will be introduced to the R programming language and the tidyverse, using familiar clinical examples. Attendees will leave with a basic understanding of what R is, what the tidyverse is and why it’s important, and what the open-source landscape has to offer us in the world of clinical statistical programming. Hands-on programming examples will be offered to give attendees some basic knowledge of the tools available in R to support common clinical workflows, such as SDTM, ADaM and clinical TFLs. If you’ve never worked in R before, but want to see how it can be used in your day-to-day tasks, come join us and see what this powerful open-source language has to offer!
Benefits of taking this class
If you’re scared of R or learning a new language, coming out of this class it shouldn’t be scary anymore. We’ll draw parallels to help make R feel more familiar and help you understand how this all fits in with your every day work.
Level and pre-requisites If you’re a SAS programmer you’ve never touched R or RStudio, this workshop is for you. This is a ground floor, first contact with R level workshop.
Back to top
Understanding Electronic Submission Components for Regulatory Submission of Clinical Study Data
Prafulla Girase, Alexion AstraZeneca
A regulatory submission of clinical study data also needs to be accompanied by various other electronic submission (eSUB) components such as Define-XML, annotated CRF, study data reviewer’s guide, analysis data reviewer’s guide etc. This seminar will take a deep dive into each of these components and educate attendees about key contents, best practices, and Global considerations (i.e., FDA & PMDA) during preparation of these components. For example, attendees will learn characteristics of a submission ready annotated CRF (i.e., annotations, validated bookmarks/links, document properties etc.). It will also go over key considerations related to preparation of a whole eSUB package for a submission such as folder structure considerations, final package checklist, regulatory hand-off etc. The author also plans to share his understanding of upcoming EMA's raw data pilot based on the latest publicly available guidance at the time of this seminar.
Back to top