PharmaSUG Single Day Event
Is Now a Virtual Meeting!
October 22-23, 2020
e-Submission, CDISC and New Technology – Together Towards Tomorrow

Our first virtual SDE (Single-Day-Event) concluded on October 23, 2020. We had 212 attendees from 6 countries, with 74 registered for the pre-conference seminar.

A big "Thank You" to our speakers and sponsors! You have helped make our first SDE virtual event a success!

To those registrants who have NOT yet completed the survey, please do so by November 8. Let us hear your comments and ideas for improving a planned hybrid (virtual & onsite) conference for 2021!

Please contact This email address is being protected from spambots. You need JavaScript enabled to view it. if you have additional suggestions or wish to volunteer for next year's event.

The recordings and slides from all of the presentations will be available to registered attendees from the first week of November through November 30, and the slides will be posted on the website after November 30. The recording files will be separated by presentation so that attendees can choose which talks to see. The link to access the presentations will be provided by email when the files are ready. Stay tuned.

Sponsored by:
Catalyst Clinovo
Pinnacle21 Rho

Thursday, October 22, 2020 Pre-Conference Virtual Seminar

Time (EDT)Presentation (click for abstract)Presenter (click for bio)
8:15 - 8:30 AMIntroduction
8:30 AM - 1:00 PMCDISC ADaM – Implementation by ExampleRichann Watson, DataRich Consulting
11:00 - 11:30 AMBreak

Friday, October 23, 2020 Single-Day Event

Time (EDT)Presentation (click for abstract)Presenter (click for bio)
8:30-8:45 AMOpening Session
8:45-9:15 AMCDISC: Beyond the StandardsDiane Wold, CDISC
9:15-9:45 AMMore Traceability: Clarity in ADaM Metadata and BeyondRichann Watson, DataRich Consulting
Wayne Zhong, Accretion Softworks
Daphne Ewing, CSL Behring
Jasmine Zhang, Boehringer Ingelheim
9:45-10:15 AMCommon Pinnacle 21 Report Issues: Shall We Document or Fix?Ajay Gupta, PPD
10:15-10:45 AMNext Innovation in Pharma - CDISC Data and Machine LearningKevin Lee, Genpact
10:45-11:15 AMBreak
11:15-11:45 AMA Codelist’s Journey From the CDISC Library to a Study Through PythonMike Molter, PRA Health Sciences
11:45 AM-12:15 PMStandardization for COVID 19 Trials, Following Different Sets of Master Protocols, and Using IQVIA CRF Design, SDTM and ADaM Standard COVID LibrariesGustav Bernard and Jim Beck, IQVIA
12:15-12:45 PMSubmit Study Data to FDA: Current Status and Upcoming Study Data Technical Rejection Criteria EnforcementEthan Chen, FDA
12:45-1:00 PMClosing Session and Prize Drawing

Event Co-Chairs

Matt Becker

Margaret Hung
MLW Consulting

Sponsorship Opportunities

Participating as a sponsor is a great way to market your company's product and services. If you sign up as a sponsor for this 2020 virtual event by September 25, you get our “2-for-1 Special” – your 2020 sponsorship is automatically extended to NC SDE 2021, an onsite event planned for October 29, 2021.

Download our sponsorship program (PDF, 156K) and application form (Word doc, 63K) for further details. For questions about our sponsorship program, please contact This email address is being protected from spambots. You need JavaScript enabled to view it..

Seminar and Presentation Abstracts

CDISC: Beyond the Standards
Diane Wold, CDISC

CDISC Implementers preparing e-submissions know they need to understand the models (SEND, CDASH, SDTM, ADaM, Define-XML) and the associated implementation guides, but may not be aware of other resource available on the CDISC website and wiki. This talk will include a tour of the reorganized CDISC website, highlighting features and content that have been added and are being expanded. These include materials in the members-only area, available to anyone who works for a CDISC member organization. The presentation will also touch on publicly available wiki content.

The talk will end with an overview of projects that CDISC teams and staff are currently working on. These include updates and supplements to the familiar standards as well as extensions of and additions to other resources.

More Traceability: Clarity in ADaM Metadata and Beyond
Richann Watson, DataRich Consulting
Wayne Zhong, Accretion Softworks
Daphne Ewing, CSL Behring
Jasmine Zhang, Boehringer Ingelheim

One of the fundamental principles of ADaM is that datasets and associated metadata must include traceability to facilitate the understanding of the relationships between analysis results, ADaM datasets, and SDTM datasets. The existing ADaM documents contain isolated elements of traceability, such as including SDTM sequence numbers, creating new records to capture derived analysis values, and providing excerpts of define.xml documentation.

An ADaM sub-team is currently developing a Traceability Examples Document with the goal of bringing these separate elements of traceability together and demonstrate how they function in detailed and complete examples. The examples cover a wide variety of practical scenarios; some expand on content from other CDISC documents, while others are developed specifically for the Traceability Examples Document. As members of the Traceability Examples ADaM sub-team, we are including in this PharmaSUG paper a selection of examples to show how traceability can bring transparency and clarity to your analyses.

Common Pinnacle 21 Report Issues: Shall We Document or Fix?
Ajay Gupta, PPD

Pinnacle 21, also previously known as OpenCDISC Validator, provides great compliance checks against CDISC outputs like SDTM, ADaM, SEND and Define.xml. This validation tool provides a report in Excel or CSV format which contains information as errors, warnings, and notices. At the initial stage of clinical programming when the data is not very clean, this report can sometimes be very large and tedious to review. If the programmer is fairly new to this report s/he might not be aware of some common issues and will have to fully depend on an experienced programmer to pave the road for them. Indirectly, this will add more review time in the budget and might distract the programmer from real issues which affect the data quality. In this presentation, I will discuss some common issues with the Pinnacle 21 report messages created from running against SDTM datasets and propose some solutions based on my experience. Also, I will discuss some scenarios when it is better to document the issue in reviewer’s guide than doing workaround programming. While the author totally agrees that there is no one fit for all solution, my intention is to provide programmers a direction which might help them to find the right solutions for their situation.

Next Innovation in Pharma - CDISC Data and Machine Learning
Kevin Lee, Genpact

The most popular buzz word nowadays in the technology world is “Machine Learning (ML).” Most economists and business experts foresee Machine Learning changing every aspect of our lives in the next 10 years through automating and optimizing processes. This is leading many organizations including drug companies to explore and implement Machine Learning on their own businesses.

The presentation will discuss how Machine Learning can lead the next innovation in pharma with CDISC data. The presentation will start with the introduction of most innovative companies and how they innovate and lead the industry using Machine Learning and data. Then, the presentation will show how pharma should learn from them to innovate using Machine Learning and CDISC data. The presentation will also introduce the basic concept of machine learning and the importance of data.

The presentation will show how CDISC data will be the perfect partner of Machine Learning for the next innovation in pharmaceutical industry. Finally, the presentation will discuss how biometric department can prepare the next innovation and lead this data-driven Machine Learning process in pharmaceutical industry.

A Codelist’s Journey From the CDISC Library to a Study Through Python
Mike Molter, PRA Health Sciences

The publication of the CDISC Library should be every programmer’s dream. The use of PDF-based Implementation Guidelines or even Excel files downloaded from the CDISC website always produced manual, non-automated hiccups to the process of standards implementation. The Library gets us one step closer to automation nirvana. In this presentation I will illustrate a small-scale proof-of-concept web application in which a study team member defines study controlled terminology subsets, not through tedious Excel operations such as copying and pasting, but rather through minimal checkbox selection of controlled terms presented to the user through a browser by an application that knows which codelists are associated with which CDISC variables. We’ll see how just a few lines of Python code can extract from the Library; how a few more can send contents of a codelist to an HTML form; and on the backend, how a few more can process the choices a user made. The purpose of this exercise is not to demonstrate a fully functioning production web application, but rather, to give the reader a sense of what is possible. Knowledge of basic Python objects such as lists and dictionaries is helpful, but not essential.

Standardization for COVID 19 Trials, Following Different Sets of Master Protocols, and Using IQVIA CRF Design, SDTM and ADaM Standard COVID Libraries
Gustav Bernard, IQVIA
Jim Beck, IQVIA

COVID 19 trials have started with high expected turnaround time for studies. What we have done at IQVIA is create a Standardized CRF Design and CDISC SDTM and ADaM COVID 19 Standard Libraries. For each, there have also been processes put in place to insure consistency against the expected standard that will be implemented. On the ADaM side, some additional processes have been put in place, for example, for automating parameter information, creation of criteria flags for BDS domains and creating CTCAE grading for ADLB.

Submit Study Data to FDA: Current Status and Upcoming Study Data Technical Rejection Criteria Enforcement
Ethan Chen, FDA

The purpose of this session is to update Industry on the area of Electronic Submissions to FDA and communicate upcoming enforcement of the requirement to submit study data in standardized format. FDA will walk though published documentation and tools to help industry successfully submit an eCTD submission containing study data. In addition, there remain some submission types which are not required in eCTD format (i.e. non-commercial IND). FDA CDER will introduce the recently expanded CDER NextGen Portal to accept these submission types.

CDISC ADaM – Implementation by Example
Richann Watson, DataRich Consulting

This course will provide a high-level overview of some of basic ADaM concepts. It is assumed that the attendee will have a fundamental knowledge of the different ADaM structures and principles. The primary focus of the course is to illustrate the implementation of some of the concepts found in both the ADaM Implementation Guide (ADaM IG) and the ADaM Structure for Occurrence Data (OCCDS) documents. Items covered include setting up subject level analysis data set (ADSL) for common trial designs and walking through the process of creating a basic data structure (BDS) starting from a simple BDS and building on it to structure a data set that will support one or more analyses. Additionally, the course will demonstrate how to implement some variables that are only found in OCCDS, such as the standardized MedDRA query (SMQ) and customized query (CQ) variables.

Materials: Printed copy of the slides set up for note taking.
SAS Software Packages: N/A
Intended Audience: Individuals who have a knowledge of different ADaM structures and principles
Length and Format: Full-day lecture

Course Outline:
  • ADaM Overview
  • High-level overview of data structures
    • ADSL
      • Illustration of Common Trial Designs
    • BDS
      • Illustration of a simple BDS
      • Build on simple structure by adding additional variables such as
        • Analysis Visit Windowing Variables
        • Descriptor and Indicator Variables
        • Category and Criterion Variables
      • OCCDS
        • Illustration of a simple AE and CM data set
        • Build on a simple AE data set by adding additional variables such as
          • AEs of Special Interest Variables
          • Occurrence Flag Variables

Presenter Biographies

Jim Beck

Jim Beck is a Director at IQVIA who has been with the company and affiliates (Cenduit, IQVIA Biotech) for 18 years. He has been involved with operational and strategic work across Data Management, Biostatistics, IT and Product Development, and his most current work involves SDTM process automation, Biostatistics standards & standardization and Data Management to Biostatistics data flow initiatives. Jim earned a Bachelor of Science in Electrical Engineering and a Bachelor of Science in Computer Science from North Carolina State University in Raleigh, NC.

Gustav Bernard

Gustav Bernard is an Associate Director at IQVIA who has been with the company for 16 years. His work focuses on the implementation of CDISC Standards (SDTM, ADaM and Define-XML) within the IQVIA Global Biostatistics department. He is currently working on ADaM process automation. He has also created the Define-XML 2.0 automation process within IQVIA and the ADaM Spec Generator application. Gustav earned a Bachelor of Business in computer science from the University of the Orange Free State in South Africa.

Ethan Chen

Ethan Chen provides overall leadership to CDER in streamlining electronic and traditional submissions and delivering solutions to enable rapid adoption of emerging electronic data standards. Since joining the FDA in 2012, Mr. Chen has led several critical initiatives as the CDER Informatics Architect, including Data Management, Analytics and Business Intelligence, Electronic Submission and Portal Collaboration programs. While leading the CDER Division of Data Management Service and Solution, Ethan had successfully implemented the eCTD electronic submission mandate in 2017 for NDAs, BLAs and ANDAs, and again in 2018 for Commercial INDs and DMFs (excluding DMF Type III). Ethan has over 20-years' experience in Data Management, Enterprise Architecture, Solution Development and System Integration. Ethan received a BS from Shanghai Jiao Tong University, an MSE from Temple University and a MBA from University of Maryland at College Park.

Ajay Gupta

Ajay is a Programming Technical Manager at PPD. He received his master’s degree in Biomedical Engineering from Louisiana Tech University in 2006. Since 2010, he has been a regular presenter at SAS conferences, especially PharmaSUG. He has also been a member of the PharmaSUG conference committee for the past four years, and is interested in topics related to SDTM, Pinnacle21, Visual Basic for Applications, Spotfire, Risk Based Monitoring, SAS Grid and SAS Application development.

Kevin Lee

Kevin Lee is a Data Scientist, Machine Learning Leader/Instructor/Evangelist in Pharmaceutical Industry. Currently, Kevin is Assistant Vice President of AI/Machine Learning Consultant at Genpact and teaches Machine Learning/Python/CDISC/Oncology courses at conferences and university. Kevin has been a big advocate in leadership and innovative technologies, with which Kevin wants to innovate Pharmaceutical Industry. Kevin is a life-time learner who loves to learn and share, and he has presented about 100 publications at various conferences and meetings. Kevin earned an M.S. in Applied Statistics at Villanova University following a B.S. from University of Pennsylvania.

Mike Molter

Mike Molter is a data standards consultant with PRA Health Sciences in Raleigh, NC. In this position, Mike works with study teams to aid in the review of ADaM data set design and the production of define.xml. He is also involved with the production of technical tools for the purpose of efficiency optimization and automation of standard processes.

Mike has been involved in SAS programming since 1999, in clinical trials since 2003, and in industry data standards since 2005. He has been a member of the CDISC XML Technologies team since 2010, and is a certified CDISC instructor for the define.xml class. Professional interests are centered around the use of cutting edge technologies to optimize the use of metadata throughout the lifecycle of a clinical trial. Personal interests include cycling, swimming, running, and reading.

Richann Watson

Richann Watson is an independent statistical programmer and CDISC consultant based in Ohio. She has been using SAS since 1996 with most of her experience being in the life sciences industry. She specializes in analyzing clinical trial data and implementing CDISC standards. Additionally, she is a member of the CDISC ADaM team and various sub-teams. Richann loves to code and is an active participant and leader in the SAS User Group community. She has presented numerous papers, posters, and training seminars at SAS Global Forum, PharmaSUG, and various regional and local SAS user group meetings. Richann holds a bachelor’s degree in mathematics and computer science from Northern Kentucky University and master’s degree in statistics from Miami University.

Diane Wold

Diane received her Ph.D. in Statistics from the University of North Carolina at Chapel Hill. She worked for Burroughs Wellcome/Glaxo Wellcome/Glaxo Smith Kline in a variety of roles for over 30 years. At the Glaxo Smith Kline merger, she joined the data standards group, and in 2002 she joined the CDISC SDS team. She was also involved in other CDISC teams, including the Protocol Representation Group and SHARE. In 2012 she became involved in the CFAST initiative to develop therapeutic area standards. In 2015 she joined CDISC as an employee.