Enhance your PharmaSUG experience by attending optional pre- and post-conference training seminars taught by seasoned experts. Half-day courses are only $175 with a conference registration, or $225 without a conference registration. You can sign up for classes through the conference registration system. Space is limited!
Saturday, May 21, 2022
Course Title (click for description) | Instructor(s) (click for bio) | Time | Room * | |
#12 | Everything is Better with Friends: Using SAS in Python Applications with SASPy and Open-Source Tooling (Beyond the Basics) | Matthew Slaughter & Isaiah Lankham |
1:00 PM - 5:00 PM | 205 |
Sunday, May 22, 2022
Course Title (click for description) | Instructor(s) (click for bio) | Time | Room * | |
#21 | Oncology Study Seminar for Programmers and Biostatisticians | Kevin Lee | 8:00 AM - 12:00 PM | 202 |
#22 | Hands-On Data-Driven Design: Developing More Flexible, Reusable, Configurable SAS Software | Troy Hughes | 8:00 AM - 12:00 PM | 203 |
#23 | Deep Dive into Electronic Submission Components for Regulatory Submission of Clinical Study Data | Prafulla Girase | 8:00 AM - 12:00 PM | 204 |
#31 | Python Programming Seminar – Advanced with Machine Learning | Kevin Lee | 1:00 PM - 5:00 PM | 202 |
#32 | Introduction to R for the Statistical Programmer | Mike Stackhouse | 1:00 PM - 5:00 PM | 203 |
#33 | Clinical Tables in R with GT | Phil Bowsher | 1:00 PM - 5:00 PM | 204 |
#34 | FDA & PMDA Submission Data Requirements | David Izard | 1:00 PM - 5:00 PM | 205 |
Wednesday, May 25, 2022
Course Title (click for description) | Instructor(s) (click for bio) | Time | Room * | |
#41 | Driving Miss Data: Data-Driven Techniques | Richann Watson | 1:00 PM - 5:00 PM | 205 |
* All rooms are located near Griffin Hall
Seminar Registration, Attendance, and Cancellation Policy
- You must register for a seminar via the PharmaSUG 2022 conference registration form online.
- You may cancel a seminar on or before May 13, 2022, and receive a full refund minus a $25 administration fee per cancelled seminar.
- You may add a seminar on or before May 13, 2022 for no additional fee. To sign up for an additional seminar after you have already registered for the conference, please contact the This email address is being protected from spambots. You need JavaScript enabled to view it..
- On or before May 13, 2022, you may swap one seminar for another; however, this is considered a change in conference registration and will incur a $25 administration fee.
- After May 13, 2022, you MAY NOT SWAP seminars; however, a new seminar may be added depending on space and availability.
- There will be NO REFUNDS after May 13, 2022. However, if you are unable to attend, the seminar material will be provided to you (either by postal mail or email) without additional charge.
- Should a seminar be cancelled at any time for any reason, the sole liability of PharmaSUG and the instructor is a refund of the seminar fee, and they are NOT liable for any special or consequential damages arising from the cancellation of the seminar.
- On-site registration will be permitted based on space and availability, and payable by major credit card (MC, VISA, Discover, AMEX). However, seminar materials may not be available on-site but will be provided later to paid attendees.
- You may sign up for seminars occurring at the same time, i.e., you can attend one class and ask for material for another class, bearing in mind that tuition must be paid for both seminars.
For questions about the above seminar policy and availability, please contact Cindy Song and Natalie Martinez, Seminar Coordinators, at This email address is being protected from spambots. You need JavaScript enabled to view it..
Course Descriptions
Using SAS in Python Applications with SASPy and Open-Source ToolingMatthew Slaughter, Isaiah Lankham
Saturday, May 21, 2022, 1:00 PM - 5:00 PM
Are you familiar with Python syntax? Want to go beyond the basics, and use SAS and Python together like a pro?
In this hands-on class, we'll practice writing Python scripts in Google Colab (an online implementation of JupyterLab). These Python scripts will link to SAS OnDemand for Academics using the Python package SASPy developed by SAS Institute. We'll also practice using the popular Python package pandas, whose DataFrame objects are the Python equivalent of SAS datasets.
Along the way, we'll work through common data-analysis tasks using both regular SAS code and Python together with the SASPy package, highlighting important tradeoffs for each and emphasizing the value of being a polyglot programmer fluent in multiple languages. Specific examples include advanced data-manipulation techniques, using SASPy as an interface for SAS/STAT, rectangularizing complex JSON-formatted data returned by web APIs, and creating simple Python web applications incorporating SAS analytics.
This class is aimed at intermediate to advanced SAS programmers, but assumes only basic familiarity with Python syntax and pandas DataFrames. However, no knowledge of JupyterLab is assumed. Accounts for Google and SAS OnDemand for Academics will be needed to interact with code examples. All class materials, including complete setup instructions, will be made available through https://github.com/saspy-bffs/pharmasug-2022-class.
Back to top
Oncology Study Seminar for Programmers and Biostatisticians
Kevin Lee
Sunday, May 22, 2022, 8:00 AM - 12:00 PM
Compared to other therapeutic studies, oncology studies are generally complex and difficult for programmers and statisticians. There is more to understand and to know such as different clinical study types, specific data collection points and analysis. In this seminar, programmers and statisticians will learn oncology specific knowledge in clinical studies and will understand a holistic view of oncology studies from data collection, CDISC datasets, and analysis. Programmers and statisticians will also find out what makes oncology studies unique and learn how to lead oncology study projects effectively.
The seminar will cover four different sub-types and their response criteria guidelines. The first sub-type, solid tumor studies, usually follows RECIST (Response Evaluation Criteria in Solid Tumor). The second sub- type, immunotherapy studies, usually follows irRC (immune-related Response Criteria). The third sub-type, lymphoma studies, usually follows Cheson. Lastly, leukemia studies follow study-specific guidelines (e.g., IWCLL for chronic lymphocytic leukemia). The seminar will show how to use response criteria guidelines for data collection and response evaluation.
Programmers and statisticians will learn how to create SDTM tumor specific datasets (RS, TU, TR), what SDTM domains are used for certain data collection, and what Controlled Terminology (e.g., CR, PR, SD, PD, NE) will be applied. They will also learn how to create time-to-event ADaM datasets from SDTM domains and how to use ADaM datasets to derive efficacy analysis (e.g., OS, PFS, TTP, ORR, DFS) and Kaplan Meier curves using SAS procedures such as PROC LIFETEST and PHREG.
Finally, programmers and statistician will understand how to build end-to-end standards-driven oncology studies from protocol, study sub-types, response criteria, data collection, SDTM and ADaM to analysis.
Back to top
Hands-On Data-Driven Design: Developing More Flexible, Reusable, Configurable SAS Software
Troy Hughes
Sunday, May 22, 2022, 8:00 AM - 12:00 PM
Attend and receive a FREE copy of the author's 550-page book, SAS® Data-Driven Development: From Abstract Design to Dynamic Functionality, Second Edition, released in May 2022. Students will receive the physical book at the course, and can run all code during the course using SAS Display Manager, SAS Enterprise Guide, SAS University Edition, or SAS OnDemand for Analytics.
This HANDS-ON workshop installs the student as the new SAS consultant within Scranton, Pennsylvania’s most infamous paper supply company — charged with improving software functionality and performance through data-driven software design. Navigate office intrigue and antics to gather software requirements, analyze hardcoded legacy SAS programs, and refactor and improve software through data-driven design. Students can run all examples. Help Jim, Dwight, Phyllis, and Stanley sell more paper through higher quality data-driven software!
Data-driven design describes software in which configuration items, business rules, data validation rules, data models, data dictionaries, report style, and other dynamic elements are maintained in external data structures – NOT in underlying code. Benefits include increased software flexibility, reusability, maintainability, modularity, readability, interoperability, extensibility, and configurability.
Topics include:
- Compare preferred data-driven design with undesirable hardcoded design
- Build reusable procedures, functions, and call routines (subroutines) using SAS macros and PROC FCMP (the SAS function compiler)
- Demonstrate built-in and user-defined data structures (e.g., parameters, macro lists, arrays, has objects, control tables, configuration files, data sets, Excel, CSV, CSS)
- Use SAS components that support data-driven development (e.g., CALL EXECUTE, CNTLIN option in PROC FORMAT, SYSPARM option, SAS dictionary tables, SAS arrays, CSSSTYLE option in PROC REPORT)
- Ingest positional flat files, CSV files, SAS data sets, and other transactional files, and dynamically identify altered file format/structure through prescriptive data dictionaries
- Create color-coded, “traffic light” quality control reports that automatically identify bad data while standardizing good data
- Configure the style (e.g., format, font, color scheme, graphics) of data products using user-defined SAS formats and CSS files
Back to top
Deep Dive into Electronic Submission Components for Regulatory Submission of Clinical Study Data
Prafulla Girase
Sunday, May 22, 2022, 8:00 AM - 12:00 PM
A regulatory submission of clinical study data also needs to be accompanied by various other electronic submission (eSUB) components such as Define-XML, annotated CRF, study data reviewer’s guide, analysis data reviewer’s guide etc. This seminar will take a deep dive into each of these components and educate attendees about key contents, best practices and Global considerations (i.e. FDA & PMDA) during preparation of these components. For example, attendees will learn characteristics of a submission-ready annotated CRF (i.e. annotations, validated bookmarks/links, document properties etc.). It will also go over key considerations related to preparation of a whole eSUB package for a submission such as folder structure considerations, PDF validation practices, final package checklist, regulatory hand-off etc. The author also plans to share general insights from his practical experience of attending face to face data format consultation meeting with PMDA.
Back to top
Python Programing Seminar – Advanced with Machine Learning
Kevin Lee
Sunday, May 22, 2022, 1:00 PM - 5:00 PM
The advanced Python programming seminar will cover more advanced Python programming. It is recommended for those who took last year’s Python course, or for those who have some knowledge, but want to learn more advanced Python programming. This seminar will also cover Machine Learning implementation using Python.
Agenda for advanced Python programming seminar
- Simple review of basic Python Programing seminar
- Metadata analysis (PROC CONTENT)
- Advanced programming – transpose, remove duplicate record, group-by
- Statistical analysis – paired t-test, Fisher's Exact Test, survival analysis
- Data visualization - scatterplot, histogram, Kaplan Meier curves
- Machine learning introduction – concepts and theory
- Machine learning algorithms – regression, logistic regression, decision trees, random forest, XGBoost, K-means clustering, KNN
- Deep learning algorithms – Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN)
- Python machine learning modules – Sklearn, Tensorflow, Keras
- Python machine learning workshop using image data
- Deeper understanding of Python programming
- Jupyter Notebook download and experience
- Real time Python coding exercise
- Differences and similarities with SAS programming
- Data manipulation and analysis in Python
- Machine learning programming in Python
Back to top
Introduction to R for the Statistical Programmer
Mike Stackhouse
Sunday, May 22, 2022, 1:00 PM - 5:00 PM
In this workshop, statistical programmers will be introduced to the R programming language and the tidyverse, using familiar clinical examples. Attendees will leave with a basic understanding of what R is, what the tidyverse is and why it’s important, and what the open-source landscape has to offer us in the world of clinical statistical programming. Hands-on programming examples will be offered to give attendees some basic knowledge of the tools available in R to support common clinical workflows, such as SDTM, ADaM and clinical TFLs. If you’ve never worked in R before, but want to see how it can be used in your day-to-day tasks, come join us and see what this powerful open-source language has to offer!
Back to top
Clinical Tables in R with GT
Phil Bowsher
Sunday, May 22, 2022, 1:00 PM - 5:00 PM
RStudio will be presenting an overview of the GT R package for the R user community at PharmaSUG. This is a great opportunity to learn and get inspired about new capabilities for generating TFLs (Tables, Figures, and Listings) for inclusion in Clinical Study Reports created in R. In this workshop, we will review and reproduce a subset of common table outputs used in clinical reporting containing descriptive statistics, counts and or percentages.
No prior knowledge of R or RStudio is needed. This short talk will provide an introduction to gt as a flexible and powerful package for generating tables as part of your research and reporting TFL programming. The talk will provide an introduction to TFL-producing R programs and include an overview of the gt R package with applications in drug development such as safety analysis and Adverse Events. A live environment will be available for attendees to explore the tables real-time.
Back to top
FDA & PMDA Submission Data Requirements
David Izard
Sunday, May 22, 2022, 1:00 PM - 5:00 PM
The binding guidance documents requiring you to provide data and related documentation based on US FDA endorsed data standards as part of your electronic submission are in effect for both clinical and non-clinical assets. These documents have moved the needle with respect to Sponsor and CRO organization obligations in terms of how they plan and execute studies as well as prepare study assets for inclusion in a regulatory submission. But it is not just the US FDA when it comes to including data in a submission; Japan's PMDA has moved beyond the pilot phase into the voluntary phase with an eye on requiring submissions based on their endorsed data standards in 2020.
This highly interactive seminar will review each asset, its role in the submission and the impact that these final guidance documents have on how the asset is handled as it weaves its way through the drug development lifecycle on its way to regulators. Simultaneously we will review the similarities and key differences executing these same tasks when interacting with Japan's PMDA. A portion of the seminar will be dedicated to a discussion of "hot off the press" topics, including a review of FDA & PMDA behavior since these documents have been finalized including Sponsor feedback during the review period. We will also explore how other global regulatory bodies are embracing standards, with a focus on Canada, Europe and China.
Audience Level: Beginner to Intermediate - individuals who are new to the Pharmaceutical industry would benefit greatly for the opportunity to put their hard work creating analysis datasets and TLFs into the context of a regulatory submission. Conversely, experienced professionals who have created submission assets in the past who are looking for a refresher on recent changes to FDA & PMDA requirements, CDISC standards and the outlook on submission data requirements for other global regulatory bodies would also enjoy this seminar.
Back to top
Driving Miss Data: Data-Driven Techniques
Richann Watson
Wednesday, May 25, 2022, 1:00 PM - 5:00 PM
We have all been there. We write a program based on the data we have. Then, we get new data and we must update the program. Making these updates can be time consuming. Not only must you update the production version of the program, but someone must also update any associated validation or QC programs. Wouldn’t it be nice if there were ways around this? This is where data-driven techniques come in handy. Using detailed examples, you will learn how to write robust code that is ready to handle an unexpected bend in the road! This half-day course will cover advanced techniques such as: discovering and using information about data sets and variables even if it's not known in advance; generating dynamic formats that are based on the data instead of hard-coded into your program; using complex looping structures to control your program flow based on the data; building code on the fly, even from within a DATA step; and much more!
Back to top