The September seminar, "Intro to Tidymodels", is sponsored by RStudio, and is being offered free of charge! Registration is still required- please register using the link below.
.

Enhance your PharmaSUG experience by attending optional post-conference training seminars taught by seasoned experts. Half-day courses are only $125 with a conference registration, or $250 without a conference registration. Here is the schedule for the rest of the year. Registration will open for each seminar approximately 100 days in advance. When a seminar is open for registration, you'll see the "Register Now" link appear below the name. If you are planning to register for multiple seminars, you can create a record in the training seminar registration system at any time, and then log into that registration record, and register for each seminar individually as it becomes available. The registration system will send emails reminding everyone who has registered to sign up for the next seminar when it becomes available

Course Date Course Title (click for description) Instructor(s)
(click for bio)
Time (U.S. Eastern time zone)
Sep 14,2021 Intro to Tidymodels
Register Now!
Phil Bowsher Noon -
4:30 PM
Nov 2, 2021 Reproducible Computation at Scale with R
Register Now!
Will Landau 11:00 AM -
3:30 PM
Dec 1,2021 Hands-On Data-Driven Design: Developing More Flexible, Reusable, Configurable SAS Software
Register Now!
Troy Hughes 11:00 AM -
3:30 PM
Jan 11, 2022 Clinical Graphs Using SAS Sanjay Matange 10:00 AM -
2:30 PM
Feb 3, 2022 Deep Dive into Electronic Submission Components for Regulatory Submission of Clinical Study Data Prafulla Girase Noon -
4:30 PM
Mar 2, 2022 Driving Miss Data: Data-Driven Techniques Richann Watson Noon -
4:30 PM
Apr 1, 2022 Oncology Study Seminar for Programmers and Biostatisticians Kevin Lee Noon -
4:30 PM




Course Descriptions

CDISC ADaM – Implementation by Example
Richann Watson
August 4, 2021, Noon - 4:30 PM ET


This course will provide a high-level overview of some of basic ADaM concepts. It is assumed that the attendee will have a fundamental knowledge of the different ADaM structures and principles. The primary focus of the course is to illustrate the implementation of some of the concepts found in both the ADaM Implementation Guide (ADaM IG) and the ADaM Structure for Occurrence Data (OCCDS) documents. Items covered include setting up ADSL for common trial designs and walking through the process of creating a basic data structure (BDS) starting from a simple BDS and building on it to structure a data set that will support one or more analyses. Additionally, the course will demonstrate how to implement some variables that are only found in OCCDS, such as the standardized MedDRA query (SMQ) and customized query (CQ) variables.
Back to top


Intro to Tidymodels
Phil Bowsher
September 14, 2021, Noon - 4:30 PM ET


RStudio will be giving an introduction to Tidymodels. Tidymodels is a collection of packages for modeling and machine learning using tidyverse principles. In this session we'll introduce functions from rsample, recipes, parsnip, and yardstick. You'll learn how to split data, fit a model, predict, and compare outcomes with a workflow that easily allows you to change and compare model types.  A training environment with R and RStudio will be provided for all participants. 

Example Content:
https://github.com/rstudio-education/tidymodels-virtually
Back to top


Reproducible Computation at Scale with R
Will Landau
November 2, 2021, 11:00 AM - 3:30 PM ET


Ambitious workflows in R, from Bayesian data analysis to machine learning, can be difficult to manage. A single round of computation can take several hours to complete, and routine updates to the code and data tend to invalidate hard-earned results. You can enhance the maintainability, hygiene, speed, scale, and reproducibility of such projects with the {targets} R package. {targets} resolves the dependency structure of your analysis pipeline, skips tasks that are already up to date, executes the rest with optional distributed computing and cloud storage, and abstracts output artifacts as R objects. {targets} surpasses the permanent limitations of its predecessor, {drake}, and provides increased efficiency and a smoother user experience. In this hands-on interactive workshop, you will apply your R skills and practice targets-powered automation with a machine learning project.
Back to top


Hands-On Data-Driven Design: Developing More Flexible, Reusable, Configurable SAS Software
Troy Hughes
December 1, 2021, 11:00 AM - 3:30 PM ET


Attend and receive a FREE copy of the author’s 600-page book, SAS® Data-Driven Development: From Abstract Design to Dynamic Functionality, Second Edition, released in 2021. Students will receive the physical book in advance of this virtual training, and can run all code during the course using SAS Display Manager, SAS Enterprise Guide, SAS University Edition, or SAS OnDemand for Analytics.

This HANDS-ON workshop installs the student as the new SAS consultant within Scranton, Pennsylvania’s most infamous paper supply company — charged with improving software functionality and performance through data-driven software design. Navigate office intrigue and antics to gather software requirements, analyze hardcoded legacy SAS programs, and refactor and improve software through data-driven design. Students can run all examples. Help Jim, Dwight, Phyllis, and Stanley sell more paper through higher quality data-driven software!

Data-driven design describes software in which configuration items, business rules, data validation rules, data models, data dictionaries, report style, and other dynamic elements are maintained in external data structures – NOT in underlying code. Benefits include increased software flexibility, reusability, maintainability, modularity, readability, interoperability, extensibility, and configurability.

Topics include:
  • Compare preferred data-driven design with undesirable hardcoded design
  • Build reusable procedures, functions, and call routines (subroutines) using SAS macros and PROC FCMP (the SAS function compiler)
  • Demonstrate built-in and user-defined data structures (e.g., parameters, macro lists, arrays, has objects, control tables, configuration files, data sets, Excel, CSV, CSS)
  • Use SAS components that support data-driven development (e.g., CALL EXECUTE, CNTLIN option in PROC FORMAT, SYSPARM option, SAS dictionary tables, SAS arrays, CSSSTYLE option in PROC REPORT)
  • Ingest positional flat files, CSV files, SAS data sets, and other transactional files, and dynamically identify altered file format/structure through prescriptive data dictionaries
  • Create color-coded, “traffic light” quality control reports that automatically identify bad data while standardizing good data
  • Configure the style (e.g., format, font, color scheme, graphics) of data products using user-defined SAS formats and CSS files
Back to top


Clinical Graphs Using SAS
Sanjay Matange
January 11, 2022, 10:00 AM - 2:30 PM ET


Graphs for the analysis of Clinical Research data and for Health and Life Science applications can range from single-cell graphs, to classification panels to complex multi-cell graphs with many specific requirements. These graphs can be made by the judicious usage of the Statistical Graphics (SG) Procedures or the Graph Template Language (GTL). This seminar will teach you how to use the appropriate tool to create the graph you need using real world examples.

In this half-day presentation we will build single-cell graphs such as the Mean Change from Baseline, Survival Plot, Swimmer plot and Waterfall Charts using the SGPLOT procedure. We will build Panels of LFT Shift by Type and displays for Lab Values using the SGPANEL procedure. Finally, we will build complex multi-cell graphs for Most Frequent Adverse Events, and a combined display for Tumor Size Change + Duration of Treatment + Baseline Tumor Load. We will discuss the advanced features available in the SG procedures and GTL to help build these graphs and how these graphs can be easily extended and customized to your individual requirements.

Audience: Graph programmers
Required: Moderate SAS programming skills.
Back to top


Deep Dive into Electronic Submission Components for Regulatory Submission of Clinical Study Data
Prafulla Girase
February 3, 2022, Noon - 4:30 PM ET


A regulatory submission of clinical study data also needs to be accompanied by various other electronic submission (eSUB) components such as Define-XML, annotated CRF, study data reviewer’s guide, analysis data reviewer’s guide etc. This seminar will take a deep dive into each of these components and educate attendees about key contents, best practices and Global considerations (i.e. FDA & PMDA) during preparation of these components. For example, attendees will learn characteristics of a submission-ready annotated CRF (i.e. annotations, validated bookmarks/links, document properties etc.). It will also go over key considerations related to preparation of a whole eSUB package for a submission, such as folder structure considerations, PDF validation practices, final package checklist, regulatory hand-off, etc. The author also plans to share general insights from his practical experience of attending a face-to-face data format consultation meeting with PMDA.
Back to top


Driving Miss Data: Data-Driven Techniques
Richann Watson
March 2, 2022, Noon - 4:30 PM ET


We have all been there. We write a program based on the data we have. Then, we get new data and we must update the program. Making these updates can be time consuming. Not only must you update the production version of the program, but someone must also update any associated validation or QC programs. Wouldn’t it be nice if there were ways around this? This is where data-driven techniques come in handy. Using detailed examples, you will learn how to write robust code that is ready to handle an unexpected bend in the road! This half-day course will cover advanced techniques such as: discovering and using information about data sets and variables even if it's not known in advance; generating dynamic formats that are based on the data instead of hard-coded into your program; using complex looping structures to control your program flow based on the data; building code on the fly, even from within a DATA step; and much more!
Back to top


Oncology Study Seminar for Programmers and Biostatisticians
Kevin Lee
April 1, 2022, Noon - 4:30 PM ET


Compared to other therapeutic studies, oncology studies are generally complex and difficult for programmers and statisticians. There is more to understand and to know such as different clinical study types, specific data collection points and analysis.  In this seminar, programmers and statisticians will learn oncology specific knowledge in clinical studies and will understand a holistic view of oncology studies from data collection, CDISC datasets, and analysis.  Programmers and statisticians will also find out what makes oncology studies unique and learn how to lead oncology study project effectively.

The seminar will cover four different sub types and their response criteria guidelines.  The first sub type, Solid Tumor study, usually follows RECIST (Response Evaluation Criteria in Solid Tumor). The second sub type, Immunotherapy study, usually follows irRC (immune-related Response Criteria).  The third sub type, Lymphoma study, usually follows Cheson.  Lastly, Leukemia studies follow study specific guidelines (e.g., IWCLL for Chronic Lymphocytic Leukemia).  The seminar will show how to use response criteria guidelines for data collections and response evaluation.

Programmers and statisticians will learn how to create SDTM tumor specific datasets (RS, TU, TR), what SDTM domains are used for certain data collection, and what Controlled Terminology (e.g., CR, PR, SD, PD, NE) will be applied.  They will also learn how to create Time-to-Event ADaM datasets from SDTM domains and how to use ADaM datasets to derive efficacy analysis (e.g., OS, PFS, TTP, ORR, DFS) and Kaplan Meier Curves using SAS Procedures such as PROC LIFETEST and PHREG.

Finally, programmers and statistician will understand how to build end-to-end standards driven oncology studies from protocol, study sub-types, response criteria, data collection, SDTM, ADaM to analysis.
Back to top





Instructor Biographies

Phil Bowsher

Phil is the Director of Healthcare and Life Sciences at RStudio and founder of the R in Pharma gathering at Harvard University. Phil is a published author and award-winning speaker, having given over 100 R talks and workshops in 4 countries to an estimated 20,000 people. His work focuses on innovation in the pharmaceutical industry, with an emphasis on interactive web applications, reproducible research and open-source education. He is interested in the use of R with applications in drug development and is a contributor to conferences promoting science through open data and software. Phil (RStudio Shiny Train-the-Trainer certified) has been one of the foremost promoters of Shiny, R Markdown, and the Tidyverse in the drug development process, documenting and explaining each in detail. He has experience at a number of technology and consulting corporations working in data science teams and delivering innovative data products. Phil has over 15 years’ experience implementing analytical programs, specializing in interactive web application initiatives and reporting needs for life science companies.


Prafulla Girase

Prafulla Girase has over 20 years of experience in Biotech industry including experience in statistical programming and data standards space. He has worked as an electronic submission (eSUB) lead or co-lead on multiple NDA/BLA clinical data submission packages that are currently approved therapies in the market. Prafulla has experience attending meetings with regulatory agencies (FDA/PMDA) regarding data standards including attendance of face-to-face data format consultation meeting with PMDA. He currently works as a Director, Data Standards and Governance at Alexion where he is responsible for leading data standards within Statistical Programming function. He was a co-lead for PhUSE’s Define-XML completion guideline working group and is an active volunteer in CDISC. He holds a MS degree in Pharmacy Administration from the University of Rhode Island.


Troy Hughes

Troy Martin Hughes has been a SAS practitioner for more than 20 years, has managed SAS projects in support of federal, state, and local government initiatives, and is a SAS Certified Advanced Programmer, SAS Certified Base Programmer, SAS Certified Clinical Trials Programmer, and SAS Professional V8. Since 2013, he has given more than 100 presentations, trainings, and hands-on workshops at SAS conferences, including at SAS Global Forum, SAS Analytics Experience, WUSS, SCSUG, SESUG, MWSUG, PharmaSUG, BASAS, and BASUG. He has authored two groundbreaking books that model software design and development best practices:
  • SAS® Data-Driven Development: From Abstract Design to Dynamic Functionality, Second Edition (2021)
  • SAS® Data Analytic Development: Dimensions of Software Quality (2016)
Troy has an MBA in information systems management as well as other credentials, including: PMP, PMI-RMP, PMI-PBA, PMI-ACP, SSCP, CISSP, CSSLP, Network+, Security+, CySA+, CASP+, CISA, CGEIT, CISM, CRISC, ITIL Foundation, CSM, CSD, CSPO, CSP, CSP-SM, CSP-PO, and SAFe Government Practitioner (SGF). He is a US Navy veteran with two tours of duty in Afghanistan.


Will Landau

Will Landau works at Eli Lilly and Company, where he develops methods and tools for clinical statisticians, and he is the creator and maintainer of the {targets} and {drake} R packages. Will earned his PhD in Statistics at Iowa State University in 2016, where his dissertation research applied Bayesian methods, hierarchical models, and GPU computing to the analysis of RNA-seq data.



Kevin Lee

Kevin Lee is Data Scientist, statistician, Machine Learning working group lead, corporate/university trainer and evangelist in new technology.  Kevin supports Pharmaceutical industry as AVP of AI/Machine Learning Consultant at Genpact.  Among all the therapeutic area, Kevin always loves oncology studies, and he is an active supporter on oncology-specific standards such as CDISC Tumor datasets, control terminology and response criteria on each study type.  Kevin wants to innovate pharmaceutical industry with AI/Machine Learning technology, and he currently leads AI/Machine Learning working group in PhUSE.  He also teaches Machine Learning and Python programming in University and corporations.  Kevin has presented about 100 papers at the various conferences including many oncology-related and Machine Learning based papers.  Kevin earned an M.S. in Applied Statistics at Villanova University following a B.S. from University of Pennsylvania.   Kevin is a life-time learner who loves to learn and share.  


Sanjay Matange

Sanjay Matange is an expert in the field of data visualization using SAS graphics software including the SG procedures and GTL. Sanjay worked at SAS for 29 years where he was responsible for the development of ODS Graphics. He is the co-author of four patents and the author of four SAS Press books, and was also the main author of Graphicalliy Speaking SAS blog for 8 years.


Richann Watson

Richann Watson is an independent statistical programmer and CDISC consultant based in Ohio. She has been using SAS since 1996 with most of her experience being in the life sciences industry. She specializes in analyzing clinical trial data and implementing CDISC standards. Additionally, she is a member of the CDISC ADaM team and various sub-teams. Richann loves to code and is an active participant and leader in the SAS User Group community. She has presented numerous papers, posters, and training seminars at SAS Global Forum, PharmaSUG, and various regional and local SAS user group meetings. Richann holds a bachelor’s degree in mathematics and computer science from Northern Kentucky University and master’s degree in statistics from Miami University.