Enhance your PharmaSUG experience by attending optional pre- and post-conference training seminars taught by seasoned experts. Half-day courses are only $200 with a conference registration, or $300 without a conference registration. You can sign up for classes through the seminar registration system. Space is limited!

Note that the seminar registration system is separate from the conference registration system this year. You must register for the conference first in order to receive the $100 discouint on the seminar registration fee.

Sunday, May 14, 2023

Course Title (click for description) Instructor(s) (click for bio) Time
#11 Understanding and Creating Define-XML 2.1 using SAS® OpenCST Lex Jansen 8:00 AM - 12:00 PM
#12 Programming Basics for Life Sciences Jane Eslinger 8:00 AM - 12:00 PM
#13 Using SAS in Python Applications with SASPy and Open-Source Tooling Matthew Slaughter
& Isaiah Lankham
8:00 AM - 12:00 PM
#14 Hands-On Functions: How to Build Your Own User-Defined FCMP Functions and Macro Functions Troy Hughes 8:00 AM - 12:00 PM
#21 Making Your ADaM Dataset Analysis-Ready Sandra Minjoe
& Mario Widel
1:00 PM - 5:00 PM
#22 Oncology Study Seminar for Programmers and Biostatisticians Kevin Lee 1:00 PM - 5:00 PM
#23 Deep Dive into Electronic Submission Components for Regulatory Submission of Clinical Study Data Prafulla Girase 1:00 PM - 5:00 PM
#24 A Deep Dive into Enhancing SAS/GRAPH® and SG Procedural Output with Templates, Styles, Attributes, and Annotation Louise S. Hadden 1:00 PM - 5:00 PM

Wednesday, May 17, 2023

Course Title (click for description) Instructor(s) (click for bio) Time
#31 Python Programming Techniques by Example for SAS® Users Kirk Paul “sasNerd” Lafler 1:00 PM - 5:00 PM
#32 Data Driven Programming: Getting More Done with Less Code Joe Matise 1:00 PM - 5:00 PM
#33 Statistics for Clinical Programmers Jim Box 1:00 PM - 5:00 PM
#34 Introduction to Shiny for Clinical Reporting Phil Bowsher 1:00 PM - 5:00 PM




Course Descriptions

Understanding and Creating Define-XML 2.1 using SAS® OpenCST
Lex Jansen
Sunday, May 14, 2023, 8:00 AM - 12:00 PM


Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is an open standard that was developed by the Worldwide Web Consortium (W3C) in order to provide a flexible way to create common information formats and share both the format and the data on the World Wide Web, intranets, and elsewhere. CDISC is a global, open, multidisciplinary, non-profit organization that has established standards to support the acquisition, exchange, submission and archive of clinical research data and metadata.

The CDISC Data Exchange Standards Team created and published several CDISC standards in an XML representation. These XML standards include the Operational Data Model (ODM) and several ODM extensions:
  • Define-XML
  • Controlled Terminology in XML (CT-XML)
  • Dataset-XML
  • Analysis Results Metadata for Define-XML 2.x
The Open SAS® Clinical Standards Toolkit is an open-source framework to support Health and Life Sciences industry data model standards. It is targeted at advanced SAS programmers and supports working with several XML based CDISC standards, such as ODM, Define-XML, Dataset-XML and CT-XML. This presentation will first introduce XML and will then give an overview of XML standards that are relevant to Define-XML for validation (XML Schema, Schematron) and transformation (XSL stylesheets). We will also introduce CDISC XML based standards: ODM, Define-XML and the Analysis Results Metadata extension for Define-XML 2.x.
Back to top


Programming Basics for Life Sciences
Jane Eslinger
Sunday, May 14, 2023, 8:00 AM - 12:00 PM


This half-day seminar is designed for new programmers and programmers who are new to the Life Sciences. We will cover the fundamentals of combining data sets, working with dates, and conditional logic. We will also cover formats and functions for converting data. Attendees will learn best practice programming techniques. This seminar is structured to give you the opportunity to write code, so after each concept type is explained, you will have the opportunity to practice coding. PharmaSUG will not be providing laptops, so please bring your own device so you can get the most out of this seminar.
Back to top


Using SAS in Python Applications with SASPy and Open-Source Tooling
Matthew Slaughter, Isaiah Lankham
Sunday, May 14, 2023, 8:00 AM - 12:00 PM


Interested in learning Python? How about learning to make Python and SAS work together?

In this class, we’ll practice writing Python scripts using Google Colab (https://colab.research.google.com/), which is a free online implementation of JupyterLab, and we’ll link to SAS OnDemand for Academics (https://welcome.oda.sas.com/) to access the SAS analytical engine. We’ll also learn to use the popular pandas package, whose DataFrame objects are the Python equivalent of SAS datasets. Along the way, we’ll work through common data-analysis tasks using both regular SAS code and Python together with the SASPy package, highlighting important tradeoffs for each and emphasizing the value of being a polyglot programmer fluent in multiple languages. This will include a beginner-friendly overview of Python syntax and data structures. SASPy is a module developed by SAS Institute for the Python programming language, providing an alternative interface to the SAS system. With SASPy, SAS procedures can be executed in Python scripts using Python syntax, and data can be transferred between SAS datasets and their Python DataFrame equivalent. This allows SAS programmers to take advantage of the flexibility of Python for flow control, and Python programmers can incorporate SAS analytics into their scripts and applications.

This class is aimed at SAS programmers of all skill levels, including those with no prior experience using Python or JupyterLab. However, some examples will assume familiarity with the Output Delivery System, PROC SQL, and the SAS Macro Facility. Accounts for Google and SAS OnDemand for Academics will be needed to interact with code examples, and instructions for creating accounts will be distributed in advance. Also, all class materials will also be accessible.
Back to top


Hands-On Functions: How to Build Your Own User-Defined FCMP Functions and Macro Functions
Troy Hughes
Sunday, May 14, 2023, 8:00 AM - 12:00 PM


Attend and receive a FREE copy of the author’s 550-page book, SAS® Data-Driven Development: From Abstract Design to Dynamic Functionality, Second Edition, released in 2022! Students will receive the physical book at the training!

“User-defined” functions are those functions that are created by SAS users, as contrasted with “built-in” functions that are part of out-of-the-box Base SAS. SAS provides two methods to build user-defined functions in SAS including the SAS macro language and the SAS Function Compiler (aka PROC FCMP). This introductory course demonstrates how to build user-defined functions (and subroutines)—including both macro functions and FCMP functions. No prior experience with the SAS macro language or PROC FCMP syntax is required.

User-defined functions improve software reusability — that is, the ability of code modules to be reused in future software projects, and to be reused by multiple SAS users within a team or organization. Reusability enables a function to be developed once but used repeatedly, which reduces the workload of the SAS users who are writing programs, by enabling us to rely on previously built (and fully tested) code modules. Thus, user-defined functions lead to not only more flexible and configurable software but also a more productive, efficient SAS team.

This HANDS-ON workshop enables students to run all programs in real-time using SAS Display Manager, SAS Enterprise Guide, or SAS OnDemand for Academics. Macro function topics comprise approximately 1/3 of the course, and include:
  • Gentle introduction to the SAS macro language, including differentiation between SAS macros and SAS macro functions
  • Differentiation between positional and keyword parameters
  • Defining optional parameters and default parameter values
  • Passing macro lists and two-dimensional data structures to functions
  • Use of the PARMBUFF option in the %MACRO statement to facilitate multi-element arguments
  • Macro function argument validation, exception handling, and use of global macro variables as return values / return codes
FCMP function topics comprise approximately 2/3 of the course, and include:
  • Gentle introduction to PROC FCMP syntax and the construction of user-defined functions and subroutines (with the FUNCTION and SUBROUTINE statements, respectively)
  • Use of the VARARGS option in the FUNCTION statement to enable multi-element arguments to be passed to functions, and OUTARGS option to modify multiple arguments (within a subroutine)
  • Passing character and/or numeric data types to functions
  • Passing arrays to functions, and utilizing arrays within functions
  • Declaring, initializing, and referencing hash objects within functions
  • Calling functions and subroutines from the DATA step, and from %SYSFUNC and %SYSCALL
  • Calling functions from PROC FORMAT

Back to top


Making Your ADaM Dataset Analysis-Ready
Sandra Minjoe, Mario Widel
Sunday, May 14, 2023, 1:00 PM - 5:00 PM


One of the fundamental principles of ADaM is to be “analysis-ready”. But what does that mean, why is it important, and how do you determine if your analysis dataset is indeed “analysis-ready”? This seminar delves into what the ADaM documents say about being “analysis-ready”, including what type of dataset manipulation is allowed (and not allowed) to happen between the ADaM dataset and the statistical output. It describes how to choose the appropriate dataset structure and recommends variables that will help efficiently create different types of analysis output, such as tables and figures. It also describes situations where “analysis-ready” doesn’t apply. This seminar includes examples and exercises to demonstrate how to know whether your dataset meets the ADaM “analysis-ready” fundamental principle.
Back to top


Oncology Study Seminar for Programmers and Biostatisticians
Kevin Lee
Sunday, May 14, 2023, 1:00 PM - 5:00 PM


Compared to other therapeutic studies, oncology studies are generally complex and difficult for programmers and statisticians. There is more to understand and to know such as different clinical study types, specific data collection points and analysis.  In this seminar, programmers and statisticians will learn oncology specific knowledge in clinical studies and will understand a holistic view of oncology studies from data collection, CDISC datasets, and analysis.  Programmers and statisticians will also find out what makes oncology studies unique and learn how to lead oncology study projects effectively.

The seminar will cover four different sub-types and their response criteria guidelines.  The first sub-type, solid tumor studies, usually follows RECIST (Response Evaluation Criteria in Solid Tumor). The second sub- type, immunotherapy studies, usually follows irRC (immune-related Response Criteria).  The third sub-type, lymphoma studies, usually follows Cheson.  Lastly, leukemia studies follow study-specific guidelines (e.g., IWCLL for chronic lymphocytic leukemia).  The seminar will show how to use response criteria guidelines for data collection and response evaluation.

Programmers and statisticians will learn how to create SDTM tumor specific datasets (RS, TU, TR), what SDTM domains are used for certain data collection, and what Controlled Terminology (e.g., CR, PR, SD, PD, NE) will be applied.  They will also learn how to create time-to-event ADaM datasets from SDTM domains and how to use ADaM datasets to derive efficacy analysis (e.g., OS, PFS, TTP, ORR, DFS) and Kaplan Meier curves using SAS procedures such as PROC LIFETEST and PHREG.

Finally, programmers and statistician will understand how to build end-to-end standards-driven oncology studies from protocol, study sub-types, response criteria, data collection, SDTM and ADaM to analysis.
Back to top


Deep Dive into Electronic Submission Components for Regulatory Submission of Clinical Study Data
Prafulla Girase
Sunday, May 14, 2023, 1:00 PM - 5:00 PM


A regulatory submission of clinical study data also needs to be accompanied by various other electronic submission (eSUB) components such as Define-XML, annotated CRF, study data reviewer’s guide, analysis data reviewer’s guide etc. This seminar will take a deep dive into each of these components and educate attendees about key contents, best practices and Global considerations (i.e. FDA & PMDA) during preparation of these components. For example, attendees will learn characteristics of a submission-ready annotated CRF (i.e. annotations, validated bookmarks/links, document properties etc.). It will also go over key considerations related to preparation of a whole eSUB package for a submission such as folder structure considerations, PDF validation practices, final package checklist, regulatory hand-off etc. The author also plans to share general insights from his practical experience of attending face to face data format consultation meeting with PMDA.
Back to top


A Deep Dive into Enhancing SAS/GRAPH® and SG Procedural Output with Templates, Styles, Attributes, and Annotation
Louise S. Hadden
Sunday, May 14, 2023, 1:00 PM - 5:00 PM


Enhancing output from SAS/GRAPH® has been the subject of many a SAS® paper over the years, including my own and those written with co-authors. The more recent graphic output from PROC SGPLOT is often "camera-ready" without any user intervention, but occasionally there is a need for additional customization. SAS/GRAPH is a separate SAS product for which a specific license is required, while the SG procedures and the Graph Template Language are available to all BASE SAS users. This class will explore both new opportunities within BASE SAS for creating remarkable graphic output as well as creating visualizations with SAS/GRAPH. Techniques in SAS/GRAPH, SG procedures and GTL such as PROC TEMPLATE, PROC GREPLAY, PROC SGRENDER, and GTL, SAS-provided annotation macros and the concept of "ATTRS" in SG procedures will be explored, compared, and contrasted. As background, a discussion of the evolution of SG procedures and the rise of GTL will be provided. The format of the class will primarily be lecture, but attendees will be provided with a suite of programs and sample data to optionally run examples during the course of the class. The focus of the class will be on procedures, tools and techniques that create data visualizations in SAS, emphasizing opportunities for customization using SAS/Graph and ODS graphics procedural and template-based outputs as a base.
Back to top


Python Programming Techniques by Example for SAS® Users
Kirk Paul “sasNerd” Lafler
Wednesday, May 17, 2023, 1:00 PM - 5:00 PM


As a general-purpose programming language used by millions of users and developers around the world, Python offers clear syntax, scalability, versatility, and powerful libraries that add tremendous value for SAS users. Python’s use cases include analytics, data science, web development, game development, web scraping, text processing, image recognition, artificial intelligence, machine learning, and Internet of Things. What seems to be propelling Python’s dominance is it is easy to learn, consists of a large and growing user community, and is freely available as open source. Attendees learn valuable programming techniques through the application of real-world examples. Topics include data access, exploratory data analysis, data cleaning and transformation, applying logic with comparison and logical operators, functions as powerful building blocks, identifying the most frequent value in a list, producing descriptive statistics, creating subsets, sort data, append / concatenate data, transpose data structures, and merge / join data.
Back to top


Data Driven Programming: Getting More Done with Less Code
Joe Matise
Wednesday, May 17, 2023, 1:00 PM - 5:00 PM


As a SAS programmer, you likely work with data on a daily basis. You clean the data, you reshape the data, you write reports with the data, perhaps you format the data for submission to regulators or other recipients. In those programs, you have to define filenames, libnames, dataset names, dates, parameters... information that is critical to your program operating correctly, but is not really programming itself - it is really a form of data. Moving that data out of your program and into data structures can make programs significantly shorter and less complex, make updating them easier, and save you significant time (and money!). In this class, I'll cover how to reduce the amount of code you need to write and increase the amount of work your code can do - by using the data you already have. I will talk about why to use data instead of code, about different common patterns that can be identified that are good for moving into data, and present several different ways to implement writing code from data. I will walk the class through several examples addressing common opportunities to move code into data, and show some resources that exist to help learn more about the subject. This class is intended for intermediate to advanced programmers, but should be understandable by anyone who knows SAS. Some macro language elements will be presented, but no prior knowledge of SAS macros is needed.
Back to top


Statistics for Clinical Programmers
Jim Box
Wednesday, May 17, 2023, 1:00 PM - 5:00 PM


Ever wonder about the statistics behind some of the tables you create for a study? Want to know what p-values really mean? Interested in understanding how study sample sizes are created? Join us for an overview of statistical methods where we’ll look at the concepts of probability and statistics are used in clinical research. We won’t focus too much on the math and equations, but will work to get a basic understanding of some of the following topics:
  • Basics of Probability
  • Data Distributions: Normal and Binomial data
  • Descriptive & Inferential Statistics
  • Categorical Data Analysis
  • Hypothesis Testing
  • Continuous Data Analysis
  • Sample Size Estimation
  • Data Correlations
  • Simple Linear Regression
  • Intro to Logistic Regression

Back to top


Introduction to Shiny for Clinical Reporting
Phil Bowsher
Wednesday, May 17, 2023, 1:00 PM - 5:00 PM


RStudio will be presenting a workshop providing an overview of Shiny for the R user community at PharmaSug. This is a great opportunity to learn and get inspired about new capabilities for creating compelling analyses with applications in drug development. No prior knowledge of R, RStudio or Shiny is needed. This short course will provide a hands‐on introduction to Shiny with a focus on creating apps for inclusion in submissions to regulatory bodies. The hands‐on course will include an overview of how to build Shiny apps. For the workshop attendees, we will be providing a free RStudio training instance. A training server is created for users to use live during the session and nothing is required to install prior to the workshop.
Back to top





Instructor Biographies

Phil Bowsher

Phil is the Director of Healthcare and Life Sciences at RStudio and founder of the R in Pharma gathering at Harvard University. Phil is a published author and award-winning speaker, having given over 100 R talks and workshops in 4 countries to an estimated 20,000 people. His work focuses on innovation in the pharmaceutical industry, with an emphasis on interactive web applications, reproducible research and open-source education. He is interested in the use of R with applications in drug development and is a contributor to conferences promoting science through open data and software. Phil (RStudio Shiny Train-the-Trainer certified) has been one of the foremost promoters of Shiny, R Markdown, and the Tidyverse in the drug development process, documenting and explaining each in detail. He has experience at a number of technology and consulting corporations working in data science teams and delivering innovative data products. Phil has over 15 years’ experience implementing analytical programs, specializing in interactive web application initiatives and reporting needs for life science companies.


Jim Box

Jim is a data scientist with the Life Sciences Industry Consultants at SAS. Prior to that he spent 20 years in the CRO industry primarily as a study statistician. He holds Masters Degrees in Statistics from Duke University and Analytics from North Carolina State University and is a frequent presenter at PharmaSUG and other industry conferences


Jane Eslinger

Jane Eslinger is a Senior Technical Training Consultant at SAS Headquarters in Cary, North Carolina. Jane has authored two books: The SAS® Programmer's PROC REPORT Handbook: Basic to Advanced Reporting Techniques and The SAS® Programmer's PROC REPORT Handbook: ODS Companion. Her SAS certifications include Advanced Programmer for SAS®9, SAS® Certified Data Scientist, and SAS® Certified Advanced Visual Business Analyst.


Prafulla Girase

Prafulla Girase has 20+ years of experience in Biotech industry including experience in statistical programming and data standards space. He has worked as an electronic submission (eSUB) lead or co-lead on five NDA/BLA clinical data submission packages that are currently approved therapies in the market. Prafulla has experience attending meetings with regulatory agencies (FDA/PMDA) regarding data standards including attendance of face-to-face data format consultation meeting with PMDA. He currently works as a Director, Data Standards and Governance at Alexion AstraZeneca Rare Disease where he is responsible for leading data standards and governance within Statistical Programming.


Louise S. Hadden

Louise Hadden has been using, and loving, SAS since the days of punch cards and computers the size of a not-so-tiny house. She spends most of her time in support of health policy analytics at Abt Associates Inc., and loves a good SAS reporting challenge. She is also the girl with the SAS tattoo!


Troy Hughes

Troy Martin Hughes has been a SAS practitioner for more than 20 years, has managed SAS projects in support of federal, state, and local government initiatives, and is a SAS Certified Advanced Programmer, SAS Certified Base Programmer, SAS Certified Clinical Trials Programmer, and SAS Professional V8. Since 2013, he has given more than 100 presentations, trainings, and hands-on workshops at SAS conferences, including at SAS Global Forum, SAS Analytics Experience, WUSS, SCSUG, SESUG, MWSUG, PharmaSUG, BASAS, and BASUG. He has authored two groundbreaking books that model software design and development best practices:


Lex Jansen

Lex Jansen is an independent consultant, currently working as Senior Director, Data Science Development at CDISC. Before, Lex was a Principal Solution Consultant at SAS Institute, Health and Life sciences. In this role, he helped customers implement SAS software for clinical research, such as SAS Life Science Analytics Framework (LSAF). Prior to this role he was one of the developers of the SAS Clinical Standards Toolkit. Lex was also one of the Java developers of the SAS Life Science Analytics Framework. Prior to working at SAS he was a Senior Consultant, Clinical Data Strategies at Octagon Research Solutions, Inc. In this position, Lex worked on client consulting projects dealing with the assessment, design and/or implementation of CDISC standards. Before his employment with Octagon, he held various positions in the 16 years that he worked at the pharmaceutical company Organon. Lex holds a MSc in Mathematics from the Eindhoven University of Technology in the Netherlands. Since 2008 Lex has been an active member of the CDISC Data Exchange Standards Team, where he has been active in the development of various CDISC standards: Define-XML 2.0/2.1, Dataset-XML and the Analysis Results Metadata extension for Define-XML 2.0. Lex owns the website (www.lexjansen.com) which is well-known in the SAS community and contains more than 36,000 links to papers that were presented at major SAS User Group conferences.


Kirk Paul “sasNerd” Lafler

Kirk Paul Lafler is a SAS, SQL and Python consultant, application developer, programmer, and educator; an adjunct professor at San Diego State University; an advisor and adjunct professor at the University of California San Diego Extension; and teaches SAS, SQL, Python, Excel and cloud-based courses, workshops, and webinars around the world. Kirk has been a SAS consultant, application developer, and programmer since 1979; an SQL user since 1985; a Python programmer since 2017, and an author of several books including, Exploratory Data Analysis (EDA) By Example (PB&J Press. 2023) and PROC SQL: Beyond the Basics Using SAS, Third Edition (SAS Press. 2019) along with numerous papers and articles on a variety of SAS, SQL, and Python topics. Kirk has served as an Invited speaker, educator, keynote, and section leader at SAS conferences; and is the recipient of 27 “Best” contributed paper, hands-on workshop (HOW), and poster awards.


Isaiah Lankham

Isaiah Lankham is a polyglot data analyst for the University of California’s systemwide office in Oakland, CA, specializing in data analysis and visualization using Tableau, SAS, and Python. Initially trained as a mathematician and educator, Isaiah is also an adjunct faculty member for the Statistics Department at California State University, East Bay, regularly teaching graduate SAS programming courses.


Kevin Lee

Kevin Lee is a Data Scientist, statistician, Machine Learning working group lead, corporate/university trainer and evangelist in new technology. Kevin supports the pharmaceutical industry as AVP of AI/Machine Learning Consultant at Genpact. Among all the therapeutic areas, Kevin always loves oncology studies, and he is an active supporter of oncology-specific standards such as CDISC tumor datasets, controlled terminology and response criteria on each study type. Kevin wants to innovate the pharmaceutical industry with AI/Machine Learning technology, and he currently leads the PHUSE AI/Machine Learning Working Group. He also teaches Machine Learning and Python programming in university and corporations. Kevin has presented about 100 papers at various conferences including many oncology-related and Machine Learning based papers. Kevin earned an M.S. in Applied Statistics at Villanova University following a B.S. from University of Pennsylvania. Kevin is a life-time learner who loves to learn and share.


Joe Matise

Joe has over 15 years experience as a SAS developer, working primarily in healthcare research and social science research with a focus on data. He has presented at SAS Global Forum, WUSS, MWSUG, SESUG, TASS, BASUG, and more. In his free time, he is a proud parent of two pre-teens, whom he gets to play with while his wife helps develop new ways to fight cancer. He lives in the East Bay region of California.


Sandra Minjoe

Sandra Minjoe started programming in the pharma/biotech industry in 1993, and is a Senior Principal Clinical Data Standards Consultant at ICON PLC. Sandra is the former CDISC ADaM Team Lead, has been part of the ADaM team since 2001, proposed structures that became ADSL and OCCDS, and continues to work on sub-teams. In addition to her CDISC involvement, Sandra is an emeritus PharmaSUG Executive Committee member.


Matthew Slaughter

Matthew Slaughter, MSBA is an Advanced SAS Certified Programmer and a Statistical Research Analyst at the Kaiser Permanente Center for Health Research in Portland, Oregon. Specializing in clinical prediction modeling, Matthew provides data management, programming, and analytical support to research projects in various topic areas.


Mario Widel

Mario Widel is a statistical programmer at Reata Pharmaceuticals. He has been involved in CDISC related activities since 2007. In his current role, Mario focuses on process development for submission data and documentation. He is a member of the ADaM team, a CDISC authorized SDTM and ADaM instructor and has presented at numerous conferences including PharmaSUG, JSM, SAS Global Forum and PhUSE.