PharmaSUG Japan
2021 Virtual Event
Thursday, November 11, 2021
Data Science for Everybody -明日から始めるデータサイエンス-
The 4th annual PharmaSUG Japan SDE was held as a virtual event this year, and was a great success! Presentations are available from the links below. Please help us plan for next year's event by completing our post-event survey by November 30, 2021. See you in 2022!
Thank you to our 2021 Premier Sponsor:

Thursday, November 11, 2021 Single-Day Event Presentations

Presentation (click for abstract)Presenter (click for bio)Slides
R Programming for ValidationKentaro Arai, Novartis Pharma K.K.Slides (PDF, 1.3 MB)
Democratize Advanced Analytics for Everyone While Maintaining the Integrity and Reproducibility of Evidence using SASToshiaki Habu and Toru Tsunoda, SAS Institute JapanSlides (PDF, 2.7 MB)
Building an Organization to Accelerate Data UtilizationHideki Ninomiya, DatackSlides (PDF, 2.1 MB)
Quickly Build an Experimental Cloud Environment for Data Pre-Processing and AnalysisHidenori Koizumi, and Chiaki Ishio, Amazon Web ServicesSlides (PDF, 5.1 MB)
Data Scientist in Shionogi : Education, Skills & TrainingAyaka Yamashita and Yoshitake Kitanishi, Shionogi & Co., Ltd.Slides (PDF, 2.4 MB)

Presentation Abstracts and Seminar Information

R Programming for Validation
Kentaro Arai, Novartis Pharma K.K.

R is an open source software and provides thousands of implementations for statistical analysis or data analysis. For analysis of clinical trials, we commonly use SAS. However, R may have many advantages compared to SAS. For example, R is a free software, easy to create graphics, there are many R users and etc. Currently, we are exploring the effective ways to use R in clinical trials. In this presentation, as the use cases, we will explain the way to use R for the validation of statistical results in clinical trials. The first case is about the validation for statistical outputs of clinical trial. In this clinical trial, almost all outputs were validated by R. The second case is the validation for ADaM data. We will share the contents of these use cases and pros/cons of using R.

Democratize Advanced Analytics for Everyone While Maintaining the Integrity and Reproducibility of Evidence using SAS
Toshiaki Habu, SAS Institute Japan
Toru Tsunoda, SAS Institute Japan

The data-driven approach is a key to make successful decisions, especially in the area of medical and life sciences. Statistical programming tools such as Base SAS are widely used. However, not everyone has been trained to do statistical programming within an organization, and it has not been cost-effective to enable all business users with programming skills. On the other hand, data visualization tools cannot cover the needs for complicated/advanced analytics, and it is not straightforward to maintain integrity and reproducibility of evidence when using open source software. In this presentation, we would like to introduce SAS’ latest solutions with user-friendly interfaces, including SAS Enterprise Guide, SAS Studio and SAS Health: Cohort Builder. These solutions support the democratization of advanced analytics while maintaining the integrity and reproducibility of evidence within an organization.

Building an Organization to Accelerate Data Utilization
Hideki Ninomiya, Datack

More and more companies are focusing on data utilization within pharmaceutical companies. On the other hand, some companies are still facing challenges. They have created a specialized departments for data utilization, but it has not yet been able to collaborate with the existing departments and has not produced results. Data utilization is limited to a few departments and members. There are many analysis needs in each department within the company, but not enough people to set up protocols. In this session, we will discuss how to accelerate the use of data, the skills and human resources required for data utilization, work flow, and building an infrastructure that can accelerate data utilization from the perspective of organizational development.

Quickly Build an Experimental Cloud Environment for Data Pre-Processing and Analysis
Hidenori Koizumi, Amazon Web Services
Chiaki Ishio, Amazon Web Services

There is a growing need for analyzing large amounts of data from various sources, such as COVID-19 cohort study data and genome datasets, to quickly transform the data into actionable insights. However, it takes time to prepare a server and set up an analysis tool on the experimental environment. In addition, data scientists spend much time on data preparation such as cleaning, normalizing, and transforming. In this session, we will introduce the latest customer case studies utilizing a variety of AWS services which support users in analyzing big data quickly. In addition, we will showcase a demo of building and deploying an experimental environment with code samples so that data scientists can leverage them for quick analysis.

Data Scientist in Shionogi : Education, Skills & Training
Ayaka Yamashita, Shionogi & Co., Ltd.
Yoshitake Kitanishi, Shionogi & Co., Ltd.

Recently, digital transformation has progressed in the pharmaceutical industry, and concept of pharmaceuticals has changed. With the evolution of information technology, there are many opportunities to handle various data. It is important to have the ability to combine and utilize data. In other words, the power of data literacy and data science is a key. Therefore, data scientists in our company perform data science program to grow up analytics translators who can discuss both business and data-driven perspectives. Data science enables analytics translators to work efficiently and create innovation. The important thing in data science activities is to run the cycle of hypothesis and verification with high quality and high speed. In this session, we will introduce some examples of data science cycles and the concept of "Shionogi Data Science Training Program".






  • データに基づき仮説立案・検証のサイクルを回す
  • データの概観把握と品質確認(分布・欠測値の確認)
  • プログラミング言語あるいはBIツールによる集計,可視化,モデリング
  • レポーティングとプレゼンテーション
*CRISP-DM: Cross-Industry Standard Process for Data Mining


LEVEL: Beginner - Intermediate

  • やる気を持ち,今後データサイエンスに挑戦したい方
  • データマネジメントもしくはデータ解析業務の経験が1年以上ある
  • SAS/R/Pythonなど,何らかのプログラミングの基礎知識がある(コードを読んで理解できる)
  • SAS/R/Pythonなど,データ解析を行える環境を整えたPCを持参できる
  • サンプルコードとしてはPythonで提供します(使用言語の希望によってSASでも提供を検討します).ハンズオンで使っていただくプログラミング言語やツールは問いません.

Presenter Biographies

Kentaro Arai

Kentaro Arai is a Statistical Programmer for Data Sciences & Scientific Operations (DSSO) in Novartis Pharma K.K. (NPKK). He works on creating analysis reports of clinical trials, and supports electronic data submissions. In addition, he leads use of R programming in his department.

Toshiaki Habu

Toshiaki Habu is the manager of the Life Sciences section in the consulting department at SAS institute Japan. He has worked at a pharmaceutical company, and he has a lot of experience in IT in the life sciences industry.

Chiaki Ishio

Chiaki Ishio is a Solutions Architect on the Process Manufacturing and Healthcare Life Sciences team at Amazon Web Services Japan. She helps customers design and build their systems on AWS.

Yoshitake Kitanishi

Yoshitake Kitanishi is the Vice President and Head of the Data Science Department at Shionogi & Co., Ltd, and has been using SAS/R/Spotfire for15+ years and Matlab/Python for 5+ years.

Hidenori Koizumi

Hidenori Koizumi is a Prototyping Solutions Architect in Japan’s Public Sector, and an expert in developing solutions in the research field based on his scientific background. He has been developing solutions with code such as AWS CDK.

Tadashi Matsuno

Tadashi Matsuno had been a recruiter and a career consultant at the Human Resources Department at Shionogi & Co., Ltd, before. Now, he belongs to the Data Science Department and has been developing analysis platforms and promoting the use of cutting-edge analysis technologies.

Hideki Ninomiya

After working in neurosurgery, Hideki Ninomiya was engaged in online disease encyclopedia and telemedicine at Medley Inc. As a data scientist at 3ida Inc, he was engaged in data analysis and AI development for companies. He founded Datack, Inc., which supports database research for pharmaceutical companies. While conducting medical data analysis, he realized the importance of data maintenance and organization, and was selected as one of the 30 doctors involved in Japan's medical innovation by "Medicine 4.0".

Ryo Soejima

Ryo Soejima has been working for Shionogi for 4+ years as a Medical Science Liaison and 2+ years as a Data Scientist. He is interested in Machine Learning and is going to try to become a Kaggle Master.

Toru Tsunoda

Toru Tsunoda is the customer advisor for SAS Institute Japan. After working for several management consulting firms, he joined SAS Institute Japan. He is responsible for the health and life science industries.

Ayaka Yamashita

Ayaka Yamashita has been working for Shionogi for 4+ years as a Data Scientist. She has been developing integrated database systems and focusing on training data scientists.