Do you have a tip that you find particularly useful? If so, you can share your handy tip(s) on PharmaSUG’s Tuesday Tips page. Submit tips about SAS, R, Python, use of CDISC standards, regulatory submissions, etc.- anything that you have found helpful and might be useful to someone else.
New tips will be published every Tuesday, while older tips will be archived and accessible to viewers on the PharmaSUG website. Watch this space for further details!
Weekly Tip for Oct. 20, 2020
Check your statistical assumptions!
Regression analyses are one of the main steps (aside from data cleaning, preparation, and descriptive analyses) in any analytic plan, regardless of plan complexity. Therefore, it is worth acknowledging that the choice and implementation of the wrong type of regression model, or the violation of its assumptions, can have detrimental effects to the results and future directions of any analysis. Considering this, it is important to understand the assumptions of these models and be aware of the processes that can be utilized to test whether these assumptions are being violated.
Keep in mind, each model has its own set of assumptions that must be met! For example:
- Common Assumptions of Parametric Tests: normality, homogeneity of variance, homogeneity of variance-covariance matrices, linear relationships, absence of multicollinearity, absence of autocorrelation, and randomization
- Logistic Regression Assumptions: dependent variable structure, observation independence, absence of multicollinearity, linearity of independent variables and log odds, and large sample size.
- Linear Regression Assumptions: linearity, multivariate normality, absence of multicollinearity and auto-correlation, homoscedasticity, and measurement level
If you find that a model assumption has been violated, have no fear! Most assumptions can be corrected with minimal impact on the interpretability of your results. Corrections can range from minor value transformations and the exclusion of noisy variables or observations to the consideration of a more appropriate model type.
Don’t forget to test for assumptions! If you run a model with data that does not match its theoretical assumptions, your results may cause more harm than running no model at all!
This week’s tip was contributed by Deanna Schreiber-Gregory. Deanna is a government contractor and independent consultant who specializes in statistics, research methods, and data management.