"

Introduction

Introduction to Data Science & Addiction Research Methods

Large multimodal biomedical datasets necessitate interdisciplinary research training curricula. In this textbook, “large multimodal datasets” refers to studies that follow thousands of participants and collect many different kinds of measures in the same people, for example survey and interview data, behavioral and cognitive tasks, electronic health records or indicators, genetic material, biospecimens, and biomarkers, geocoded neighborhood and policy context, smart phone data and novel technologies, and neuroimaging measures such as structural or functional brain variables. When these data streams are combined, they create research opportunities that were previously difficult or impossible, including linking social environments to biology, testing developmental pathways over time, and examining how risk and protection operate across levels from policy to brain to behavior.

For addiction research in particular, this shifts training demands. Researchers must be able to move across domains, translate theory into measurable variables, choose designs and tests that fit data, evaluate alternative explanations, and communicate findings responsibly. The intent of this textbook is to provide an introductory-level resource that helps students build those fundamentals while learning how to think and work with large, interdisciplinary datasets. The course GitHub repository is here.

This textbook is designed to be used alongside Vu & Harrington, Introductory Statistics for the Life and Biomedical Sciences (2021). Most modules are paired with specific sections from Vu and Harrington so that statistical concepts are introduced right when students need them for the applied research tasks in this course. A module-by-module pairing table appears below to guide your reading each week.

Vu, J., & Harrington, D. (2021). Introductory statistics for the life and biomedical sciences (1st ed., Version August 8, 2021). OpenIntro.

Click here to access Vu & Harrington (2021), OpenIntro Biostat

Table 1. Module pairing schedule with Vu & Harrington (2021)
Module DSARM module topic (this textbook) Paired statistical concepts from Vu & Harrington (2021)
M1 The Research Process & Data Ethics None (DSARM reading only on ethical data practices)
M2 Measuring Addiction & Youth Substance Use Exploratory data analysis; numerical and categorical summaries; graphical displays (§1.2–1.6, pp. 10–45)
M3 Attitudes & Environments; Data Cleaning None (DSARM reading only on data cleaning and missing data)
M4 Behavioral Genetics & Addiction Foundations of probability; rules of probability; random variables (§2.1–2.2, pp. 80–105)
M5 Polygenic Traits & Addiction Discrete and continuous probability distributions; binomial and normal distributions (§3.1–3.4, pp. 123–151)
M6 Social Determinants of Addiction Statistical inference; sampling variability; confidence intervals (§4.1–4.2, pp. 172–185)
M7 Public Policy & Addiction Hypothesis testing framework; test statistics; Type I and Type II errors (§4.3–4.4, pp. 185–200)
M8 Brain’s Reward System & Addiction One- and two-sided tests; interpreting p-values; decision rules (§5.1–5.3, pp. 208–225)
M9 Cue-Based Habits & Impulse Control Statistical power; one-way ANOVA; comparing multiple group means (§5.4–5.6, pp. 225–242)
M10 Capstone Integration of inference, hypothesis testing, and ANOVA concepts (DSARM synthesis)

 

License

Icon for the Creative Commons Attribution 4.0 International License

Data Science & Addiction Research Methods Copyright © by Jesse Liss is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.