{"id":281,"date":"2026-03-01T13:43:51","date_gmt":"2026-03-01T13:43:51","guid":{"rendered":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/?post_type=chapter&#038;p=281"},"modified":"2026-03-03T20:27:19","modified_gmt":"2026-03-03T20:27:19","slug":"capstone","status":"publish","type":"chapter","link":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/chapter\/capstone\/","title":{"raw":"Capstone","rendered":"Capstone"},"content":{"raw":"<h2 data-start=\"0\" data-end=\"31\">Reading Objectives<\/h2>\r\n<ul>\r\n \t<li data-start=\"0\" data-end=\"197\">\r\n<p data-start=\"2\" data-end=\"197\"><strong data-start=\"2\" data-end=\"45\" data-is-only-node=\"\">Study Design and Inference:<\/strong> Use the inference hierarchy to explain what DSARM analyses can and cannot claim, and apply responsible language that matches an observational design.<\/p>\r\n<\/li>\r\n \t<li data-start=\"199\" data-end=\"390\">\r\n<p data-start=\"201\" data-end=\"390\"><strong data-start=\"201\" data-end=\"249\" data-is-only-node=\"\">Confounding and Causal Pathways:<\/strong> Define confounding, apply the H1\u2013H3 screening approach to evaluate a plausible third variable, and distinguish confounders from mediators.<\/p>\r\n<\/li>\r\n \t<li data-start=\"392\" data-end=\"631\">\r\n<p data-start=\"394\" data-end=\"631\"><strong data-start=\"394\" data-end=\"438\" data-is-only-node=\"\">Capstone Analysis Blueprint:<\/strong> Develop and carry out a defensible analysis plan using the DSARM wave structure (age 16 and age 21) and the course\u2019s allowed tests, including basic validity checks and transparent reporting.<\/p>\r\n<\/li>\r\n \t<li data-start=\"633\" data-end=\"875\" data-is-last-node=\"\">\r\n<p data-start=\"635\" data-end=\"875\" data-is-last-node=\"\"><strong data-start=\"635\" data-end=\"687\" data-is-only-node=\"\">Poster Simulation and Dissemination:<\/strong> Translate results into the required poster simulation components, emphasizing figure-centered communication, clear captions, and conclusions that reflect uncertainty and inference limits.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<h2>Key Terms<\/h2>\r\n<ul>\r\n \t<li data-start=\"1403\" data-end=\"1545\">\r\n<p data-start=\"1405\" data-end=\"1545\"><strong data-start=\"1405\" data-end=\"1424\">Confounder (C):<\/strong> A third variable associated with both the IV and DV that can bias the observed IV\u2013DV association if not accounted for.<\/p>\r\n<\/li>\r\n \t<li data-start=\"1547\" data-end=\"1694\">\r\n<p data-start=\"1549\" data-end=\"1694\"><strong data-start=\"1549\" data-end=\"1570\">Confounding bias:<\/strong> Distortion of an IV\u2013DV association due to a confounder, making an effect look larger, smaller, or present when it is not.<\/p>\r\n<\/li>\r\n \t<li data-start=\"1696\" data-end=\"1843\">\r\n<p data-start=\"1698\" data-end=\"1843\"><strong data-start=\"1698\" data-end=\"1711\">Mediator:<\/strong> A variable on the causal pathway between IV and DV; controlling for a mediator can remove part of the mechanism linking IV to DV.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<h2 data-start=\"0\" data-end=\"31\">1. Introduction: Capstone Project Overview<\/h2>\r\n<p data-start=\"33\" data-end=\"262\">Module 10 is the capstone for DSARM. The goal is to integrate what you have learned about study design, observational inference, confounding, basic hypothesis testing, and responsible communication into one coherent mini-project.<\/p>\r\n<p data-start=\"264\" data-end=\"648\">Your final deliverable is a <strong data-start=\"292\" data-end=\"313\">poster simulation<\/strong>, meaning you will build the core components of a scientific poster as a slide deck (not a full conference poster). This format is intentional. It trains you to communicate one clear research story, supported by well-labeled figures and careful interpretation, while staying honest about what observational data can and cannot justify.<\/p>\r\n<p data-start=\"650\" data-end=\"1131\">To complete the capstone, you will work with a synthetic ABCD-style dataset. You can access the dataset in the <a href=\"https:\/\/github.com\/jl2578\/dsarm-codespaces\">course GitHub repository<\/a>. It is designed to feel like a real, large-scale longitudinal dataset while remaining safe for teaching. You will choose a research question, identify one main predictor (IV) and one main outcome (DV), include at least one plausible third variable to consider as an alternative explanation, and then use the course\u2019s allowed tests to produce figures and results that you can explain clearly.<\/p>\r\n\r\n<table><caption><strong>Table 1.<\/strong> DSARM synthetic dataset overview and variable shopping checklist.<\/caption>\r\n<thead>\r\n<tr>\r\n<th style=\"text-align: left\">DSARM synthetic dataset<\/th>\r\n<th style=\"text-align: left\">Start with the data dictionary<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td style=\"vertical-align: top\">\r\n<ul>\r\n \t<li><strong>Time 1:<\/strong> age 16 wave<\/li>\r\n \t<li><strong>Time 2:<\/strong> age 21 wave<\/li>\r\n \t<li><strong>Longitudinal structure:<\/strong> the same individuals are measured twice; most projects will use Time 1, Time 2, or a Time 2 minus Time 1 difference score<\/li>\r\n \t<li><strong>Data types:<\/strong> survey, behavioral measures, imaging ROI variables, substance use, environment<\/li>\r\n \t<li><strong>Note:<\/strong> no twin subsample in this dataset<\/li>\r\n<\/ul>\r\n<\/td>\r\n<td style=\"vertical-align: top\">\r\n<ul>\r\n \t<li>Pick <strong>one main DV<\/strong>, <strong>one IV<\/strong>, and <strong>at least one plausible third variable (C)<\/strong> to evaluate alternative explanations<\/li>\r\n \t<li>Check levels of measurement and feasibility with the <strong>allowed course tests<\/strong><\/li>\r\n \t<li>Confirm which wave(s) each variable comes from and whether a difference score is appropriate for your question<\/li>\r\n<\/ul>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h2 data-start=\"111\" data-end=\"186\">2. Study Design and Levels of Inference (Refresher \u2192 Inference Hierarchy)<\/h2>\r\n<p data-start=\"221\" data-end=\"494\">Your conclusions can only be as strong as your design. This section offers a quick refresher on study design (introduced in Module 2) to orient you toward the inference hierarchy you will use to frame your data analyses.<\/p>\r\n\r\n<h3 data-start=\"496\" data-end=\"546\">2.A. Experimental vs. Observational Study Designs<\/h3>\r\n<ul data-start=\"548\" data-end=\"1539\">\r\n \t<li data-start=\"548\" data-end=\"980\">\r\n<p data-start=\"374\" data-end=\"883\"><strong data-start=\"374\" data-end=\"423\">Experimental design (best for causal claims).<\/strong> In an experimental study, researchers <strong data-start=\"462\" data-end=\"472\">assign<\/strong> an exposure or intervention to participants, ideally using <strong data-start=\"532\" data-end=\"553\">random assignment<\/strong>. Random assignment aims to make groups comparable at baseline, so any later differences in outcomes are less likely to be \u201cfalse alarms\u201d caused by pre-existing group differences. When an experiment is well-executed, outcome differences can be more credibly linked to the intervention itself.<\/p>\r\n<\/li>\r\n \t<li data-start=\"548\" data-end=\"980\">\r\n<p data-start=\"374\" data-end=\"883\"><strong data-start=\"885\" data-end=\"936\">Observational design (what ABCD and DSARM are).<\/strong> In an observational study, researchers <strong data-start=\"976\" data-end=\"1010\">measure what naturally happens<\/strong> without assigning exposure. This design can reveal real-world patterns and associations, but it is also vulnerable to <strong data-start=\"1129\" data-end=\"1157\">alternative explanations<\/strong>. People who differ on the exposure often differ on other characteristics too, which can make an association look causal when it is actually driven by hidden baseline differences or \u201cthird-variable\u201d influences. That is why observational findings require more cautious interpretation.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<h3 data-start=\"1541\" data-end=\"1617\">2.B. Within observational: Cross\u2011sectional vs. Longitudinal<\/h3>\r\n<ul>\r\n \t<li><strong data-start=\"1540\" data-end=\"1582\">Cross-sectional studies (single wave).<\/strong> Cross-sectional designs measure exposure and outcome at the same time. They provide a snapshot of relationships in a population, but they generally cannot establish temporal order. If X and Y are measured simultaneously, it is hard to know whether X preceded Y, Y preceded X, or whether both reflect other background factors.<\/li>\r\n<\/ul>\r\n<ul>\r\n \t<li data-start=\"1540\" data-end=\"1946\"><strong data-start=\"1948\" data-end=\"1990\">Longitudinal studies (multiple waves).<\/strong> Longitudinal designs measure the same individuals repeatedly over time. This can strengthen inference because it helps establish temporal order (X measured before Y) and allows you to study change. Still, time ordering alone does not guarantee causation, because third-variable influences can still shape both the predictor and the outcome over time.<\/li>\r\n<\/ul>\r\n<h3 data-start=\"2381\" data-end=\"2456\">2.C. Design features that strengthen inference (the bridge to the hierarchy)<\/h3>\r\n<p data-start=\"2458\" data-end=\"2522\">Certain design features move a study closer to causal inference:<\/p>\r\n\r\n<ul data-start=\"2524\" data-end=\"3430\">\r\n \t<li data-start=\"2524\" data-end=\"2693\">\r\n<p data-start=\"2526\" data-end=\"2693\"><strong data-start=\"2526\" data-end=\"2547\">Random assignment<\/strong> reduces systematic baseline differences between groups, which reduces bias from hidden group differences.<\/p>\r\n<\/li>\r\n \t<li data-start=\"2694\" data-end=\"3006\">\r\n<p data-start=\"2696\" data-end=\"3006\"><strong data-start=\"2696\" data-end=\"2729\">Natural and quasi-experiments<\/strong> exploit external events that create \u201cas-if random\u201d exposure. A classic example is Oregon\u2019s Medicaid lottery, which allowed researchers to compare lottery-selected vs non-selected groups in a way that closely resembles random assignment.<\/p>\r\n<\/li>\r\n \t<li data-start=\"3007\" data-end=\"3187\">\r\n<p data-start=\"3009\" data-end=\"3187\"><strong data-start=\"3009\" data-end=\"3026\">Time ordering<\/strong> (longitudinal data) helps answer \u201cwhat came first,\u201d but it still does not remove all alternative explanations by itself.<\/p>\r\n<\/li>\r\n \t<li data-start=\"3188\" data-end=\"3430\">\r\n<p data-start=\"3190\" data-end=\"3430\"><strong data-start=\"3190\" data-end=\"3239\">Better comparisons and statistical adjustment<\/strong> (clear comparison groups, covariates, and sensitivity checks) can reduce bias from third-variable influences, but they cannot eliminate it completely.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<h3 data-start=\"3432\" data-end=\"3493\">2.D. Inference hierarchy (stronger \u2192 weaker for causal claims)<\/h3>\r\n<p data-start=\"3495\" data-end=\"3550\">A common hierarchy for causal strength looks like this:<\/p>\r\n\r\n<ol data-start=\"3552\" data-end=\"4299\">\r\n \t<li data-start=\"3552\" data-end=\"3672\">\r\n<p data-start=\"3555\" data-end=\"3672\"><strong data-start=\"3555\" data-end=\"3581\">Randomized experiments<\/strong> (strongest for causal claims when conducted well).<\/p>\r\n<\/li>\r\n \t<li data-start=\"3673\" data-end=\"3828\">\r\n<p data-start=\"3676\" data-end=\"3828\"><strong data-start=\"3676\" data-end=\"3707\">Natural \/ quasi-experiments<\/strong> (can approach causal inference when \u201cas-if random\u201d assumptions are plausible).<\/p>\r\n<\/li>\r\n \t<li data-start=\"3829\" data-end=\"4026\">\r\n<p data-start=\"3832\" data-end=\"4026\"><strong data-start=\"3832\" data-end=\"3870\">Longitudinal observational studies<\/strong> (temporal order helps; causal insight is limited and depends on how well alternative explanations are addressed).<\/p>\r\n<\/li>\r\n \t<li data-start=\"4027\" data-end=\"4152\">\r\n<p data-start=\"4030\" data-end=\"4152\"><strong data-start=\"4030\" data-end=\"4071\">Cross-sectional observational studies<\/strong> (association only; direction unclear).<\/p>\r\n<\/li>\r\n \t<li data-start=\"4153\" data-end=\"4299\">\r\n<p data-start=\"4156\" data-end=\"4299\"><strong data-start=\"4156\" data-end=\"4194\">Case studies \/ descriptive reports<\/strong> (excellent for hypothesis generation, not for causal testing).<\/p>\r\n<\/li>\r\n<\/ol>\r\n[caption id=\"attachment_284\" align=\"aligncenter\" width=\"1024\"]<img class=\"wp-image-284 size-large\" src=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-09_00_20-AM-1024x683.png\" alt=\"Infographic showing a hierarchy of study designs from stronger to weaker causal inference: randomized experiment, natural or quasi-experiment, longitudinal observational, cross-sectional observational, and case study or descriptive. A side panel lists responsible language to use (associated with, linked to, correlated with, predicts, is consistent with) and language to avoid (caused, led to, reduced, increased).\" width=\"1024\" height=\"683\" \/> Figure 1. Hierarchy of study designs ranked by strength of causal inference, with guidance on responsible language for interpreting findings. Created with generative AI.[\/caption]\r\n<h3 data-start=\"4493\" data-end=\"4540\">2.E. What this means for DSARM capstone projects<\/h3>\r\n<p data-start=\"4542\" data-end=\"4630\">DSARM capstone projects will fall in <strong data-start=\"4584\" data-end=\"4596\">#3 or #4<\/strong>. You can strengthen inference by:<\/p>\r\n\r\n<ul data-start=\"4631\" data-end=\"4930\">\r\n \t<li data-start=\"4631\" data-end=\"4705\">\r\n<p data-start=\"4633\" data-end=\"4705\">using multiple waves when possible (so you can speak to temporal order),<\/p>\r\n<\/li>\r\n \t<li data-start=\"4706\" data-end=\"4765\">\r\n<p data-start=\"4708\" data-end=\"4765\">defining clear comparison groups (more apples-to-apples),<\/p>\r\n<\/li>\r\n \t<li data-start=\"4766\" data-end=\"4930\">\r\n<p data-start=\"4768\" data-end=\"4930\">reporting unadjusted vs adjusted models to show whether the association is robust to plausible alternative explanations. Comparing unadjusted and adjusted models allows you to see whether the observed association persists after accounting for plausible alternative explanations, which strengthens the credibility of your interpretation.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<h3 data-start=\"4932\" data-end=\"4994\">2.F. Responsible language (keep your claims inside your design)<\/h3>\r\n<p data-start=\"4996\" data-end=\"5042\">Your interpretations should match your design:<\/p>\r\n\r\n<ul data-start=\"5044\" data-end=\"5334\">\r\n \t<li data-start=\"5044\" data-end=\"5146\">\r\n<p data-start=\"5046\" data-end=\"5146\">Prefer: \u201cis associated with,\u201d \u201cis linked to,\u201d \u201ccorrelates with,\u201d \u201cpredicts,\u201d \u201cis consistent with.\u201d<\/p>\r\n<\/li>\r\n \t<li data-start=\"5147\" data-end=\"5334\">\r\n<p data-start=\"5149\" data-end=\"5334\">Avoid causal verbs like \u201ccauses,\u201d \u201cleads to,\u201d \u201creduces,\u201d or \u201cincreases\u201d unless you truly have a randomized or strong quasi-experimental design.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p data-start=\"5336\" data-end=\"5355\">Example contrast:<\/p>\r\n\r\n<ul data-start=\"5356\" data-end=\"5487\">\r\n \t<li data-start=\"5356\" data-end=\"5441\">\r\n<p data-start=\"5358\" data-end=\"5441\">Appropriate: \u201cHigher baseline screen time predicts lower attention at follow-up.\u201d<\/p>\r\n<\/li>\r\n \t<li data-start=\"5442\" data-end=\"5487\">\r\n<p data-start=\"5444\" data-end=\"5487\">Overreach: \u201cScreen time reduced attention.\u201d<\/p>\r\n<\/li>\r\n<\/ul>\r\n<h2 data-start=\"0\" data-end=\"61\">3. Confounding: Identifying and Addressing Third Variables<\/h2>\r\n<h3 data-start=\"63\" data-end=\"119\">3.A. Why confounding matters (one motivating example)<\/h3>\r\n<p data-start=\"121\" data-end=\"955\">Imagine you find a strong pattern in ABCD: adolescents who report frequent energy drink use at Time 1 also report higher anxiety at Time 2. It is very tempting to tell a simple story, such as \u201cenergy drinks cause later anxiety.\u201d The problem is that observational datasets are full of clustered life circumstances, so an apparent relationship can be a \u201cfalse positive\u201d for causation. Teens who drink a lot of energy drinks might also be sleeping less, under more academic pressure, experiencing higher baseline anxiety, living with more family stress, or embedded in peer contexts that increase both energy drink use and anxiety risk. In other words, the association may be real, but the causal explanation may be wrong. This is why observational findings can sound causal even when they are not.<\/p>\r\n\r\n<h3 data-start=\"957\" data-end=\"1015\">3.B. Definition (what a confounder is and what it does)<\/h3>\r\n<p data-start=\"1017\" data-end=\"1785\">A confounder is a third variable that is not the exposure and not the outcome, but is related to both in a way that biases the association we estimate. A useful way to think about it is \u201cshared causes.\u201d If the exposure and the outcome share a common cause, then part of the observed exposure\u2013outcome relationship can reflect background risk differences rather than an effect of the exposure itself. This third-variable bias can make an association look larger than it really is, create an association that is mostly spurious, or even hide and reverse an association, depending on how the shared causes are distributed across the groups you are comparing.<\/p>\r\n\r\n\r\n[caption id=\"attachment_286\" align=\"aligncenter\" width=\"1024\"]<img class=\"wp-image-286 size-large\" src=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-04_42_05-PM-1024x683.png\" alt=\"Diagram showing a confounding variable example. Independent variable: social media use. Dependent variable: sleep quality. Confounder: stress level. Arrows show stress level influencing both social media use and sleep quality, and a dashed arrow from social media use to sleep quality. Text notes that a confounder influences both IV and DV and that ignoring the confounder can bias the association.\" width=\"1024\" height=\"683\" \/> Figure 2. Illustration of a confounder (stress level) influencing both the independent variable (social media use) and the dependent variable (sleep quality), demonstrating how failing to adjust for confounding can bias observed associations in observational research.[\/caption]\r\n<p data-start=\"1787\" data-end=\"2589\">This is one reason randomized experiments sit at the top of most inference hierarchies. Random assignment is designed to balance both known and unknown confounders across groups. Randomization makes it less likely that baseline differences (e.g., biases) explain outcome differences. In observational studies, like ABCD, you do not get that randomization, so you rely on measurement and adjustment. This entails identifying plausible third variables, measuring them well, and testing whether your main association changes when you account for them. Even then, observational work cannot guarantee that every relevant third variable has been measured, so responsible interpretation requires some humility.<\/p>\r\n\r\n<h3>3.C. Common ways researchers test and account for confounding<\/h3>\r\nIn observational research, you cannot rely on random assignment to balance background differences between groups, so you have to reduce bias through analysis choices. A common starting point is <strong>regression adjustment<\/strong>, where you include plausible confounders as covariates so the IV\u2013DV association is estimated while holding those third variables constant. In its simplest form, this looks like a multiple regression model such as DV = \u03b2\u2080 + \u03b2\u2081(IV) + \u03b2\u2082(C) + \u2026 . The key diagnostic is what happens to the IV estimate (\u03b2\u2081) once the confounder enters the model. If \u03b2\u2081 shrinks substantially or becomes non-significant, the original association may have been largely explained by C. If \u03b2\u2081 stays similar, the association is more robust to that confounder, though it can still be vulnerable to unmeasured third variables.\r\n\r\nTo make this logic transparent, researchers typically use <strong>model comparison<\/strong> and report results from both an unadjusted and an adjusted model. Model A estimates DV ~ IV, which describes the raw association. Model B estimates DV ~ IV + C (and possibly additional confounders). You then compare the IV coefficient, uncertainty, and model N across Model A and Model B and describe how the interpretation changes.\r\n\r\nResearchers also sometimes use <strong>stratification or matching<\/strong> to make comparisons more apples-to-apples. The idea is to compare high vs. low exposure groups within levels of the confounder, such as comparing high vs. low screen time among youth who have similar sleep duration. This approach can be useful for intuition and for simple visualizations. In this course, regression adjustment will usually be the primary tool.\r\n\r\nFinally, when working with longitudinal data, many researchers use <strong>baseline outcome control<\/strong> as an additional guardrail. If you have a baseline measure of the DV, you can model the follow-up DV while controlling for the baseline DV, for example: Time 2 attention ~ Time 1 screen time + Time 1 attention + confounders. This reduces confounding from stable individual differences that influence the outcome and reframes the question as whether the IV predicts change or divergence over time. A short caution is worth remembering. Baseline control can over-control in cases where the IV already influenced the baseline DV, but in most observational follow-up analyses it is a strong default.\r\n<h3>3.D. The \u201ccrude\u201d H1\u2013H3 approach<\/h3>\r\nBefore you use regression adjustment, it helps to build intuition for what a confounder looks like in the data. In the capstone, we use a deliberately \u201ccrude\u201d screening approach that asks a simple question: does a third variable C show up in the story on both sides, meaning it is related to the predictor and the outcome? The goal here is not to prove causation or to claim you have identified the one true confounder. The goal is to develop a disciplined habit of checking whether your main IV\u2013DV relationship could plausibly reflect shared background differences rather than a direct effect.\r\n\r\nWe do this using a small bundle of hypothesis tests (our H1\u2013H3 framework in the capstone). First, you test <strong>H1<\/strong>, the primary association you actually care about, by estimating a simple unadjusted model such as DV ~ IV. Next, you test whether your candidate confounder is tied to the predictor by checking <strong>H2<\/strong> (IV ~ C). Then you test whether the same candidate is tied to the outcome by checking <strong>H3<\/strong> (DV ~ C). If both H2 and H3 are supported, C becomes a plausible confounder because it is associated with both sides of the relationship you are trying to interpret. At that point, your original H1 association may be partly or largely explained by C, even if H1 was statistically significant.\r\n\r\nOnce you have that basic intuition, you move from \u201cscreening\u201d to \u201caccounting.\u201d You fit an adjusted model such as DV ~ IV + C (and possibly additional confounders) and compare the IV estimate in the adjusted model to the IV estimate in the unadjusted model. If the IV effect shrinks a lot, that suggests the unadjusted association was strongly sensitive to C. If it stays similar, the association is more robust to that particular alternative explanation. This simple workflow is not the final word on confounding, but it is a reliable first step that will keep your interpretations honest and your methods transparent.\r\n<h3>3.E. Confounders vs. mediators (do not block the mechanism by accident)<\/h3>\r\nNot every third variable belongs in your \u201ccontrol variables\u201d list. Some variables are true confounders, meaning they sit outside the relationship you care about and create background differences that can bias the IV\u2013DV association. Others are mediators, meaning they are part of the causal pathway through which the IV exerts its influence on the DV. The distinction matters because controlling for a mediator can accidentally remove the very process you are trying to understand.\r\n\r\nA <strong>confounder<\/strong> is an outside cause that influences both the IV and the DV. If you do not account for it, you risk attributing an association to the IV when it may actually reflect shared background causes. In contrast, a <strong>mediator<\/strong> is a step in the chain from IV to DV. If your research question is about the total relationship between IV and DV, controlling for the mediator can \u201cblock\u201d the pathway and make the IV look less important than it truly is.\r\n\r\n[caption id=\"attachment_288\" align=\"aligncenter\" width=\"1024\"]<img class=\"wp-image-288 size-large\" src=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-05_36_51-PM-1024x575.png\" alt=\"Side-by-side diagram comparing a confounder and a mediator. Left panel shows a confounder (shared risk context such as stress or neighborhood exposure) influencing both peer substance use (IV) and personal substance use (DV), creating potential bias. Right panel shows a mediator (attitudes toward drugs) lying on the pathway between peers\u2019 substance use (IV) and personal substance use (DV), with a note that controlling for the mediator blocks part of the causal effect.\" width=\"1024\" height=\"575\" \/> Figure 3. Diagram illustrating the difference between a confounder and a mediator: a confounder influences both the independent and dependent variables and can bias associations if unaccounted for, whereas a mediator lies on the causal pathway and represents part of the mechanism linking the independent variable to the outcome. Created with generative AI.[\/caption]\r\n\r\nHere is a quick example. Suppose your IV is peer substance use and your DV is your own substance use. A plausible mediator is your attitudes toward drugs. Peers can shape attitudes, and attitudes can shape behavior. If you control for attitudes in a model, you may be removing part of the mechanism by which peers influence use. That is not wrong in every situation, but it changes the question you are answering. Instead of estimating the total association between peers and your own use, you are estimating what is left after stripping out the attitude pathway.\r\n<h2>4. Capstone Analysis Blueprint (From Research Question to Results)<\/h2>\r\nThe capstone deliverable of this course is a poster simulation slide deck that contains the core poster components. This Section 4 focuses on producing the results that will populate those slides. Because our synthetic ABCD dataset is observational, your main job is to (1) make your analysis plan explicit before you run models, (2) run a small set of primary tests that match your design, and (3) communicate what your results do and do not justify.\r\n<h3>4.A. Define the research question and variables (operationalize early)<\/h3>\r\nStart by writing your research question in one sentence in a way that can be answered with DSARM variables. Because DSARM includes <strong>Time 1 (age 16)<\/strong> and <strong>Time 2 (age 21)<\/strong>, you can frame questions that use one wave or both waves, and you can treat the five-year span as meaningful when it fits your topic.\r\n\r\nHere are flexible one-sentence templates you can use:\r\n<ul>\r\n \t<li><strong>Cross-sectional (single wave):<\/strong> \u201cAt age 16 (Time 1), is X associated with Y?\u201d or \u201cAt age 21 (Time 2), is X associated with Y?\u201d<\/li>\r\n \t<li><strong>Prospective prediction (two waves):<\/strong> \u201cDoes X at age 16 (Time 1) predict Y at age 21 (Time 2)?\u201d<\/li>\r\n \t<li><strong>Change over time (two waves):<\/strong> \u201cDoes X at age 16 (Time 1) predict change in Y from age 16 to age 21?\u201d<\/li>\r\n \t<li><strong>Co-change (requires X and Y at both waves):<\/strong> \u201cDo changes in X from age 16 to 21 track with changes in Y over the same period?\u201d<\/li>\r\n \t<li><strong>Group differences in change:<\/strong> \u201cDo groups defined at Time 1 (for example high vs low X) differ in how much Y changes from 16 to 21?\u201d<\/li>\r\n<\/ul>\r\nOnce you have the question, convert it into concrete analytic ingredients:\r\n<ul>\r\n \t<li>Identify your <strong>IV<\/strong> (predictor\/exposure) and <strong>DV<\/strong> (outcome).<\/li>\r\n \t<li>List your planned <strong>covariates<\/strong>, especially variables you will adjust for as alternative explanations.<\/li>\r\n \t<li>Confirm how each variable is measured (continuous, binary, ordinal, categorical).<\/li>\r\n \t<li>Write down exactly which wave each variable comes from: <strong>Time 1 (age 16)<\/strong> and\/or <strong>Time 2 (age 21)<\/strong>.<\/li>\r\n \t<li>Decide whether you need derived variables (sum scores, composites, or change scores like Time 2 minus Time 1), and write down the exact recipe.<\/li>\r\n<\/ul>\r\nA practical rule: if you cannot state exactly how the DV is measured and which wave it comes from, you do not yet have an analysis plan.\r\n<h3>4.B. Choose design and timeframe (match the question)<\/h3>\r\nNow that you have defined your IV, DV, covariates, and wave(s), choose a design that matches that question and record the inference limits that come with it. DSARM and ABCD are <strong>observational<\/strong> datasets. Exposures are measured rather than assigned, so your conclusions should be framed as <strong>associations<\/strong> or <strong>predictions<\/strong>, not causal effects.\r\n\r\nYour main design choice is straightforward:\r\n<ul>\r\n \t<li><strong>Cross-sectional<\/strong> means you are analyzing a single wave (Time 1 at age 16, or Time 2 at age 21). This supports clear descriptive and associative claims at that age, but it does not establish direction.<\/li>\r\n \t<li><strong>Longitudinal (two-wave)<\/strong> means you are using both waves across the five-year span. This supports statements about temporal ordering, such as whether Time 1 predicts Time 2, but it still does not prove causation.<\/li>\r\n<\/ul>\r\nConcrete decisions to record up front:\r\n<ul>\r\n \t<li>Is this <strong>cross-sectional<\/strong> (single wave) or <strong>two-wave longitudinal<\/strong> (Time 1 \u2192 Time 2)?<\/li>\r\n \t<li>If two-wave longitudinal, what is the exact ordering (for example, <strong>Time 1 predictor \u2192 Time 2 outcome<\/strong>)?<\/li>\r\n \t<li>Will you include <strong>baseline outcome control<\/strong> when available (for example, modeling the Time 2 outcome while controlling the Time 1 level of the same outcome)? If yes, state why, since it changes the interpretation toward change over time rather than simple prediction.<\/li>\r\n<\/ul>\r\n<h3>4.C. Build a reproducible project plan before running models<\/h3>\r\nIn research, \u201creproducible\u201d means a reader can trace every statistic and finding in your results back to a saved output that was generated by code. This is not busywork. It is how you prevent results from drifting as you revise notebooks, rerun cells, or update plots. Equally important, this is how other researchers verify your findings.\r\n\r\nSet up the structure before you analyze:\r\n<ul>\r\n \t<li>Create a clear folder structure (for example: <code>data_raw\/<\/code>, <code>data_clean\/<\/code>, <code>scripts\/<\/code>, <code>outputs\/<\/code>, <code>figures\/<\/code>, <code>poster\/<\/code>).<\/li>\r\n \t<li>Use consistent filenames that record wave(s), variable set, and version\/date (for example: <code>t1_age16_screen_attention_v1.csv<\/code>).<\/li>\r\n \t<li>Keep an analysis log (a short markdown file is fine) that records:\r\n<ul>\r\n \t<li>variables and waves used (and why),<\/li>\r\n \t<li>cleaning rules and exclusions,<\/li>\r\n \t<li>derived-variable recipes,<\/li>\r\n \t<li>model formulas you ran,<\/li>\r\n \t<li>final analytic N for each model.<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li>Use a \u201cno handcrafted figures\u201d rule. Every plot should be generated from code and saved with a stable filename.<\/li>\r\n<\/ul>\r\n<h3>4.D. Execute the analysis sequence (what you actually run)<\/h3>\r\nThis is the core workflow. It is intentionally simple so you can do it well and explain it clearly.\r\n\r\n<strong>1) Data audit and cleaning plan<\/strong>\r\nConfirm coding, missingness rules, and exclusions. Track sample size changes as you clean, because changing N changes interpretation.\r\n\r\n<strong>2) Descriptives and visuals<\/strong>\r\nSummarize distributions for IV, DV, and key covariates. Make at least one plot matched to the question type (scatterplot, boxplot, bar chart with uncertainty). The plot should reflect your design choice, meaning it should clearly indicate whether you are describing a single-wave association or a Time 1 to Time 2 relationship.\r\n\r\n<strong>3) Primary model (H1)<\/strong>\r\nFit the unadjusted association: <strong>DV ~ IV<\/strong>. Save the estimate, uncertainty (CI if available), p-value if used, and model N.\r\n\r\n<strong>4) Confounding checks and adjusted model<\/strong>\r\nFit the adjusted model: <strong>DV ~ IV + confounders<\/strong> (and baseline DV if longitudinal). Compare unadjusted vs adjusted IV estimates and describe what changed.\r\n\r\n<strong>5) Sensitivity \/ robustness (optional, lightweight)<\/strong>\r\nRun one extra check only, such as an alternative operationalization or one additional covariate set. Label anything beyond the primary plan as exploratory.\r\n<h3 data-start=\"622\" data-end=\"699\">4.E. Assumptions and validity checks (for t-tests, ANOVA, and correlations)<\/h3>\r\n<p data-start=\"701\" data-end=\"981\">Every statistical test makes assumptions. You do not need perfection, but you do need to know when a result might be fragile. In this capstone, your checks should match the tests you actually use, which are group comparisons (t-tests\/ANOVA) and simple associations (correlations).<\/p>\r\n<p data-start=\"983\" data-end=\"1009\">Minimal checks to include:<\/p>\r\n\r\n<ul data-start=\"1011\" data-end=\"1699\">\r\n \t<li data-start=\"1011\" data-end=\"1103\">\r\n<p data-start=\"1013\" data-end=\"1103\"><strong data-start=\"1013\" data-end=\"1026\">Outliers:<\/strong> check for extreme values that could drive group differences or correlations.<\/p>\r\n<\/li>\r\n \t<li data-start=\"1104\" data-end=\"1216\">\r\n<p data-start=\"1106\" data-end=\"1216\"><strong data-start=\"1106\" data-end=\"1129\">Distribution shape:<\/strong> use a histogram or boxplot to see skew and heavy tails (especially for small samples).<\/p>\r\n<\/li>\r\n \t<li data-start=\"1217\" data-end=\"1347\">\r\n<p data-start=\"1219\" data-end=\"1347\"><strong data-start=\"1219\" data-end=\"1263\">Equal variances (for group comparisons):<\/strong> compare group spreads; if variances differ, use <strong data-start=\"1312\" data-end=\"1330\">Welch\u2019s t-test<\/strong> when applicable.<\/p>\r\n<\/li>\r\n \t<li data-start=\"1348\" data-end=\"1468\">\r\n<p data-start=\"1350\" data-end=\"1468\"><strong data-start=\"1350\" data-end=\"1366\">Group sizes:<\/strong> note when one group is much smaller than another, since this can affect stability and interpretation.<\/p>\r\n<\/li>\r\n \t<li data-start=\"1469\" data-end=\"1556\">\r\n<p data-start=\"1471\" data-end=\"1556\"><strong data-start=\"1471\" data-end=\"1488\">Independence:<\/strong> note clustering (site, school, family) as a limitation if relevant.<\/p>\r\n<\/li>\r\n \t<li data-start=\"1557\" data-end=\"1699\">\r\n<p data-start=\"1559\" data-end=\"1699\"><strong data-start=\"1559\" data-end=\"1580\">For correlations:<\/strong> inspect a scatterplot to ensure the relationship is not being driven by a single outlier or a weird nonlinear pattern.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p data-start=\"1701\" data-end=\"1751\">If a check raises concerns, document what you did:<\/p>\r\n\r\n<ul data-start=\"1753\" data-end=\"1951\">\r\n \t<li data-start=\"1753\" data-end=\"1810\">\r\n<p data-start=\"1755\" data-end=\"1810\">use a more robust option (for example, Welch\u2019s t-test),<\/p>\r\n<\/li>\r\n \t<li data-start=\"1811\" data-end=\"1890\">\r\n<p data-start=\"1813\" data-end=\"1890\">rerun after a clearly justified recode or exclusion rule (with transparency),<\/p>\r\n<\/li>\r\n \t<li data-start=\"1891\" data-end=\"1951\">\r\n<p data-start=\"1893\" data-end=\"1951\">or keep the analysis but interpret cautiously and say why.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<h3>4.F. Multiple testing and transparency rules (so results mean something)<\/h3>\r\nWhen you run many tests, the chance of a false positive increases. This is basic probability, not a character flaw. The fix is to keep your primary analysis tight, label exploratory work honestly, and use a correction when you are doing many related comparisons.\r\n\r\nGuardrails to build into your plan:\r\n<ul>\r\n \t<li>Pre-limit your primary test set (one primary DV or one primary model).<\/li>\r\n \t<li>Label analyses as <strong>confirmatory<\/strong> (planned) versus <strong>exploratory<\/strong> (hypothesis-generating).<\/li>\r\n \t<li>If you run many related tests, use a correction strategy and name it:\r\n<ul>\r\n \t<li><strong>Bonferroni<\/strong> (strict), or<\/li>\r\n \t<li><strong>False Discovery Rate (FDR)<\/strong> control (common in multi-test settings). (Wiley Online Library)<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\nA simple reporting norm is to put the correction rule in Methods and keep the interpretation conservative, especially for exploratory results.\r\n<h3>4.G. Interpretation guardrails (ethics + uncertainty)<\/h3>\r\nYour interpretation should match your design, your analytic choices, and your uncertainty. Start by reporting what you actually analyzed. State the final analytic sample size used in each key test and explain why it changed, because missing data, exclusions, and recoding decisions are part of the meaning of the study, not just technical details.\r\n<ul>\r\n \t<li><strong>Statistical uncertainty:<\/strong>\r\n<ul>\r\n \t<li>Emphasize effect sizes and the stability of the pattern shown in the figure.<\/li>\r\n \t<li>Do not treat a single p-value threshold as a truth machine.<\/li>\r\n \t<li>If available, report uncertainty information such as confidence intervals or standard errors.<\/li>\r\n \t<li>Do not let uncertainty metrics replace substantive interpretation of the effect.<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li><strong>Causal uncertainty:<\/strong>\r\n<ul>\r\n \t<li>Because the dataset is observational, associations may reflect alternative explanations.<\/li>\r\n \t<li>Keep verbs aligned to the inference hierarchy.<\/li>\r\n \t<li>\u201cAssociated with,\u201d \u201cdiffers from,\u201d and \u201cpredicts\u201d (when Time 1 precedes Time 2) are usually appropriate.<\/li>\r\n \t<li>Avoid using \u201ccauses\u201d for analyses.<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li><strong>Generalizability:<\/strong>\r\n<ul>\r\n \t<li>Be explicit about what your findings do and do not generalize to.<\/li>\r\n \t<li>Note that the dataset is synthetic and context-specific.<\/li>\r\n \t<li>Acknowledge the wave structure spanning age 16 to age 21 when discussing scope and limits.<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\nFinally, apply an ethical lens to interpretation. Avoid stigmatizing language when describing group differences, and focus on mechanisms, context, and uncertainty. A strong capstone conclusion reads as careful and credible because it is honest about limits, transparent about decisions, and disciplined about claims.\r\n<h2>5. What a scientific poster is (and what it is not)<\/h2>\r\nSection 5 shows how to turn the outputs from Section 4 into the required poster simulation components. A scientific poster is a conference communication format built for speed and interaction. In most poster sessions, people are moving, scanning, and deciding quickly what is worth a closer look. A good poster is designed to be readable at a glance and useful during conversation. It functions as a visual aid for the short spoken explanations you give when someone stops at your poster.\r\n\r\nThat is why posters exist alongside papers and talks. A paper is built for depth and permanence. It can include full methods, nuance, and detailed analysis, and readers engage with it over time. A talk is built for a guided story delivered to a captive audience in a fixed time slot. A poster sits in between. It is a \u201csnapshot\u201d of the work that helps you engage colleagues in dialogue, get feedback, and spark follow-up discussions. <a href=\"https:\/\/www.training.nih.gov\/creating-a-scientific-poster\/\">NIH\u2019s guidance<\/a> makes the same point operationally: you should be able to deliver a short verbal explanation of the work to people who \u201cattend\u201d your poster session.\r\n\r\nYour poster should communicate one central claim that matches your inference level, supported by one to two figures that carry the evidence. Everything else is supporting material that helps a reader understand what you did and why it matters. If you try to fit three different research stories on one poster, you usually end up with a crowded wall of text that is hard to scan and even harder to discuss. The poster is not a full paper shrunk down. It is an intentionally distilled story that invites questions.\r\n\r\n[caption id=\"attachment_295\" align=\"aligncenter\" width=\"1024\"]<img class=\"wp-image-295 size-large\" src=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-2-2026-04_55_48-PM-1024x683.png\" alt=\"Flat vector infographic showing a three-column scientific poster layout on the left with labeled sections including Title, Background, Research Question and Hypotheses, Methods, Results with three figure placeholders, Discussion, Limitations, Conclusions, and a footer. Arrows point to seven callout boxes on the right that briefly explain the purpose of each poster section.\" width=\"1024\" height=\"683\" \/> Flat vector infographic illustrating the standard layout of a scientific poster, with arrows linking each section to concise explanations of its purpose. The visual emphasizes results as the core evidence and presents the poster as a structured progression from research question to interpretation. Created with generative AI.[\/caption]\r\n\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3>The DSARM Poster Simulation assignment (what you are building)<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nIn this course, you are not creating a full conference poster. You are building the core elements of a scientific poster as a slide-based poster simulation. That is deliberate. It keeps the focus on the fundamentals of dissemination, meaning telling a clear research story with evidence, while avoiding the distractions of advanced poster software, print formatting, and layout micro-decisions.\r\n\r\nYour slide deck maps directly onto standard poster sections:\r\n<ul>\r\n \t<li><strong>Title and Abstract:<\/strong> the top of the poster, which tells the reader what the project is and why it matters.<\/li>\r\n \t<li><strong>Research Question and Hypotheses:<\/strong> the \u201cwhat are we testing\u201d section.<\/li>\r\n \t<li><strong>Participants and Measures:<\/strong> the essential methods content needed to interpret the results.<\/li>\r\n \t<li><strong>Results slides for H1, H2, H3:<\/strong> three result claims, each supported by one visualization and a short caption.<\/li>\r\n \t<li><strong>Discussion and Conclusion:<\/strong> interpretation, limitations, and what the findings imply.<\/li>\r\n \t<li><strong>Citations:<\/strong> credit and traceability.<\/li>\r\n \t<li><strong>AI Use Attestation:<\/strong> a transparency statement about how you worked.<\/li>\r\n<\/ul>\r\nOne benefit of this structure is that it trains you to separate roles: a slide is not a place to dump everything you did. Each slide has a job, and the whole deck functions like a poster session conversation.\r\n\r\n<\/div>\r\n<\/div>\r\n&nbsp;\r\n<h3>5.A. Why posters matter for scientific research dissemination<\/h3>\r\nPosters matter because they let researchers share new work quickly, visually, and at high volume in conference settings. They are designed for fast scanning plus conversation, so they help an audience grasp the research question, approach, and main result in a short amount of time.\r\n\r\nPosters also function as a structured \u201cargument test.\u201d Space constraints force you to make choices explicit. You have to state the question clearly, define the variables, describe the design, and show the key evidence. That constraint is a feature.\r\n\r\nPoster sessions are also a feedback engine. Researchers routinely use posters to get real-time critique that improves a project before it becomes a manuscript or a formal talk. In practice, the best poster conversations often revolve around measurement choices, alternative explanations, and what the findings do and do not justify. This is why posters are a staple in research training.\r\n\r\nFinally, posters help bridge expert and non-expert audiences. A well-designed figure and a plain-language caption can communicate a finding more accessibly than a dense methods section. This matters for dissemination because research does not only live in journals. It also moves through labs, departments, conferences, and community-facing spaces, and posters are one of the most common formats for that movement.\r\n<h3>5.B. Poster anatomy as a narrative arc<\/h3>\r\nScientific posters are structured stories, not templates. A template can help with layout, but it cannot tell you what the story is. The story is the sequence of ideas that a reader can follow in a single pass, even if they only give you 30 seconds. The poster format rewards clarity because it is designed for quick scanning and short conversations, not for long reading. A simple, reliable arc is: <strong>background \u2192 research question \u2192 methods \u2192 results \u2192 interpretation \u2192 limitations.<\/strong>\r\n\r\nWhat people look for in 30 seconds is different from what they ask in conversation. In a fast scan, readers typically look for (1) the title, (2) the question, (3) one clear figure, and (4) a bottom-line statement. In conversation, they usually ask about design choices and credibility: why this question, what variables and waves, what you controlled for, how you handled alternative explanations, and what remains uncertain.\r\n<h3>5.C. Figures and captions as the core of the poster<\/h3>\r\nFigures are the heart of a poster because they communicate patterns faster than text. The best posters have one figure per claim, not many weak figures that compete for attention. If you have three claims, you should have three figures. That matches your H1, H2, H3 structure nicely.\r\n\r\nCaptions should be short and functional. A good caption answers three questions in one to two sentences: what is plotted, who is included, and which variables and waves are shown. Captions should not be mini-discussions. They should orient the reader so they can interpret the figure correctly.\r\n\r\nFor this assignment, do not report p-values in captions. Instead, focus on what the figure shows in plain language: direction, magnitude, uncertainty when available, and sample size. If you want to communicate statistical support, the caption can mention confidence intervals or describe the size of the estimated relationship, but keep it simple.\r\n\r\nBasic readability norms matter more than students expect. Axes should be labeled clearly, units should be included when relevant, legends should be readable, and variable names should be consistent with the rest of the deck. If the reader cannot interpret the plot in five seconds, the plot is not doing its job.\r\n<h3>5.D. Poster readiness: the 2-minute walkthrough<\/h3>\r\nA poster simulation works best when you can explain it out loud in a tight, two-minute story. The simplest structure mirrors the deck: start with the question, summarize the design and measures, walk through the three results slides, then give a careful conclusion with limitations.\r\n\r\nA practical pacing model:\r\n<ul>\r\n \t<li>20 seconds: background and research question<\/li>\r\n \t<li>20 seconds: dataset and measures<\/li>\r\n \t<li>60 seconds: results, one sentence per figure (H1, H2, H3)<\/li>\r\n \t<li>20 seconds: conclusion and what remains uncertain<\/li>\r\n<\/ul>\r\nPrepare for three predictable questions.\r\n\r\n<strong>\u201cWhat did you find?\u201d<\/strong>\r\nAnswer with one sentence that matches the inference tier, then point to the single most important figure.\r\n\r\n<strong>\u201cHow did you test confounding?\u201d<\/strong>\r\nAnswer by describing the unadjusted versus adjusted comparison. Mention which covariate(s) you used and what changed in the IV estimate.\r\n\r\n<strong>\u201cWhat is still uncertain?\u201d<\/strong>\r\nName the biggest remaining alternative explanation, limitation, or generalizability constraint.\r\n\r\nEnd with one reproducibility sentence you can say out loud, such as: \u201cAll figures and model outputs in this deck were generated from my analysis code, and the saved outputs and figure files are stored in my project folders so the results can be traced and reproduced.\u201d","rendered":"<h2 data-start=\"0\" data-end=\"31\">Reading Objectives<\/h2>\n<ul>\n<li data-start=\"0\" data-end=\"197\">\n<p data-start=\"2\" data-end=\"197\"><strong data-start=\"2\" data-end=\"45\" data-is-only-node=\"\">Study Design and Inference:<\/strong> Use the inference hierarchy to explain what DSARM analyses can and cannot claim, and apply responsible language that matches an observational design.<\/p>\n<\/li>\n<li data-start=\"199\" data-end=\"390\">\n<p data-start=\"201\" data-end=\"390\"><strong data-start=\"201\" data-end=\"249\" data-is-only-node=\"\">Confounding and Causal Pathways:<\/strong> Define confounding, apply the H1\u2013H3 screening approach to evaluate a plausible third variable, and distinguish confounders from mediators.<\/p>\n<\/li>\n<li data-start=\"392\" data-end=\"631\">\n<p data-start=\"394\" data-end=\"631\"><strong data-start=\"394\" data-end=\"438\" data-is-only-node=\"\">Capstone Analysis Blueprint:<\/strong> Develop and carry out a defensible analysis plan using the DSARM wave structure (age 16 and age 21) and the course\u2019s allowed tests, including basic validity checks and transparent reporting.<\/p>\n<\/li>\n<li data-start=\"633\" data-end=\"875\" data-is-last-node=\"\">\n<p data-start=\"635\" data-end=\"875\" data-is-last-node=\"\"><strong data-start=\"635\" data-end=\"687\" data-is-only-node=\"\">Poster Simulation and Dissemination:<\/strong> Translate results into the required poster simulation components, emphasizing figure-centered communication, clear captions, and conclusions that reflect uncertainty and inference limits.<\/p>\n<\/li>\n<\/ul>\n<h2>Key Terms<\/h2>\n<ul>\n<li data-start=\"1403\" data-end=\"1545\">\n<p data-start=\"1405\" data-end=\"1545\"><strong data-start=\"1405\" data-end=\"1424\">Confounder (C):<\/strong> A third variable associated with both the IV and DV that can bias the observed IV\u2013DV association if not accounted for.<\/p>\n<\/li>\n<li data-start=\"1547\" data-end=\"1694\">\n<p data-start=\"1549\" data-end=\"1694\"><strong data-start=\"1549\" data-end=\"1570\">Confounding bias:<\/strong> Distortion of an IV\u2013DV association due to a confounder, making an effect look larger, smaller, or present when it is not.<\/p>\n<\/li>\n<li data-start=\"1696\" data-end=\"1843\">\n<p data-start=\"1698\" data-end=\"1843\"><strong data-start=\"1698\" data-end=\"1711\">Mediator:<\/strong> A variable on the causal pathway between IV and DV; controlling for a mediator can remove part of the mechanism linking IV to DV.<\/p>\n<\/li>\n<\/ul>\n<h2 data-start=\"0\" data-end=\"31\">1. Introduction: Capstone Project Overview<\/h2>\n<p data-start=\"33\" data-end=\"262\">Module 10 is the capstone for DSARM. The goal is to integrate what you have learned about study design, observational inference, confounding, basic hypothesis testing, and responsible communication into one coherent mini-project.<\/p>\n<p data-start=\"264\" data-end=\"648\">Your final deliverable is a <strong data-start=\"292\" data-end=\"313\">poster simulation<\/strong>, meaning you will build the core components of a scientific poster as a slide deck (not a full conference poster). This format is intentional. It trains you to communicate one clear research story, supported by well-labeled figures and careful interpretation, while staying honest about what observational data can and cannot justify.<\/p>\n<p data-start=\"650\" data-end=\"1131\">To complete the capstone, you will work with a synthetic ABCD-style dataset. You can access the dataset in the <a href=\"https:\/\/github.com\/jl2578\/dsarm-codespaces\">course GitHub repository<\/a>. It is designed to feel like a real, large-scale longitudinal dataset while remaining safe for teaching. You will choose a research question, identify one main predictor (IV) and one main outcome (DV), include at least one plausible third variable to consider as an alternative explanation, and then use the course\u2019s allowed tests to produce figures and results that you can explain clearly.<\/p>\n<table>\n<caption><strong>Table 1.<\/strong> DSARM synthetic dataset overview and variable shopping checklist.<\/caption>\n<thead>\n<tr>\n<th style=\"text-align: left\">DSARM synthetic dataset<\/th>\n<th style=\"text-align: left\">Start with the data dictionary<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"vertical-align: top\">\n<ul>\n<li><strong>Time 1:<\/strong> age 16 wave<\/li>\n<li><strong>Time 2:<\/strong> age 21 wave<\/li>\n<li><strong>Longitudinal structure:<\/strong> the same individuals are measured twice; most projects will use Time 1, Time 2, or a Time 2 minus Time 1 difference score<\/li>\n<li><strong>Data types:<\/strong> survey, behavioral measures, imaging ROI variables, substance use, environment<\/li>\n<li><strong>Note:<\/strong> no twin subsample in this dataset<\/li>\n<\/ul>\n<\/td>\n<td style=\"vertical-align: top\">\n<ul>\n<li>Pick <strong>one main DV<\/strong>, <strong>one IV<\/strong>, and <strong>at least one plausible third variable (C)<\/strong> to evaluate alternative explanations<\/li>\n<li>Check levels of measurement and feasibility with the <strong>allowed course tests<\/strong><\/li>\n<li>Confirm which wave(s) each variable comes from and whether a difference score is appropriate for your question<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 data-start=\"111\" data-end=\"186\">2. Study Design and Levels of Inference (Refresher \u2192 Inference Hierarchy)<\/h2>\n<p data-start=\"221\" data-end=\"494\">Your conclusions can only be as strong as your design. This section offers a quick refresher on study design (introduced in Module 2) to orient you toward the inference hierarchy you will use to frame your data analyses.<\/p>\n<h3 data-start=\"496\" data-end=\"546\">2.A. Experimental vs. Observational Study Designs<\/h3>\n<ul data-start=\"548\" data-end=\"1539\">\n<li data-start=\"548\" data-end=\"980\">\n<p data-start=\"374\" data-end=\"883\"><strong data-start=\"374\" data-end=\"423\">Experimental design (best for causal claims).<\/strong> In an experimental study, researchers <strong data-start=\"462\" data-end=\"472\">assign<\/strong> an exposure or intervention to participants, ideally using <strong data-start=\"532\" data-end=\"553\">random assignment<\/strong>. Random assignment aims to make groups comparable at baseline, so any later differences in outcomes are less likely to be \u201cfalse alarms\u201d caused by pre-existing group differences. When an experiment is well-executed, outcome differences can be more credibly linked to the intervention itself.<\/p>\n<\/li>\n<li data-start=\"548\" data-end=\"980\">\n<p data-start=\"374\" data-end=\"883\"><strong data-start=\"885\" data-end=\"936\">Observational design (what ABCD and DSARM are).<\/strong> In an observational study, researchers <strong data-start=\"976\" data-end=\"1010\">measure what naturally happens<\/strong> without assigning exposure. This design can reveal real-world patterns and associations, but it is also vulnerable to <strong data-start=\"1129\" data-end=\"1157\">alternative explanations<\/strong>. People who differ on the exposure often differ on other characteristics too, which can make an association look causal when it is actually driven by hidden baseline differences or \u201cthird-variable\u201d influences. That is why observational findings require more cautious interpretation.<\/p>\n<\/li>\n<\/ul>\n<h3 data-start=\"1541\" data-end=\"1617\">2.B. Within observational: Cross\u2011sectional vs. Longitudinal<\/h3>\n<ul>\n<li><strong data-start=\"1540\" data-end=\"1582\">Cross-sectional studies (single wave).<\/strong> Cross-sectional designs measure exposure and outcome at the same time. They provide a snapshot of relationships in a population, but they generally cannot establish temporal order. If X and Y are measured simultaneously, it is hard to know whether X preceded Y, Y preceded X, or whether both reflect other background factors.<\/li>\n<\/ul>\n<ul>\n<li data-start=\"1540\" data-end=\"1946\"><strong data-start=\"1948\" data-end=\"1990\">Longitudinal studies (multiple waves).<\/strong> Longitudinal designs measure the same individuals repeatedly over time. This can strengthen inference because it helps establish temporal order (X measured before Y) and allows you to study change. Still, time ordering alone does not guarantee causation, because third-variable influences can still shape both the predictor and the outcome over time.<\/li>\n<\/ul>\n<h3 data-start=\"2381\" data-end=\"2456\">2.C. Design features that strengthen inference (the bridge to the hierarchy)<\/h3>\n<p data-start=\"2458\" data-end=\"2522\">Certain design features move a study closer to causal inference:<\/p>\n<ul data-start=\"2524\" data-end=\"3430\">\n<li data-start=\"2524\" data-end=\"2693\">\n<p data-start=\"2526\" data-end=\"2693\"><strong data-start=\"2526\" data-end=\"2547\">Random assignment<\/strong> reduces systematic baseline differences between groups, which reduces bias from hidden group differences.<\/p>\n<\/li>\n<li data-start=\"2694\" data-end=\"3006\">\n<p data-start=\"2696\" data-end=\"3006\"><strong data-start=\"2696\" data-end=\"2729\">Natural and quasi-experiments<\/strong> exploit external events that create \u201cas-if random\u201d exposure. A classic example is Oregon\u2019s Medicaid lottery, which allowed researchers to compare lottery-selected vs non-selected groups in a way that closely resembles random assignment.<\/p>\n<\/li>\n<li data-start=\"3007\" data-end=\"3187\">\n<p data-start=\"3009\" data-end=\"3187\"><strong data-start=\"3009\" data-end=\"3026\">Time ordering<\/strong> (longitudinal data) helps answer \u201cwhat came first,\u201d but it still does not remove all alternative explanations by itself.<\/p>\n<\/li>\n<li data-start=\"3188\" data-end=\"3430\">\n<p data-start=\"3190\" data-end=\"3430\"><strong data-start=\"3190\" data-end=\"3239\">Better comparisons and statistical adjustment<\/strong> (clear comparison groups, covariates, and sensitivity checks) can reduce bias from third-variable influences, but they cannot eliminate it completely.<\/p>\n<\/li>\n<\/ul>\n<h3 data-start=\"3432\" data-end=\"3493\">2.D. Inference hierarchy (stronger \u2192 weaker for causal claims)<\/h3>\n<p data-start=\"3495\" data-end=\"3550\">A common hierarchy for causal strength looks like this:<\/p>\n<ol data-start=\"3552\" data-end=\"4299\">\n<li data-start=\"3552\" data-end=\"3672\">\n<p data-start=\"3555\" data-end=\"3672\"><strong data-start=\"3555\" data-end=\"3581\">Randomized experiments<\/strong> (strongest for causal claims when conducted well).<\/p>\n<\/li>\n<li data-start=\"3673\" data-end=\"3828\">\n<p data-start=\"3676\" data-end=\"3828\"><strong data-start=\"3676\" data-end=\"3707\">Natural \/ quasi-experiments<\/strong> (can approach causal inference when \u201cas-if random\u201d assumptions are plausible).<\/p>\n<\/li>\n<li data-start=\"3829\" data-end=\"4026\">\n<p data-start=\"3832\" data-end=\"4026\"><strong data-start=\"3832\" data-end=\"3870\">Longitudinal observational studies<\/strong> (temporal order helps; causal insight is limited and depends on how well alternative explanations are addressed).<\/p>\n<\/li>\n<li data-start=\"4027\" data-end=\"4152\">\n<p data-start=\"4030\" data-end=\"4152\"><strong data-start=\"4030\" data-end=\"4071\">Cross-sectional observational studies<\/strong> (association only; direction unclear).<\/p>\n<\/li>\n<li data-start=\"4153\" data-end=\"4299\">\n<p data-start=\"4156\" data-end=\"4299\"><strong data-start=\"4156\" data-end=\"4194\">Case studies \/ descriptive reports<\/strong> (excellent for hypothesis generation, not for causal testing).<\/p>\n<\/li>\n<\/ol>\n<figure id=\"attachment_284\" aria-describedby=\"caption-attachment-284\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-284 size-large\" src=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-09_00_20-AM-1024x683.png\" alt=\"Infographic showing a hierarchy of study designs from stronger to weaker causal inference: randomized experiment, natural or quasi-experiment, longitudinal observational, cross-sectional observational, and case study or descriptive. A side panel lists responsible language to use (associated with, linked to, correlated with, predicts, is consistent with) and language to avoid (caused, led to, reduced, increased).\" width=\"1024\" height=\"683\" srcset=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-09_00_20-AM-1024x683.png 1024w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-09_00_20-AM-300x200.png 300w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-09_00_20-AM-768x512.png 768w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-09_00_20-AM-65x43.png 65w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-09_00_20-AM-225x150.png 225w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-09_00_20-AM-350x233.png 350w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-09_00_20-AM.png 1536w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-284\" class=\"wp-caption-text\">Figure 1. Hierarchy of study designs ranked by strength of causal inference, with guidance on responsible language for interpreting findings. Created with generative AI.<\/figcaption><\/figure>\n<h3 data-start=\"4493\" data-end=\"4540\">2.E. What this means for DSARM capstone projects<\/h3>\n<p data-start=\"4542\" data-end=\"4630\">DSARM capstone projects will fall in <strong data-start=\"4584\" data-end=\"4596\">#3 or #4<\/strong>. You can strengthen inference by:<\/p>\n<ul data-start=\"4631\" data-end=\"4930\">\n<li data-start=\"4631\" data-end=\"4705\">\n<p data-start=\"4633\" data-end=\"4705\">using multiple waves when possible (so you can speak to temporal order),<\/p>\n<\/li>\n<li data-start=\"4706\" data-end=\"4765\">\n<p data-start=\"4708\" data-end=\"4765\">defining clear comparison groups (more apples-to-apples),<\/p>\n<\/li>\n<li data-start=\"4766\" data-end=\"4930\">\n<p data-start=\"4768\" data-end=\"4930\">reporting unadjusted vs adjusted models to show whether the association is robust to plausible alternative explanations. Comparing unadjusted and adjusted models allows you to see whether the observed association persists after accounting for plausible alternative explanations, which strengthens the credibility of your interpretation.<\/p>\n<\/li>\n<\/ul>\n<h3 data-start=\"4932\" data-end=\"4994\">2.F. Responsible language (keep your claims inside your design)<\/h3>\n<p data-start=\"4996\" data-end=\"5042\">Your interpretations should match your design:<\/p>\n<ul data-start=\"5044\" data-end=\"5334\">\n<li data-start=\"5044\" data-end=\"5146\">\n<p data-start=\"5046\" data-end=\"5146\">Prefer: \u201cis associated with,\u201d \u201cis linked to,\u201d \u201ccorrelates with,\u201d \u201cpredicts,\u201d \u201cis consistent with.\u201d<\/p>\n<\/li>\n<li data-start=\"5147\" data-end=\"5334\">\n<p data-start=\"5149\" data-end=\"5334\">Avoid causal verbs like \u201ccauses,\u201d \u201cleads to,\u201d \u201creduces,\u201d or \u201cincreases\u201d unless you truly have a randomized or strong quasi-experimental design.<\/p>\n<\/li>\n<\/ul>\n<p data-start=\"5336\" data-end=\"5355\">Example contrast:<\/p>\n<ul data-start=\"5356\" data-end=\"5487\">\n<li data-start=\"5356\" data-end=\"5441\">\n<p data-start=\"5358\" data-end=\"5441\">Appropriate: \u201cHigher baseline screen time predicts lower attention at follow-up.\u201d<\/p>\n<\/li>\n<li data-start=\"5442\" data-end=\"5487\">\n<p data-start=\"5444\" data-end=\"5487\">Overreach: \u201cScreen time reduced attention.\u201d<\/p>\n<\/li>\n<\/ul>\n<h2 data-start=\"0\" data-end=\"61\">3. Confounding: Identifying and Addressing Third Variables<\/h2>\n<h3 data-start=\"63\" data-end=\"119\">3.A. Why confounding matters (one motivating example)<\/h3>\n<p data-start=\"121\" data-end=\"955\">Imagine you find a strong pattern in ABCD: adolescents who report frequent energy drink use at Time 1 also report higher anxiety at Time 2. It is very tempting to tell a simple story, such as \u201cenergy drinks cause later anxiety.\u201d The problem is that observational datasets are full of clustered life circumstances, so an apparent relationship can be a \u201cfalse positive\u201d for causation. Teens who drink a lot of energy drinks might also be sleeping less, under more academic pressure, experiencing higher baseline anxiety, living with more family stress, or embedded in peer contexts that increase both energy drink use and anxiety risk. In other words, the association may be real, but the causal explanation may be wrong. This is why observational findings can sound causal even when they are not.<\/p>\n<h3 data-start=\"957\" data-end=\"1015\">3.B. Definition (what a confounder is and what it does)<\/h3>\n<p data-start=\"1017\" data-end=\"1785\">A confounder is a third variable that is not the exposure and not the outcome, but is related to both in a way that biases the association we estimate. A useful way to think about it is \u201cshared causes.\u201d If the exposure and the outcome share a common cause, then part of the observed exposure\u2013outcome relationship can reflect background risk differences rather than an effect of the exposure itself. This third-variable bias can make an association look larger than it really is, create an association that is mostly spurious, or even hide and reverse an association, depending on how the shared causes are distributed across the groups you are comparing.<\/p>\n<figure id=\"attachment_286\" aria-describedby=\"caption-attachment-286\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-286 size-large\" src=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-04_42_05-PM-1024x683.png\" alt=\"Diagram showing a confounding variable example. Independent variable: social media use. Dependent variable: sleep quality. Confounder: stress level. Arrows show stress level influencing both social media use and sleep quality, and a dashed arrow from social media use to sleep quality. Text notes that a confounder influences both IV and DV and that ignoring the confounder can bias the association.\" width=\"1024\" height=\"683\" srcset=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-04_42_05-PM-1024x683.png 1024w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-04_42_05-PM-300x200.png 300w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-04_42_05-PM-768x512.png 768w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-04_42_05-PM-65x43.png 65w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-04_42_05-PM-225x150.png 225w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-04_42_05-PM-350x233.png 350w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-04_42_05-PM.png 1536w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-286\" class=\"wp-caption-text\">Figure 2. Illustration of a confounder (stress level) influencing both the independent variable (social media use) and the dependent variable (sleep quality), demonstrating how failing to adjust for confounding can bias observed associations in observational research.<\/figcaption><\/figure>\n<p data-start=\"1787\" data-end=\"2589\">This is one reason randomized experiments sit at the top of most inference hierarchies. Random assignment is designed to balance both known and unknown confounders across groups. Randomization makes it less likely that baseline differences (e.g., biases) explain outcome differences. In observational studies, like ABCD, you do not get that randomization, so you rely on measurement and adjustment. This entails identifying plausible third variables, measuring them well, and testing whether your main association changes when you account for them. Even then, observational work cannot guarantee that every relevant third variable has been measured, so responsible interpretation requires some humility.<\/p>\n<h3>3.C. Common ways researchers test and account for confounding<\/h3>\n<p>In observational research, you cannot rely on random assignment to balance background differences between groups, so you have to reduce bias through analysis choices. A common starting point is <strong>regression adjustment<\/strong>, where you include plausible confounders as covariates so the IV\u2013DV association is estimated while holding those third variables constant. In its simplest form, this looks like a multiple regression model such as DV = \u03b2\u2080 + \u03b2\u2081(IV) + \u03b2\u2082(C) + \u2026 . The key diagnostic is what happens to the IV estimate (\u03b2\u2081) once the confounder enters the model. If \u03b2\u2081 shrinks substantially or becomes non-significant, the original association may have been largely explained by C. If \u03b2\u2081 stays similar, the association is more robust to that confounder, though it can still be vulnerable to unmeasured third variables.<\/p>\n<p>To make this logic transparent, researchers typically use <strong>model comparison<\/strong> and report results from both an unadjusted and an adjusted model. Model A estimates DV ~ IV, which describes the raw association. Model B estimates DV ~ IV + C (and possibly additional confounders). You then compare the IV coefficient, uncertainty, and model N across Model A and Model B and describe how the interpretation changes.<\/p>\n<p>Researchers also sometimes use <strong>stratification or matching<\/strong> to make comparisons more apples-to-apples. The idea is to compare high vs. low exposure groups within levels of the confounder, such as comparing high vs. low screen time among youth who have similar sleep duration. This approach can be useful for intuition and for simple visualizations. In this course, regression adjustment will usually be the primary tool.<\/p>\n<p>Finally, when working with longitudinal data, many researchers use <strong>baseline outcome control<\/strong> as an additional guardrail. If you have a baseline measure of the DV, you can model the follow-up DV while controlling for the baseline DV, for example: Time 2 attention ~ Time 1 screen time + Time 1 attention + confounders. This reduces confounding from stable individual differences that influence the outcome and reframes the question as whether the IV predicts change or divergence over time. A short caution is worth remembering. Baseline control can over-control in cases where the IV already influenced the baseline DV, but in most observational follow-up analyses it is a strong default.<\/p>\n<h3>3.D. The \u201ccrude\u201d H1\u2013H3 approach<\/h3>\n<p>Before you use regression adjustment, it helps to build intuition for what a confounder looks like in the data. In the capstone, we use a deliberately \u201ccrude\u201d screening approach that asks a simple question: does a third variable C show up in the story on both sides, meaning it is related to the predictor and the outcome? The goal here is not to prove causation or to claim you have identified the one true confounder. The goal is to develop a disciplined habit of checking whether your main IV\u2013DV relationship could plausibly reflect shared background differences rather than a direct effect.<\/p>\n<p>We do this using a small bundle of hypothesis tests (our H1\u2013H3 framework in the capstone). First, you test <strong>H1<\/strong>, the primary association you actually care about, by estimating a simple unadjusted model such as DV ~ IV. Next, you test whether your candidate confounder is tied to the predictor by checking <strong>H2<\/strong> (IV ~ C). Then you test whether the same candidate is tied to the outcome by checking <strong>H3<\/strong> (DV ~ C). If both H2 and H3 are supported, C becomes a plausible confounder because it is associated with both sides of the relationship you are trying to interpret. At that point, your original H1 association may be partly or largely explained by C, even if H1 was statistically significant.<\/p>\n<p>Once you have that basic intuition, you move from \u201cscreening\u201d to \u201caccounting.\u201d You fit an adjusted model such as DV ~ IV + C (and possibly additional confounders) and compare the IV estimate in the adjusted model to the IV estimate in the unadjusted model. If the IV effect shrinks a lot, that suggests the unadjusted association was strongly sensitive to C. If it stays similar, the association is more robust to that particular alternative explanation. This simple workflow is not the final word on confounding, but it is a reliable first step that will keep your interpretations honest and your methods transparent.<\/p>\n<h3>3.E. Confounders vs. mediators (do not block the mechanism by accident)<\/h3>\n<p>Not every third variable belongs in your \u201ccontrol variables\u201d list. Some variables are true confounders, meaning they sit outside the relationship you care about and create background differences that can bias the IV\u2013DV association. Others are mediators, meaning they are part of the causal pathway through which the IV exerts its influence on the DV. The distinction matters because controlling for a mediator can accidentally remove the very process you are trying to understand.<\/p>\n<p>A <strong>confounder<\/strong> is an outside cause that influences both the IV and the DV. If you do not account for it, you risk attributing an association to the IV when it may actually reflect shared background causes. In contrast, a <strong>mediator<\/strong> is a step in the chain from IV to DV. If your research question is about the total relationship between IV and DV, controlling for the mediator can \u201cblock\u201d the pathway and make the IV look less important than it truly is.<\/p>\n<figure id=\"attachment_288\" aria-describedby=\"caption-attachment-288\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-288 size-large\" src=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-05_36_51-PM-1024x575.png\" alt=\"Side-by-side diagram comparing a confounder and a mediator. Left panel shows a confounder (shared risk context such as stress or neighborhood exposure) influencing both peer substance use (IV) and personal substance use (DV), creating potential bias. Right panel shows a mediator (attitudes toward drugs) lying on the pathway between peers\u2019 substance use (IV) and personal substance use (DV), with a note that controlling for the mediator blocks part of the causal effect.\" width=\"1024\" height=\"575\" srcset=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-05_36_51-PM-1024x575.png 1024w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-05_36_51-PM-300x168.png 300w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-05_36_51-PM-768x431.png 768w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-05_36_51-PM-65x36.png 65w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-05_36_51-PM-225x126.png 225w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-05_36_51-PM-350x196.png 350w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-1-2026-05_36_51-PM.png 1456w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-288\" class=\"wp-caption-text\">Figure 3. Diagram illustrating the difference between a confounder and a mediator: a confounder influences both the independent and dependent variables and can bias associations if unaccounted for, whereas a mediator lies on the causal pathway and represents part of the mechanism linking the independent variable to the outcome. Created with generative AI.<\/figcaption><\/figure>\n<p>Here is a quick example. Suppose your IV is peer substance use and your DV is your own substance use. A plausible mediator is your attitudes toward drugs. Peers can shape attitudes, and attitudes can shape behavior. If you control for attitudes in a model, you may be removing part of the mechanism by which peers influence use. That is not wrong in every situation, but it changes the question you are answering. Instead of estimating the total association between peers and your own use, you are estimating what is left after stripping out the attitude pathway.<\/p>\n<h2>4. Capstone Analysis Blueprint (From Research Question to Results)<\/h2>\n<p>The capstone deliverable of this course is a poster simulation slide deck that contains the core poster components. This Section 4 focuses on producing the results that will populate those slides. Because our synthetic ABCD dataset is observational, your main job is to (1) make your analysis plan explicit before you run models, (2) run a small set of primary tests that match your design, and (3) communicate what your results do and do not justify.<\/p>\n<h3>4.A. Define the research question and variables (operationalize early)<\/h3>\n<p>Start by writing your research question in one sentence in a way that can be answered with DSARM variables. Because DSARM includes <strong>Time 1 (age 16)<\/strong> and <strong>Time 2 (age 21)<\/strong>, you can frame questions that use one wave or both waves, and you can treat the five-year span as meaningful when it fits your topic.<\/p>\n<p>Here are flexible one-sentence templates you can use:<\/p>\n<ul>\n<li><strong>Cross-sectional (single wave):<\/strong> \u201cAt age 16 (Time 1), is X associated with Y?\u201d or \u201cAt age 21 (Time 2), is X associated with Y?\u201d<\/li>\n<li><strong>Prospective prediction (two waves):<\/strong> \u201cDoes X at age 16 (Time 1) predict Y at age 21 (Time 2)?\u201d<\/li>\n<li><strong>Change over time (two waves):<\/strong> \u201cDoes X at age 16 (Time 1) predict change in Y from age 16 to age 21?\u201d<\/li>\n<li><strong>Co-change (requires X and Y at both waves):<\/strong> \u201cDo changes in X from age 16 to 21 track with changes in Y over the same period?\u201d<\/li>\n<li><strong>Group differences in change:<\/strong> \u201cDo groups defined at Time 1 (for example high vs low X) differ in how much Y changes from 16 to 21?\u201d<\/li>\n<\/ul>\n<p>Once you have the question, convert it into concrete analytic ingredients:<\/p>\n<ul>\n<li>Identify your <strong>IV<\/strong> (predictor\/exposure) and <strong>DV<\/strong> (outcome).<\/li>\n<li>List your planned <strong>covariates<\/strong>, especially variables you will adjust for as alternative explanations.<\/li>\n<li>Confirm how each variable is measured (continuous, binary, ordinal, categorical).<\/li>\n<li>Write down exactly which wave each variable comes from: <strong>Time 1 (age 16)<\/strong> and\/or <strong>Time 2 (age 21)<\/strong>.<\/li>\n<li>Decide whether you need derived variables (sum scores, composites, or change scores like Time 2 minus Time 1), and write down the exact recipe.<\/li>\n<\/ul>\n<p>A practical rule: if you cannot state exactly how the DV is measured and which wave it comes from, you do not yet have an analysis plan.<\/p>\n<h3>4.B. Choose design and timeframe (match the question)<\/h3>\n<p>Now that you have defined your IV, DV, covariates, and wave(s), choose a design that matches that question and record the inference limits that come with it. DSARM and ABCD are <strong>observational<\/strong> datasets. Exposures are measured rather than assigned, so your conclusions should be framed as <strong>associations<\/strong> or <strong>predictions<\/strong>, not causal effects.<\/p>\n<p>Your main design choice is straightforward:<\/p>\n<ul>\n<li><strong>Cross-sectional<\/strong> means you are analyzing a single wave (Time 1 at age 16, or Time 2 at age 21). This supports clear descriptive and associative claims at that age, but it does not establish direction.<\/li>\n<li><strong>Longitudinal (two-wave)<\/strong> means you are using both waves across the five-year span. This supports statements about temporal ordering, such as whether Time 1 predicts Time 2, but it still does not prove causation.<\/li>\n<\/ul>\n<p>Concrete decisions to record up front:<\/p>\n<ul>\n<li>Is this <strong>cross-sectional<\/strong> (single wave) or <strong>two-wave longitudinal<\/strong> (Time 1 \u2192 Time 2)?<\/li>\n<li>If two-wave longitudinal, what is the exact ordering (for example, <strong>Time 1 predictor \u2192 Time 2 outcome<\/strong>)?<\/li>\n<li>Will you include <strong>baseline outcome control<\/strong> when available (for example, modeling the Time 2 outcome while controlling the Time 1 level of the same outcome)? If yes, state why, since it changes the interpretation toward change over time rather than simple prediction.<\/li>\n<\/ul>\n<h3>4.C. Build a reproducible project plan before running models<\/h3>\n<p>In research, \u201creproducible\u201d means a reader can trace every statistic and finding in your results back to a saved output that was generated by code. This is not busywork. It is how you prevent results from drifting as you revise notebooks, rerun cells, or update plots. Equally important, this is how other researchers verify your findings.<\/p>\n<p>Set up the structure before you analyze:<\/p>\n<ul>\n<li>Create a clear folder structure (for example: <code>data_raw\/<\/code>, <code>data_clean\/<\/code>, <code>scripts\/<\/code>, <code>outputs\/<\/code>, <code>figures\/<\/code>, <code>poster\/<\/code>).<\/li>\n<li>Use consistent filenames that record wave(s), variable set, and version\/date (for example: <code>t1_age16_screen_attention_v1.csv<\/code>).<\/li>\n<li>Keep an analysis log (a short markdown file is fine) that records:\n<ul>\n<li>variables and waves used (and why),<\/li>\n<li>cleaning rules and exclusions,<\/li>\n<li>derived-variable recipes,<\/li>\n<li>model formulas you ran,<\/li>\n<li>final analytic N for each model.<\/li>\n<\/ul>\n<\/li>\n<li>Use a \u201cno handcrafted figures\u201d rule. Every plot should be generated from code and saved with a stable filename.<\/li>\n<\/ul>\n<h3>4.D. Execute the analysis sequence (what you actually run)<\/h3>\n<p>This is the core workflow. It is intentionally simple so you can do it well and explain it clearly.<\/p>\n<p><strong>1) Data audit and cleaning plan<\/strong><br \/>\nConfirm coding, missingness rules, and exclusions. Track sample size changes as you clean, because changing N changes interpretation.<\/p>\n<p><strong>2) Descriptives and visuals<\/strong><br \/>\nSummarize distributions for IV, DV, and key covariates. Make at least one plot matched to the question type (scatterplot, boxplot, bar chart with uncertainty). The plot should reflect your design choice, meaning it should clearly indicate whether you are describing a single-wave association or a Time 1 to Time 2 relationship.<\/p>\n<p><strong>3) Primary model (H1)<\/strong><br \/>\nFit the unadjusted association: <strong>DV ~ IV<\/strong>. Save the estimate, uncertainty (CI if available), p-value if used, and model N.<\/p>\n<p><strong>4) Confounding checks and adjusted model<\/strong><br \/>\nFit the adjusted model: <strong>DV ~ IV + confounders<\/strong> (and baseline DV if longitudinal). Compare unadjusted vs adjusted IV estimates and describe what changed.<\/p>\n<p><strong>5) Sensitivity \/ robustness (optional, lightweight)<\/strong><br \/>\nRun one extra check only, such as an alternative operationalization or one additional covariate set. Label anything beyond the primary plan as exploratory.<\/p>\n<h3 data-start=\"622\" data-end=\"699\">4.E. Assumptions and validity checks (for t-tests, ANOVA, and correlations)<\/h3>\n<p data-start=\"701\" data-end=\"981\">Every statistical test makes assumptions. You do not need perfection, but you do need to know when a result might be fragile. In this capstone, your checks should match the tests you actually use, which are group comparisons (t-tests\/ANOVA) and simple associations (correlations).<\/p>\n<p data-start=\"983\" data-end=\"1009\">Minimal checks to include:<\/p>\n<ul data-start=\"1011\" data-end=\"1699\">\n<li data-start=\"1011\" data-end=\"1103\">\n<p data-start=\"1013\" data-end=\"1103\"><strong data-start=\"1013\" data-end=\"1026\">Outliers:<\/strong> check for extreme values that could drive group differences or correlations.<\/p>\n<\/li>\n<li data-start=\"1104\" data-end=\"1216\">\n<p data-start=\"1106\" data-end=\"1216\"><strong data-start=\"1106\" data-end=\"1129\">Distribution shape:<\/strong> use a histogram or boxplot to see skew and heavy tails (especially for small samples).<\/p>\n<\/li>\n<li data-start=\"1217\" data-end=\"1347\">\n<p data-start=\"1219\" data-end=\"1347\"><strong data-start=\"1219\" data-end=\"1263\">Equal variances (for group comparisons):<\/strong> compare group spreads; if variances differ, use <strong data-start=\"1312\" data-end=\"1330\">Welch\u2019s t-test<\/strong> when applicable.<\/p>\n<\/li>\n<li data-start=\"1348\" data-end=\"1468\">\n<p data-start=\"1350\" data-end=\"1468\"><strong data-start=\"1350\" data-end=\"1366\">Group sizes:<\/strong> note when one group is much smaller than another, since this can affect stability and interpretation.<\/p>\n<\/li>\n<li data-start=\"1469\" data-end=\"1556\">\n<p data-start=\"1471\" data-end=\"1556\"><strong data-start=\"1471\" data-end=\"1488\">Independence:<\/strong> note clustering (site, school, family) as a limitation if relevant.<\/p>\n<\/li>\n<li data-start=\"1557\" data-end=\"1699\">\n<p data-start=\"1559\" data-end=\"1699\"><strong data-start=\"1559\" data-end=\"1580\">For correlations:<\/strong> inspect a scatterplot to ensure the relationship is not being driven by a single outlier or a weird nonlinear pattern.<\/p>\n<\/li>\n<\/ul>\n<p data-start=\"1701\" data-end=\"1751\">If a check raises concerns, document what you did:<\/p>\n<ul data-start=\"1753\" data-end=\"1951\">\n<li data-start=\"1753\" data-end=\"1810\">\n<p data-start=\"1755\" data-end=\"1810\">use a more robust option (for example, Welch\u2019s t-test),<\/p>\n<\/li>\n<li data-start=\"1811\" data-end=\"1890\">\n<p data-start=\"1813\" data-end=\"1890\">rerun after a clearly justified recode or exclusion rule (with transparency),<\/p>\n<\/li>\n<li data-start=\"1891\" data-end=\"1951\">\n<p data-start=\"1893\" data-end=\"1951\">or keep the analysis but interpret cautiously and say why.<\/p>\n<\/li>\n<\/ul>\n<h3>4.F. Multiple testing and transparency rules (so results mean something)<\/h3>\n<p>When you run many tests, the chance of a false positive increases. This is basic probability, not a character flaw. The fix is to keep your primary analysis tight, label exploratory work honestly, and use a correction when you are doing many related comparisons.<\/p>\n<p>Guardrails to build into your plan:<\/p>\n<ul>\n<li>Pre-limit your primary test set (one primary DV or one primary model).<\/li>\n<li>Label analyses as <strong>confirmatory<\/strong> (planned) versus <strong>exploratory<\/strong> (hypothesis-generating).<\/li>\n<li>If you run many related tests, use a correction strategy and name it:\n<ul>\n<li><strong>Bonferroni<\/strong> (strict), or<\/li>\n<li><strong>False Discovery Rate (FDR)<\/strong> control (common in multi-test settings). (Wiley Online Library)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>A simple reporting norm is to put the correction rule in Methods and keep the interpretation conservative, especially for exploratory results.<\/p>\n<h3>4.G. Interpretation guardrails (ethics + uncertainty)<\/h3>\n<p>Your interpretation should match your design, your analytic choices, and your uncertainty. Start by reporting what you actually analyzed. State the final analytic sample size used in each key test and explain why it changed, because missing data, exclusions, and recoding decisions are part of the meaning of the study, not just technical details.<\/p>\n<ul>\n<li><strong>Statistical uncertainty:<\/strong>\n<ul>\n<li>Emphasize effect sizes and the stability of the pattern shown in the figure.<\/li>\n<li>Do not treat a single p-value threshold as a truth machine.<\/li>\n<li>If available, report uncertainty information such as confidence intervals or standard errors.<\/li>\n<li>Do not let uncertainty metrics replace substantive interpretation of the effect.<\/li>\n<\/ul>\n<\/li>\n<li><strong>Causal uncertainty:<\/strong>\n<ul>\n<li>Because the dataset is observational, associations may reflect alternative explanations.<\/li>\n<li>Keep verbs aligned to the inference hierarchy.<\/li>\n<li>\u201cAssociated with,\u201d \u201cdiffers from,\u201d and \u201cpredicts\u201d (when Time 1 precedes Time 2) are usually appropriate.<\/li>\n<li>Avoid using \u201ccauses\u201d for analyses.<\/li>\n<\/ul>\n<\/li>\n<li><strong>Generalizability:<\/strong>\n<ul>\n<li>Be explicit about what your findings do and do not generalize to.<\/li>\n<li>Note that the dataset is synthetic and context-specific.<\/li>\n<li>Acknowledge the wave structure spanning age 16 to age 21 when discussing scope and limits.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Finally, apply an ethical lens to interpretation. Avoid stigmatizing language when describing group differences, and focus on mechanisms, context, and uncertainty. A strong capstone conclusion reads as careful and credible because it is honest about limits, transparent about decisions, and disciplined about claims.<\/p>\n<h2>5. What a scientific poster is (and what it is not)<\/h2>\n<p>Section 5 shows how to turn the outputs from Section 4 into the required poster simulation components. A scientific poster is a conference communication format built for speed and interaction. In most poster sessions, people are moving, scanning, and deciding quickly what is worth a closer look. A good poster is designed to be readable at a glance and useful during conversation. It functions as a visual aid for the short spoken explanations you give when someone stops at your poster.<\/p>\n<p>That is why posters exist alongside papers and talks. A paper is built for depth and permanence. It can include full methods, nuance, and detailed analysis, and readers engage with it over time. A talk is built for a guided story delivered to a captive audience in a fixed time slot. A poster sits in between. It is a \u201csnapshot\u201d of the work that helps you engage colleagues in dialogue, get feedback, and spark follow-up discussions. <a href=\"https:\/\/www.training.nih.gov\/creating-a-scientific-poster\/\">NIH\u2019s guidance<\/a> makes the same point operationally: you should be able to deliver a short verbal explanation of the work to people who \u201cattend\u201d your poster session.<\/p>\n<p>Your poster should communicate one central claim that matches your inference level, supported by one to two figures that carry the evidence. Everything else is supporting material that helps a reader understand what you did and why it matters. If you try to fit three different research stories on one poster, you usually end up with a crowded wall of text that is hard to scan and even harder to discuss. The poster is not a full paper shrunk down. It is an intentionally distilled story that invites questions.<\/p>\n<figure id=\"attachment_295\" aria-describedby=\"caption-attachment-295\" style=\"width: 1024px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-295 size-large\" src=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-2-2026-04_55_48-PM-1024x683.png\" alt=\"Flat vector infographic showing a three-column scientific poster layout on the left with labeled sections including Title, Background, Research Question and Hypotheses, Methods, Results with three figure placeholders, Discussion, Limitations, Conclusions, and a footer. Arrows point to seven callout boxes on the right that briefly explain the purpose of each poster section.\" width=\"1024\" height=\"683\" srcset=\"https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-2-2026-04_55_48-PM-1024x683.png 1024w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-2-2026-04_55_48-PM-300x200.png 300w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-2-2026-04_55_48-PM-768x512.png 768w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-2-2026-04_55_48-PM-65x43.png 65w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-2-2026-04_55_48-PM-225x150.png 225w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-2-2026-04_55_48-PM-350x233.png 350w, https:\/\/openpub.libraries.rutgers.edu:443\/wp-content\/uploads\/sites\/28\/2026\/03\/ChatGPT-Image-Mar-2-2026-04_55_48-PM.png 1536w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption id=\"caption-attachment-295\" class=\"wp-caption-text\">Flat vector infographic illustrating the standard layout of a scientific poster, with arrows linking each section to concise explanations of its purpose. The visual emphasizes results as the core evidence and presents the poster as a structured progression from research question to interpretation. Created with generative AI.<\/figcaption><\/figure>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3>The DSARM Poster Simulation assignment (what you are building)<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p>In this course, you are not creating a full conference poster. You are building the core elements of a scientific poster as a slide-based poster simulation. That is deliberate. It keeps the focus on the fundamentals of dissemination, meaning telling a clear research story with evidence, while avoiding the distractions of advanced poster software, print formatting, and layout micro-decisions.<\/p>\n<p>Your slide deck maps directly onto standard poster sections:<\/p>\n<ul>\n<li><strong>Title and Abstract:<\/strong> the top of the poster, which tells the reader what the project is and why it matters.<\/li>\n<li><strong>Research Question and Hypotheses:<\/strong> the \u201cwhat are we testing\u201d section.<\/li>\n<li><strong>Participants and Measures:<\/strong> the essential methods content needed to interpret the results.<\/li>\n<li><strong>Results slides for H1, H2, H3:<\/strong> three result claims, each supported by one visualization and a short caption.<\/li>\n<li><strong>Discussion and Conclusion:<\/strong> interpretation, limitations, and what the findings imply.<\/li>\n<li><strong>Citations:<\/strong> credit and traceability.<\/li>\n<li><strong>AI Use Attestation:<\/strong> a transparency statement about how you worked.<\/li>\n<\/ul>\n<p>One benefit of this structure is that it trains you to separate roles: a slide is not a place to dump everything you did. Each slide has a job, and the whole deck functions like a poster session conversation.<\/p>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<h3>5.A. Why posters matter for scientific research dissemination<\/h3>\n<p>Posters matter because they let researchers share new work quickly, visually, and at high volume in conference settings. They are designed for fast scanning plus conversation, so they help an audience grasp the research question, approach, and main result in a short amount of time.<\/p>\n<p>Posters also function as a structured \u201cargument test.\u201d Space constraints force you to make choices explicit. You have to state the question clearly, define the variables, describe the design, and show the key evidence. That constraint is a feature.<\/p>\n<p>Poster sessions are also a feedback engine. Researchers routinely use posters to get real-time critique that improves a project before it becomes a manuscript or a formal talk. In practice, the best poster conversations often revolve around measurement choices, alternative explanations, and what the findings do and do not justify. This is why posters are a staple in research training.<\/p>\n<p>Finally, posters help bridge expert and non-expert audiences. A well-designed figure and a plain-language caption can communicate a finding more accessibly than a dense methods section. This matters for dissemination because research does not only live in journals. It also moves through labs, departments, conferences, and community-facing spaces, and posters are one of the most common formats for that movement.<\/p>\n<h3>5.B. Poster anatomy as a narrative arc<\/h3>\n<p>Scientific posters are structured stories, not templates. A template can help with layout, but it cannot tell you what the story is. The story is the sequence of ideas that a reader can follow in a single pass, even if they only give you 30 seconds. The poster format rewards clarity because it is designed for quick scanning and short conversations, not for long reading. A simple, reliable arc is: <strong>background \u2192 research question \u2192 methods \u2192 results \u2192 interpretation \u2192 limitations.<\/strong><\/p>\n<p>What people look for in 30 seconds is different from what they ask in conversation. In a fast scan, readers typically look for (1) the title, (2) the question, (3) one clear figure, and (4) a bottom-line statement. In conversation, they usually ask about design choices and credibility: why this question, what variables and waves, what you controlled for, how you handled alternative explanations, and what remains uncertain.<\/p>\n<h3>5.C. Figures and captions as the core of the poster<\/h3>\n<p>Figures are the heart of a poster because they communicate patterns faster than text. The best posters have one figure per claim, not many weak figures that compete for attention. If you have three claims, you should have three figures. That matches your H1, H2, H3 structure nicely.<\/p>\n<p>Captions should be short and functional. A good caption answers three questions in one to two sentences: what is plotted, who is included, and which variables and waves are shown. Captions should not be mini-discussions. They should orient the reader so they can interpret the figure correctly.<\/p>\n<p>For this assignment, do not report p-values in captions. Instead, focus on what the figure shows in plain language: direction, magnitude, uncertainty when available, and sample size. If you want to communicate statistical support, the caption can mention confidence intervals or describe the size of the estimated relationship, but keep it simple.<\/p>\n<p>Basic readability norms matter more than students expect. Axes should be labeled clearly, units should be included when relevant, legends should be readable, and variable names should be consistent with the rest of the deck. If the reader cannot interpret the plot in five seconds, the plot is not doing its job.<\/p>\n<h3>5.D. Poster readiness: the 2-minute walkthrough<\/h3>\n<p>A poster simulation works best when you can explain it out loud in a tight, two-minute story. The simplest structure mirrors the deck: start with the question, summarize the design and measures, walk through the three results slides, then give a careful conclusion with limitations.<\/p>\n<p>A practical pacing model:<\/p>\n<ul>\n<li>20 seconds: background and research question<\/li>\n<li>20 seconds: dataset and measures<\/li>\n<li>60 seconds: results, one sentence per figure (H1, H2, H3)<\/li>\n<li>20 seconds: conclusion and what remains uncertain<\/li>\n<\/ul>\n<p>Prepare for three predictable questions.<\/p>\n<p><strong>\u201cWhat did you find?\u201d<\/strong><br \/>\nAnswer with one sentence that matches the inference tier, then point to the single most important figure.<\/p>\n<p><strong>\u201cHow did you test confounding?\u201d<\/strong><br \/>\nAnswer by describing the unadjusted versus adjusted comparison. Mention which covariate(s) you used and what changed in the IV estimate.<\/p>\n<p><strong>\u201cWhat is still uncertain?\u201d<\/strong><br \/>\nName the biggest remaining alternative explanation, limitation, or generalizability constraint.<\/p>\n<p>End with one reproducibility sentence you can say out loud, such as: \u201cAll figures and model outputs in this deck were generated from my analysis code, and the saved outputs and figure files are stored in my project folders so the results can be traced and reproduced.\u201d<\/p>\n","protected":false},"author":30,"menu_order":1,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-281","chapter","type-chapter","status-publish","hentry"],"part":279,"_links":{"self":[{"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/pressbooks\/v2\/chapters\/281","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/wp\/v2\/users\/30"}],"version-history":[{"count":15,"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/pressbooks\/v2\/chapters\/281\/revisions"}],"predecessor-version":[{"id":317,"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/pressbooks\/v2\/chapters\/281\/revisions\/317"}],"part":[{"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/pressbooks\/v2\/parts\/279"}],"metadata":[{"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/pressbooks\/v2\/chapters\/281\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/wp\/v2\/media?parent=281"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/pressbooks\/v2\/chapter-type?post=281"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/wp\/v2\/contributor?post=281"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/openpub.libraries.rutgers.edu\/dsarm12\/wp-json\/wp\/v2\/license?post=281"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}