Potential outcomes
The pair of outcomes a unit would show with and without treatment, Y(1) and Y(0). Causal effects are contrasts between them (Splawa-Neyman 1923; Rubin 1974).
65 terms from the course, defined for development practitioners who design, read and commission impact evidence. Search, or filter by part of the toolkit.
The pair of outcomes a unit would show with and without treatment, Y(1) and Y(0). Causal effects are contrasts between them (Splawa-Neyman 1923; Rubin 1974).
The outcome that did not happen: how a treated unit would have fared untreated, or vice versa. The thing every causal claim must reconstruct.
For any unit you observe only one potential outcome, never both, so an individual effect can never be measured directly (Holland 1986). A missing-data problem, not a sample-size one.
The difference a treatment makes to an outcome: τ = Y(1) − Y(0) for a unit, or an average over a group.
Average treatment effect: the mean of Y(1) − Y(0) over the whole population — the effect of treating everyone versus no one.
Average treatment effect on the treated: the mean effect among those who actually received treatment. Usually what “did the programme work?” means.
Average treatment effect on the untreated: the effect treatment would have had on those who did not get it. Relevant to expansion decisions.
The quantity you are trying to estimate (e.g. the ATT), named before any method or data. Doing so first is the discipline of design-based inference.
Stable Unit Treatment Value Assumption: one unit's outcome is unaffected by others' treatment (no interference) and treatment means one well-defined thing (no hidden versions) (Rubin 1980).
Y = D·Y(1) + (1−D)·Y(0): the observed outcome is whichever potential outcome the treatment switch selects; the other half is the counterfactual.
The gap between how the treated would have fared untreated and how the untreated actually fared. Why a naive treated-minus-untreated comparison is not the effect.
Whether an effect could be recovered with infinite data, given the assumptions. A question about design, prior to estimation.
Also ignorability: treatment is as good as randomly assigned once you condition on observed covariates, {Y(1),Y(0)} ⊥ D | X. The key assumption behind matching and regression.
For every covariate value both treated and untreated units exist, so like can be compared with like. Without it, effects rest on extrapolation.
A variable is uncorrelated with the error term, so its variation is unrelated to unobserved drivers of the outcome.
An instrument affects the outcome only through the treatment, never directly. Untestable, and the heart of every IV argument.
A common cause of both treatment and outcome that, left unadjusted, biases the comparison.
The specific condition a design needs to be true for its comparison to recover a causal effect — parallel trends for DiD, continuity for RDD, and so on.
Same as unconfoundedness: once X is held fixed, treatment is independent of the potential outcomes.
“Other things equal” — the ideal a causal comparison approximates by holding confounders fixed.
Directed acyclic graph: arrows encode assumed causal directions among variables, with no cycles. A tool for reasoning about what to control for (Pearl 2009).
A variable that causes both treatment and outcome; a backdoor path you must block by conditioning.
A variable caused by two others (X → C ← Y). Conditioning on it opens a spurious path and creates bias.
A variable on the causal path from treatment to outcome. Controlling for it removes part of the very effect you want to measure.
A non-causal path from treatment to outcome through a common cause. Blocking all of them identifies the effect.
A graphical rule: condition on a set that blocks every backdoor path without opening a collider, and the effect is identified.
A variable that is a mediator or collider; controlling for it introduces bias rather than removing it (Angrist & Pischke, ch. 3).
The graph rule that determines whether two variables are independent given a conditioning set.
Treatment assigned by chance, so treated and control groups are comparable in expectation on everything, observed and not. The benchmark design.
Assigning treatment by a chance device. It makes potential outcomes independent of treatment, neutralising selection bias by design.
Randomising within groups to guarantee balance on key covariates and improve precision.
Estimating effects by pairing treated units with untreated units of similar covariates, then comparing outcomes.
The probability of treatment given covariates, P(D=1|X). Matching or weighting on it removes confounding under unconfoundedness (Rosenbaum & Rubin 1983).
Reweighting units by the inverse of their propensity score to build a synthetic sample in which treatment is unconfounded.
Regressing the outcome on treatment and covariates to net out observed confounders. Identifies the effect only under unconfoundedness.
Using a variable that shifts treatment but affects the outcome only through it, to recover effects when treatment is endogenous.
The standard IV estimator: predict treatment from the instrument, then regress the outcome on the predicted treatment.
Local average treatment effect: the effect IV recovers, for compliers only — those whose treatment is moved by the instrument (Imbens & Angrist 1994).
Units that take treatment if and only if the instrument assigns them to it. IV speaks only about them, not always-takers or never-takers.
Using a cutoff in a running variable that assigns treatment, comparing units just above and below as if randomised (Thistlethwaite & Campbell 1960).
Sharp: treatment switches deterministically at the cutoff. Fuzzy: the cutoff only shifts the probability of treatment, so it is used as an instrument.
The variable that determines treatment at a threshold, and the window around the cutoff used for the comparison.
Comparing the before-after change in a treated group to the change in a control group, differencing out fixed gaps and common trends.
DiD's identifying assumption: absent treatment, treated and control groups would have moved in parallel. Supported, never proved, by pre-trends.
A regression with unit and time fixed effects, the common DiD estimator; biased under staggered timing with heterogeneous effects (Goodman-Bacon 2021).
Plotting effects by time relative to treatment, to inspect pre-trends and the dynamics around the intervention.
Building a weighted combination of untreated units that tracks the treated unit before treatment, as a data-driven counterfactual (Abadie et al. 2010).
Using machine learning for the nuisance models with orthogonalisation and sample splitting to estimate effects (Chernozhukov et al. 2018).
Units dropping out before outcomes are measured. If it differs by treatment, it reintroduces selection bias even in an RCT.
Treatment affecting untreated units through markets or networks, violating SUTVA and contaminating the control group.
Assigned units not taking treatment, or unassigned units taking it, breaking the link between assignment and exposure.
The effect of being assigned to treatment, regardless of take-up. Robust to non-compliance and the policy-relevant number for an offer.
The effect among those who actually complied; recovered from the ITT by scaling for take-up under IV assumptions.
Instruments only loosely related to treatment, which inflate variance and bias IV toward the naive estimate. Diagnosed via the first-stage F.
Checking for bunching in the running variable at an RDD cutoff, which would signal units sorting across it and break the design.
Applying a design where no effect should exist (a fake cutoff, an untreated period) to check it returns a null.
Whether treated and control groups have similar covariate distributions. A balance table is the first check on any RCT or matching design.
Whether the estimate is unbiased for the population studied. The first thing a design must earn.
Whether a result transports to other populations, places or scales. High internal validity does not guarantee it.
The smallest true effect a study has the power to detect, set before data collection in a power calculation.
A registered specification of hypotheses and analyses filed before seeing outcomes, to limit specification search.
Effects that vary across subgroups; the conditional average treatment effect describes them, increasingly estimated with causal ML.
When a point estimate needs assumptions you cannot defend, reporting the range the data support instead (Manski).
Trying many models and reporting the most favourable, which invalidates inference. Pre-registration and robustness checks guard against it.
Standard errors that allow correlation within groups (villages, schools), required when treatment is assigned at the cluster level.
No terms match. Try a broader search or a different filter.