Subject Pack . S7 . Interactive

Public Policy Evaluation

Policy evaluation vs programme evaluation, political economy awareness in design, implementation tracking and regulatory impact, and welfare-effect estimation approaches. Walk out with a policy evaluation design brief.

4 modules~3 hoursInteractiveIndia-context

Your progress

0% complete

Your Capstone

Policy Evaluation Design Brief

Walk in with a public policy or government scheme. Walk out with an evaluation design brief covering the policy-programme distinction, political economy mapping, implementation tracking, and welfare estimation approach.

Module 1 . ~25 min

Policy evaluation vs programme evaluation

A programme is a bounded intervention with defined beneficiaries, a budget, and a start/end date. A policy is a rule, incentive structure, or institutional arrangement that shapes behaviour across an entire population. Evaluating the two requires different logics.

Key differences

Dimension	Programme evaluation	Policy evaluation
Scope	Bounded: specific geographies, beneficiaries	Universal or near-universal: affects entire populations
Counterfactual	Non-participants or comparison group	Often no untreated group (universal policies); use pre-post, cross-state, or regression discontinuity
Implementation	Controlled by implementing agency	Mediated through bureaucracy, politics, and state capacity
Timeline	2-5 years typical	Effects unfold over decades; political cycles interrupt
Data	Primary collection usually feasible	Often relies on administrative data (NSSO, PLFS, Census, SECC)

Indian policy evaluation landscape

India has a growing culture of policy evaluation but it remains patchy. NITI Aayog's Development Monitoring and Evaluation Office (DMEO) commissions evaluations of centrally sponsored schemes. The Programme Evaluation Organisation (PEO) has been folded into DMEO. State-level evaluation capacity varies enormously -- Kerala and Tamil Nadu have strong traditions; many states have none.

Worked example

PM-KISAN (Rs 6,000/year cash transfer to farmer families) was implemented nationally in February 2019. No randomisation was possible because it is a universal entitlement. Evaluations have used: (a) PLFS panel data comparing pre/post implementation periods, (b) regression discontinuity at the 2-hectare cutoff (in the initial design which had a landholding cap), and (c) difference-in-differences across states with faster vs slower DBT rollout. Each design answers a different question and has different limitations.

Your Policy Scoping

Fill these for your policy/scheme. Answers flow into the capstone.

Policy or scheme name and contexte.g., "PM-KISAN national cash transfer to farmer families, evaluating effect on input use and yields"

Is this a programme evaluation or a policy evaluation?

Programme evaluation -- bounded, specific beneficiariesPolicy evaluation -- universal rule or incentive structureHybrid -- scheme with universal intent but variable implementation

Central evaluation question

Counterfactual strategyWho or what is the comparison, given universal coverage?

Primary data source (administrative, survey, or primary)

Saved

Self-check

The Right to Education Act (2009) mandated free education for all children aged 6-14. How would you design an evaluation of RTE's effect on learning outcomes?

RCT with treatment and control schools

Pre-post comparison using ASER data from 2006-2014, comparing trends across states with different implementation timing and intensity

Survey of teachers asking if RTE improved learning

Cannot be evaluated because it is universal

Correct. Universal policies like RTE require pre-post or cross-state comparison designs. ASER provides annual, comparable learning data from 2005 onwards -- a rare asset for policy evaluation. Variation in RTE implementation timing across states creates a natural experiment.

Module 2 . ~30 min

Political economy awareness in design

Policy evaluation is never politically neutral. The findings will be used, misused, or ignored depending on who commissioned the evaluation, when in the political cycle it is released, and which stakeholders benefit from the current policy.

The four political economy questions

Who gains and who loses from the current policy? These stakeholders will support or resist your evaluation findings. Map them explicitly.
Who commissioned the evaluation and why? A DMEO-commissioned evaluation of a flagship scheme has different dynamics than a CSO-commissioned evaluation of the same scheme. Both are valid; both are shaped by their commissioning context.
Where are we in the electoral cycle? Pre-election evaluations of government schemes face pressure to show results. Post-election evaluations of the previous government's schemes face pressure to show failures. Neither is objective; acknowledging this is honest practice.
What decisions can this evaluation actually influence? If the policy is politically untouchable (e.g., a PM's flagship scheme in its first term), an evaluation recommending discontinuation will be ignored. Focus instead on implementation improvements that are politically feasible.

The evaluator's independence

DMEO guidelines require evaluations to be conducted by "independent" agencies. In practice, agencies that produce unfavourable findings are less likely to receive future contracts. This structural incentive biases towards positive findings. Document your quality assurance process explicitly: peer review by named experts, pre-registered analysis plan, data deposited in a public repository.

Your Political Economy Mapping

Map the political economy. These flow into your capstone.

Key stakeholders and their interests

Who commissioned this evaluation and why?

What recommendations are politically feasible?

Your independence safeguardsPre-registration, peer review, data deposit, etc.

Electoral/political timing considerations

Saved

Self-check

You are commissioned to evaluate a state government's flagship nutrition programme 6 months before state elections. What is the most important design decision?

Use the most rigorous method possible (RCT)

Pre-register the analysis plan and methodology before data collection, so findings cannot be selectively reported under political pressure

Delay the evaluation until after elections

Only report positive findings to maintain access

Correct. Pre-registration protects both the evaluator and the funder. When the analysis plan is public before data collection, selective reporting becomes visible. This is the strongest independence safeguard available.

Module 3 . ~30 min

Implementation tracking + regulatory impact

Most policies fail not because the design is wrong but because implementation breaks down. Lant Pritchett's concept of "capability traps" explains why: India announces ambitious policies that exceed the implementation capacity of the state machinery. Implementation tracking is therefore the most useful form of policy evaluation for improving outcomes.

What to track

Coverage gap -- what proportion of eligible beneficiaries actually receive the benefit? PM-KISAN has ~8.5 crore beneficiaries against ~14.5 crore eligible (as of 2025). This 40% gap is the implementation story.
Leakage and targeting errors -- inclusion errors (non-eligible receiving benefits) and exclusion errors (eligible not receiving). Both matter; they have different political implications.
Administrative bottlenecks -- where does implementation stall? At Aadhaar verification? At DBT transfer? At last-mile delivery? Process maps reveal this.
Regulatory burden -- for regulatory policies (e.g., labour law reform, environmental clearance changes), measure compliance cost and time.

Worked example

MGNREGA implementation tracking: The most informative evaluations of MGNREGA focus not on "did it reduce poverty?" (too distal, too many confounders) but on implementation quality: days provided vs days demanded (demand gap), wage payment delays (measured in days between muster roll closing and DBT credit), social audit coverage, and material-labour ratio. These process metrics directly diagnose implementation health and are actionable at the block level.

Your Implementation Tracking Design

Design the implementation analysis. These flow into your capstone.

Coverage analysis: eligible vs actual beneficiaries

Key implementation bottlenecks to investigate

Administrative data sources you will usee.g., DBT data, PFMS, MGNREGA MIS, PM-KISAN portal, SECC

Process indicators (3-5 measurable implementation metrics)

Implementation variation across geographies (states, districts)

Saved

Self-check

An evaluation of PM-KISAN finds: "The scheme has no significant effect on agricultural productivity." The government dismisses the finding. What might the evaluation have missed?

The sample size was too small

Implementation tracking -- if only 60% of eligible farmers actually receive payments, and payments are delayed by 3+ months, the "treatment" is not what the policy intended; the evaluation measured a diluted version

The comparison group was poorly selected

Agricultural productivity is not the right outcome

Correct. Evaluating a policy's intent without measuring its implementation is evaluating a theory, not a reality. Implementation tracking (coverage, timeliness, amount) should always accompany impact estimation. A null effect may reflect implementation failure, not design failure.

Module 4 . ~25 min

Welfare-effect estimation approaches

When you need to estimate the causal effect of a policy on welfare outcomes, you need a credible identification strategy. India's policy landscape offers several natural experiment opportunities.

Identification strategies for Indian policies

Strategy	When it works	Indian example
Regression discontinuity	Eligibility has a sharp cutoff (BPL score, age, income)	PMAY-G eligibility based on SECC deprivation score
Difference-in-differences	Policy rolled out at different times across states/districts	MGNREGA phased rollout (2006-2008) across districts
Instrumental variables	An instrument affects policy exposure but not outcomes directly	Rainfall as instrument for MGNREGA take-up (demand rises after drought)
Synthetic control	A single treated unit (one state) with many potential comparison units	Gujarat's industrial policy effect compared to synthetic Gujarat

Administrative data for welfare estimation

India's administrative data ecosystem has expanded enormously: NSSO/PLFS (labour), NFHS (health/nutrition), AIDIS (debt/assets), ASI (industry), UDISE+ (education). These datasets are free, nationally representative, and increasingly available as microdata. Use them.

The general equilibrium problem

Large-scale policies create general equilibrium effects that programme evaluations miss. MGNREGA raises rural wages even for non-participants. GST changes relative prices across sectors. PM-KISAN may affect land prices. These spillover effects can be larger than direct effects and require macro-level analysis, not household surveys.

Your Welfare Estimation Plan

Design the estimation strategy. These flow into your capstone.

Primary identification strategy

Welfare outcome you will estimate

Dataset(s) you will use

General equilibrium / spillover effects to consider

Your honesty-test sentence

Saved

Self-check

MGNREGA was rolled out in three phases: Phase 1 (200 poorest districts, Feb 2006), Phase 2 (130 districts, 2007), Phase 3 (remaining districts, 2008). What identification strategy does this phased rollout enable?

Difference-in-differences -- Phase 1 districts as treatment, Phase 3 districts as control during 2006-2008, then Phase 3 becomes treated

Regression discontinuity at the poverty cutoff

Randomised controlled trial across phases

Synthetic control for each phase

Correct. MGNREGA's phased rollout is the most-used natural experiment in Indian policy evaluation. Phase 1 districts received the programme 2 years before Phase 3, creating a treatment-control window. Dozens of papers use this design (Imbert & Papp 2015, Zimmermann 2020, etc.).

Capstone

Your Policy Evaluation Design Brief

Click Build my brief to compile everything.

Policy Evaluation Design Brief

Click "Build my brief" -- your module answers will be pulled into the artefact.

Your brief will appear here when you click "Build my brief".

Where to go next on ImpactMojo

Done?

Share this brief with a policy researcher before circulating. The most common blind spot is ignoring implementation quality when interpreting impact estimates.

Help us improve: feedback form.

All Practice Packs →