Institutional change measurement, citizen-experience metrics, before-after with structural break design, and the political economy of evaluation findings. Walk out with a governance evaluation design brief.
4 modules~3 hoursInteractiveIndia-context
Your progress
0% complete
Your Capstone
Governance Evaluation Design Brief
Walk in with a governance reform initiative. Walk out with an evaluation design brief covering institutional change measurement, citizen-experience metrics, research design, and political economy mapping.
Module 1 . ~25 min
Institutional change -- what can be measured
Governance reform aims to change how institutions function -- making them more transparent, accountable, responsive, or efficient. These are inherently hard to measure because institutions are complex systems, not bounded interventions.
Four dimensions of institutional change
Dimension
What it captures
Example indicators
Data source
Rules
Formal rule changes (laws, orders, SOPs)
New SOP adopted, RTI compliance rate, file disposal timeline
Administrative records, gazette notifications
Practices
How rules are actually implemented
Average service delivery time, absenteeism rates, meeting frequency
Mystery client visits, direct observation, administrative data
Norms
Informal expectations and culture
Corruption perception, trust in institution, perceived responsiveness
Citizen surveys, staff surveys
Outcomes
End results for citizens
Service access, grievance resolution, satisfaction
Citizen surveys, service delivery data
Most governance evaluations measure rules and claim outcomes. The gap between "the SOP was changed" (rules) and "citizens experienced faster service" (outcomes) is where governance reforms succeed or fail. Measuring practices is the critical middle layer.
Worked example
Rajasthan's Bhamashah Yojana digitised social protection benefit delivery. The evaluation measured: (a) rules -- database created, Aadhaar-linked accounts opened (output), (b) practices -- time from application to benefit receipt, number of trips to office (measured via citizen survey), (c) norms -- corruption perception around benefit delivery (pre/post survey), (d) outcomes -- inclusion errors and exclusion errors in benefit targeting. The most impactful finding was that digitisation reduced the number of citizen-office trips from 4.2 to 1.8 on average, but exclusion errors increased for households without Aadhaar.
Your Institutional Change Framing
Map the four dimensions for your reform. Answers flow into the capstone.
e.g., "E-governance reform in land records (Bhoomi-type), 3 districts, Karnataka"
Saved
Self-check
A state government digitises its Public Distribution System (PDS). The evaluation reports: "100% of fair price shops now have electronic Point of Sale devices." Is this an outcome finding?
Yes -- full digitisation is the outcome
No -- device installation is a rules/infrastructure change (output); outcomes would be reduced diversion, fewer ghost beneficiaries, or faster service for citizens
Yes, if the devices are functional
Only if compared to a pre-digitisation baseline
Correct. Installing technology is infrastructure, not reform. The evaluation must measure what changed for citizens (service speed, accuracy, leakage reduction) and for the institution (transparency, accountability). Many e-governance evaluations stop at installation counts.
Module 2 . ~30 min
Citizen-experience metrics
Governance exists to serve citizens. The most credible governance evaluation evidence comes from measuring citizen experience directly -- not from administrative self-reports or expert scorecards.
Three approaches to citizen experience measurement
Citizen Report Cards (CRC) -- pioneered by the Public Affairs Centre, Bangalore. Large-sample surveys of citizens rating government services on accessibility, reliability, quality, responsiveness, and corruption. The Bangalore CRC has been running since 1994 and has demonstrably improved service delivery through public pressure.
Community Scorecards -- participatory tool where community members rate local service providers (PHC, school, panchayat office) and providers respond. More process-oriented; produces dialogue, not just data.
Mystery client / simulated citizen -- trained researchers visit government offices posing as citizens to measure actual service quality, wait times, bribe demands. The most rigorous measure of frontline practice but ethically and operationally complex.
Indian governance data sources
Source
What it provides
Frequency
Service delivery surveys (World Bank)
Provider absenteeism, infrastructure, service quality
Periodic (2003, 2010, 2019)
Governance Performance Index (NITI Aayog)
State-level governance quality across sectors
Irregular
DISHA dashboards
District-level scheme implementation data
Real-time
RTI compliance data
Response rates, timelines, appeals
Annual (CIC annual report)
CPGRAMS
Grievance registration and disposal
Real-time
The social desirability problem
Citizens in India often under-report negative experiences with government services, especially to unknown surveyors. They fear retaliation or believe it is futile. Use indirect questioning techniques: "Some people in this area say that obtaining a caste certificate takes 3 trips to the office and costs Rs 500 in unofficial fees. In your experience, is this more or less than what it actually takes?" This indirect framing yields more honest responses than "Did you pay a bribe?"
Your Citizen Experience Measurement
Design the citizen-side measurement. These flow into your capstone.
Saved
Self-check
You are evaluating a one-stop-centre reform in district offices. The government reports "average service time reduced from 7 days to 2 days." Your citizen survey finds "average service time is 5 days." Why might these differ?
The citizen survey sample is biased
The government measures from application receipt to file closure; citizens measure from first visit (including document preparation, re-visits for corrections) to actual receipt of service -- the citizen journey is longer than the administrative process
The government data is falsified
Citizens remember incorrectly
Correct. The citizen journey includes steps that administrative data does not capture: gathering documents, making initial inquiries, returning for corrections, waiting for notifications. Both measures are valid; they answer different questions. The evaluation should report both and explain the gap.
Module 3 . ~30 min
Before-after with structural break design
Most governance reforms are implemented universally within a jurisdiction. There is no control group. The most feasible design is before-after with structural break analysis -- testing whether the reform corresponds to a detectable change in the trend of governance outcomes.
How structural break designs work
Collect time-series data -- at least 8-10 pre-reform data points and 4-6 post-reform points. For governance, monthly or quarterly administrative data often provides this (e.g., RTI responses per month, grievance disposal rates, service delivery times from DISHA).
Test for structural break -- use Chow test or Bai-Perron test to determine whether the reform date corresponds to a statistically significant break in the trend.
Control for confounders -- other events that coincided with the reform (elections, budget changes, personnel transfers). Document these explicitly.
Supplement with qualitative evidence -- interviews with officials and citizens to explain the mechanism. The quant shows when things changed; the qual shows why.
Worked example
AP's Mee Seva (citizen service centres) reform was evaluated using monthly data on service transactions from 2010-2016. The structural break test showed a significant increase in transaction volume at the Mee Seva launch date (2011), with a secondary break when rural centres opened (2013). Qualitative interviews with citizens confirmed that the primary driver was reduced travel cost (service available locally) rather than faster processing (which was marginal).
Your Research Design
Design the before-after analysis. These flow into your capstone.
Cross-jurisdiction comparison? Matched district design?
Saved
Self-check
You have 3 months of pre-reform data and 3 months of post-reform data on grievance disposal rates. Is this sufficient for a structural break analysis?
Yes -- 6 data points is enough
No -- structural break tests require at least 8-10 pre-reform points to establish the pre-existing trend; with only 3, you cannot distinguish reform effects from normal variation
Yes, if you use weekly data instead of monthly
Depends on the effect size
Correct. With only 3 pre-reform data points, you have no baseline trend to compare against. The post-reform data could simply reflect normal month-to-month variation. You need a longer time series or an alternative design (cross-jurisdiction comparison, for instance).
Module 4 . ~25 min
Political economy of evaluation findings
Governance evaluations are uniquely political. Unlike health or education evaluations, governance evaluations directly assess the performance of the political-administrative system. The people being evaluated are often the people who commissioned the evaluation.
Three political economy traps
The showcase trap -- the reform is implemented in one or two model districts where the collector is supportive. The evaluation covers these districts. Findings are positive. The government scales up. But model-district performance does not replicate because the enabling conditions (strong collector, extra budget, political attention) do not scale.
The attribution trap -- a new government claims credit for improvements that began under the previous government. Or the reverse: a new government discredits the previous government's reform. Evaluators must be clear about timelines and trends, not just snapshot comparisons.
The messenger trap -- the evaluator finds that the reform worsened outcomes for certain groups (e.g., digitisation excluded those without smartphones). Reporting this honestly risks losing future contracts or access. Pre-commit to transparency: pre-register, use independent peer review, share data publicly.
The "good enough" governance standard
Not all governance reforms need to reach ideal standards to be valuable. Matt Andrews' concept of "good enough governance" suggests evaluating reforms against locally achievable benchmarks, not global best practice. If a district office in Jharkhand reduces service time from 30 days to 10 days, that is meaningful progress even if the Kerala benchmark is 3 days. Frame findings as progress relative to the starting point, not just distance from the ideal.
Your Political Economy Mapping
Map the political economy of your evaluation. These flow into your capstone.
Saved
Self-check
The state government asks you to evaluate their e-governance reform. They suggest you study the 3 best-performing districts where the reform was piloted with extra support. What is the risk?
No risk -- best-performing districts show the reform's potential
Showcase-district bias -- findings will not represent typical implementation; insist on including average and below-average districts in the sample to show the full distribution of implementation quality
The sample is too small (only 3 districts)
The government is trying to influence the findings
Correct. Evaluation of showcase districts produces evidence about what is possible, not what is typical. For policy decisions about scaling, the government needs to know how the reform performs under normal conditions -- average bureaucratic capacity, standard budgets, typical political attention levels.
Capstone
Your Governance Evaluation Design Brief
Click Build my brief to compile everything.
Governance Evaluation Design Brief
Click "Build my brief" to compile your answers.
Your brief will appear here when you click "Build my brief".
Share this brief with a governance specialist or an IAS/IPS officer before circulating. The most common blind spot is assuming that technology deployment equals institutional reform.