Visual History

MEL & Research Methods, 1969–Today

From the original USAID logical framework through OECD-DAC evaluation criteria, Theory of Change, the J-PAL randomista revolution, Most Significant Change and Outcome Harvesting, adaptive learning, contribution analysis, and the recent AI-for-evaluation turn — 18 nodes tracing how the field of MEL has actually evolved as practice, not just as doctrine.

18 nodes 6 eras ~55 years CC BY-NC-SA 4.0

Era 01

The Logframe Era

1969 – 1989

USAID’s 1969 commission to Practical Concepts Inc. produced the Logical Framework. It became the dominant planning and evaluation grammar of international development for decades. The OECD DAC was created in 1971 to coordinate donor practice.

1969

USAID Logical Framework (Logframe)

Leon J. Rosenberg for Practical Concepts Inc. · commissioned by USAID, 1969

Argued

A 4×4 matrix structuring a project’s logic: rows for Goal, Purpose, Outputs, Activities; columns for Narrative, Verifiable Indicators, Means of Verification, and Important Assumptions. Forces designers to articulate the causal logic, indicators, evidence sources, and assumptions in a single page.

Mattered

Became the universal grammar of development planning. USAID, World Bank, EU, FCDO, GIZ, JICA — all adopted variants. The vocabulary (outputs/outcomes/impact, indicators, MOVs) traces directly to Rosenberg’s 1969 work.

Critique

Critics (Gasper, Mosse, Eyben) argue the logframe imposes spurious linearity on complex change processes, hides political contestation behind technical language, and centralises authority with donors. The Theory of Change movement (1995–) emerged partly as response.

MEL Basics 101 TOC Workbench

1971

OECD Development Assistance Committee (DAC) Formed

OECD · Replaced the Development Assistance Group of 1960 · 1971 reorganisation

Argued

A coordinating forum for OECD member donor countries. Sets common definitions of Official Development Assistance (ODA), measures aid flows, and develops standards for evaluation, transparency, and effectiveness.

Mattered

DAC defined the architecture within which donor MEL evolved: the 1991 evaluation principles, the 5 (now 6) DAC criteria, peer reviews, and the Paris Declaration follow from this institutional spine. The 0.7% ODA/GNI target is a DAC norm.

Critique

DAC’s membership is exclusively rich-country donors; recipient and Southern voice is structurally limited. China, India, Brazil and other major South-South cooperation actors operate outside the DAC framework, with different norms.

MEL Basics 101

Era 02

Results-Based Management & Theory of Change

1991 – 2000

The DAC codified evaluation criteria; ToC entered development from US foundation work; the MDGs cemented results-based management as the global frame. The 1990s pivoted MEL from input-output accountability to outcome thinking.

1991

OECD-DAC Evaluation Criteria — The Original Five

OECD-DAC Network on Development Evaluation · Principles for Evaluation of Development Assistance, 1991

Argued

Five criteria for evaluating any development intervention: Relevance (does it address the right need?), Effectiveness (does it achieve its objectives?), Efficiency (does it use resources well?), Impact (does it produce broader change?), and Sustainability (do the results last?).

Mattered

The most-cited evaluation framework in history. Anchored evaluation training, ToRs, donor reporting, and academic research for three decades. Coherence was added in 2019, making six criteria.

Critique

Critics (Picciotto, Carden, Patton) argue the criteria are donor-centric, frame interventions as discrete projects rather than systems, and treat values as if they were technical. Real-world evaluations often score everything “moderately satisfactory” without saying anything actionable.

MEL Basics 101

1995

Theory of Change Enters Development

Carol Weiss, Aspen Institute Roundtable on Comprehensive Community Initiatives · New Approaches to Evaluating Community Initiatives, 1995

Argued

Complex social change initiatives need to articulate not just inputs and outputs, but the underlying causal hypotheses about why activities should produce the desired change. ToC makes assumptions explicit and testable, helping evaluators and implementers work with complexity rather than reduce it.

Mattered

By the early 2010s ToC had become standard donor language (DFID, USAID, Hewlett, Gates). Versions like outcomes mapping (IDRC), pathway analysis (Rockefeller), and ToC-as-process (Vogel, James) extended the basic frame.

Critique

Critics argue ToC has become as ritualised as logframes — produced once, never revisited, often consultant-led. Patton and others advocate “developmental evaluation” as the actual practice ToC implies. The gap between ToC-as-document and ToC-as-practice persists.

MEL Basics 101 TOC Workbench Lab

Era 03

Quality, Coordination & Standards

2002 – 2009

The 2000s saw a strong push for evaluation quality, donor harmonisation, and the politics of aid effectiveness. The Paris Declaration set principles; UNEG codified quality standards; the methods debate intensified between rigour-as-RCTs and rigour-as-fitness-for-purpose.

2005

Paris Declaration on Aid Effectiveness

OECD Development Assistance Committee · Paris High Level Forum · March 2005

Argued

Five principles: Ownership (recipient-led), Alignment (donors align with country systems), Harmonisation (donors coordinate), Managing for Results, and Mutual Accountability. 12 indicators with targets to 2010 to track progress on each.

Mattered

Most ambitious attempt to reform donor practice in a generation. Drove sector-wide approaches (SWAps), country-led joint performance frameworks, and the explicit obligation to use country M&E systems. Successor conferences (Accra 2008, Busan 2011) extended the framework to civil society and South-South cooperation.

Critique

2010 evaluations found mixed compliance: alignment improved but harmonisation lagged; donor proliferation continued; Busan opened the door for non-DAC actors but quality concerns remained. The aid effectiveness agenda largely faded after 2015 as climate finance and SDGs took centre stage.

MEL Basics 101

2005

UNEG Norms & Standards for Evaluation

United Nations Evaluation Group (UNEG) · Norms (2005), Standards (2005), Code of Conduct (2008)

Argued

Codified what professional, independent, ethical, and quality evaluation looks like across the UN system. Norms cover purpose, principles (independence, impartiality, credibility, utility), and roles. Standards cover institutional framework, evaluator competencies, design, conduct, and reporting.

Mattered

Set the floor for evaluation quality across UN agencies, with influence on national VOPEs (voluntary organisations for professional evaluation), UNEG-RBM partnerships, and the EvalPartners movement. Most contemporary evaluation policy documents (governments, INGOs) trace lineage to UNEG.

Critique

Standards are clearer on independence and rigour than on equity, gender, decolonial frames, or community-led evaluation. UNEG has updated several times (2016, 2020) to address gender-responsive evaluation, equity-focused evaluation, and Made in Africa evaluation principles.

MEL Basics 101

Era 04

The Empirical / RCT Turn

2003 – 2010

J-PAL, IPA, 3ie. RCTs became the “gold standard” for impact evaluation. Cochrane methods extended to development. The systematic-review and replication agenda took root. The methods debate sharpened: rigour-as-design vs rigour-as-context.

2003

J-PAL Founded — The Randomista Movement

Abhijit Banerjee, Esther Duflo, Sendhil Mullainathan · MIT, June 2003

Argued

Many development questions can be answered using randomised controlled trials — the gold standard from medicine. Random assignment of the intervention controls for selection effects; impact = treatment outcome − control outcome. Build evidence policy-by-policy.

Mattered

Within a decade J-PAL had run over 1,000 RCTs across 80+ countries; 2019 Nobel Prize in Economics to the founders. Reshaped how donors and governments make programming decisions. Built the evidence base on deworming, conditional cash transfers, microfinance, teacher absenteeism.

Critique

Critics (Deaton, Ravallion, Pritchett, Reddy) argue RCTs answer narrow what-works-here questions but say little about why or whether results travel; the movement depoliticises development by avoiding macro and structural questions; selection of researchable questions itself reflects donor priorities.

MEL Basics 101 Dev Economics 101 BookSummary: Poor Economics

2008

3ie — International Initiative for Impact Evaluation

Howard White (founding ED) · Established with 17 founding members · February 2008

Argued

Funded high-quality impact evaluations and systematic reviews on development questions, with a methods-pluralist (not RCT-only) stance. Built infrastructure: the impact evaluation repository, gap maps, evidence-and-gap maps (EGMs).

Mattered

Funded ~250 impact evaluations and 75+ systematic reviews to date. Made evidence synthesis accessible to policymakers through gap maps; evidence-informed policy units in Bihar, Punjab, Andhra Pradesh, and South Africa drew heavily on 3ie’s outputs.

Critique

3ie’s influence has waned post-2018 with funding cuts; the focus on rigorous impact evidence sometimes excluded relevance to political and contextual decision-making. The wider evidence-informed policy movement has matured but faces “evidence fatigue” from over-supply of high-quality findings.

MEL Basics 101

Era 05

Adaptive Management & Complexity-Aware MEL

2010 – 2019

A counter-current to the RCT turn: methods designed for complex, emergent, contested change — PDIA, MSC, Outcome Harvesting, Contribution Analysis, Developmental Evaluation. The 2019 DAC criteria revision added Coherence — a quiet acknowledgement that interventions sit in systems.

2011

PDIA — Problem-Driven Iterative Adaptation

Matt Andrews, Lant Pritchett, Michael Woolcock · Harvard CID, working paper 2012; book 2017

Argued

Most failed development interventions fail because they import “best practice” solutions and force-fit them onto problems that are politically and contextually different. Better: define the problem locally, iterate rapidly with feedback loops, build broad agency for change, and accept that solutions emerge.

Mattered

Influenced USAID’s Collaborating Learning & Adapting (CLA), DFID’s Adaptive Programming, the Doing Development Differently manifesto (2014), the Building State Capability programme. Adaptive management is now mainstream in donor practice (at least rhetorically).

Critique

Critics ask whether adaptive practice is genuinely different from good consultancy; whether reporting requirements really allow iteration; and whether donors can resist the pull of pre-defined logframes when accountability pressures rise. Implementation often falls short of doctrine.

MEL Basics 101

2012

Most Significant Change & Outcome Harvesting Mainstreamed

Rick Davies & Jess Dart (MSC, 2005 manual); Ricardo Wilson-Grau & Heather Britt (Outcome Harvesting, 2012)

Argued

MSC: collect stories of change from beneficiaries, have stakeholders select the “most significant” ones through iterated review — surfacing what mattered to whom, not what donors expected. Outcome Harvesting: identify outcomes (changes in actor behaviour) that have already happened and trace back to what contributed.

Mattered

Both are now standard methods for advocacy, governance, and behaviour-change programmes where pre-set indicators miss the actual change. Adopted by Oxfam, ICCO, Hivos, Christian Aid, and many CSO networks; influential in policy advocacy MEL where attribution is structurally hard.

Critique

Both methods rely heavily on facilitation quality; results depend on whose voices are heard. Without rigorous sampling, “most significant” can collapse into “most articulate.” Outcome Harvesting’s contribution analysis can become post-hoc rationalisation if not held to evidentiary discipline.

MEL Basics 101 Qualitative Methods 101

2019

DAC Criteria Revised — Coherence Added as Sixth Criterion

OECD-DAC Network on Development Evaluation · Better Criteria for Better Evaluation, December 2019

Argued

Coherence joined Relevance, Effectiveness, Efficiency, Impact, and Sustainability. Coherence asks: how well does the intervention fit with other interventions in country, sector, institution? Internal and external coherence both matter. Each criterion was also more clearly defined and aligned with the SDGs.

Mattered

First major revision in nearly 30 years. Formal recognition that interventions sit in systems, not vacuums; aligned with the rise of nexus thinking (humanitarian-development-peace nexus, water-energy-food nexus, climate-development).

Critique

Coherence remains the criterion evaluators find hardest to operationalise; many evaluations treat it perfunctorily. Climate justice, decolonial, and feminist evaluators argue the framework still privileges donor and intervention-centric evaluation, with values implicit and unexamined.

MEL Basics 101

Era 06

AI, Real-Time & Open MEL

2020 – today

COVID forced remote MEL; mobile data, satellite imagery, and call-detail records became standard data sources; generative AI is reshaping report writing, qualitative analysis, and synthesis. The decolonial evaluation movement and the Made-in-Africa principles (2019–) push back on whose knowledge counts.

2020

COVID Forces Remote & Real-Time MEL

Global evaluation community · UNEG, EvalPartners, INTRAC, Better Evaluation hubs · March 2020 onwards

Argued

Field visits ceased overnight. Evaluators rapidly pivoted to phone surveys, mobile data collection, satellite imagery, call-detail-records, online deliberative methods, and locally-led data collection. Real-time monitoring became a necessity.

Mattered

Permanently shifted MEL practice toward remote-first methods. The locally-led evaluation movement (Pact, Equal Access International) gained ground partly because international consultants couldn’t travel. Acceleration of the “evaluator at a distance” debate.

Critique

Phone surveys systematically under-sample women, rural respondents, people without phones; online methods exclude those without bandwidth. The “digital divide” became a measurement bias. Some pre-COVID gains in beneficiary engagement were reversed by remote-first defaults.

MEL Basics 101 Digital Ethics 101

2023

Generative AI Enters Evaluation Practice

Global evaluation community · Foundation models (GPT-4, Claude, Gemini) · 2023–

Argued

Large language models are now used in evaluation for: drafting and synthesising reports, qualitative coding (interview transcripts), evidence synthesis, indicator construction, and even evaluation design. Multiple INGOs, UN agencies, and consultancies have launched AI-assisted MEL pilots.

Mattered

Substantial productivity gains in report drafting and qualitative analysis. Made systematic review and large-N qualitative work feasible at smaller budgets. Opened access to evidence synthesis for under-resourced organisations and southern researchers.

Critique

Hallucinations risk fabricated quotes; bias in training data is reproduced; the “efficiency” gain can substitute for relationship work and field judgment that AI can’t replicate. UNEG, ALNAP, and EvalPartners are developing guidance; the locally-led evaluation movement is wary that AI re-centralises authority with model-providers.

MEL Basics 101 Digital Ethics 101

2024

Decolonial & Made-in-Africa Evaluation

African Evaluation Association (AfrEA); Bagele Chilisa; Indigenous evaluation networks · ongoing 2019–2024

Argued

Mainstream evaluation frameworks — logframes, DAC criteria, RCTs — embed Northern epistemologies that subordinate other ways of knowing. Decolonial evaluation centres community ownership, relational accountability, indigenous evidentiary norms, and reparative purposes. AfrEA’s Made-in-Africa Evaluation principles (since 2007, formalised 2019–) are an institutional anchor.

Mattered

Reshaping major donor practice (Hewlett, Ford, MacArthur on equitable evaluation; FCDO and USAID locally-led monitoring agendas). National VOPEs in the Global South are growing; Cuba, Brazil, India, Kenya, South Africa have active evaluation societies. Re-Imagining INGOs (2022) asks what evaluation looks like in a decentralised civil-society architecture.

Critique

Sceptics argue the principles risk becoming new orthodoxy; some equity claims are unfalsifiable; donor adoption can be performative. Practical implementation requires sustained investment in southern evaluation capacity — a structural rather than methodological question.

MEL Basics 101 Decolonising Development 101