ImpactMojo
AI for Impact: Data Monitoring & Evaluation in Development
When AI Helps, When It Doesn't, and How to Tell the Difference
A rigorous, evidence-based exploration of AI applications in development M&E—from computer vision and NLP to algorithmic targeting and real-time monitoring. With deep focus on South Asia and Africa, where context determines everything.
Why Study AI in Development M&E?
AI is reshaping how development organizations collect data, target beneficiaries, and monitor programs. But the gap between vendor promises and ground reality is vast. Organizations waste millions on tools that don't work in low-connectivity environments, or worse, deploy algorithms that systematically exclude the most vulnerable.
This course differs from typical AI hype in crucial ways: we focus on what actually works in low-resource contexts, when simpler tools outperform AI, and how to assess whether your organization is ready for AI adoption—or whether the investment would be wasted.
Evidence-Based Assessment
Move beyond vendor demos to rigorous evaluation. Learn to assess tools using development research standards, not Silicon Valley metrics.
Context-Specific Application
What works in Accra may fail in Upper East Region. Deep focus on infrastructure constraints, data quality challenges, and organizational capacity.
Ethical Frameworks
Algorithmic bias, data sovereignty, consent in low-literacy contexts. The ethical dimensions that vendor pitches never mention.
"The question is not whether AI can help development -- it clearly can, in specific contexts. The question is whether your organization is ready to use it responsibly, and whether simpler solutions might work better." -- Adapted from J-PAL AI & Development Initiative
Who This Course Is For
M&E Professionals
Learn to evaluate AI tools critically, design AI-assisted monitoring systems, and communicate AI limitations to stakeholders. No coding required.
Program Managers
Understand when AI adds value versus when simpler tools work better. Learn to manage AI vendors, design pilots, and assess organizational readiness.
Researchers & Academics
Explore how ML complements causal inference methods. Understand heterogeneous treatment effects, synthetic controls, and ethical considerations in AI-driven research.
Policy Makers & Donors
Develop frameworks for evaluating AI proposals, assessing vendor claims, and ensuring responsible deployment in programs you fund or oversee.
The AI-M&E Landscape
What does "AI" actually mean in development practice? This module demystifies the taxonomy of tools—from simple automation to machine learning to large language models—and maps the current state of AI adoption in the sector.
What This Module Covers
Understanding the AI landscape in development requires cutting through marketing terminology to identify what each technology actually does, what data it needs, and when it outperforms simpler alternatives. This module provides the foundational taxonomy you'll use throughout the course.
Key insight: The term "AI" in development marketing covers everything from simple if-then rules to sophisticated neural networks. A tool that auto-fills survey fields is called "AI." A tool that predicts drought from satellite imagery is also called "AI." These are fundamentally different technologies with vastly different requirements, costs, and reliability levels.
Taxonomy of AI Technologies in Development
The term "AI" is used loosely in development contexts, often conflating fundamentally different technologies. Clear taxonomy is essential for appropriate tool selection.
| Technology | What It Does | M&E Applications | Data Requirements |
|---|---|---|---|
| Rule-Based Automation | Follows explicit if-then rules | Data validation, skip logic, alerts | Low—rules defined manually |
| Classical ML | Learns patterns from labeled data | Targeting, classification, prediction | Medium—thousands of labeled examples |
| Deep Learning | Neural networks for complex patterns | Image recognition, NLP, anomaly detection | High—millions of examples, GPUs |
| Computer Vision | Extracts information from images | Satellite imagery, infrastructure monitoring | High—labeled images, geospatial data |
| NLP | Processes human language | Qualitative coding, sentiment, translation | Medium-High—domain-specific corpora |
| LLMs (GPT, Claude) | General-purpose text generation | Report writing, data synthesis, chatbots | Low for use; high for fine-tuning |
ML excels at prediction—identifying who is likely to be poor, which programs are at risk of failure. But prediction ≠ causation. Knowing that households with tin roofs are poor doesn't tell you whether providing tin roofs reduces poverty. Development requires both—ML for targeting and monitoring, RCTs for causal inference.
The "AI Hype Cycle" in Development
Development organizations tend to follow a predictable pattern with new AI technologies:
Phase 1: Hype (Months 1-6)
A conference presentation or donor initiative sparks excitement. "AI will revolutionize our M&E!" Vendor demos look impressive. Leadership is enthusiastic. Budget is allocated.
Phase 2: Reality Check (Months 6-18)
Data quality issues emerge. The tool doesn't work offline. Staff resistance grows. The vendor's demo doesn't match field conditions. Costs escalate beyond initial estimates.
Phase 3: Trough or Learning (18+ Months)
Organizations either abandon the initiative ("AI doesn't work for us") or -- more productively -- recalibrate expectations and find specific, bounded use cases where AI genuinely adds value.
Needs Assessment for AI Integration
Before adopting any AI tool, organizations must assess readiness across multiple dimensions: data infrastructure, technical capacity, organizational culture, and—critically—whether AI is actually the right solution.
AI for Data Collection
From voice-to-text transcription to intelligent chatbots, AI is transforming how development organizations collect data in the field. But implementation challenges—language diversity, connectivity, trust—determine success or failure.
Computer Vision & Geospatial Analysis
Satellite imagery combined with machine learning has revolutionized poverty mapping, agricultural monitoring, and infrastructure tracking. But the gap between research papers and operational use remains significant.
NLP for Qualitative Data
Natural Language Processing can analyze thousands of open-ended survey responses, interview transcripts, and social media posts. But automated coding is not a replacement for human interpretation—it's a complement.
The Language Technology Gap
NLP capabilities vary dramatically across languages. English NLP is mature and accurate. For the languages spoken by the world's poorest populations, NLP is often rudimentary or nonexistent.
| Language | Speakers (M) | NLP Resources | Sentiment Accuracy | ASR Availability |
|---|---|---|---|---|
| English | 1,500 | Extensive | 90%+ | Excellent |
| Hindi | 600 | Moderate | 75-80% | Good |
| Bengali | 270 | Growing | 65-75% | Moderate |
| Swahili | 100 | Limited | 60-70% | Basic |
| Hausa | 80 | Very limited | 50-60% | Minimal |
| Bhojpuri | 50 | Nearly none | N/A | None |
| Dagbani | 3 | None | N/A | None |
The digital language divide: Of the world's ~7,000 languages, fewer than 100 have meaningful NLP resources. Many development programs work with communities speaking languages that have zero digital text resources. In these contexts, NLP is not an option -- regardless of how powerful the underlying models are.
Algorithmic Targeting & Beneficiary Selection
Who gets the transfer? Who receives the scholarship? Algorithmic targeting promises efficiency and objectivity—but can also systematically exclude the most vulnerable.
Real-Time Monitoring & Anomaly Detection
Dashboard automation, data quality flags, and early warning systems. How AI enables faster response to program problems—and the human oversight that remains essential.
The Shift from Periodic to Continuous Monitoring
Traditional M&E operates on quarterly or annual cycles. A typical program collects baseline data, conducts a midterm review, and runs an endline survey. Problems are often discovered months or years after they begin. AI-enabled real-time monitoring changes this paradigm fundamentally.
Consider a nutrition program distributing supplements to children under 5. Under traditional M&E, if the supply chain breaks in a remote district, you might not know until the next quarterly report -- by which time months of malnutrition have occurred. AI-enabled monitoring can detect the supply break within days by analyzing distribution records, inventory data, and even satellite imagery of warehouse activity.
AI-Powered Monitoring Systems
Several organizations have pioneered AI-enabled monitoring at scale. The tools range from simple anomaly detection to complex predictive systems.
Anomaly Detection
Algorithms flag unusual patterns: sudden drops in attendance, unexpected expenditure spikes, geographic clustering of complaints. UNHCR uses this for fraud detection in cash programs.
Predictive Early Warning
ML models predict which programs are at risk of failure based on early indicators. WFP's HungerMap combines satellite data, market prices, and conflict indicators for food security alerts.
Automated Data Quality
AI identifies suspicious survey responses: impossible combinations, pattern responses, outliers. Reduces reliance on manual data cleaning.
AI for Adaptive Programming
Feedback loops, course correction, and predictive analytics for implementation. Moving from static program design to continuous learning.
From Linear to Iterative Program Design
The traditional program cycle is linear: design a logframe, secure funding, implement activities, collect data, write a final report. This model assumes that the program theory is correct from the start and that context remains stable throughout implementation. Both assumptions are usually wrong.
The evidence is clear: Programs that adapt based on data consistently outperform rigid implementations. DFID's adaptive programming portfolio showed 30% better outcomes compared to traditional programs in fragile states. The challenge is building the systems and culture that enable adaptation.
The Adaptive Management Framework
Adaptive management uses continuous data to adjust implementation in real-time. AI accelerates this by processing feedback faster than humans can, enabling shorter learning cycles.
AI enables adaptive management at scale by processing feedback faster than humans can. But adaptation requires: (1) Clear decision rules for when to adapt, (2) Authority to make changes, (3) Budget flexibility, (4) Organizational culture that accepts iteration.
The PDSA Cycle: AI-Enhanced
The Plan-Do-Study-Act (PDSA) cycle is a well-established framework for continuous improvement. AI enhances each phase:
| PDSA Phase | Traditional Approach | AI-Enhanced Approach |
|---|---|---|
| Plan | Design based on baseline data and theory | Use ML to identify optimal intervention parameters from historical data |
| Do | Implement as designed | Implement with embedded data collection; real-time process monitoring |
| Study | Quarterly data review; endline analysis | Continuous analysis with anomaly detection; automated reporting |
| Act | Annual program adjustments | Monthly or weekly micro-adjustments based on AI-flagged insights |
The Limits of AI in Causal Inference
Why ML ≠ RCT. Prediction vs. causation. Heterogeneity detection. Understanding what AI can and cannot tell us about program impact.
Prediction: ML excels at predicting outcomes—who is poor, which programs will fail, what areas need intervention. But prediction doesn't tell you why.
Causation: To know if a program causes outcomes, you need experimental or quasi-experimental methods. Correlation in ML predictions is not evidence of causal impact.
Implication: Use ML for targeting and monitoring; use RCTs and rigorous evaluation for impact assessment. They're complements, not substitutes.
Ethics, Bias & Accountability
Algorithmic fairness, data sovereignty, consent in low-literacy contexts. The ethical dimensions that every development practitioner must understand.
Why Ethics is Not Optional
In commercial AI, ethical failures mean bad press and regulatory fines. In development AI, ethical failures mean people go hungry, are denied healthcare, or are wrongly excluded from social protection. The stakes are fundamentally different, and the ethical standards must be correspondingly higher.
The asymmetry of harm: When a recommendation algorithm suggests the wrong movie, the cost is mild annoyance. When a targeting algorithm denies a family emergency food aid, the cost can be starvation. Development AI operates in contexts where errors have life-and-death consequences.
Context Assessment: Case Studies
Deep dives into AI implementation in Ghana, India, Bangladesh, and Kenya. What worked, what failed, and why context determines everything.
Why Context Matters More Than Technology
This module examines four case studies where AI/digital systems were deployed for development purposes. In each case, the technology was similar but the outcomes varied dramatically -- determined by infrastructure, institutions, culture, and political economy rather than algorithmic sophistication.
Before reading each case study, consider these questions:
1. What digital infrastructure already existed?
2. What institutional capacity was in place to maintain the system?
3. What was the population's relationship with technology and government?
4. What political pressures influenced deployment decisions?
5. Were there existing alternatives that worked reasonably well?
Strategy Building: Build vs. Buy, Pilot Design, Sustainability
Practical guidance for organizations considering AI adoption. How to evaluate vendors, design pilots, build internal capacity, and plan for sustainability.