FLAGSHIP COURSE

Seeing Data: Visualization for Impact

Data Visualization Masterclass

Master the art and science of data visualization for development. From perceptual foundations to interactive dashboards—learn to communicate data that drives decisions and creates change.

Interactive Tools AI Study Companion Colab Notebooks 58-Term Lexicon
12
Modules
5+
Interactive Tools
50+
Exercises
Impact
Course Papers (Coming Soon) Video Lectures Data Viz Lexicon
Module 01

Why Visualize? Purpose & Principles

Understanding when and why visualization matters for policy influence and accountability. From Anscombe's quartet to Hans Rosling's Gapminder—discover the power of seeing data.

The Case for Visualization

Data visualization is not decoration—it is a cognitive tool that extends human perception. Our visual system processes images in parallel, detecting patterns, outliers, and relationships that would take hours to discover in tables of numbers.

Anscombe's Quartet: The Classic Demonstration

In 1973, statistician Francis Anscombe created four datasets with nearly identical statistical properties (same mean, variance, correlation, and regression line)—yet radically different when visualized. This quartet remains the definitive argument for always visualizing your data before analysis.

Anscombe's Quartet: Same Statistics, Different Stories Interactive
Dataset I Dataset II Dataset III Dataset IV All Four Datasets Share Identical Statistics Mean X: 9.0 Mean Y: 7.5 Variance X: 11.0 Correlation: 0.816 But visualizing reveals four completely different relationships!
Francis Anscombe (1973). All four datasets have identical summary statistics but tell entirely different stories.
Module 02

Data Types & Quality Structures & Challenges

Classify data correctly, recognize development sector data quality challenges, and understand how data collection choices constrain visualization options.

Data Type Classification

Every visualization decision begins with understanding your data type. The encoding that works for quantitative data fails for categorical; the chart perfect for time series misleads for cross-sectional comparisons.

Data Type Definition Examples Suitable Encodings
Nominal Categories without order Country names, program types, gender categories Position, color hue, shape
Ordinal Categories with meaningful order Education levels, satisfaction ratings, wealth quintiles Position, color saturation, size
Interval Numeric with arbitrary zero Temperature (°C), dates, index scores Position, length, area (with caution)
Ratio Numeric with true zero Income, population, coverage rates Position, length, area, angle
Temporal Time-based sequences Years, months, project phases Position on x-axis, animation
Geographic Spatial coordinates or regions Districts, GPS points, administrative boundaries Map position, choropleth, symbols

Development Sector Data Challenges

Development data presents challenges rarely addressed in standard visualization courses. Acknowledging these constraints is essential for honest, effective communication.

Missing Values

Subnational data gaps, incomplete time series, non-response. Never interpolate without disclosure.

Methodology Changes

Redefined indicators, updated survey instruments, changed sampling. Breaks in time series must be marked.

Aggregation Masking

National averages hide regional inequality. Always ask: "Does this aggregate conceal important variation?"

Proxy Indicators

What we can measure ≠ what we want to measure. Document the gap between proxy and concept.

Small Samples

Disaggregated data often lacks statistical power. Confidence intervals are not optional.

Reporting Delays

"Latest data" may be 2-3 years old. Always show data vintage prominently.

Module 03

Visual Encoding Graphical Perception

The scientific foundation for chart selection. Rank visual encodings by perceptual accuracy and predict which chart types will enable accurate comparisons.

Cleveland & McGill's Hierarchy

In 1984, William Cleveland and Robert McGill conducted experiments establishing a hierarchy of encoding accuracy. Their findings explain why bar charts consistently outperform pie charts: humans judge length on a common baseline more accurately than angles.

Encoding Accuracy Hierarchy

From most to least accurate perception: Position on common scale → Position on non-aligned scales → Length → Angle/Slope → Area → Volume → Color saturation/density.

1
Position (common scale)
Bar charts, dot plots, scatter plots
2
Position (non-aligned)
Multiple separate charts
3
Length
Stacked bars, Gantt charts
4
Angle / Slope
Pie charts, line slopes
5
Area
Bubble charts, treemaps
6
Volume
3D charts (avoid!)
7
Color saturation
Heatmaps, choropleths
Module 04

Color & Accessibility Inclusive Design

Select appropriate palettes for sequential, diverging, and categorical data. Design for colorblind accessibility and adapt for cultural contexts.

Three Types of Color Palettes

Color palette selection is not aesthetic preference—it encodes information. Using the wrong palette type misleads viewers by implying relationships that don't exist.

Sequential

Light → Dark. For ordered data with no midpoint: population density, poverty rates, coverage percentages.

Diverging

Color ← Neutral → Color. For data with meaningful midpoint: change from baseline, deviation from target, above/below average.

Categorical

Distinct hues. For unordered groups: regions, program types, demographic categories. Maximum ~7-8 distinguishable colors.

Module 05

Chart Selection Frameworks & Common Mistakes

Use evidence-based frameworks to select chart types. Identify and correct the most common visualization errors found in development sector reports.

The Grammar of Graphics

Leland Wilkinson's Grammar of Graphics (1999) provides the theoretical foundation underlying ggplot2, Vega-Lite, and modern visualization thinking. Understanding this grammar enables creating novel chart types, not just selecting from menus.

Components of a Graphic

Any visualization decomposes into: DataTransformationsCoordinate SystemScalesGeometric ElementsGuidesFacets. Master these building blocks and you can construct any chart.

Module 06

Design Process Style Guides & Iteration

Apply iterative design methodology. Develop organizational style guide components that institutionalize visualization practice beyond individual skill.

The Design Sprint Methodology

Following Harvard CS171's approach, effective visualization emerges from structured iteration: paper sketching before digital tools, multiple alternatives before committing, structured critique at each stage.

1. Sketch

Paper prototypes first. No software until you've explored 3+ approaches by hand.

2. Prototype

Build functional draft in your tool of choice. Focus on data, not polish.

3. Critique

Structured feedback using specific criteria. Not "I like/don't like" but evidence-based review.

4. Refine

Iterate based on feedback. Polish only after structure and message are validated.

Module 07

Storytelling Narrative & Audience

Apply Cole Nussbaumer Knaflic's six lessons systematically. Design for specific audiences— donors, communities, policymakers—with appropriate narrative structures.

The Six Lessons of Storytelling with Data

Cole Nussbaumer Knaflic's framework provides a systematic approach to data communication that transforms charts into stories.

1. Understand Context

Who is your audience? What do they need to know or do? What's the desired action?

2. Choose Display

Match visualization to message: simple text for 1-2 numbers, lines for trends, bars for comparisons.

3. Eliminate Clutter

Reduce cognitive load. Remove non-strategic elements that don't serve the message.

4. Draw Attention

Use preattentive attributes (color, size, position) strategically to guide the eye.

5. Think Like Designer

Affordances, accessibility, visual hierarchy. Design for how people actually perceive.

6. Tell a Story

Beginning-middle-end structure. Narrative arc with tension and resolution.

Module 08

Tool Landscape Selection & Evaluation

Survey the visualization tool ecosystem. Evaluate tools against specific use cases and understand when each is appropriate.

Tool Selection Matrix

Every tool has trade-offs. The right choice depends on your skill level, output format, reproducibility needs, and organizational context.

Tool Best For Limitations Learning Curve
Excel / Google Sheets Quick analysis, familiar interface, print reports Limited chart types, manual formatting Low
Tableau Interactive dashboards, rapid exploration Expensive licensing; free tier publishes all work publicly Medium
Power BI Microsoft ecosystem, DAX calculations Steeper learning curve, Windows-centric Medium-High
Datawrapper News-style responsive charts, beginners Limited customization, primarily static Low
Flourish Scrollytelling, animations, bar chart races Templates can be constraining Low-Medium
RAWGraphs Unconventional chart types, SVG export No hosting, static only, requires design skills Low
R (ggplot2) Publication-quality graphics, reproducibility Programming required High
Python (matplotlib, seaborn, plotly) Data science workflows, dashboards Multiple libraries to learn High
D3.js Custom interactive web visualization JavaScript required, steep learning curve Very High

The Scaffolding Approach

Stephanie Evergreen recommends a progression: start with what's familiar (Excel), build to more powerful tools (Tableau/Flourish), optionally extend to code-based approaches (R/Python) for reproducibility.

1
Excel/Google Sheets
Foundation: familiar, widely available
2
Datawrapper/Flourish
Web-ready: responsive, interactive basics
3
Tableau/Power BI
Enterprise: dashboards, exploration
4
R/Python
Reproducible: code-based pipelines
5
D3.js/Observable
Custom: full control, web interactivity
Module 09

Interactive Visualization Web & Dynamic

Implement Shneiderman's visual information-seeking mantra. Create interactive web visualizations and understand when interactivity adds value versus complexity.

Shneiderman's Mantra

Ben Shneiderman's information-seeking mantra guides effective interactive design: "Overview first, zoom and filter, then details-on-demand."

1. Overview First

Start with the big picture. Let users see the entire dataset or summary before drilling down.

2. Zoom & Filter

Enable users to focus on subsets of interest. Region, time period, category—progressive refinement.

3. Details on Demand

Hover, click, or tap to reveal specifics. Don't clutter the view—reveal context when requested.

Module 10

M&E Dashboards Development Applications

Design dashboards following Stephen Few's principles. Apply DHIS2 and WHO standards for M&E frameworks, theory of change communication, and annual reporting.

Dashboard Definition

Stephen Few defines a dashboard as: "A visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so information can be monitored at a glance."

Key Insight

A dashboard is not a report with charts. It's a monitoring interface for ongoing decision-making. If users need to scroll or click through pages, you've built a report, not a dashboard.

Module 11

Advanced Topics Specialized Approaches

Create effective geographic visualizations, network diagrams, and apply Data Feminism principles to challenge power dynamics in visualization.

Geographic Visualization

Maps are powerful but dangerous. Choropleth maps (color-filled regions) create visual bias toward large regions regardless of population.

Choropleth Maps

Best for: rates, percentages, density. Caution: large areas dominate visually regardless of importance.

Proportional Symbols

Best for: counts, totals. Circles sized by value placed at geographic points.

Cartograms

Distort geography to size by variable (population, GDP). Corrects area bias but sacrifices familiarity.

Hex Bin Maps

Equal-area hexagons for fair comparison. Used for electoral maps (each hex = one district).

Module 12

Capstone Project Portfolio Development

Complete an end-to-end visualization project from data acquisition through published output. Develop a professional portfolio and give structured critique.

Capstone Timeline

The capstone project spans 3-4 weeks with milestone submissions at each stage.

Week 1: Proposal

Identify dataset, define audience and purpose, sketch 3+ visualization approaches.

Week 2: Prototype

Build functional draft visualizations, document design decisions, peer review exchange.

Week 3: Refinement

Incorporate feedback, polish design, prepare presentation materials.

Week 4: Presentation

Present to class and external reviewers; publish to portfolio.