Seeing Data: Visualization for Impact

Module 01

Why Visualize? Purpose & Principles

Understanding when and why visualization matters for policy influence and accountability. From Anscombe's quartet to Hans Rosling's Gapminder—discover the power of seeing data.

The Case for Visualization

Data visualization is not decoration—it is a cognitive tool that extends human perception. Our visual system processes images in parallel, detecting patterns, outliers, and relationships that would take hours to discover in tables of numbers.

Anscombe's Quartet: The Classic Demonstration

In 1973, statistician Francis Anscombe created four datasets with nearly identical statistical properties (same mean, variance, correlation, and regression line)—yet radically different when visualized. This quartet remains the definitive argument for always visualizing your data before analysis.

Anscombe's Quartet: Same Statistics, Different Stories Interactive

Francis Anscombe (1973). All four datasets have identical summary statistics but tell entirely different stories.

Module 02

Data Types & Quality Structures & Challenges

Classify data correctly, recognize development sector data quality challenges, and understand how data collection choices constrain visualization options.

Data Type Classification

Every visualization decision begins with understanding your data type. The encoding that works for quantitative data fails for categorical; the chart perfect for time series misleads for cross-sectional comparisons.

Data Type	Definition	Examples	Suitable Encodings
Nominal	Categories without order	Country names, program types, gender categories	Position, color hue, shape
Ordinal	Categories with meaningful order	Education levels, satisfaction ratings, wealth quintiles	Position, color saturation, size
Interval	Numeric with arbitrary zero	Temperature (°C), dates, index scores	Position, length, area (with caution)
Ratio	Numeric with true zero	Income, population, coverage rates	Position, length, area, angle
Temporal	Time-based sequences	Years, months, project phases	Position on x-axis, animation
Geographic	Spatial coordinates or regions	Districts, GPS points, administrative boundaries	Map position, choropleth, symbols

Development Sector Data Challenges

Development data presents challenges rarely addressed in standard visualization courses. Acknowledging these constraints is essential for honest, effective communication.

Missing Values

Subnational data gaps, incomplete time series, non-response. Never interpolate without disclosure.

Methodology Changes

Redefined indicators, updated survey instruments, changed sampling. Breaks in time series must be marked.

Aggregation Masking

National averages hide regional inequality. Always ask: "Does this aggregate conceal important variation?"

Proxy Indicators

What we can measure ≠ what we want to measure. Document the gap between proxy and concept.

Small Samples

Disaggregated data often lacks statistical power. Confidence intervals are not optional.

Reporting Delays

"Latest data" may be 2-3 years old. Always show data vintage prominently.

Module 03

Visual Encoding Graphical Perception

The scientific foundation for chart selection. Rank visual encodings by perceptual accuracy and predict which chart types will enable accurate comparisons.

Cleveland & McGill's Hierarchy

In 1984, William Cleveland and Robert McGill conducted experiments establishing a hierarchy of encoding accuracy. Their findings explain why bar charts consistently outperform pie charts: humans judge length on a common baseline more accurately than angles.

Encoding Accuracy Hierarchy

From most to least accurate perception: Position on common scale → Position on non-aligned scales → Length → Angle/Slope → Area → Volume → Color saturation/density.

1

Position (common scale)

Bar charts, dot plots, scatter plots

2

Position (non-aligned)

Multiple separate charts

3

Length

Stacked bars, Gantt charts

4

Angle / Slope

Pie charts, line slopes

5

Area

Bubble charts, treemaps

6

Volume

3D charts (avoid!)

7

Color saturation

Heatmaps, choropleths

Module 04

Color & Accessibility Inclusive Design

Select appropriate palettes for sequential, diverging, and categorical data. Design for colorblind accessibility and adapt for cultural contexts.

Three Types of Color Palettes

Color palette selection is not aesthetic preference—it encodes information. Using the wrong palette type misleads viewers by implying relationships that don't exist.

Sequential

Light → Dark. For ordered data with no midpoint: population density, poverty rates, coverage percentages.

Diverging

Color ← Neutral → Color. For data with meaningful midpoint: change from baseline, deviation from target, above/below average.

Categorical

Distinct hues. For unordered groups: regions, program types, demographic categories. Maximum ~7-8 distinguishable colors.

Module 05

Chart Selection Frameworks & Common Mistakes

Use evidence-based frameworks to select chart types. Identify and correct the most common visualization errors found in development sector reports.

The Grammar of Graphics

Leland Wilkinson's Grammar of Graphics (1999) provides the theoretical foundation underlying ggplot2, Vega-Lite, and modern visualization thinking. Understanding this grammar enables creating novel chart types, not just selecting from menus.

Components of a Graphic

Any visualization decomposes into: Data → Transformations → Coordinate System → Scales → Geometric Elements → Guides → Facets. Master these building blocks and you can construct any chart.

Module 06

Design Process Style Guides & Iteration

Apply iterative design methodology. Develop organizational style guide components that institutionalize visualization practice beyond individual skill.

The Design Sprint Methodology

Following Harvard CS171's approach, effective visualization emerges from structured iteration: paper sketching before digital tools, multiple alternatives before committing, structured critique at each stage.

1. Sketch

Paper prototypes first. No software until you've explored 3+ approaches by hand.

2. Prototype

Build functional draft in your tool of choice. Focus on data, not polish.

3. Critique

Structured feedback using specific criteria. Not "I like/don't like" but evidence-based review.

4. Refine

Iterate based on feedback. Polish only after structure and message are validated.

Module 07

Storytelling Narrative & Audience

Apply Cole Nussbaumer Knaflic's six lessons systematically. Design for specific audiences— donors, communities, policymakers—with appropriate narrative structures.

The Six Lessons of Storytelling with Data

Cole Nussbaumer Knaflic's framework provides a systematic approach to data communication that transforms charts into stories.

1. Understand Context

Who is your audience? What do they need to know or do? What's the desired action?

2. Choose Display

Match visualization to message: simple text for 1-2 numbers, lines for trends, bars for comparisons.

3. Eliminate Clutter

Reduce cognitive load. Remove non-strategic elements that don't serve the message.

4. Draw Attention

Use preattentive attributes (color, size, position) strategically to guide the eye.

5. Think Like Designer

Affordances, accessibility, visual hierarchy. Design for how people actually perceive.

6. Tell a Story

Beginning-middle-end structure. Narrative arc with tension and resolution.

Module 08

Tool Landscape Selection & Evaluation

Survey the visualization tool ecosystem. Evaluate tools against specific use cases and understand when each is appropriate.

Tool Selection Matrix

Every tool has trade-offs. The right choice depends on your skill level, output format, reproducibility needs, and organizational context.

Tool	Best For	Limitations	Learning Curve
Excel / Google Sheets	Quick analysis, familiar interface, print reports	Limited chart types, manual formatting	Low
Tableau	Interactive dashboards, rapid exploration	Expensive licensing; free tier publishes all work publicly	Medium
Power BI	Microsoft ecosystem, DAX calculations	Steeper learning curve, Windows-centric	Medium-High
Datawrapper	News-style responsive charts, beginners	Limited customization, primarily static	Low
Flourish	Scrollytelling, animations, bar chart races	Templates can be constraining	Low-Medium
RAWGraphs	Unconventional chart types, SVG export	No hosting, static only, requires design skills	Low
R (ggplot2)	Publication-quality graphics, reproducibility	Programming required	High
Python (matplotlib, seaborn, plotly)	Data science workflows, dashboards	Multiple libraries to learn	High
D3.js	Custom interactive web visualization	JavaScript required, steep learning curve	Very High

The Scaffolding Approach

Stephanie Evergreen recommends a progression: start with what's familiar (Excel), build to more powerful tools (Tableau/Flourish), optionally extend to code-based approaches (R/Python) for reproducibility.

1

Excel/Google Sheets

Foundation: familiar, widely available

2

Datawrapper/Flourish

Web-ready: responsive, interactive basics

3

Tableau/Power BI

Enterprise: dashboards, exploration

4

R/Python

Reproducible: code-based pipelines

5

D3.js/Observable

Custom: full control, web interactivity

Module 09

Interactive Visualization Web & Dynamic

Implement Shneiderman's visual information-seeking mantra. Create interactive web visualizations and understand when interactivity adds value versus complexity.

Shneiderman's Mantra

Ben Shneiderman's information-seeking mantra guides effective interactive design: "Overview first, zoom and filter, then details-on-demand."

1. Overview First

Start with the big picture. Let users see the entire dataset or summary before drilling down.

2. Zoom & Filter

Enable users to focus on subsets of interest. Region, time period, category—progressive refinement.

3. Details on Demand

Hover, click, or tap to reveal specifics. Don't clutter the view—reveal context when requested.

Module 10

M&E Dashboards Development Applications

Design dashboards following Stephen Few's principles. Apply DHIS2 and WHO standards for M&E frameworks, theory of change communication, and annual reporting.

Dashboard Definition

Stephen Few defines a dashboard as: "A visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so information can be monitored at a glance."

Key Insight

A dashboard is not a report with charts. It's a monitoring interface for ongoing decision-making. If users need to scroll or click through pages, you've built a report, not a dashboard.

Module 11

Advanced Topics Specialized Approaches

Create effective geographic visualizations, network diagrams, and apply Data Feminism principles to challenge power dynamics in visualization.

Geographic Visualization

Maps are powerful but dangerous. Choropleth maps (color-filled regions) create visual bias toward large regions regardless of population.

Choropleth Maps

Best for: rates, percentages, density. Caution: large areas dominate visually regardless of importance.

Proportional Symbols

Best for: counts, totals. Circles sized by value placed at geographic points.

Cartograms

Distort geography to size by variable (population, GDP). Corrects area bias but sacrifices familiarity.

Hex Bin Maps

Equal-area hexagons for fair comparison. Used for electoral maps (each hex = one district).

Module 12

Capstone Project Portfolio Development

Complete an end-to-end visualization project from data acquisition through published output. Develop a professional portfolio and give structured critique.

Capstone Timeline

The capstone project spans 3-4 weeks with milestone submissions at each stage.

Week 1: Proposal

Identify dataset, define audience and purpose, sketch 3+ visualization approaches.

Week 2: Prototype

Build functional draft visualizations, document design decisions, peer review exchange.

Week 3: Refinement

Incorporate feedback, polish design, prepare presentation materials.

Week 4: Presentation

Present to class and external reviewers; publish to portfolio.