Seeing Data: Visualization for Impact
Data Visualization Masterclass
Master the art and science of data visualization for development. From perceptual foundations to interactive dashboards—learn to communicate data that drives decisions and creates change.
Why Visualize? Purpose & Principles
Understanding when and why visualization matters for policy influence and accountability. From Anscombe's quartet to Hans Rosling's Gapminder—discover the power of seeing data.
The Case for Visualization
Data visualization is not decoration—it is a cognitive tool that extends human perception. Our visual system processes images in parallel, detecting patterns, outliers, and relationships that would take hours to discover in tables of numbers.
Anscombe's Quartet: The Classic Demonstration
In 1973, statistician Francis Anscombe created four datasets with nearly identical statistical properties (same mean, variance, correlation, and regression line)—yet radically different when visualized. This quartet remains the definitive argument for always visualizing your data before analysis.
Data Types & Quality Structures & Challenges
Classify data correctly, recognize development sector data quality challenges, and understand how data collection choices constrain visualization options.
Data Type Classification
Every visualization decision begins with understanding your data type. The encoding that works for quantitative data fails for categorical; the chart perfect for time series misleads for cross-sectional comparisons.
| Data Type | Definition | Examples | Suitable Encodings |
|---|---|---|---|
| Nominal | Categories without order | Country names, program types, gender categories | Position, color hue, shape |
| Ordinal | Categories with meaningful order | Education levels, satisfaction ratings, wealth quintiles | Position, color saturation, size |
| Interval | Numeric with arbitrary zero | Temperature (°C), dates, index scores | Position, length, area (with caution) |
| Ratio | Numeric with true zero | Income, population, coverage rates | Position, length, area, angle |
| Temporal | Time-based sequences | Years, months, project phases | Position on x-axis, animation |
| Geographic | Spatial coordinates or regions | Districts, GPS points, administrative boundaries | Map position, choropleth, symbols |
Development Sector Data Challenges
Development data presents challenges rarely addressed in standard visualization courses. Acknowledging these constraints is essential for honest, effective communication.
Missing Values
Subnational data gaps, incomplete time series, non-response. Never interpolate without disclosure.
Methodology Changes
Redefined indicators, updated survey instruments, changed sampling. Breaks in time series must be marked.
Aggregation Masking
National averages hide regional inequality. Always ask: "Does this aggregate conceal important variation?"
Proxy Indicators
What we can measure ≠ what we want to measure. Document the gap between proxy and concept.
Small Samples
Disaggregated data often lacks statistical power. Confidence intervals are not optional.
Reporting Delays
"Latest data" may be 2-3 years old. Always show data vintage prominently.
Visual Encoding Graphical Perception
The scientific foundation for chart selection. Rank visual encodings by perceptual accuracy and predict which chart types will enable accurate comparisons.
Cleveland & McGill's Hierarchy
In 1984, William Cleveland and Robert McGill conducted experiments establishing a hierarchy of encoding accuracy. Their findings explain why bar charts consistently outperform pie charts: humans judge length on a common baseline more accurately than angles.
From most to least accurate perception: Position on common scale → Position on non-aligned scales → Length → Angle/Slope → Area → Volume → Color saturation/density.
Color & Accessibility Inclusive Design
Select appropriate palettes for sequential, diverging, and categorical data. Design for colorblind accessibility and adapt for cultural contexts.
Three Types of Color Palettes
Color palette selection is not aesthetic preference—it encodes information. Using the wrong palette type misleads viewers by implying relationships that don't exist.
Sequential
Light → Dark. For ordered data with no midpoint: population density, poverty rates, coverage percentages.
Diverging
Color ← Neutral → Color. For data with meaningful midpoint: change from baseline, deviation from target, above/below average.
Categorical
Distinct hues. For unordered groups: regions, program types, demographic categories. Maximum ~7-8 distinguishable colors.
Chart Selection Frameworks & Common Mistakes
Use evidence-based frameworks to select chart types. Identify and correct the most common visualization errors found in development sector reports.
The Grammar of Graphics
Leland Wilkinson's Grammar of Graphics (1999) provides the theoretical foundation underlying ggplot2, Vega-Lite, and modern visualization thinking. Understanding this grammar enables creating novel chart types, not just selecting from menus.
Any visualization decomposes into: Data → Transformations → Coordinate System → Scales → Geometric Elements → Guides → Facets. Master these building blocks and you can construct any chart.
Design Process Style Guides & Iteration
Apply iterative design methodology. Develop organizational style guide components that institutionalize visualization practice beyond individual skill.
The Design Sprint Methodology
Following Harvard CS171's approach, effective visualization emerges from structured iteration: paper sketching before digital tools, multiple alternatives before committing, structured critique at each stage.
1. Sketch
Paper prototypes first. No software until you've explored 3+ approaches by hand.
2. Prototype
Build functional draft in your tool of choice. Focus on data, not polish.
3. Critique
Structured feedback using specific criteria. Not "I like/don't like" but evidence-based review.
4. Refine
Iterate based on feedback. Polish only after structure and message are validated.
Storytelling Narrative & Audience
Apply Cole Nussbaumer Knaflic's six lessons systematically. Design for specific audiences— donors, communities, policymakers—with appropriate narrative structures.
The Six Lessons of Storytelling with Data
Cole Nussbaumer Knaflic's framework provides a systematic approach to data communication that transforms charts into stories.
1. Understand Context
Who is your audience? What do they need to know or do? What's the desired action?
2. Choose Display
Match visualization to message: simple text for 1-2 numbers, lines for trends, bars for comparisons.
3. Eliminate Clutter
Reduce cognitive load. Remove non-strategic elements that don't serve the message.
4. Draw Attention
Use preattentive attributes (color, size, position) strategically to guide the eye.
5. Think Like Designer
Affordances, accessibility, visual hierarchy. Design for how people actually perceive.
6. Tell a Story
Beginning-middle-end structure. Narrative arc with tension and resolution.
Tool Landscape Selection & Evaluation
Survey the visualization tool ecosystem. Evaluate tools against specific use cases and understand when each is appropriate.
Tool Selection Matrix
Every tool has trade-offs. The right choice depends on your skill level, output format, reproducibility needs, and organizational context.
| Tool | Best For | Limitations | Learning Curve |
|---|---|---|---|
| Excel / Google Sheets | Quick analysis, familiar interface, print reports | Limited chart types, manual formatting | Low |
| Tableau | Interactive dashboards, rapid exploration | Expensive licensing; free tier publishes all work publicly | Medium |
| Power BI | Microsoft ecosystem, DAX calculations | Steeper learning curve, Windows-centric | Medium-High |
| Datawrapper | News-style responsive charts, beginners | Limited customization, primarily static | Low |
| Flourish | Scrollytelling, animations, bar chart races | Templates can be constraining | Low-Medium |
| RAWGraphs | Unconventional chart types, SVG export | No hosting, static only, requires design skills | Low |
| R (ggplot2) | Publication-quality graphics, reproducibility | Programming required | High |
| Python (matplotlib, seaborn, plotly) | Data science workflows, dashboards | Multiple libraries to learn | High |
| D3.js | Custom interactive web visualization | JavaScript required, steep learning curve | Very High |
The Scaffolding Approach
Stephanie Evergreen recommends a progression: start with what's familiar (Excel), build to more powerful tools (Tableau/Flourish), optionally extend to code-based approaches (R/Python) for reproducibility.
Interactive Visualization Web & Dynamic
Implement Shneiderman's visual information-seeking mantra. Create interactive web visualizations and understand when interactivity adds value versus complexity.
Shneiderman's Mantra
Ben Shneiderman's information-seeking mantra guides effective interactive design: "Overview first, zoom and filter, then details-on-demand."
1. Overview First
Start with the big picture. Let users see the entire dataset or summary before drilling down.
2. Zoom & Filter
Enable users to focus on subsets of interest. Region, time period, category—progressive refinement.
3. Details on Demand
Hover, click, or tap to reveal specifics. Don't clutter the view—reveal context when requested.
M&E Dashboards Development Applications
Design dashboards following Stephen Few's principles. Apply DHIS2 and WHO standards for M&E frameworks, theory of change communication, and annual reporting.
Dashboard Definition
Stephen Few defines a dashboard as: "A visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so information can be monitored at a glance."
Key Insight
A dashboard is not a report with charts. It's a monitoring interface for ongoing decision-making. If users need to scroll or click through pages, you've built a report, not a dashboard.
Advanced Topics Specialized Approaches
Create effective geographic visualizations, network diagrams, and apply Data Feminism principles to challenge power dynamics in visualization.
Geographic Visualization
Maps are powerful but dangerous. Choropleth maps (color-filled regions) create visual bias toward large regions regardless of population.
Choropleth Maps
Best for: rates, percentages, density. Caution: large areas dominate visually regardless of importance.
Proportional Symbols
Best for: counts, totals. Circles sized by value placed at geographic points.
Cartograms
Distort geography to size by variable (population, GDP). Corrects area bias but sacrifices familiarity.
Hex Bin Maps
Equal-area hexagons for fair comparison. Used for electoral maps (each hex = one district).
Capstone Project Portfolio Development
Complete an end-to-end visualization project from data acquisition through published output. Develop a professional portfolio and give structured critique.
Capstone Timeline
The capstone project spans 3-4 weeks with milestone submissions at each stage.
Week 1: Proposal
Identify dataset, define audience and purpose, sketch 3+ visualization approaches.
Week 2: Prototype
Build functional draft visualizations, document design decisions, peer review exchange.
Week 3: Refinement
Incorporate feedback, polish design, prepare presentation materials.
Week 4: Presentation
Present to class and external reviewers; publish to portfolio.