ImpactMojo
Browse Premium
Back to ImpactMojo
◇ Level 1 · Primer

Sampling Basics

The handful of ideas you need before you size a survey — in plain language, no maths, about ten minutes. When it clicks, you'll be ready for the Sampling Design Studio.

Free · in-browser · nothing to install · written for people meeting sampling for the first time

What you'll pick up
  1. Why we don't measure everyone
  2. The one idea that matters: representativeness
  3. Random beats big
  4. “How sure?” and “how precise?”
  5. When your sample comes in clumps (clusters)
  6. Describing vs comparing
  7. Your vocabulary cheat-sheet

01 Why we don't measure everyone

Say your programme reaches 8,000 women across a district and you want to know how many feel confident speaking up in their group meeting. You could interview all 8,000 — but it would take months and a budget you don't have. So you don't. You talk to a well-chosen few hundred and use their answers to say something trustworthy about all 8,000.

That's the whole game. The big group you care about is the population. The few you actually talk to are the sample. Measuring everyone is a census; measuring a slice is sampling.

The promise of sampling: a small slice, chosen well, can tell you something reliable about the whole — for a fraction of the time and money.

02 The one idea that matters: representativeness

A sample is only useful if it looks like the whole group. If a quarter of your population is landless, roughly a quarter of your sample should be too. When your method quietly favours some kinds of people over others, that's bias — and a biased sample lies to you confidently.

The classic trap is the convenience sample: you interview whoever is easy to reach — the women already at the centre, the households near the road. Easy, and usually wrong, because “easy to reach” is rarely “typical.”

See it: convenience vs random

A village of 300. The amber dots are the 40% who are landless — the people a market-day interview tends to miss. Draw 30 people two ways and compare what each “finds.”

40%the real share (landless)
convenience sample says
random sample says

Run it a few times. The convenience number keeps landing well below the truth; the random one hugs 40%. Same size — only the method differs.

03 Random beats big

Beginners assume a bigger sample is always better. It isn't. A small random sample beats a huge biased one, every time. The famous example: in 1936 a US magazine mailed 10 million ballots and got 2.4 million back predicting the wrong election winner — while a pollster with a few thousand well-chosen people got it right. Size didn't save the biased method.

Random selection — every person having a known, fair chance of being picked — is what earns you the right to generalise from the few to the many. Size only helps after you've got the method right.

Order of operations: get it representative first (random), get it big enough second. Never the other way round.

04 “How sure?” and “how precise?”

Because you asked a sample and not everyone, your answer carries a little wobble. Two plain words describe it:

Margin of error — how wide the wobble is. “62%, give or take 5 points” means the true figure is very likely between 57% and 67%.

Confidence — how often that range would contain the truth if you repeated the whole exercise. “95% confidence” means if you did it 20 times, about 19 of your ranges would capture the real number.

See it: bigger sample, tighter answer

Drag to change the sample size and watch the margin of error (95% confidence) shrink.

n = 400 people margin of error ≈ ±4.9 points

Notice the diminishing returns: going from 400 to 1,600 people (four times the cost) only halves the wobble. Precision gets expensive fast — which is exactly why picking the right sample size is a real decision, not “grab as many as you can.”

05 When your sample comes in clumps (clusters)

Often you simply can't get a list of all 8,000 women to draw randomly from. But you can list the 200 villages they live in. So you randomly pick, say, 30 villages, then interview women within those. That's cluster sampling.

It's practical — but there's a catch. People in the same village tend to be alike: same water source, same school, same local prices. So each extra woman you interview in a village you've already visited teaches you a little less than a brand-new independent person would. To get the same precision, you need a somewhat bigger total sample. Statisticians call that penalty the design effect, and the “how alike are people in a cluster” number behind it the intra-cluster correlation.

Plain version: clusters save you legwork but cost you a bit of precision, so clustered surveys need more people overall. The more alike a cluster's members, the bigger that cost.

06 Describing vs comparing

One last fork, because it changes everything about how big your sample must be:

Describing — “What share of children can read a Grade-2 text?” You're estimating one number for one group. This is a survey.
Comparing — “Did our programme raise reading?” Now you need two groups (those in the programme and a comparison) and you're hunting for a difference between them. Detecting a difference reliably — especially a small one — takes a much larger sample than simply describing.

Knowing which one you're doing is the first question the Sampling Design Studio will ask you — and now you'll know why it matters.

07 Your vocabulary cheat-sheet

Every word you'll meet in the Studio, in one place:

Population — the whole group you want to learn about.
Sample — the slice you actually measure.
Sampling frame — the list you draw your sample from (people, or villages).
Census — measuring everyone, no sampling.
Bias — a method that quietly favours some people, skewing your answer.
Representative — a sample that mirrors the population's mix.
Random sampling — everyone has a fair, known chance of being picked.
Stratify — split the population into groups (e.g. by gender) and sample each, so every group is covered.
Cluster sampling — pick groups (villages/schools) first, then people within them.
Margin of error — how wide the “give or take” around your answer is.
Confidence level — how often that range would capture the truth on repetition (usually 95%).
Design effect — the extra sample clustering costs you.

Ready to design your own?

You've got the ideas. Now let the Studio turn your evaluation question into a real, defensible sampling plan — it does the maths and keeps explaining as you go.

Open the Sampling Design Studio (Level 2) →