Sampling Design Studio

Back to ImpactMojo

Who this is for. You already grasp the basics — the difference between a population and a sample, roughly what a margin of error is — and now you need to design a real, defensible sample and size it properly. New to sampling? Start with the plain-language Sampling Basics primer (Level 1) first.

📐

Shows its working. Every number traces to a visible formula — margin of error, design effect, finite-population correction, power. Defensible to a donor or IRB, not a black box.

⚖️

Sizes impact, not just surveys. A full two-group / RCT mode with statistical power and minimum detectable effect, alongside proportion and mean estimates.

🎯

Removes the guesswork. A built-in ICC reference library and a two-level model that separates the statistical ideal from the field reality — so the inputs that most swing your sample size are defensible, not invented.

Facilitator mode ★ Premium discussion prompts for training

What do you need to measure?

Your question decides the maths. Pick the closest match — you can change it later.

Discuss: Why does “how many should we survey?” have no answer until you fix the question? Ask the group to phrase one real, named question their programme must answer this quarter.

How far does your programme reach?

Scale drives your strategy. The wider and more scattered your population, the more you’ll lean on clusters rather than a simple random sample — because a clean list of every individual rarely exists.

Unit of analysis the “thing” each row of data represents

We use the scale + unit to suggest a realistic population size in the next step. You can always overwrite it with your own frame.

Discuss: Do you actually have a list (a sampling frame) of every individual? If not, which clusters — villages, schools, SHGs — can you list? That list is what you’ll sample from.

Whose responses will you measure?

Tick each group you must report on. Every ticked group gets its own sample — and if you must report them separately, that pushes you toward stratified sampling.

Must you report results separately for sub-groups?e.g. by gender, grade, or district

“Yes” means each sub-group needs enough sample on its own, which increases the total — and makes stratified sampling the right call.

Discuss: Teams often forget that reporting “by district” or “by gender” multiplies the sample. Which breakdowns does your donor actually require — and which are nice-to-have?

How precise must the answer be?

Precision has two layers. First set the statistical rigor you want in a perfect world; then tell us the field reality — clustering and non-response — that inflates it. We show both, so you can see the gap.

1 Statistical rigor the precision you'd get in a perfect world

Confidence level how sure you want to be

Margin of error (± percentage points) for a proportion

±5.0

±5 points is the common default. ±3 is tight (large sample); ±10 is rough (small sample).

Expected value of the proportion use 50% if unsure — it’s the safest

50%

2 Field reality what the field does to that ideal

Do I need clustering? (design effect)

If you sample whole villages/schools and then survey people inside them, the people in one cluster resemble each other, so you learn less per person. We inflate the sample by a design effect.

Do you sample in clusters? villages, schools, SHGs Expected non-response / attrition buffer replace those you can’t reach or who drop out

10%

Discuss: Ask the group to justify their margin of error to an imaginary board. Is ±5 points good enough to decide whether to scale the programme? What would ±10 hide?

Your sampling strategy

A defensible plan you can drop into a proposal or protocol — with the working shown, so a reviewer can check it.

Show the statistics behind these numbers

Proportion: the sample for a target margin of error e at confidence z is

n₀ = z² · p(1−p) / e²

Mean: replace p(1−p) with the variance σ²:

n₀ = z² · σ² / e²

Two-group comparison (impact): per arm,

n_arm = 2 · (z_α + z_β)² · σ² / MDE²

Finite population correction (when your population N is small):

n = n₀ / (1 + (n₀ − 1)/N)

Design effect for clustering, with m per cluster and ICC ρ:

DEFF = 1 + (m − 1) · ρ , n_cluster = n · DEFF

Finally we divide by (1 − non-response) to get the number you must approach.

★ Premium

⚠️ A studio, not an oracle. These figures assume simple random sampling within your strata and reasonable inputs. Real designs have to reckon with weighting, multiple outcomes, sub-group power and messy frames. Treat this as a defensible first draft, then have a statistician pressure-test it before you field it. ImpactMojo offers coaching and runs practitioner dojos where you can bring your plan for feedback.