Sample Size Calculator

Confidence Level Probability that the true value falls within your margin of error. 95% is standard for most surveys.

Expected Proportion (p)

Your best estimate of the proportion you are measuring. Use 50% if unknown (gives the largest, most conservative sample).

Margin of Error (E)

How precise your estimate needs to be. e.g., +/-5% means if the true value is 40%, your estimate will be between 35% and 45%.

Non-Response Rate

Expected percentage of selected respondents who will not participate. 10-20% is common in development surveys.

Apply finite population correction

Total Population (N) Use when sampling a large fraction of a small, known population (e.g., all teachers in a district).

Required Sample Size

385

respondents needed

Parameter	Value

n = (Z² x p x (1 - p)) / E² Where: Z = Z-score for chosen confidence level p = expected proportion E = margin of error With non-response adjustment: n_final = n / (1 - non_response_rate) With finite population correction: n_adj = n / (1 + (n - 1) / N)

Confidence Level 95% is the standard for most development evaluations.

Expected Standard Deviation Estimated variability in the outcome. Use pilot data or similar studies. Example: if measuring monthly income in thousands, SD might be 10-20.

Margin of Error (precision) Acceptable difference from the true mean, in the same units as your outcome. Example: +/-3 thousand for income.

Non-Response Rate

Apply finite population correction

Total Population (N)

Required Sample Size

respondents needed

Parameter	Value

n = (Z² x sigma²) / E² Where: Z = Z-score for chosen confidence level sigma = expected standard deviation E = margin of error (precision) With non-response adjustment: n_final = n / (1 - non_response_rate) With finite population correction: n_adj = n / (1 + (n - 1) / N)

Significance Level (alpha) Probability of detecting an effect that does not exist (Type I error). 0.05 is standard.

Statistical Power Probability of detecting a real effect. 80% is common; 90% provides more assurance but requires a larger sample.

Expected Standard Deviation Pooled standard deviation of the outcome variable across both groups.

Minimum Detectable Effect (MDE) Smallest difference between treatment and control you want to be able to detect. In the same units as your outcome.

Non-Response Rate

Apply finite population correction

Total Population (N)

Required Sample Size

196

per group (392 total)

Parameter	Value

n_per_group = 2 x (Z_alpha + Z_beta)² x sigma² / delta² Where: Z_alpha = Z-score for significance level (two-sided) Z_beta = Z-score for statistical power sigma = pooled standard deviation delta = minimum detectable effect size Total sample = 2 x n_per_group With non-response adjustment: n_final = n / (1 - non_response_rate)

Start by computing a simple random sample size using the Proportion or Mean tab above, then apply the cluster adjustment here.

Base Sample Size (from SRS calculation) The sample size you would need under simple random sampling. Compute this using the Proportion or Mean tab.

Average Cluster Size (m) Average number of individuals sampled per cluster (village, school, health facility, etc.).

Intra-Cluster Correlation (ICC)

How similar individuals within the same cluster are. Typical values: 0.01-0.05 for health outcomes, 0.05-0.20 for education, 0.10-0.30 for community-level behaviors.

Non-Response Rate

Adjusted Sample Size

817

across ~41 clusters

Parameter	Value

Design Effect (DEFF): DEFF = 1 + (m - 1) x ICC Adjusted sample size: n_adj = n_srs x DEFF Number of clusters: k = ceil(n_adj / m) With non-response adjustment: n_final = n_adj / (1 - non_response_rate) Where: m = average cluster size ICC = intra-cluster correlation coefficient n_srs = base sample size from simple random sampling

Understanding Sample Size Parameters

Confidence Level

How certain you want to be that your results reflect the true population value. A 95% confidence level means that if you repeated the survey 100 times, 95 of those estimates would contain the true value.

In a WASH baseline survey, 95% confidence ensures funders can trust that reported access rates are reliable.

Margin of Error

The range within which the true value likely falls. A +/-5% margin of error on a finding of 60% means the true value is between 55% and 65%.

For national health surveys (like DHS/NFHS), margins of +/-3% are typical. For project-level monitoring, +/-5 to 10% is often acceptable.

Expected Proportion

Your best guess of the proportion before the survey. When unknown, use 50% -- this maximizes variance and gives the most conservative (largest) sample size.

If a previous survey found 35% of households practiced open defecation, use p = 0.35 for a follow-up survey on the same indicator.

Standard Deviation

A measure of how spread out your data values are. Higher variability requires larger samples to achieve the same precision.

Monthly household income might have SD of Rs. 15,000 in rural areas. If measuring test scores (0-100), SD might be around 15-20 points.

Statistical Power

The probability of detecting a real effect when it exists. 80% power means a 20% chance of missing a true effect (Type II error). Used in experimental designs (RCTs).

A livelihood RCT with 80% power and MDE of Rs. 2,000/month: if the program truly increases income by Rs. 2,000+, the study will detect it 80% of the time.

Intra-Cluster Correlation (ICC)

Measures how similar individuals within the same cluster (village, school) are to each other. Higher ICC means individuals within clusters are more alike, requiring more clusters and larger total samples.

Vaccination rates tend to cluster by village (ICC ~ 0.10-0.20) because communities share health facilities and social norms. Income is often less clustered (ICC ~ 0.02-0.05).

When to Use Each Mode

Proportion

Use when your key indicator is a percentage or rate: access to services, adoption rates, prevalence, coverage.

What proportion of households have access to clean drinking water within 500 meters?

Mean

Use when measuring a continuous outcome: income, test scores, crop yield, distance to services.

What is the average monthly income of smallholder farmers in the project area?

Two-Group Comparison

Use when comparing treatment vs. control groups in RCTs, quasi-experiments, or impact evaluations.

Does the skills training program increase monthly earnings by at least Rs. 3,000 compared to the control group?

Cluster Sampling

Use when you cannot randomly sample individuals but must sample entire clusters (villages, schools, clinics) and then survey individuals within them.

A nutrition survey selects 30 villages randomly and surveys 20 households in each village -- cluster sampling with m=20.

Rules of Thumb

1For simple proportion surveys at 95% confidence and +/-5% margin of error, you need approximately 385 respondents (assuming p=50%).
2Doubling precision (halving the margin of error) quadruples the sample size. Going from +/-5% to +/-2.5% requires ~1,537 instead of ~385.
3Always add a buffer for non-response. In rural South Asia, 10-20% non-response is common; in urban slums or conflict areas, 20-30% may be needed.
4Cluster sampling typically increases sample size by 1.5x to 3x compared to simple random sampling, depending on ICC and cluster size.
5For RCTs in development, an MDE of 0.2 standard deviations is considered a small effect. Many real-world programs produce effects of 0.1-0.3 SD.
6Finite population correction matters most when sampling more than 5-10% of the total population. For large populations, it has negligible effect.

Related ImpactMojo Courses

Monitoring, Evaluation & Learning Development Economics & Econometrics Data Visualization