Sampling and Power
Choose a sampling method, understand representativeness, and work out how large your sample needs to be.
Who you recruit, and how many, determines what you can conclude. This page covers the sampling choices behind a study and how to size it so you can actually detect the effect you care about.
Sampling Methods
Sampling methods fall into two families.
Probability Sampling
Every member of the population has a known, non-zero chance of selection. These methods support the strongest statistical generalization.
- Simple random: everyone has an equal chance.
- Stratified random: divide the population into strata (for example, age groups) and sample within each, guaranteeing representation.
- Systematic: select every nth member from an ordered list.
Non-Probability Sampling
Selection is based on availability or judgment. Faster and cheaper, but generalization rests on argument rather than pure probability.
- Convenience: whoever is readily available.
- Quota: fill predetermined counts for specific subgroups.
- Purposive: deliberately target people with specific characteristics.
Representativeness
A sample is representative when its composition on the characteristics that matter mirrors the population you want to generalize to. Representativeness is always relative to specific variables: a sample can be representative for age and sex while skewing on income or geography.
To approximate it on Terac:
- Identify the variables your conclusions depend on.
- Use filters to constrain the audience to your target population.
- Use quotas to balance subgroups so no single group dominates.
- Report the composition you achieved, and be explicit about variables you did not control.
Statistical Power
Statistical power is the probability of detecting an effect that is genuinely there. Underpowered studies waste effort and produce unreliable results, with positive findings that often fail to replicate.
Power depends on four interacting factors. Fix any three and the fourth is determined:
| Factor | Effect on power |
|---|---|
| Significance level (alpha) | A stricter alpha (for example .01 vs .05) lowers power |
| Effect size | Larger true effects are easier to detect |
| Sample size | More participants increase power |
| The statistical test | Some tests extract more information than others |
Aim for power between .80 and .95. Below .80 you are likely to miss real effects; chasing much higher power can demand impractically large samples.
Sizing Your Sample Before You Launch
Work it backwards:
- Decide the smallest effect size that would be meaningful for your question.
- Pick your alpha (commonly .05) and target power (commonly .80).
- Use a power calculator to get the required sample size. Free tools like G*Power handle the common designs.
- Set your participant cap to that number, and add a margin for exclusions you defined in your preregistration.
Decide your sample size in advance and commit to it. Peeking at results and stopping once they cross significance inflates false positives. If you need an adaptive stopping rule, plan it as a sequential design up front.