Probability Sampling: How to Represent Large Populations - Atlan

What Is Probability Sampling?

Population sampling is the process of picking a representative subset of a population, in order to conduct research over the entire population. While the most accurate results can be obtained if the entire population is considered, it is neither feasible nor practical. This is exactly why the Census of India, which covers every Indian citizen, is done only every ten years.

probability sampling, sampling techniques

How researchers select their sample largely determines the quality of a research’s findings. Probability sampling leads to higher quality findings because it provides an unbiased representation of the population. It uses randomization to guarantee that every unit in the population has a non-zero known probability of being included in the sample.

Check out our sampling blog to learn more about why good sampling is important.

Why Use Probability Sampling?

Provides a Representative Sample

Populations are usually diverse. Probability sampling allows researchers to create a sample that fully represents the diversity of the population.

Say we want to know what country Indian students prefer for higher studies. Probability sampling allows for picking a sample that closely represents the diversity of students’ gender, socio-economic background, academic background, motivations, and ambitions among the population of students.

Helps Obtain Statistical Inferences

Probability sampling helps researchers create an accurate sample of their population. If the sample is accurate, researchers can use proven statistical methods to confidently draw conclusions about the larger population.

Imagine that in the previous example the survey found that 30% of surveyed students prefer the United Kingdom, 25% the United States, 15% Australia, 10% Canada, and 20% other countries for higher studies. Researchers could use this data to infer that between 25% and 35% of all Indian students prefer the United Kingdom, since the statistical margin of error for this survey was 5%.

Check out our ebook to learn how to calculate the margin of error for a sample.

Reduces Sampling Bias

Sampling bias occurs when some units of the population are more likely to be chosen than others. This results in an incorrectly represented population. Probability sampling gives each unit of the population an equal chance of being selected in the sample since units are randomly selected.

In the previous example, suppose that the sampling is done by picking the researcher’s friends and their friends. This sample would be biased, since the researcher’s friends are more likely to be chosen for the sample than the researcher’s enemies. Probability sampling would solve this sampling bias by giving every student an equal chance of being chosen for the sample.

Techniques for Probability Sampling

Simple Random Sampling

This method of sampling attributes equal probability of selection to each unit of the population. Simple random sampling is ideal when the researcher need not account for the composition of the population.

For example, imagine that researchers want to study village infrastructure in a given state. To make the survey easier, the researchers can randomly select villages in that state and only conduct their survey in those villages. This sample would be statistically rigorous because every village had an equal chance of being selected for the sample.

Stratified Random Sampling

Researchers can use this technique to create a sample that replicates the composition of various groups in the population. Stratified random sampling divides the population into groups, referred to as strata, with similar characteristics. Random sampling is then done on each group so that the proportion of each group in the sample is equal to the proportion of that group in the overall population.

For instance, for a research on the differences between male and female perceptions of a new apparel store, simple random sampling would not be a good idea. Instead, splitting the population into two strata — male and female — and selecting proportional samples from each group would give us a more representative sample.

Systematic Random Sampling

In this method, the researcher first chooses a random item from the population and the next item chosen is the nth item each time from the list as long as it falls within the size of the population. Here n is the size of the sample required.

Say we have a population of size 100, and we wish to obtain a sample of 10 (n=10). So, we need to obtain the 10th item each time. Let us first randomly choose the 2nd item. We will then choose the 12th, 22nd items and so on till we reach the 92nd item, after which we will exceed the population limit. It is imperative that for the starting point there is an equal probability of selection of any item.

Systematic random sampling only works if the list is not ordered. For example, consider research on the scores of high school students. The researcher has a list of students’ scores arranged in descending order. The average scores obtained from the sample will be higher if the researcher chooses the 1st student for her first sample unit instead of the 2nd, or the 2nd instead of the 3rd and so on.

Cluster Random Sampling

Cluster random sampling is mostly used in geographical sampling, which involves dividing an area into clusters and choosing units from each cluster to represent the entire area.

Consider a survey to find the number of pop music supporters in India. The entire population of India can be divided into clusters based on area. A set of clusters can be chosen based on either random sampling or systematic random sampling. Based on some background study, the researcher can then decide whether to keep all the units of the cluster or to carry out further sampling within each cluster.

This method has a higher chance of a sampling error. Say that the researcher picked states and UTs as clusters, and Delhi was one of the chosen clusters. Since Delhi has widespread pop-culture influence, the results would be skewed in favor of pop culture.

All in all, probability sampling is not always feasible since it requires the researcher to know their population well enough beforehand. Nonetheless, it is an effective technique that guarantees a sample that is a close representation of the population.

7 Comments

paul 8 years ago Reply
Very Interesting and helpful.
Parag 8 years ago Reply
Very nicely compiled and helpful in understanding the concept of sampling and usage.
Gajab 8 years ago Reply
Nicely written
Bilqis 8 years ago Reply
Simple, nice and clear.
Symon 8 years ago Reply
Insightful!
Lameck 5 years ago Reply
Ofcourse you are sharing fantastic materials, but it could be better to provide also PDFs / ebooks that we can revise later.
Regards,
Lameck Pashet.
- Ayswarrya G 5 years ago Reply
  Thank you for the suggestion Lameck. We’ll keep that in mind for upcoming articles.