What Is Survey Sampling?
Surveys would be meaningless and incomplete without accounting for the respondents that they’re aimed at. The best survey design practices keep the target population at the core of their thought process.
‘All the residents of the Dharavi slums in Mumbai’, ‘every NGO in Calcutta’ and ‘all students below the age of 16 in Manipur’ are examples of a population; they are countable, finite and well-defined.
When the population is small enough, researchers have the resources to reach out to all of them. This would be the best case scenario, making sure that everybody who matters to the survey is represented accurately. A survey that covers the entire target population is called a census.
However, most surveys cannot survey the entire population. This is when sampling techniques become crucial to your survey.
Why Is It Important?
If the target population is not small enough, or if the resources at your disposal don’t give you the bandwidth to cover the entire population, it is important to identify a subset of the population to work with – a carefully identified group that is representative of the population. This process is called survey sampling, and it is one of the most important aspects of survey design.
Whatever the sample size, there are fixed costs associated with any survey. Once the survey has begun, the marginal costs associated with gathering more information, from more people, are proportional to the size of the sample.
Drawing Inferences About the Population
Researchers are not interested in the sample itself, but in the understanding that they can potentially infer from the sample and then apply across the entire population.
A sample survey usually offers greater scope than a census. Working within a given resource constraint, sampling may make it possible to study the population of a larger geographical area or to find out more about the same population by examining an area in greater depth through a smaller sample.
Before we dive into the survey sampling methods at our disposal it is imperative that we develop a perspective on what an effective sample should look like.
3 Features to Keep in Mind While Constructing a Sample
It is important that researchers understand the population on a case-by-case basis and test the sample for consistency before going ahead with the survey. This is especially critical for surveys that track changes across time and space where we need to be confident that any change we see in our data reflects real change – across consistent and comparable samples.
Ensuring diversity of the sample is a tall order, as reaching some portions of the population and convincing them to participate in the survey could be difficult. But to be truly representative of the population, a sample must be as diverse as the population itself and sensitive to the local differences that are unavoidable as we move across the population.
There are several constraints that dictate the size and structure of the population. It is imperative that researchers discuss these limitations and maintain transparency about the procedures followed while selecting the sample so that the results of the survey are seen with the right perspective.
Now that we understand the necessity of choosing the right sample and have a vision of what an effective sample for your survey should be like, let’s explore the various methods of constructing a sample and understand the relative pros and cons of each of these approaches.
Sampling methods can broadly be classified as probability and non-probability.
3 Probability Sampling Techniques
When each entity of the population has a definite, non-zero probability of being incorporated into the sample, the sample is known as a probability sample.
Probability samples are selected in such a way as to be representative of the population. They provide the most valid or credible results because they reflect the characteristics of the population from which they are selected.
Probability sampling techniques include random sampling, systematic sampling, and stratified sampling.
When: There is a very large population and it is difficult to identify every member of the population.
How: The entire process of sampling is done in a single step with each subject selected independently of the other members of the population. The term random has a very precise meaning and you can’t just collect responses on the street and have a random sample.
Pros: In this technique, each member of the population has an equal chance of being selected as subject.
Cons: When there are very large populations, it is often difficult to identify every member of the population and the pool of subjects becomes biased. Dialing numbers from a phone book for instance, may not be entirely random as the numbers, though random, would correspond to a localized region. A sample created by doing so might leave out many sections of the population that are significant to the study.
Use case: Want to study and understand the rice consumption pattern across rural India? While it might not be possible to cover every household, you could draw meaningful insights by building your sample from different districts or villages (depending on the scope).
When: Your given population is logically homogenous.
How: In a systematic sample, after you decide the sample size, arrange the elements of the population in some order and select terms at regular intervals from the list.
Pros: The main advantage of using systematic sampling over simple random sampling is its simplicity. Another advantage of systematic random sampling over simple random sampling is the assurance that the population will be evenly sampled. There exists a chance in simple random sampling that allows a clustered selection of subjects. This can be avoided through systematic sampling.
Cons: The possible weakness of the method that may compromise the randomness of the sample is an inherent periodicity of the list. This can be avoided by randomizing the list of your population entities, as you would randomize a deck of cards for instance, before you proceed with systematic sampling.
Use Case: Suppose a supermarket wants to study buying habits of their customers. Using systematic sampling, they can choose every 10th or 15th customer entering the supermarket and conduct the study on this sample.
When: You can divide your population into characteristics of importance for the research.
How: A stratified sample, in essence, tries to recreate the statistical features of the population on a smaller scale. Before sampling, the population is divided into characteristics of importance for the research — for example, by gender, social class, education level, religion, etc. Then the population is randomly sampled within each category or stratum. If 38% of the population is college-educated, then 38% of the sample is randomly selected from the college-educated subset of the population.
Pros: This method attempts to overcome the shortcomings of random sampling by splitting the population into various distinct segments and selecting entities from each of them. This ensures that every category of the population is represented in the sample. Stratified sampling is often used when one or more of the sections in the population have a low incidence relative to the other sections.
Cons: Stratified sampling is the most complex method of sampling. It lays down criteria that may be difficult to fulfill and place a heavy strain on your available resources.
Use Case: If 38% of the population is college-educated and 62% of the population have not been to college, then 38% of the sample is randomly selected from the college-educated subset of the population and 62% of the sample is randomly selected from the non-college-going population. Maintaining the ratios while selecting a randomized sample is key to stratified sampling.
3 Non-Probability Sampling Techniques
Non-probability sampling techniques include convenience sampling, snowball sampling and quota sampling.
In these techniques, the units that make up the sample are collected with no specific probability structure in mind. The selection is not completely randomized, and hence the resultant sample isn’t truly representative of the population.
When: During preliminary research efforts.
How: As the name suggests, the elements of such a sample are picked only on the basis of convenience in terms of availability, reach and accessibility.
Pros: The sample is created quickly without adding any additional burden on the available resources.
Cons: The likelihood of this approach leading to a sample that is truly representative of the population is very poor.
Use Case: This method is often used during preliminary research efforts to get a gross estimate of the results, without incurring the cost or time required to select a random sample.
When: When you can rely on your initial respondents to refer you to the next respondents.
How: Just as the snowball rolls and gathers mass, the sample constructed in this way will grow in size as you move through the process of conducting a survey. In this technique, you rely on your initial respondents to refer you to the next respondents whom you may connect with for the purpose of your survey.
Pros: The costs associated with this method are significantly lower, and you will end up with a sample that is very relevant to your study.
Cons: The clear downside of this approach is that you may restrict yourself to only a small, largely homogenous section of the population.
Use Case: Snowball sampling can be useful when you need the sample to reflect certain features that are difficult to find. To conduct a survey of people who go jogging in a certain park every morning, for example, snowball sampling would be a quick, accurate way to create the sample.
When: When you can characterize the population based on certain desired features.
How: Quota sampling is the non-probability equivalent of stratified sampling that we discussed earlier. It starts with characterizing the population based on certain desired features and assigns a quota to each subset of the population.
Pros: This process can be extended to cover several characteristics and varying degrees of complexity.
Cons: Though the method is superior to convenience and snowball sampling, it does not offer the statistical insights of any of the probability methods.
Use Case: If a survey requires a sample of fifty men and fifty women, a quota sample will survey respondents until the right number of each type has been surveyed. Unlike stratified sampling, the sample isn’t necessarily randomized.
Probability sampling techniques are clearly superior, but the costs can be prohibitive. For the initial stages of a study, non-probability sampling techniques might be sufficient to give you a sense of what you’re dealing with. For detailed insights and results that you can bank upon, move on to the more sophisticated techniques as the study gathers pace and takes a more concrete structure.
Once you have created your sample, optimize your survey quality by choosing the right survey question types.
Looking for a mobile-friendly data collection app? Check out Collect, our Android-based data collection tool! It has been used by Google, Tata Trusts, the Azim Premji Foundation and others to collect over 20 million survey responses.
Note: This article was originally published on 27 April 2015, then refreshed and updated on 25 July 2017.