What is the first thing that comes to mind when we see data? The first instinct is to find patterns, connections, and relationships. We look at the data to find meaning in it.
Similarly, in research, once data is collected, the next step is to get insights from it. For example, if a clothing brand is trying to identify the latest trends among young women, the brand will first reach out to young women and ask them questions relevant to the research objective. After collecting this information, the brand will analyze that data to identify patterns — for example, it may discover that most young women would like to see more variety of jeans.
Data analysis is how researchers go from a mass of data to meaningful insights. There are many different data analysis methods, depending on the type of research. Here are a few methods you can use to analyze quantitative and qualitative data.
It’s difficult to analyze bad data. Make sure you’re collecting high-quality data with our blog “4 Data Collection Techniques: Which One’s Right for You?”.
Analyzing Quantitative Data
The first stage of analyzing data is data preparation, where the aim is to convert raw data into something meaningful and readable. It includes four steps:
Step 1: Data Validation
The purpose of data validation is to find out, as far as possible, whether the data collection was done as per the pre-set standards and without any bias. It is a four-step process, which includes…
- Fraud, to infer whether each respondent was actually interviewed or not.
- Screening, to make sure that respondents were chosen as per the research criteria.
- Procedure, to check whether the data collection procedure was duly followed.
- Completeness, to ensure that the interviewer asked the respondent all the questions, rather than just a few required ones.
To do this, researchers would need to pick a random sample of completed surveys and validate the collected data. (Note that this can be time-consuming for surveys with lots of responses.) For example, imagine a survey with 200 respondents split into 2 cities. The researcher can pick a sample of 20 random respondents from each city. After this, the researcher can reach out to them through email or phone and check their responses to a certain set of questions.
Check out 18 data validations that will prevent bad data from slipping into your data set in the first place.
Step 2: Data Editing
Typically, large data sets include errors. For example, respondents may fill fields incorrectly or skip them accidentally. To make sure that there are no such errors, the researcher should conduct basic data checks, check for outliers, and edit the raw research data to identify and clear out any data points that may hamper the accuracy of the results.
For example, an error could be fields that were left empty by respondents. While editing the data, it is important to make sure to remove or fill all the empty fields. (Here are 4 methods to deal with missing data.)
Step 3: Data Coding
This is one of the most important steps in data preparation. It refers to grouping and assigning values to responses from the survey.
For example, if a researcher has interviewed 1,000 people and now wants to find the average age of the respondents, the researcher will create age buckets and categorize the age of each of the respondent as per these codes. (For example, respondents between 13-15 years old would have their age coded as 0, 16-18 as 1, 18-20 as 2, etc.)
Then during analysis, the researcher can deal with simplified age brackets, rather than a massive range of individual ages.
Quantitative Data Analysis Methods
After these steps, the data is ready for analysis. The two most commonly used quantitative data analysis methods are descriptive statistics and inferential statistics.
Typically descriptive statistics (also known as descriptive analysis) is the first level of analysis. It helps researchers summarize the data and find patterns. A few commonly used descriptive statistics are:
- Mean: numerical average of a set of values.
- Median: midpoint of a set of numerical values.
- Mode: most common value among a set of values.
- Percentage: used to express how a value or group of respondents within the data relates to a larger group of respondents.
- Frequency: the number of times a value is found.
- Range: the highest and lowest value in a set of values.
Descriptive statistics provide absolute numbers. However, they do not explain the rationale or reasoning behind those numbers. Before applying descriptive statistics, it’s important to think about which one is best suited for your research question and what you want to show. For example, a percentage is a good way to show the gender distribution of respondents.
Descriptive statistics are most helpful when the research is limited to the sample and does not need to be generalized to a larger population. For example, if you are comparing the percentage of children vaccinated in two different villages, then descriptive statistics is enough.
Since descriptive analysis is mostly used for analyzing single variable, it is often called univariate analysis.
Often, researchers collect data on a sample of their population, then they generalize the results to the entire population or target group. Inferential statistics are used to generalize results and make predictions about a larger population.
These are complex analyses that show the relationship between several different variables, rather than describing a single variable. They are used when the researcher needs to go beyond absolute values and understand the relations between variables.
A few types of inferential analysis are:
- Correlation: This describes the relationship between two variables. If a correlation is found, it means that there is a relationship among the variables. For example, taller people tend to have a higher weight. Hence, height and weight are correlated with each other. However, this doesn’t necessarily mean that one variable causes the other (e.g. gaining weight doesn’t cause people to grow taller).
- Regression: This shows the relationship between two variables. For example, regression can help us guess someone’s weight based on their height.
- Analysis of variance: This is a statistical procedure used to test the degree to which two or more groups vary or differ in an experiment. In most experiments, a great deal of variance indicates that there was a significant finding from the research. For example, to understand the relationship between the number of children in a family and the socio-economic status, a researcher may recruit a sample of families from each socio-economic status and ask them about their ideal number of children. Analysis of variance will be used to check if the difference between the groups’ answers is statistically significant or due to random chance.
The choice of inferential statistic completely depends upon the research objective. Like in the case of descriptive statistics, it is best to identify the appropriate inferential statistic for your research questions.
Since inferential statistics are used to determine the relationship between two or more variables, they are called bivariate analysis (when limited to two variables) or multivariate analysis (when there are more than two variables).
The above-stated methods are the most commonly used methods for data analysis. However, other data analysis methods and metrics, such as standard deviation and variance, are also available.
Make sure you’re carrying out quantitative research right with our Complete Guide to Quantitative Research Methods.
Analyzing Qualitative Data
Qualitative data analysis works a little differently from quantitative data, primarily because qualitative data is made up of words, observations, images, and even symbols. Deriving absolute meaning from such data is nearly impossible; hence, it is mostly used for exploratory research. While in quantitative research there is a clear distinction between the data preparation and data analysis stage, analysis for qualitative research often begins as soon as the data is available.
Data Preparation and Basic Data Analysis
Analysis and preparation happen in parallel and include the following steps:
- Getting familiar with the data: Since most qualitative data is just words, the researcher should start by reading the data several times to get familiar with it and start looking for basic observations or patterns. This also includes transcribing the data.
- Revisiting research objectives: Here, the researcher revisits the research objective and identifies the questions that can be answered through the collected data.
- Developing a framework: Also known as coding or indexing, here the researcher identifies broad ideas, concepts, behaviors, or phrases and assigns codes to them. For example, coding age, gender, socio-economic status, and even concepts such as the positive or negative response to a question. Coding is helpful in structuring and labeling the data.
- Identifying patterns and connections: Once the data is coded, the research can start identifying themes, looking for the most common responses to questions, identifying data or patterns that can answer research questions, and finding areas that can be explored further.
How to Find Patterns in Qualitative Data
There are several ways to identify qualitative data patterns (also known as “themes”). One way is to use word-based methods, such as word repetitions. In this method, the researcher simply reads the text and identifies the words used most often. For example, in a study on what India’s youth think about politics, the researcher may find that the most commonly used words are “greed” or “corruption” and use them for analysis.
Another word-based technique is key words in context. Here, the researcher tries to understand a concept by looking at the context in which it is used. For example, if researchers are trying to analyze the concept of depression among respondents, they can analyze the context of when the respondent has referred to depression. (This could be while discussing mental health, family-related issues, etc.)
Another method of identifying patterns is called scrutiny-based techniques. One such method is the compare and contrast method, where a theme represents a way in which texts are similar or different from each other. For example, one theme may be “importance of counselor in a school”, and the collected text data may be divided into those who think there should be a counselor and those who think it is unnecessary. These texts can be further analyzed to learn the why behind each group’s beliefs.
Other scrutiny-based methods to identify patterns or themes include looking for metaphors and analogies in the text, or looking for connectors in the form of words or phrases that indicate a relationship between different ideas or things.
Did you collect your qualitative data right? Check out the 3 qualitative research methods you should know.
Qualitative Data Analysis Methods
Several methods are available to analyze qualitative data. The most commonly used data analysis methods are:
- Content analysis: This is one of the most common methods to analyze qualitative data. It is used to analyze documented information in the form of texts, media, or even physical items. When to use this method depends on the research questions. Content analysis is usually used to analyze responses from interviewees.
- Narrative analysis: This method is used to analyze content from various sources, such as interviews of respondents, observations from the field, or surveys. It focuses on using the stories and experiences shared by people to answer the research questions.
- Discourse analysis: Like narrative analysis, discourse analysis is used to analyze interactions with people. However, it focuses on analyzing the social context in which the communication between the researcher and the respondent occurred. Discourse analysis also looks at the respondent’s day-to-day environment and uses that information during analysis.
- Grounded theory: This refers to using qualitative data to explain why a certain phenomenon happened. It does this by studying a variety of similar cases in different settings and using the data to derive causal explanations. Researchers may alter the explanations or create new ones as they study more cases until they arrive at an explanation that fits all cases.
These methods are the ones used most commonly. However, other data analysis methods, such as conversational analysis, are also available.
Data analysis is perhaps the most important component of research. Weak analysis produces inaccurate results that not only hamper the authenticity of the research but also make the findings unusable. It’s imperative to choose your data analysis methods carefully to ensure that your findings are insightful and actionable.
Our data collection app Collect supports a host of amazing features and capabilities to make your next research project smarter and your data analysis more efficient. Learn how it can help you and start your free trial here.