Statistical analysis entails using quantitative data to investigate trends, patterns, and relationships. Scientists, governments, businesses, and other organizations utilize it to do research.

Statistical analysis involves meticulous planning from the beginning of the research process to draw meaningful conclusions. You must define your hypotheses and decide on your study design, sample size, and sampling process.

After you’ve collected data from your sample, you may use descriptive statistics to arrange and summarize it. You may next use inferential statistics to explicitly test hypotheses and make population estimates. Finally, you can put your findings into context and generalize them.

This article provides students and researchers with a practical introduction to statistical analysis. Using two study examples, we’ll walk you through the steps. The first looks into the possibility of a cause-and-effect link, whereas the second looks into the possibility of a correlation between variables.

Causal research question, for example

Can meditation help teenagers perform better on exams?

Correlational research question, for example

Is there a link between parental wealth and a student’s college grade point average (GPA)?

Step 1: Develop your hypotheses and research design.

To collect reliable data for statistical analysis, you must first define your hypothesis and design your study.

Statistical hypotheses are written

A common purpose of research is to look at a link between variables in a population. You start with a hypothesis and then test it through statistical analysis.

A statistical hypothesis is a method of formally expressing a population prediction. Every study hypothesis is broken down into null and alternative hypotheses that may be tested with real data.

The null hypothesis always predicts that there will be no effect or relationship between variables, whereas the alternative hypothesis expresses your study prediction of an effect or link.

Statistical hypotheses to examine an effect, for example.

• Null hypothesis: A 5-minute meditation exercise has no influence on teens’ arithmetic test scores.

• Alternative hypothesis: A 5-minute meditation exercise will help youngsters improve their arithmetic test scores.

Statistical hypotheses to test a correlation, for example.

• Null hypothesis: In college students, parental income and GPA have no relation to each other.

• Alternative hypothesis: In college students, parental income and GPA are positively connected.

Creating a research design

Your study design is your overall data collecting and analysis strategy. It establishes the statistical tests that will be used to test your hypothesis in the future.

Determine if your study will be descriptive, correlational, or experimental in nature. Experiments affect variables directly, whereas descriptive and correlational studies just assess them.

• Statistical tests of comparison or regression can be used in an experimental design to analyze a cause-and-effect connection (e.g., the influence of meditation on test scores).

• Using correlation coefficients and significance tests, you can investigate correlations between variables (e.g., parental income and GPA) without making any assumptions about causality.

• Using statistical tests to derive inferences from sample data, you can analyze the features of a population or phenomena (e.g., the prevalence of anxiety in US college students) in a descriptive design.

Your study’s design also determines whether you’ll compare participants on a group or individual basis, or both.

• A between-subjects design compares the group-level results of participants who were exposed to different treatments (for example, those who conducted a meditation exercise vs. those who did not).

• A within-subjects design compares repeated measures from participants who have completed all of the study’s treatments (e.g., scores from before and after performing a meditation exercise).

• In a mixed (factorial) design, one variable is changed among subjects, while another is changed within subjects (for example, pretest and posttest scores from individuals who did or did not undertake a meditation exercise).

• Experimental

• Correlational

Experimental research design is an example.

You plan a within-subjects experiment to see if a 5-minute meditation exercise may help students improve their math test performance. One group of participants is measured repeatedly in your study.

Participants’ baseline test scores will be taken first. Your participants will then participate in a 5-minute meditation activity. Finally, you’ll keep track of the results of a second math test.

The 5-minute meditation exercise is the independent variable in this study, while the math exam score before and after the intervention is the dependent variable.

Variables are measured.

You should operationalize your variables and establish how you will measure them while creating a research design.

It’s crucial to think about the level of measurement of your variables while doing statistical analysis because it tells you what sort of data they contain:

• Categorical data denotes classifications. These can be nominal (for example, gender) or ordinal (for example, age) (e.g. level of language ability).

• Quantitative data refers to numerical values. These can be on a ratio scale or an interval scale (for example, a test score) (e.g. age).

Many factors can be measured with varying degrees of accuracy. Age data, for example, can be categorical or quantitative (8 years old) (young). If a variable is coded numerically (for example, level of agreement from 1 to 5), that doesn’t mean it’s quantitative rather than categorical.

Choosing proper statistics and hypothesis testing requires determining the measurement level. With quantitative data, you can generate a mean score, but not with categorical data.

Along with assessments of your variables of interest, you’ll frequently collect data on relevant participant characteristics in a research project.

• Experimental

• Correlational

Variables, for example (experiment)

With quantitative age or test score data, you can do a variety of calculations, whereas categorical factors can be utilized to determine comparison test groupings.

Variable Data Type

Quantitative age (ratio)

Categorical Gender (nominal)

Ethnicity or race

Categorization (nominal)

Initial test results

Statistical (interval)

Final test results

Statistical (interval)

Step 2: Collect data from a representative sample

In most circumstances, collecting data from every person of the population you’re interested in studying is too difficult or expensive. Instead, you’ll gather information from a sample.

As long as you utilize acceptable sampling practices, statistical analysis permits you to apply your conclusions beyond your own sample. A representative sample of the population should be your goal.

Statistical analysis requires sampling.

There are two major methods for choosing a sample.

• Probability sampling: every member of the population has a chance of being chosen at random for the study.

• Non-probability sampling: some people are more likely to be chosen for the study than others based on factors like convenience or voluntary self-selection.

In theory, a probability sampling method should be used for highly generalizable conclusions. Random selection eliminates sampling bias and assures that the data from your sample is representative of the entire population. When data is acquired via probability sampling, parametric tests can be utilized to establish strong statistical judgments.

In practice, however, obtaining the optimal sample is unusual. Non-probability samples are more likely to be skewed, but they are also considerably easier to recruit and gather data from. Non-parametric tests are better suited to non-probability samples, but they yield weaker population inferences.

If you want to apply parametric tests with non-probability samples, you must show that:

• your sample is representative of the population to whom your findings are being applied.

• There is no systematic bias in your sample.

Keep in mind that external validity means you can only apply your findings to those who share your sample’s characteristics. Results from Western, Educated, Industrialized, Rich, and Democratic samples (for example, college students in the United States) aren’t always transferable to non-WEIRD groups.

If you use parametric tests on data from non-probability samples, make sure to explain in your discussion section how far your results can be generalized.

Make a suitable sampling procedure.

Decide how you’ll recruit participants based on the resources available for your study.

• Will you have the means to publicize your research extensively, even outside of your university?

• Will you be able to get a varied sample that represents the entire population?

• Do you have time to reach out to members of hard-to-reach groups and follow up with them?

• Experimental

• Correlational

The population you’re interested in is high school pupils in your city, as an example. You contact three private schools and seven public schools across the city to see if you can conduct your experiment with 11th grade kids.

Your participants are chosen by their respective schools. You aim for a diverse and representative sample despite employing a non-probability sample.

Calculate an appropriate sample size.

Determine your sample size before recruiting participants by looking at other studies in your field or using statistics. A sample that is too small may not be representative of the entire sample, and a sample that is too large will be more expensive than necessary.

There are numerous sample size calculators available on the internet. Depending on whether you have subgroups or how rigorous your study should be, different formulas are used (e.g., in clinical research). A minimum of 30 units or more per subgroup is required as a rule of thumb.

To use these calculators, you must first comprehend and input the following key elements:

• Alpha (significance level): the risk of rejecting a true null hypothesis that you are willing to accept, usually set at 5%.

• Statistical power: the likelihood of your study detecting a large effect if one exists, usually 80 percent or higher.

• Expected effect size: a standardized estimate of the size of your study’s expected result, usually based on similar studies.

• Population standard deviation: a population parameter estimate based on a previous study or your own pilot study.

Students adore Scribbr’s proofreading services for the following reasons.

Learn about proofreading and editing.

Step 3: Use descriptive statistics to summarize your data.

After you’ve gathered all of your information, examine it and generate descriptive statistics to summarize it.

Examine your information.

There are several methods for inspecting your data, including:

• Using frequency distribution tables to organize data from each variable.

• Using a bar chart to visualize the distribution of answers from a key variable.

• Using a scatter plot to visualize the relationship between two variables.

You may analyze whether your data has a skewed or normal distribution and whether there are any outliers or missing data by visualizing it in tables and graphs.

A normal distribution describes how your data is symmetrically distributed about a central point where the majority of the values are found, with values falling off at the ends.

A skewed distribution, on the other hand, is asymmetric, with more values on one end than the other. The form of the distribution is vital to remember since skewed distributions should only be utilized with a few descriptive statistics.

Extreme outliers can also provide misleading statistics, therefore dealing with these values may require a systematic strategy.

Calculate central tendency measures.

The most common values in a data collection are described by measures of central tendency. There are three common measurements of central tendency:

• Mode: the data set’s most popular response or value.

• Median: the value in the data set that is exactly in the middle when ordered from low to high.

• Mean: the total number of values divided by the total number of values.

Only one or two of these measures may be appropriate depending on the shape of the distribution and the level of measurement. Many demographic traits, for example, can only be characterized in terms of the mode or proportions, whereas a variable like response time may not have one at all.

Calculate the variability measurements.

Variability measures reveal how evenly distributed the values in a data set are. There are four main metrics of variability that are frequently reported:

• Range: the data set’s highest value minus its lowest value.

• Interquartile range: the range of the data set’s middle half.

• Standard deviation: the average distance between the mean and each value in your data set.

• Variance: the standard deviation squared.

Your choice of variability statistics should again be guided by the form of the distribution and the amount of measurement. For skewed distributions, the interquartile range is the best measure, while standard deviation and variance provide the most information for normal distributions.

• Experimental

• Correlational

Descriptive statistics, for example (experiment)

You construct descriptive statistics after gathering pretest and posttest data from 30 pupils across the city. You tabulate the mean, standard deviation, variance, and range because the data is normally distributed on an interval scale.

Check whether the descriptive statistics units for pretest and posttest scores are comparable using your table. Are the variance levels similar among the groups, for example? Do any extreme values exist? If so, you may need to identify and remove severe outliers from your data collection, or alter your data before running a statistical test.

The mean score increased following the meditation exercise, and the variances of the two scores are comparable, as seen in this table. The next step is to run a statistical test to see if the increase in test scores is statistically significant in the population.

Step 4: Use inferential statistics to test hypotheses or create estimates.

A statistic is a number that describes a sample, whereas a parameter is a number that characterizes a population. You can draw conclusions about population parameters using inferential statistics based on sample statistics.

To make statistical inferences, researchers frequently utilize two major methodologies (at the same time).

• Estimation: using sample statistics to calculate population parameters.

• Hypothesis testing: a method for putting research hypotheses about the population to the test using samples.

Estimation

From sample statistics, you may derive two sorts of population parameter estimates:

• A point estimate is a number that provides your best guess of a parameter’s exact value.

• An interval estimate is a set of numbers that represents your best guess as to where the parameter is located.

It’s advisable to utilize both point and interval estimates in your work if your goal is to deduce and present population features from sample data.

When you have a representative sample, you can consider a sample statistic to be a point estimate for the population parameter (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).

Because there is always some degree of inaccuracy in estimation, you should include a confidence interval as an interval estimate to demonstrate the variability around a point estimate.

A confidence interval conveys where you’d normally anticipate to find the population parameter most of the time using the standard error and z score from the standard normal distribution.

Testing hypotheses

You can test hypotheses regarding links between variables in the population using data from a sample. Hypothesis testing begins with the premise that the null hypothesis is true in the population, and statistical tests are used to determine whether or not the null hypothesis can be rejected.

If the null hypothesis were true, statistical tests would establish where your sample data would fall on an expected distribution of sample data. There are two main outcomes from these tests:

• A test statistic indicates how far your data deviates from the test’s null hypothesis.

• A p value indicates how likely it is that your results will be obtained if the null hypothesis is true in the population.

There are three types of statistical tests:

• Group differences in outcomes are assessed using comparison tests.

• Regression tests are used to determine the cause-and-effect correlations between variables.

• Correlation tests look at how variables are related without implying causation.

The statistical test you choose is determined by your research questions, research strategy, sampling method, and data characteristics.

Statistical tests

Based on sample data, parametric tests can generate robust predictions about the population. However, various assumptions must be met before they can be used, and only certain types of variables can be used. If your data contradicts these assumptions, you can use non-parametric tests or perform appropriate data transformations.

A regression model determines how changes in a predictor variable affect the outcome variable (s).

• One predictor variable and one outcome variable are included in a simple linear regression.

• In a multiple linear regression, two or more predictor variables are combined with one result variable.

In most comparison tests, the means of groups are compared. These can be the means of various groups within a sample (for example, a treatment and control group), the means of one sample group collected at different times (for example, pretest and posttest scores), or the difference between a sample mean and a population mean.

• When the sample size is small, a t test is used to compare exactly one or two groups (30 or less).

• When the sample size is large, a z test is used to identify exactly one or two groups.

• An ANOVA is used when there are three or more groups.

Subtypes of the z and t tests are based on the quantity and types of samples as well as the hypotheses:

• Use a one-sample test if you only have one sample to compare to the population mean.

• Use a dependent (paired) samples test if you have paired measurements (within-subjects design).

• Use an independent (unpaired) samples test if you have fully different measurements from two unmatched groups (between-subjects design).

• Use a one-tailed test if you expect a significant difference between groups in one direction.

• Use a two-tailed test if you have no assumptions about the direction of a difference between groups.

Pearson’s r is the only parametric correlation test available. The strength of a linear link between two quantitative variables is measured by the correlation coefficient (r).

To determine whether the correlation in the sample is strong enough to be significant in the population, you must also conduct a correlation coefficient significance test, commonly a t test, to yield a p value. This test uses your sample size to determine how far the correlation coefficient in the population differs from zero.

• Experimental

• Correlational

For example, in experimental research, the paired t test

Because your study is a within-subjects experiment, you’ll need a dependent (paired) t test because both pretest and posttest measurements come from the same group. A one-tailed test is required because you expect a specific direction of change (an increase in test scores).

To see if the meditation exercise improved math test scores considerably, you perform a dependent-samples, one-tailed t test. The test provides you with:

• a 3.00 t value (test statistic)

• a 0.0028 p value

Step 5: Analyze your findings

Interpreting your findings is the final step in statistical analysis.

relevance in statistics

The main criterion for drawing conclusions in hypothesis testing is statistical significance. To determine if your results are statistically significant or not, you compare your p value to a predetermined significance level (typically 0.05).

Statistically significant outcomes are thought to be unlikely to have occurred by coincidence. If the null hypothesis is true in the population, such a finding has a very small likelihood of occuring.

• Experimental

• Correlational

Consider the following scenario: (experiment)

You compare your 0.0027 p value to your 0.05 significance level. You reject the null hypothesis since your p value is lower, and you consider your data statistically significant.

This indicates that you believe the meditation intervention caused the increase in test scores rather than chance causes.

Size of effect

A statistically significant result does not always imply that the research has meaningful real-world applications or therapeutic implications.

The effect size, on the other hand, demonstrates the practical relevance of your findings. For a complete view of your results, include effect sizes with your inferential statistics. If you’re writing an APA paper, you should also include interval estimates of effect sizes.

• Experimental

• Correlational

Size of the effect (experiment)

Cohen’s d is used to determine the extent of the difference between pre- and post-test scores.

With a Cohen’s d of 0.72, your conclusion that the meditation exercise enhanced test scores has medium to high practical significance.

Errors in judgment

Type I and Type II errors occur when study conclusions are incorrect. A Type I error is when the null hypothesis is rejected when it is actually true, while a Type II error is when the null hypothesis is not rejected when it is untrue.

By setting an optimal significance level and guaranteeing high power, you can reduce the chance of these errors. However, there is a trade-off between the two faults, necessitating a delicate balancing act.