Business

How to Calculate Statistical Significance

Written by MasterClass

Last updated: Aug 16, 2021 • 5 min read

Various social sciences and businesses use statistical significance to gauge the potential success rate for new medications, marketing strategies, or products.

Learn From the Best

What Is Statistical Significance?

Statistical significance is a form of statistical analysis that determines whether the relationship between two variables in a statistical test is random or a statistically significant connection. Analysts use a statistical hypothesis test to prove one of two possible outcomes: the null hypothesis and the alternative hypothesis. The null hypothesis states that no observed difference exists between the two variables. The alternative hypothesis says that there is a chance, even if extreme, that the connection is not coincidental.

The probability of a random sample in terms of results is expressed as a percentage called the probability value or p-value. The smaller the p-value, the greater the chance of statistical significance in the observed difference of the variables. Generally, analysts consider a p-value of five percent or less as the cutoff for statistical significance.

The goal of statistical significance is to reject the null hypothesis because that would indicate that the results are only explainable by chance. There are many real-world applications for hypothesis testing for statistical significance, and many of the social sciences, including economics, psychology, and sociology, use these tests. Companies often use this type of testing to demonstrate success rates with new products for investors, while pharmaceutical companies use them in clinical trials for new vaccines or medications.

Factors That Determine Statistical Significance

The two main factors in determining statistical significance are sample size and effect size. Sample size refers to the size of the particular demographic for the experiment; larger samples produce more reliable results, while smaller samples are more prone to sample error.

Effect size measures the differences in results between two sample sets, which indicates practical significance. If the effect size is small, the evidence for either significance or randomness will be weaker than larger effect sizes. The boundaries of what can happen in a statistical significance experiment are called the confidence intervals or confidence levels, and the difference between them is called the standard deviation.

3 Examples of Statistical Significance

There are numerous examples of statistical significance, including:

  1. 1. Conversion rates: Statistical significance can help businesses measure their conversion rates, which determine the effectiveness of decisions on performance and consumers. For example, a company may determine if changing the font size on a sign attracts more customers. The original font is the null hypothesis, while the change in size is the alternate hypothesis. From there, you would assign a p-value—how many customers it would attract—and confidence intervals (the largest and smallest number of customers it can attract).
  2. 2. Marketing strategy: Businesses can use statistical significance to determine the success of a new marketing strategy. For example, a company gathers data from a sample group of customers who make purchases based on previous or new marketing approaches to see if a new marketing strategy would impact sales. Current strategies are the null hypothesis, while the proposed new strategy is the alternate hypothesis. The p-value is the difference in money that customers spent due to the new strategy over what they spent due to the previous strategy.
  3. 3. Medical/science approach: To test the efficacy of a new drug, a pharmaceutical company would create two test groups—one that received the drug (alternate hypothesis) and one that did not (null hypothesis). If four members of the former group and one member of the latter group reported positive effects, the results would fall within the five-percent threshold for statistical significance.

How to Calculate Statistical Significance

Here is a step-by-step guide to calculating statistical significance:

  1. 1. Determine your research. Choose a subject that will allow you to test two variables. It can be as basic as the number of coin tosses that end on tails or as specific as conversion rates on call-to-action buttons on the landing pages of a website.
  2. 2. Choose your hypothesis. State both your null hypothesis—which determines randomness in the connection between your variables—and your alternate hypothesis. Set your p-value, which is typically five percent, though you can also choose a lower percentage.
  3. 3. Gather your data. Next, determine how many times you will run the test (sample size) and the length of the testing time. Larger samples will typically yield more accurate results.
  4. 4. Calculate your results. You can use several different statistical tests to determine statistical significance. A Chi-Squared test, which compares relationships between two variables, is perhaps the best application for statistical significance, as it offers a finite number of results based on the null hypothesis.
  5. 5. Observe differences. The differences you observe will give you the parameters for determining whether the test hewed toward the null hypothesis or the alternate hypothesis. The statistical test you use will determine how you arrive at your final answer.

Limitations of Statistical Significance

There are notable limitations to using statistical significance, including:

  • False positives: Though five percent is the accepted p-value for statistical significance, it also comes with a high rate of false-positive errors, known as type I errors, ranging from 25–50 percent. A false-positive error happens when a true null hypothesis is accidentally rejected. British statistician Ronald Fisher, who developed the p-value, noted that it is just one method for determining statistical significance. There is also little means to determine if a particular p-value will yield a false-positive result. The best alternative is to choose as low a p-value as possible—around one to two percent.
  • Reproduction rates: False positives are easy to obtain in statistical significance, and the chance of them occurring increases each time you attempt to reproduce your results with additional tests. A 50 percent chance of achieving a false positive with each test will reduce the efficacy of the test and nullify the original results.
  • Sample size: If the p-value is five percent or less in a test that includes 50 people, it may not have a practical application in the real world. Simply put, a small sample may inaccurately reflect the actual scope of your research. Testing an entire population will produce more accurate results and pose challenges in administering tests and gathering data.

Want to Learn More About Business?

Get the MasterClass Annual Membership for exclusive access to video lessons taught by business luminaries, including Bob Iger, Chris Voss, Robin Roberts, Sara Blakely, Daniel Pink, Howard Schultz, Anna Wintour, and more.