Science & Tech

Correlation vs. Causation: Causal and Noncausal Relationships

Written by MasterClass

Last updated: Jun 17, 2022 • 4 min read

Charting out specific cause and effect relationships can prove elusive at times. Occasionally, what looks like a cause might merely be a circumstantial relationship (or correlation). Learn more about correlation vs. causation in both real-life circumstances and for the purposes of scientific research design.

Learn From the Best

Correlation vs. Causation

Correlation refers to a noncausal relationship between two or more variables, whereas causation refers to a variable directly responsible for causing another variable to occur.

As an example, consider a person who smokes cigarettes every day and lives to be a hundred years old. It’s possible someone could insinuate this correlation is proof cigarettes do not cause lung cancer after all and were in fact responsible for this person’s longevity. In reality, the causation is false—the person’s smoking habit does not indicate the cigarettes had anything to do with their longevity. For that matter, it took decades of studies and analysis to prove smoking had a negative causative impact on people’s health in the first place.

In mathematical and data analysis, there are more precise tools to create a statistical measure by which you can deduce correlation vs. causation (including the correlation coefficient or the covariance principle, for example). Telling the difference in a more real-world scenario can prove more difficult due to the lack of similar concrete measurements.

What Are the Dangers of Conflating Correlation and Causation?

When you misdiagnose a correlation for a causal relationship, you miss what’s transpiring. This might be a cause for minor annoyance in your everyday life, but it can have a much more dire impact in other scenarios.

For instance, suppose a doctor looked at a data set about your health and diagnosed various types of correlations to be the true cause of an illness, all while dismissing the real problem. Precision is key in correctly and logically evaluating what the direct cause of any circumstance is.

3 Examples of Correlation vs. Causation Fallacies

It’s easier to learn about correlation vs. causation through direct examples. Think about how you would assess the variable causes in these three scenarios:

  1. 1. Potential reversals: One common logical fallacy is to reason incorrectly about the direction of the relationship for a causal effect. Imagine someone looks at solar panels on various houses in their neighborhood and then proclaims they are what projects the sun’s energy into the sky, rather than vice versa. Of course, their assessment goes in the opposite direction of the true causality here.
  2. 2. Unknown factors: Causality is elusive and difficult to pin down in the first place because so many potential causes exist for any given effect. Suppose someone attributes their positive state of mental health exclusively to how often they go for a run. In reality, there could be a third variable (also known as a confounding variable) more responsible for their happiness than this. For that matter, there might be multiple correlations and causes worth establishing in this scenario.
  3. 3. Total coincidences: Some correlations are completely coincidental and have no causative effect whatsoever. Suppose ice cream sales markedly go up every time you personally notice an increase in car accidents around your community. While there might be some abstract connection between the two variables, at least hypothetically, it’s far more likely your personal ice cream consumption has no impact on the driving patterns of your neighbors. Spurious correlations like these decrease the likelihood of any degree of causation.

How to Identify Correlation vs. Causation

Identifying the difference between correlation and causation is essential in both professional scientific research and in everyday critical thinking. Keep these tips in mind as you try to parse out the difference between these two concepts:

  • Be precise. Whether in a data science experiment or a real-world logical problem, try to be as precise as possible when you formulate all the variables involved in a scenario. Sometimes all it takes is to home in on your exact verbiage to discern whether something serves as a strong correlation or the true cause of an event.
  • Chart your question out logically. Try to write out your problem in a straight line as a logical syllogism. For instance, if A then B, if B then C, therefore, if A then C. This allows you to chart a linear relationship between a cause and an effect. See if you can truly say each variable causes the appearance of the next one. If it does, you’ve identified the cause; if it doesn’t, you’ve likely placed a mere correlation in the syllogism as one of your variables.
  • Consider all alternatives. Try to account for as many possible variables as you can. A strong correlation between one variable and another does not necessarily indicate this positive correlation is the cause of the other. Multiple different factors can influence most scenarios, so do your best to consider them all.
  • Give correlation its due. Even though correlation doesn’t mean causation, identifying strong correlations can still help you identify causation. For example, there’s a high incidence of depression among alcoholics, but it’s difficult to tell whether depression leads to alcoholism or vice versa. Still, you can then proceed to reason from the effects alcohol has on the nervous system and brain chemistry to accurately assess it might cause both a temporary sense of relief from depression symptoms but also a long-term worsening of them on the whole, leading you closer to the truth about its potential causative effect.
  • Test your hypothesis. A controlled experiment can help you figure out the difference between correlation vs. causation in a specific scenario. This sort of hypothesis testing relies on you choosing a control group measuring a dependent variable and another group in which an independent variable changes. Experimentation is one of the most surefire ways to deduce actual causes.

Learn More

Get the MasterClass Annual Membership for exclusive access to video lessons taught by science luminaries, including Bill Nye, Terence Tao, Neil deGrasse Tyson, Chris Hadfield, Jane Goodall, and more.