Factor Analysis Explained: What Is Factor Analysis?
Written by MasterClass
Last updated: Mar 18, 2022 • 3 min read
When statisticians want to study the effects of unobserved variables on a data set’s outcomes and iterations, they can construct a factor analysis model. In doing so, they make an attempt to isolate underlying factors linked to specific results.
Learn From the Best
What Is Factor Analysis?
Factor analysis is the study of unobserved variables, also known as latent variables or latent factors, that may combine with observed variables to affect outcomes. Statisticians take these unobserved variables and study whether they could be common factors behind observed outputs in a data set. In layman’s terms, statisticians want to see whether a particular factor is producing common outcomes throughout a population.
Factor analysis plays a key role in the world of descriptive statistics and social sciences. It touches industries such as business marketing, product management, psychometrics, machine learning, and finance.
2 Types of Factor Analysis
There are two principal types of factor analyses that contribute to the broader realm of statistical analysis and data analysis.
- 1. Exploratory factor analysis: In an exploratory factor analysis (EFA), a researcher approaches a data set with no preconceived notions about its factor structure. By identifying latent factors and charting them alongside the amount of variance among observed variables, the researcher hopes to isolate the factors that impact the observed data. Note that EFA models see observed variables as linear combinations, which makes it an advanced version of a principal components analysis (PCA).
- 2. Confirmatory factor analysis: Confirmatory factor analysis (CFA) uses structural equation modeling to test hypotheses by comparing those hypotheses to observed data. Researchers can then revise their structural equations to better reflect real-world data. This makes a CFA similar to a least-squares estimation, but statisticians consider a CFA to be more accommodating of slight measurement errors when studying a large number of variables.
4 Factor Analysis Terms
Explore the most common terms related to factor analysis.
- 1. Variance: When statisticians talk about the amount of variance in factor analysis, they’re talking about a variation from the mean, or average. If a data point shows a great variance from normal results, researchers may want to isolate the factor that is behind such an abnormality.
- 2. Eigenvalue: In data analysis, an eigenvalue is a measure of variance. The key number to pay attention to is 1. When an eigenvalue is greater than 1, this means a factor solution shows more variance than could be caused by one single observed variable. This could point to the existence of a latent variable that is causing additional variance.
- 3. Factor score: A factor score, or factor loading, is a measurement that correlates a particular variable to a given factor. When a factor score is high, this suggests that there is a notably strong connection between a certain factor and a common variance in the observed data.
- 4. Correlation coefficient: Correlation coefficients function in a similar way to factor scores. They are numerical measurements of a correlation between two variables in affecting outcomes. If statisticians suspect a strong correlation, they may try to expand their sample size to establish the maximum likelihood of two factors influencing one another.
3 Factor Analysis Examples
You do not need to be a statistician to see how isolating and studying a number of factors can provide a greater understanding of real-world phenomena.
- 1. Microbial gut studies: Scientists have taken an increased interest in the way a person’s microbiome—a colony of microorganisms living in their intestines—might influence health outcomes, body mass, and even emotions. Research biologists have conducted factor analyses to attempt to link the presence of certain microbes to certain health conditions.
- 2. Recommendation algorithms: Machine learning has given way to an industry of recommendation algorithms that suggest news items to read, movies to stream, and products to purchase. Sometimes the science behind these algorithms involves isolating specific factors that might predispose a person to like a certain product or entertainment offering.
- 3. Surveys: Pollsters and businesses may conduct opt-in surveys where respondents’ answers are converted into correlation matrix blocks. Survey architects then attempt to analyze the responses to predict additional preferences and behaviors.
Learn More
Get the MasterClass Annual Membership for exclusive access to video lessons taught by science luminaries, including Terence Tao, Bill Nye, Neil deGrasse Tyson, Chris Hadfield, Jane Goodall, and more.