Correlation Measures to Know for Data, Inference, and Decisions

Correlation measures help us understand relationships between variables, guiding data analysis and decision-making. From linear connections to more complex associations, these tools reveal insights across various fields, enhancing our ability to make informed inferences from data.

  1. Pearson correlation coefficient

    • Measures the linear relationship between two continuous variables.
    • Ranges from -1 to 1, where -1 indicates a perfect negative correlation, 1 indicates a perfect positive correlation, and 0 indicates no correlation.
    • Assumes that the data is normally distributed and that the relationship is linear.
    • Sensitive to outliers, which can significantly affect the correlation value.
    • Commonly used in various fields, including psychology, finance, and social sciences.
  2. Spearman rank correlation

    • Assesses the strength and direction of the association between two ranked variables.
    • Ranges from -1 to 1, similar to Pearson, but does not assume a linear relationship.
    • Suitable for ordinal data or non-normally distributed continuous data.
    • Less sensitive to outliers compared to Pearson correlation.
    • Useful in situations where data does not meet the assumptions of parametric tests.
  3. Kendall's tau

    • Measures the ordinal association between two variables by considering the ranks of the data.
    • Ranges from -1 to 1, with values indicating the strength and direction of the relationship.
    • More robust to ties in the data compared to Spearman's rank correlation.
    • Often used in smaller sample sizes or when data is not normally distributed.
    • Provides a more conservative estimate of correlation than Spearman's.
  4. Point-biserial correlation

    • A special case of Pearson correlation used when one variable is continuous and the other is binary.
    • Ranges from -1 to 1, indicating the strength and direction of the relationship.
    • Assumes that the continuous variable is normally distributed within each group defined by the binary variable.
    • Useful in comparing means between two groups based on a binary outcome.
    • Commonly applied in social sciences and medical research.
  5. Phi coefficient

    • Measures the association between two binary variables.
    • Ranges from -1 to 1, where values indicate the strength and direction of the relationship.
    • Calculated using a contingency table to assess the degree of association.
    • Suitable for 2x2 contingency tables and provides a simple measure of association.
    • Often used in categorical data analysis and epidemiological studies.
  6. Intraclass correlation

    • Assesses the reliability or agreement between multiple raters or measurements.
    • Ranges from 0 to 1, with higher values indicating greater agreement among raters.
    • Useful in studies involving repeated measures or ratings by different observers.
    • Can be applied to both continuous and categorical data.
    • Commonly used in psychology, education, and medical research to evaluate measurement consistency.
  7. Partial correlation

    • Measures the relationship between two variables while controlling for the effect of one or more additional variables.
    • Helps to identify the direct association between the two variables of interest.
    • Ranges from -1 to 1, similar to Pearson correlation.
    • Useful in multivariate analysis to understand the unique contribution of each variable.
    • Commonly applied in regression analysis and causal inference studies.
  8. Multiple correlation

    • Assesses the relationship between one dependent variable and two or more independent variables.
    • Ranges from 0 to 1, indicating the strength of the relationship.
    • Useful in regression analysis to evaluate how well the independent variables predict the dependent variable.
    • Provides insight into the combined effect of multiple predictors on an outcome.
    • Commonly used in social sciences, economics, and health research.
  9. Canonical correlation

    • Examines the relationship between two sets of variables, identifying the linear combinations that maximize correlation.
    • Useful for understanding the interrelationships between multiple dependent and independent variables.
    • Ranges from -1 to 1, indicating the strength of the relationship between the sets.
    • Helps to explore complex data structures and multivariate relationships.
    • Commonly applied in psychology, ecology, and marketing research.
  10. Distance correlation

    • Measures the association between two random variables, capturing both linear and non-linear relationships.
    • Ranges from 0 to 1, where 0 indicates no association and 1 indicates perfect association.
    • Does not assume any specific distribution of the data, making it versatile for various data types.
    • Useful in high-dimensional data analysis and machine learning applications.
    • Provides a more comprehensive understanding of relationships compared to traditional correlation measures.


© 2025 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2025 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.