User Rating 0.0
Total Usage 1 times

Variable X

Variable Y

Is this tool helpful?

Your feedback helps us improve.

About

Correlation does not imply causation, but it strongly suggests where to look. The Pearson Correlation Coefficient (r) quantifies the linear strength between two continuous variables. However, blind reliance on r is dangerous; Anscombe's Quartet demonstrates how four identical coefficients can represent vastly different datasets.

This tool combines the numerical coefficient calculation with an automatic scatter plot. This visual validation allows analysts to instantly spot non-linear relationships (parabolic curves) or heteroscedasticity that the math alone might miss. We also perform a significance test against a Critical Value table to determine if the result is statistically valid or merely random noise.

correlation statistics pearson r linear regression data analysis

Formulas

Pearson's r is calculated using the covariance divided by the product of the standard deviations:

r = (xi x)(yi y)(xi x)2 (yi y)2

Reference Data

df (N-2)Critical Value (α=0.05)Critical Value (α=0.01)
10.9970.999
50.7540.874
100.5760.708
200.4230.537
500.2730.354
1000.1950.254

Frequently Asked Questions

r (Pearson's coefficient) indicates direction and strength (-1 to +1). R-squared (Coefficient of Determination) is r squared (0 to 1) and represents the percentage of variance in the dependent variable explained by the independent variable.
While you can calculate r with as few as 3 points, statistical significance usually requires more. For a correlation of 0.5 to be significant at the 0.05 level, you generally need at least 15-20 data pairs.
No. Pearson is strictly for linear relationships. If your data follows a curve (like compounding interest), Pearson will underestimate the relationship. Use Spearman's rank correlation for monotonic non-linear data.