Coefficient of Determination Calculator (R-squared)
Calculate R-squared (R²), adjusted R², regression coefficients, Pearson correlation, SST, SSE, SSR, and F-statistic from your data points.
About
The coefficient of determination R2 quantifies the proportion of variance in a dependent variable y that is predictable from an independent variable x. A value of 0.85 means 85% of the observed variation in y is explained by the linear model. Misinterpreting R2 leads to overfitting, false confidence in weak models, or rejection of adequate ones. This calculator performs Ordinary Least Squares regression, computes R2 via 1 − SSE ÷ SST, and reports adjusted R2 which penalizes model complexity. It assumes a linear relationship and homoscedastic residuals.
Note: R2 alone does not confirm causation and can be misleading with nonlinear data. Always inspect residual patterns. For datasets with fewer than 5 observations, adjusted R2 becomes unreliable due to small-sample bias. The F-statistic reported here tests the null hypothesis that the slope equals zero. A high R2 with a non-significant F-statistic signals insufficient data.
Formulas
The coefficient of determination is defined as the complement of the ratio of residual variance to total variance:
Where the component sums of squares are computed as:
The OLS regression line y = b0 + b1x has coefficients:
Adjusted R-squared corrects for the number of predictors k:
The Pearson correlation coefficient r satisfies R2 = r2 for simple linear regression. The F-statistic is:
Where n = number of observations, k = number of predictors (1 for simple linear regression), yi = observed value, yi = predicted value, = mean of observed values, b0 = y-intercept, b1 = slope.
Reference Data
| R² Range | Interpretation | Typical Domain | Action Guidance |
|---|---|---|---|
| 0.95 - 1.00 | Excellent fit | Physics, engineering calibration | Verify not overfitting; check for data leakage |
| 0.85 - 0.95 | Strong fit | Chemistry, controlled experiments | Model reliable for prediction within range |
| 0.70 - 0.85 | Moderate fit | Biology, agriculture | Consider additional predictors or transformations |
| 0.50 - 0.70 | Weak - moderate fit | Social sciences, psychology | Model captures trend but high residual noise |
| 0.30 - 0.50 | Weak fit | Economics, marketing | Useful for directional insight only |
| 0.00 - 0.30 | Poor fit | Behavioral data, stock returns | Linear model inadequate; try nonlinear or add variables |
| < 0.00 | Worse than mean model | Misspecified models | Model is harmful; discard and re-specify |
| Key Statistics Reference | |||
| SST | Total Sum of Squares - total variance of y around its mean | ||
| SSR | Regression Sum of Squares - variance explained by the regression line | ||
| SSE | Error (Residual) Sum of Squares - unexplained variance; SST = SSR + SSE | ||
| r | Pearson correlation coefficient; R2 = r2 for simple linear regression | ||
| SEE | Standard Error of Estimate - average distance of data from regression line in y units | ||
| F | F-statistic - ratio of explained to unexplained variance per degree of freedom | ||
| Adj. R2 | Adjusted R-squared - penalizes adding predictors that do not improve fit | ||
| n | Sample size - minimum 3 for regression, 10+ recommended | ||
| k | Number of independent predictors - 1 in simple linear regression | ||