User Rating 0.0 ★★★★★

Total Usage 0 times

Category Statistics & Probability

Number of categories 2 – 30

Significance level (α)

Category	Observed (O)	Expected (E)

Rows 2 – 10

Columns 2 – 10

Significance level (α)

Yates' correction (2×2 only)

Is this tool helpful?

Your feedback helps us improve.

★ ★ ★ ★ ★

About

Incorrect application of the chi-square test leads to false conclusions in clinical trials, market research, and quality control. A miscalculated p-value can approve a defective product batch or reject a viable drug candidate. This calculator computes the Pearson chi-square statistic χ² = ∑ (O − E)²E using the regularized incomplete gamma function for exact p-value derivation rather than table lookup. It supports both goodness-of-fit tests (one-dimensional frequency comparison against a theoretical distribution) and tests of independence on r × c contingency tables up to 10 × 10. Effect size is reported via Cramér’s V.

The test assumes observations are independent, categories are mutually exclusive, and all expected frequencies exceed 5. When expected counts fall below this threshold, results become unreliable and Fisher’s exact test should be considered instead. Degrees of freedom are computed automatically: (r − 1)(c − 1) for independence, (k − 1) for goodness-of-fit. Pro tip: always verify that your sampling method produces genuinely independent observations before trusting the output.

Formulas

The Pearson chi-square statistic measures the discrepancy between observed and expected frequencies across k categories:

χ² = k∑i=1 (O_i − E_i)²E_i

Where O_i = observed frequency in category i, and E_i = expected frequency in category i.

For a test of independence on an r × c contingency table, expected frequencies are computed as:

E_ij = R_i ⋅ C_jN

Where R_i = total of row i, C_j = total of column j, N = grand total of all observations.

Degrees of freedom:

df = (r − 1)(c − 1) (independence)

df = k − 1 (goodness-of-fit)

The p-value is obtained from the upper tail of the chi-square distribution using the regularized upper incomplete gamma function:

p = Q(df2, χ²2) = 1 − P(df2, χ²2)

Cramér’s V measures effect size for independence tests:

V = √χ²N ⋅ min(r − 1, c − 1)

Where V ranges from 0 (no association) to 1 (perfect association). Values below 0.1 indicate negligible effect, 0.1 - 0.3 small, 0.3 - 0.5 medium, and above 0.5 large effect.

Reference Data

Degrees of Freedom (df)	α = 0.10	α = 0.05	α = 0.025	α = 0.01	α = 0.005	α = 0.001
1	2.706	3.841	5.024	6.635	7.879	10.828
2	4.605	5.991	7.378	9.210	10.597	13.816
3	6.251	7.815	9.348	11.345	12.838	16.266
4	7.779	9.488	11.143	13.277	14.860	18.467
5	9.236	11.070	12.833	15.086	16.750	20.515
6	10.645	12.592	14.449	16.812	18.548	22.458
7	12.017	14.067	16.013	18.475	20.278	24.322
8	13.362	15.507	17.535	20.090	21.955	26.124
9	14.684	16.919	19.023	21.666	23.589	27.877
10	15.987	18.307	20.483	23.209	25.188	29.588
12	18.549	21.026	23.337	26.217	28.300	32.909
15	22.307	24.996	27.488	30.578	32.801	37.697
20	28.412	31.410	34.170	37.566	39.997	45.315
25	34.382	37.652	40.646	44.314	46.928	52.620
30	40.256	43.773	46.979	50.892	53.672	59.703
40	51.805	55.758	59.342	63.691	66.766	73.402
50	63.167	67.505	71.420	76.154	79.490	86.661
60	74.397	79.082	83.298	88.379	91.952	99.607
80	96.578	101.879	106.629	112.329	116.321	124.839
100	118.498	124.342	129.561	135.807	140.169	149.449

Frequently Asked Questions

Use the goodness-of-fit test when you have a single categorical variable and want to compare observed frequencies against a known or hypothesized distribution (e.g., testing whether a die is fair). Use the test of independence when you have two categorical variables arranged in a contingency table and want to determine whether they are statistically associated (e.g., whether treatment type affects recovery outcome).

The chi-square approximation becomes unreliable when any expected cell count falls below 5. The calculator flags such cells with a warning. In these cases, consider combining categories to increase expected counts, or use Fisher's exact test for 2×2 tables. For larger sparse tables, a Monte Carlo simulation of the exact distribution is preferable.

The p-value is computed algorithmically using the regularized incomplete gamma function. Specifically, p = Q(a, x) where a = df ÷ 2 and x = χ² ÷ 2. The implementation uses a series expansion for small x and Lentz's continued fraction algorithm for large x, achieving precision to approximately 10 significant digits.

No. Cramér's V is mathematically bounded between 0 and 1. A value of 0 indicates complete independence between variables. A value of 1 indicates a perfect deterministic relationship. It cannot be negative because it is derived from a square root of a ratio of non-negative quantities.

Yates' correction subtracts 0.5 from each absolute observed-minus-expected difference before squaring, reducing the chi-square value for 2×2 tables. This compensates for the discrete-to-continuous approximation. It is most relevant when sample sizes are small (total N < 40). For large samples, the correction has negligible effect. This calculator applies Yates' correction automatically for 2×2 independence tables and displays both corrected and uncorrected values.

Higher degrees of freedom shift the chi-square distribution to the right, increasing the critical value needed to reject the null hypothesis at a given significance level. For example, at α = 0.05, the critical value is 3.841 for df = 1 but 18.307 for df = 10. The reference table above provides critical values for common df and α combinations.