Allele Frequency Calculator
Calculate allele frequencies, Hardy-Weinberg equilibrium genotype ratios, chi-square test, heterozygosity, and fixation index from observed genotype counts.
About
Incorrect allele frequency estimation propagates errors into linkage disequilibrium analysis, association studies, and forensic match probability calculations. A miscounted genotype class or a rounding shortcut can shift p by enough to invert a chi-square verdict on Hardy-Weinberg equilibrium. This calculator applies the direct counting method - p = (2 ร NAA + NAa) รท 2N - and returns expected genotype counts under HWE (p2, 2pq, q2), a Pearson chi-square goodness-of-fit statistic with associated p-value, observed and expected heterozygosity, and Wright's fixation index F. It supports two-allele and three-allele loci. Results assume a diploid, randomly mating population with no selection, migration, mutation, or drift. Small sample sizes (N < 30) yield unreliable chi-square approximations; exact tests are preferable in that range.
Formulas
The core relationship governing a randomly mating diploid population at a bi-allelic locus:
where p + q = 1. Allele frequencies from observed genotype counts:
Expected genotype counts under Hardy-Weinberg equilibrium:
Chi-square goodness-of-fit test:
Expected heterozygosity and fixation index:
For a three-allele system (p + q + r = 1), six genotype classes exist: p2, q2, r2, 2pq, 2pr, 2qr. The chi-square test uses df = 3 (6 genotype classes minus 3 allele frequency parameters).
Where: p, q, r = allele frequencies; N = total individuals; Oi = observed count for genotype i; Ei = expected count; He = expected heterozygosity; Ho = observed heterozygosity; F = Wright's fixation index (positive = heterozygote deficit, negative = excess).
Reference Data
| Parameter | Symbol | Formula / Definition | Typical Range |
|---|---|---|---|
| Dominant allele frequency | p | (2NAA + NAa) รท 2N | 0 - 1 |
| Recessive allele frequency | q | 1 โ p | 0 - 1 |
| Expected AA frequency | p2 | Homozygous dominant | 0 - 1 |
| Expected Aa frequency | 2pq | Heterozygote | 0 - 0.5 |
| Expected aa frequency | q2 | Homozygous recessive | 0 - 1 |
| Chi-square statistic | ฯ2 | โ((O โ E)2 รท E) | 0 - โ |
| Degrees of freedom (2-allele) | df | Genotypes โ alleles | 1 |
| Degrees of freedom (3-allele) | df | 6 โ 3 = 3 | 3 |
| Expected heterozygosity | He | 1 โ โpi2 | 0 - 1 |
| Observed heterozygosity | Ho | Heterozygotes รท N | 0 - 1 |
| Fixation index | F | (He โ Ho) รท He | โ1 to 1 |
| HWE significance (ฮฑ) | ฮฑ | Conventional threshold | 0.05 |
| Carrier frequency (2-allele) | 2pq | Proportion carrying one recessive allele | 0 - 0.5 |
| Cystic fibrosis (q) | - | European Caucasians | 0.022 |
| Sickle cell (q) | - | West African populations | 0.10 - 0.20 |
| PKU (q) | - | European Caucasians | 0.010 |
| Tay-Sachs (q) | - | Ashkenazi Jewish | 0.018 |
| CCR5-ฮ32 (q) | - | Northern European | 0.10 |
| ABO locus alleles | p, q, r | IA, IB, i frequencies | Varies by population |
| MN blood group | - | Co-dominant, classic HWE example | p ≈ 0.54 |