User Rating 0.0 โ˜…โ˜…โ˜…โ˜…โ˜…
Total Usage 0 times
Is this tool helpful?

Your feedback helps us improve.

โ˜… โ˜… โ˜… โ˜… โ˜…

About

Incorrect allele frequency estimation propagates errors into linkage disequilibrium analysis, association studies, and forensic match probability calculations. A miscounted genotype class or a rounding shortcut can shift p by enough to invert a chi-square verdict on Hardy-Weinberg equilibrium. This calculator applies the direct counting method - p = (2 ร— NAA + NAa) รท 2N - and returns expected genotype counts under HWE (p2, 2pq, q2), a Pearson chi-square goodness-of-fit statistic with associated p-value, observed and expected heterozygosity, and Wright's fixation index F. It supports two-allele and three-allele loci. Results assume a diploid, randomly mating population with no selection, migration, mutation, or drift. Small sample sizes (N < 30) yield unreliable chi-square approximations; exact tests are preferable in that range.

allele frequency hardy-weinberg equilibrium population genetics chi-square test heterozygosity fixation index genotype frequency genetics calculator

Formulas

The core relationship governing a randomly mating diploid population at a bi-allelic locus:

p2 + 2pq + q2 = 1

where p + q = 1. Allele frequencies from observed genotype counts:

p = 2NAA + NAa2N

Expected genotype counts under Hardy-Weinberg equilibrium:

EAA = p2 ร— N , EAa = 2pq ร— N , Eaa = q2 ร— N

Chi-square goodness-of-fit test:

ฯ‡2 = kโˆ‘i=1 (Oi โˆ’ Ei)2Ei

Expected heterozygosity and fixation index:

He = 1 โˆ’ kโˆ‘i=1 pi2 , F = He โˆ’ HoHe

For a three-allele system (p + q + r = 1), six genotype classes exist: p2, q2, r2, 2pq, 2pr, 2qr. The chi-square test uses df = 3 (6 genotype classes minus 3 allele frequency parameters).

Where: p, q, r = allele frequencies; N = total individuals; Oi = observed count for genotype i; Ei = expected count; He = expected heterozygosity; Ho = observed heterozygosity; F = Wright's fixation index (positive = heterozygote deficit, negative = excess).

Reference Data

ParameterSymbolFormula / DefinitionTypical Range
Dominant allele frequencyp(2NAA + NAa) รท 2N0 - 1
Recessive allele frequencyq1 โˆ’ p0 - 1
Expected AA frequencyp2Homozygous dominant0 - 1
Expected Aa frequency2pqHeterozygote0 - 0.5
Expected aa frequencyq2Homozygous recessive0 - 1
Chi-square statisticฯ‡2โˆ‘((O โˆ’ E)2 รท E)0 - โˆž
Degrees of freedom (2-allele)dfGenotypes โˆ’ alleles1
Degrees of freedom (3-allele)df6 โˆ’ 3 = 33
Expected heterozygosityHe1 โˆ’ โˆ‘pi20 - 1
Observed heterozygosityHoHeterozygotes รท N0 - 1
Fixation indexF(He โˆ’ Ho) รท Heโˆ’1 to 1
HWE significance (ฮฑ)ฮฑConventional threshold0.05
Carrier frequency (2-allele)2pqProportion carrying one recessive allele0 - 0.5
Cystic fibrosis (q) - European Caucasians0.022
Sickle cell (q) - West African populations0.10 - 0.20
PKU (q) - European Caucasians0.010
Tay-Sachs (q) - Ashkenazi Jewish0.018
CCR5-ฮ”32 (q) - Northern European0.10
ABO locus allelesp, q, rIA, IB, i frequenciesVaries by population
MN blood group - Co-dominant, classic HWE examplep 0.54

Frequently Asked Questions

A positive F signals a deficit of heterozygotes relative to Hardy-Weinberg expectation. Common causes include inbreeding (consanguineous mating), population substructure (Wahlund effect), or positive assortative mating. An F of 0.05 is often considered a threshold for mild inbreeding in human populations. Values approaching 1.0 indicate near-complete homozygosity.
The chi-square statistic scales linearly with sample size N. A genotype frequency that deviates from expectation by 0.5% may be statistically significant at N > 10000 even though the biological significance is negligible. Always report effect size (the fixation index F) alongside the p-value. Statistical significance is not biological significance.
Null alleles produce no detectable product (e.g., failed PCR amplification at a microsatellite locus). Null/null homozygotes appear as missing data, and null/visible heterozygotes are misclassified as visible homozygotes. This inflates observed homozygosity, biases F upward, and makes the population appear to violate HWE. Software such as Micro-Checker or ML-NullFreq can estimate null allele frequency. If null allele frequency exceeds 0.05, consider dropping the locus.
No. This tool assumes autosomal diploid loci. For X-linked loci, males are hemizygous and genotype expectations differ by sex: females follow standard HWE (p2, 2pq, q2) while males have frequencies p and q directly. Pooling sexes without correction produces spurious HWE deviations.
The chi-square approximation requires all expected cell counts Ei โ‰ฅ 5. For a rare allele with q = 0.05, the expected homozygote count is q2 ร— N = 0.0025N, so you need N โ‰ฅ 2000 for that cell alone. Below this threshold, use an exact test (Fisher or permutation). This calculator flags when any expected count falls below 5.
HWE assumes no fitness differences among genotypes. If heterozygotes have higher fitness (heterozygote advantage, as with sickle cell trait in malaria-endemic regions), the population will maintain higher heterozygosity than HWE predicts, producing a negative F. Directional selection against homozygous recessives reduces q slowly because most recessive alleles hide in heterozygous carriers. The rate of change in q per generation under complete selection against aa is ฮ”q = โˆ’q21 + q.