Benford's Law Calculator
Analyze datasets for Benford's Law conformity. Compute leading digit frequencies, chi-squared, MAD, and KS tests with visual charts.
| Digit | Count | Observed % | Expected % | Difference | |Deviation| |
|---|
About
Benford's Law (also called the Newcomb-Benford Law) predicts that in many naturally occurring datasets, the leading digit d appears with probability P(d) = log10(1 + 1/d). Digit 1 appears roughly 30.1% of the time, not 11.1% as naive intuition suggests. Datasets spanning multiple orders of magnitude (population counts, financial statements, physical constants, election returns) tend to follow this distribution. Failure to conform can indicate data fabrication, rounding bias, or truncation artifacts. Forensic accountants and auditors use Benford analysis as a first-pass anomaly screen on general ledger entries, tax returns, and expense reports.
This calculator extracts every leading significant digit from your dataset, computes observed vs. expected frequencies, and runs three conformity tests: the χ2 goodness-of-fit test (degrees of freedom = 8, critical value 15.507 at α = 0.05), Nigrini's Mean Absolute Deviation (MAD), and the Kolmogorov-Smirnov (KS) statistic. The tool approximates conformity under the assumption that observations are independent and the dataset contains at least 100 values. Smaller samples yield unreliable results. Numbers with no significant digits (zero, non-numeric text) are silently excluded from analysis.
Formulas
The probability that the first significant digit equals d (d ∈ {1, 2, …, 9}) under Benford's Law:
The Chi-Squared goodness-of-fit statistic with k − 1 = 8 degrees of freedom:
where Od = observed count of digit d, and Ed = N ⋅ P(d) is the expected count given total sample size N. Reject the null hypothesis (data follows Benford) when χ2 > 15.507 at significance level α = 0.05.
Nigrini's Mean Absolute Deviation:
where Od′ is the observed proportion (relative frequency) of digit d.
Kolmogorov-Smirnov statistic:
where Fobs and Fexp are cumulative distribution functions of observed and expected proportions. Critical value at α = 0.05 is approximated as 1.36√N.
Reference Data
| Leading Digit d | Benford Probability P(d) | Percentage | Cumulative % |
|---|---|---|---|
| 1 | 0.30103 | 30.103% | 30.103% |
| 2 | 0.17609 | 17.609% | 47.712% |
| 3 | 0.12494 | 12.494% | 60.206% |
| 4 | 0.09691 | 9.691% | 69.897% |
| 5 | 0.07918 | 7.918% | 77.815% |
| 6 | 0.06695 | 6.695% | 84.510% |
| 7 | 0.05799 | 5.799% | 90.309% |
| 8 | 0.05115 | 5.115% | 95.424% |
| 9 | 0.04576 | 4.576% | 100.000% |
| MAD Range | Conformity Level | Interpretation |
|---|---|---|
| 0.000 - 0.006 | Close Conformity | Strong adherence to Benford's Law. Typical of large, unmanipulated datasets. |
| 0.006 - 0.012 | Acceptable Conformity | Minor deviations within normal variance. Generally passes audit screens. |
| 0.012 - 0.015 | Marginal Conformity | Warrants further investigation. Could indicate partial data manipulation or natural boundary effects. |
| ≥ 0.015 | Non-Conformity | Significant deviation. Data may be fabricated, truncated, or drawn from a non-Benford process. |
| Dataset Type | Typically Conforms? | Notes |
|---|---|---|
| Population of cities/countries | Yes | Spans many orders of magnitude |
| Financial statement line items | Yes | Standard forensic accounting application |
| Stock prices | Yes | Over long periods with sufficient range |
| Physical constants | Yes | Mixed units amplify multi-order effect |
| Fibonacci sequence | Yes | Exact conformity in the limit |
| Telephone numbers | No | Assigned, not naturally generated |
| Human heights (cm) | No | Narrow range, single order of magnitude |
| Lottery results | No | Uniform distribution by design |
| Zip/postal codes | No | Assigned sequentially by geography |
| Invoice amounts (fabricated) | No | Fraudsters tend to over-represent digits 5-9 |