User Rating 0.0
Total Usage 0 times
Accepts integers or decimals. Minimum 2 values required.
Presets:
Is this tool helpful?

Your feedback helps us improve.

About

Choosing the wrong class width collapses distinct data patterns into noise or fragments them into meaningless spikes. A histogram with too few bins hides bimodality; too many bins create random jaggedness that misleads interpretation. This calculator determines optimal class width h and number of classes k from raw ungrouped data using five established rules: Sturges' formula, Scott's normal reference rule, the Freedman-Diaconis estimator, the Square Root choice, and Rice's rule. Each method makes different assumptions about the underlying distribution. Sturges assumes approximate normality. Freedman-Diaconis is robust against outliers because it relies on the interquartile range IQR rather than standard deviation σ. The tool parses your raw observations, computes all five widths simultaneously, builds a grouped frequency table with absolute, relative, and cumulative frequencies, and renders a histogram. Note: all rules are asymptotic approximations. For sample sizes below 30, results should be treated as rough guides, not definitive answers.

class width frequency distribution histogram bin width sturges rule scotts rule freedman-diaconis statistics calculator grouped data

Formulas

The calculator determines the number of classes k and class width h from n observations spanning a range R = xmax xmin. Five rules are evaluated simultaneously.

Sturges' Rule

k = 1 + 3.322 log10(n) , h = Rk

Scott's Rule

h = 3.49 s n13 , k = Rh

Freedman-Diaconis Rule

h = 2 IQR n13 , k = Rh

Square Root Rule

k = n , h = Rk

Rice Rule

k = 2 n13 , h = Rk

Where n = number of observations, R = range (xmax xmin), s = sample standard deviation (Bessel-corrected), IQR = interquartile range (Q3 Q1), h = class width, k = number of classes, and denotes the ceiling function.

Reference Data

RuleFormula for Classes / WidthAssumptionBest ForWeakness
Sturges (1926)k = 1 + 3.322 log10(n)Normal distributionSmall to moderate n (< 200)Under-bins for large or skewed data
Scott (1979)h = 3.49 σ n−1/3Normal distributionContinuous, roughly symmetric dataSensitive to outliers via σ
Freedman-Diaconis (1981)h = 2 IQR n−1/3None (nonparametric)Skewed data, outlier-heavy setsMay over-bin if IQR is very small
Square Rootk = nNoneQuick estimation, Excel defaultNo theoretical optimality
Rice (1944)k = 2 n1/3NoneLarge datasetsTends to over-bin for small n
ManualUser-defined kDomain knowledgeRegulatory or publication standardsRequires expertise
Common sample size benchmarks
n = 30Sturges: 6, √: 5, Rice: 6Minimum for CLT approximation
n = 100Sturges: 8, √: 10, Rice: 9Typical classroom dataset
n = 500Sturges: 10, √: 22, Rice: 16Survey-scale data
n = 1000Sturges: 11, √: 32, Rice: 20Large-sample analytics
n = 10000Sturges: 15, √: 100, Rice: 43Big-data; Sturges notably under-bins
Descriptive statistics used internally
RangeR = xmax xminSpread of data
Meanx = 1n ni=1 xiArithmetic average
Std Dev (σ)s = ni=1(xi x)2n 1Sample standard deviation (Bessel-corrected)
IQRIQR = Q3 Q1Middle 50% spread, outlier-resistant
Q1 (25th percentile)Linear interpolation at rank 0.25(n + 1)Lower quartile boundary
Q3 (75th percentile)Linear interpolation at rank 0.75(n + 1)Upper quartile boundary

Frequently Asked Questions

Use the Freedman-Diaconis rule. It replaces standard deviation with the interquartile range (IQR), which is resistant to extreme values. For example, income data with a few very high earners would inflate the standard deviation used by Scott's rule, producing bins that are too wide and masking the shape of the lower-income majority. Freedman-Diaconis avoids this by anchoring to the middle 50% of data.
Sturges' formula grows logarithmically: k = 1 + 3.322 × log₁₀(n). At n = 10,000 you get only about 15 classes, while the Square Root rule yields 100. Sturges derived his formula assuming a binomial distribution converging to normal. For large n with non-normal data, it systematically under-bins, hiding multimodality and local structure. For n > 200, prefer Scott's or Freedman-Diaconis.
When all values are equal, the range R = 0. Division by zero is avoided: the calculator reports a single class containing all observations with width 0. Scott's and Freedman-Diaconis rules also yield h = 0 because both σ and IQR equal zero. A toast notification warns that the data has no variability.
Yes. Select the "Manual" method and enter your desired number of classes k. The calculator then derives h = R / k. This is useful when a regulatory standard or publication guideline mandates a specific bin count, for instance ISO 3534-2 or APA style recommendations for histogram reporting.
Class width h is a single number: the span of each bin. Class interval boundaries are the actual edges. The first lower boundary is typically x_min (or slightly below it). Each subsequent boundary adds h. So for x_min = 10 and h = 5, boundaries are [10, 15), [15, 20), [20, 25), etc. The calculator uses left-closed, right-open intervals except for the last class, which is closed on both sides to include x_max.
Yes. The calculator rounds h upward to a "nice" number (multiples of 1, 2, 5, 10, etc.) for readability. This may slightly widen the total span beyond the original range, meaning the last class might extend past x_max. Frequency counts remain accurate because every observation still falls into exactly one bin. However, the visual histogram may show a small empty tail on the right.
The calculator sorts the data and uses linear interpolation (the "inclusive" method matching Excel's QUARTILE.INC). For Q1, the rank position is 0.25 × (n + 1). If this is not an integer, interpolation between the two adjacent sorted values is applied. The same process applies for Q3 at rank 0.75 × (n + 1). This method is consistent with most statistical software defaults.