User Rating 0.0 โ˜…โ˜…โ˜…โ˜…โ˜…
Total Usage 0 times
Examples:
Is this tool helpful?

Your feedback helps us improve.

โ˜… โ˜… โ˜… โ˜… โ˜…

About

A box plot (box-and-whisker diagram) encodes five critical order statistics into a single glyph: the minimum, first quartile Q1, median Q2, third quartile Q3, and maximum. Points beyond Q1 โˆ’ 1.5 โ‹… IQR or Q3 + 1.5 โ‹… IQR are flagged as outliers. Misidentifying outliers corrupts regression models, inflates variance estimates, and produces misleading confidence intervals. This tool computes the full five-number summary, detects outliers using Tukey's fence method, and renders a scaled box plot on canvas. It approximates quartiles via linear interpolation (the inclusive method used by most statistics textbooks). Note: for datasets with fewer than 4 observations, quartile definitions become ambiguous.

box plot box and whisker five number summary quartiles IQR outliers statistics calculator descriptive statistics

Formulas

The interquartile range and outlier fences are computed as follows:

IQR = Q3 โˆ’ Q1
LF = Q1 โˆ’ 1.5 ร— IQR
UF = Q3 + 1.5 ร— IQR

Quartiles are calculated using linear interpolation. For a sorted dataset of n values, the p-th percentile index is:

i = p ร— (n โˆ’ 1)

If i is not an integer, interpolate between xfloor(i) and xceil(i):

Q = xfloor(i) + (i โˆ’ floor(i)) ร— (xceil(i) โˆ’ xfloor(i))

Sample standard deviation uses Bessel's correction:

s = โˆšnโˆ‘i=1 (xi โˆ’ x)2n โˆ’ 1

Where Q1 = 25th percentile, Q3 = 75th percentile, LF = lower fence, UF = upper fence, n = sample size, x = sample mean.

Reference Data

StatisticSymbolDescriptionSensitivity to Outliers
MinimumxminSmallest observed valueExtremely sensitive
First QuartileQ125th percentile; splits lowest 25% of dataResistant
MedianQ250th percentile; central valueResistant
Third QuartileQ375th percentile; splits lowest 75% of dataResistant
MaximumxmaxLargest observed valueExtremely sensitive
Interquartile RangeIQRQ3 โˆ’ Q1Resistant
Lower FenceLFQ1 โˆ’ 1.5 ร— IQRResistant
Upper FenceUFQ3 + 1.5 ร— IQRResistant
MeanxArithmetic average of all valuesHighly sensitive
Standard DeviationsSquare root of sample varianceHighly sensitive
Skewnessg1Measure of distribution asymmetrySensitive
Kurtosisg2Measure of tail heaviness (excess)Sensitive
RangeRxmax โˆ’ xminExtremely sensitive
Lower Whisker - Smallest datum โ‰ฅ LFResistant
Upper Whisker - Largest datum โ‰ค UFResistant
Mild Outlier - Beyond 1.5 ร— IQR from box -
Extreme Outlier - Beyond 3.0 ร— IQR from box -

Frequently Asked Questions

This tool uses the inclusive (interpolation) method, which computes the index as i = p ร— (n โˆ’ 1) and linearly interpolates between adjacent sorted values. This matches Excel's PERCENTILE.INC function and is the most widely taught method in introductory statistics courses. Results may differ slightly from tools using the exclusive method or Hyndman-Fan Type 7.
A mild outlier falls between 1.5 ร— IQR and 3.0 ร— IQR beyond the quartile boundary. An extreme outlier exceeds 3.0 ร— IQR. On the plot, mild outliers appear as open circles and extreme outliers as filled red circles. Extreme outliers often indicate measurement errors or genuinely rare events and warrant investigation before removal.
A minimum of 4 data points is needed to compute distinct values for Q1, Q2, and Q3. With fewer points, the quartile boundaries collapse and the box plot loses interpretive value. For reliable inference, most statisticians recommend at least 20 - 30 observations.
If the median line sits closer to Q1 with a longer upper whisker, the distribution is right-skewed (positive skewness). If the median sits closer to Q3 with a longer lower whisker, it is left-skewed. A symmetric distribution shows roughly equal whisker lengths and the median centered in the box. The computed skewness coefficient g1 quantifies this: values near 0 indicate symmetry, positive values indicate right skew.
Yes. You can enter up to 5 datasets, each in its own input area. The tool renders all box plots on a shared scale so you can visually compare medians, spreads, and outlier patterns across groups. This is the primary use case for box plots in ANOVA-style comparisons and quality control (e.g., comparing batch measurements).
Whiskers extend only to the most extreme data point that falls within the fences (Q1 โˆ’ 1.5 ร— IQR and Q3 + 1.5 ร— IQR). Any data points beyond these fences are plotted individually as outliers. This is Tukey's convention and prevents outliers from compressing the box into an unreadable sliver.