About

In information theory, Shannon Entropy measures the unpredictability or "information content" of a message source. It quantifies the absolute minimum number of bits required to encode a string without losing data. A string with identical characters (e.g., "AAAAA") has an entropy of 0, while a truly random string approaches maximum entropy.

This tool is widely used by computer scientists to analyze password strength (randomness), test random number generators, or determine the compressibility of a dataset. It provides a breakdown of character frequency (p(x)) and compares the actual byte size against the theoretical ideal size.

Formulas

Shannon Entropy H is calculated as the negative sum of the probability of each distinct character:

H(X) = − n∑i=1 p(x_i) log₂ p(x_i)

Where p(x) is the frequency count of character x divided by the string length.

Reference Data

Scenario	Entropy (bits/symbol)	Description
Constant String	0	"aaaaa". Complete certainty.
English Text	2.3 - 4.5	Natural language is redundant.
Base64 / Random	5.8 - 6.0	High density, used in encryption.
Binary Random	1.0	If symbols are only 0 and 1 with equal prob.
Max Entropy (N symbols)	log₂(N)	Uniform distribution of all unique chars.

Frequently Asked Questions

Here, a bit represents a binary decision. An entropy of 4.5 bits means that, on average, you need 4.5 "yes/no" questions to guess the next character in the sequence.

English has high redundancy. The letter "e" is very common, "z" is rare, and "q" is almost always followed by "u". This predictability lowers the information content (entropy), which is why text files compress so well.

High entropy implies high unpredictability. A password like "Password123" has low entropy (predictable patterns). A password like "9#xK!2" has higher entropy per character, making it harder to brute-force.

Shannon Entropy Calculator

Character Distribution

About

Formulas

Reference Data

Frequently Asked Questions