About

Comparing multiple lists manually invites errors. A missed duplicate or overlooked common entry in procurement, inventory reconciliation, or data migration can cascade into costly rework. This tool performs exact set operations - intersection, difference, and symmetric difference - across three input lists simultaneously. It computes the Jaccard similarity index |A ∩ B||A ∪ B| for each pair to quantify overlap. Results assume exact string matching after optional case normalization and whitespace trimming. Items containing only whitespace are discarded.

Formulas

The core comparison relies on standard set algebra. Given three finite sets A, B, and C, seven distinct Venn regions exist.

J(A, B) = |A ∩ B||A ∪ B|

Where J = 0 indicates disjoint sets and J = 1 indicates identical sets. The "Only in A" region is computed as A − (B ∪ C). Pairwise-exclusive overlaps use A ∩ B − C. The triple intersection is A ∩ B ∩ C. All operations run in O(n) time via hash-set lookups.

Reference Data

Operation	Symbol	Description	Example (A={1,2,3}, B={2,3,4}, C={3,4,5})	Result
Union	A ∪ B ∪ C	All items from all lists	{1,2,3} ∪ {2,3,4} ∪ {3,4,5}	{1,2,3,4,5}
Common to All	A ∩ B ∩ C	Items in every list	{1,2,3} ∩ {2,3,4} ∩ {3,4,5}	{3}
A ∩ B only	(A ∩ B) − C	Shared by A and B but not C	({1,2,3} ∩ {2,3,4}) − {3,4,5}	{2}
A ∩ C only	(A ∩ C) − B	Shared by A and C but not B	({1,2,3} ∩ {3,4,5}) − {2,3,4}	∅
B ∩ C only	(B ∩ C) − A	Shared by B and C but not A	({2,3,4} ∩ {3,4,5}) − {1,2,3}	{4}
Only in A	A − (B ∪ C)	Exclusive to A	{1,2,3} − {2,3,4,5}	{1}
Only in B	B − (A ∪ C)	Exclusive to B	{2,3,4} − {1,2,3,4,5}	∅
Only in C	C − (A ∪ B)	Exclusive to C	{3,4,5} − {1,2,3,4}	{5}
Jaccard(A,B)	\|A ∩ B\|\|A ∪ B\|	Similarity ratio 0-1	2 ÷ 4	0.50
Jaccard(A,C)	\|A ∩ C\|\|A ∪ C\|	Similarity ratio 0-1	1 ÷ 5	0.20
Jaccard(B,C)	\|B ∩ C\|\|B ∪ C\|	Similarity ratio 0-1	2 ÷ 4	0.50
Symmetric Diff	A ⊕ B ⊕ C	Items in odd number of lists	XOR across all three	{1,3,5}

Frequently Asked Questions

When case-insensitive mode is enabled, all items are normalized to lowercase before comparison. This means "Apple", "apple", and "APPLE" are treated as the same entry. The original casing of the first occurrence found is preserved in the output. For data like email addresses or SKUs where case is irrelevant, enable this option to avoid false negatives.

Each list accepts items separated by newlines (one item per line), commas, semicolons, or tab characters. You can select the delimiter in the settings. Newline is the default and most reliable for items that themselves contain commas. Mixed delimiters within a single list are not supported - pick one consistently.

By default, intra-list duplicates are removed before comparison. If List A contains "Cat" three times, it counts as one entry. The duplicate count per list is reported in the statistics panel. If you need to preserve duplicates (e.g., frequency analysis), disable the "Remove duplicates" toggle - items will then be compared as multisets where count matters.

The Jaccard index J(A,B) equals the size of the intersection divided by the size of the union of two sets. A value of 0 means no overlap; 1 means identical sets. It is useful for quantifying how similar two datasets are - for example, comparing keyword lists across competitor pages or measuring inventory overlap between warehouses. Values above 0.6 generally indicate high similarity.

Yes. All set operations use hash-based lookups with O(n) average time complexity. Lists of 100,000+ items process in under a second on modern hardware. The Venn diagram canvas may simplify label placement for very large result sets, but all numerical counts and exportable item lists remain complete.

The three circles represent Lists A, B, and C. The number in a non-overlapping segment shows items exclusive to that list. Numbers in two-circle overlaps show items shared by exactly those two lists (excluding the third). The center number shows items common to all three. Clicking any region in the results panel highlights and lists the corresponding items.

Leading and trailing whitespace is always trimmed. Internal whitespace is preserved by default, so "New York" and "NewYork" are different items. If your data has inconsistent internal spacing, pre-process it or ensure entries match exactly. Empty lines and whitespace-only entries are silently discarded.