User Rating 0.0
Total Usage 0 times
Stream A
📂 Drop file or click to browse
Stream B
📂 Drop file or click to browse
0%
Page 1
Match Mismatch Overflow
Is this tool helpful?

Your feedback helps us improve.

About

Binary comparison errors propagate silently. A single flipped bit in firmware, a corrupted byte in a cryptographic key, or a transmission error in a serialized protocol buffer can cause catastrophic downstream failures without any human-readable symptom. This tool performs a strict byte-level comparison of two binary streams, reporting every offset where stream A diverges from stream B. It computes the Hamming distance dH, percentage similarity, and renders a visual heatmap so spatial patterns in corruption become immediately obvious. Input accepts raw files (up to 50 MB), hex-encoded strings, or plain UTF-8 text encoded to bytes. The tool approximates diff analysis assuming both streams share a common alignment at offset 0. It does not perform sequence alignment or fuzzy matching. For streams of unequal length, trailing bytes in the longer stream are flagged as overflow differences.

binary compare hex diff byte comparison binary diff file compare hex dump stream compare

Formulas

The primary metric is the Hamming distance computed over the overlapping region of both streams:

dH = min(nA,nB)1i=0 δ(A[i] B[i])

where δ is the Kronecker indicator returning 1 on mismatch. The total difference count includes overflow bytes:

D = dH + |nA nB|

Similarity as a percentage over the maximum stream length:

S = max(nA, nB) Dmax(nA, nB) × 100 %

where A[i] is the byte at offset i in stream A, nA is byte length of stream A, nB is byte length of stream B, D is total differences including overflow, and S is similarity percentage.

Reference Data

MetricSymbolDescriptionRange
Hamming DistancedHCount of byte positions where streams differ0 - min(nA, nB)
SimilaritySPercentage of matching bytes over max length0 - 100 %
First Diff OffsetfByte offset of the earliest mismatch0 - n1
Last Diff OffsetlByte offset of the latest mismatch0 - n1
Stream A SizenATotal bytes in stream A0 - 52428800 B
Stream B SizenBTotal bytes in stream B0 - 52428800 B
Overflow BytesΔnAbsolute size difference between streams0 - 52428800 B
Diff DensityρRatio of differing bytes to compared range0.0 - 1.0
Match CountmNumber of identical byte positions0 - n
Compared Lengthmin(nA, nB)Number of byte positions actually compared0 - 52428800
Hex Byte - Two-character representation, base-1600 - FF
ASCII Printable Range - Characters rendered in hex dump sidebar0x20 - 0x7E

Frequently Asked Questions

The comparison runs over the overlapping region from offset 0 to min(nA, nB) 1. Bytes beyond the shorter stream's length are counted as overflow differences and shown in gray on the visual map. The similarity formula uses max(nA, nB) as the denominator, so unequal lengths always reduce similarity.
Comparisons exceeding 1 MB are offloaded to a Web Worker to prevent UI freezing. A progress indicator shows completion percentage. The maximum accepted size is 50 MB per stream. Beyond that threshold the tool rejects the input to prevent browser memory exhaustion. The hex dump view is paginated at 4096 bytes per page regardless of file size.
Yes. The parser strips all whitespace, commas, and optional 0x prefixes. It then validates that only characters 0 - 9 and A - F (case-insensitive) remain. If the resulting string has odd length, a leading zero is prepended. Invalid characters trigger an error toast identifying the first illegal character and its position.
The map is drawn on an HTML Canvas element. Each pixel corresponds to one byte. Green (#6BCB77) indicates a matching byte. Red (#F87C8C) indicates a mismatch within the overlapping region. Gray (#C4C7D4) marks overflow bytes present in only one stream. The canvas width is fixed at 256 pixels, so rows wrap every 256 bytes. This layout reveals spatial corruption patterns such as block-aligned errors or periodic bit flips.
No. This tool performs fixed-offset comparison only. It does not implement sequence alignment algorithms like Needleman-Wunsch or Myers diff. A single inserted byte at offset 0 will cause every subsequent byte to appear different. For insertion/deletion detection, a dedicated binary diff tool with alignment capability is required.
Text input is encoded to bytes using the TextEncoder API, which produces UTF-8. A single ASCII character maps to 1 byte. Multi-byte characters (e.g., emoji, CJK) expand to 2 - 4 bytes. This means string length and byte length can differ significantly for non-ASCII text.