Compare Two TSV Files
Compare two TSV files side by side. Detect added, removed, and modified rows with cell-level highlighting. Export diff results instantly.
About
Comparing tab-separated value files manually risks missing critical data discrepancies. A single mismatched cell in a dataset of thousands can cascade into flawed analytics, broken database imports, or silent data corruption. This tool parses both TSV inputs, constructs an indexed lookup by a user-defined key column (or by full-row identity), and performs an O(n) diff. It reports rows that exist only in File A (removed), only in File B (added), or in both but with cell-level differences (modified). The comparison is deterministic and handles edge cases including empty cells, trailing delimiters, and mixed line endings (CRLF / LF).
The diff engine does not use fuzzy matching. Equality is strict: cell values are compared as trimmed strings. If your TSV contains numeric fields where 1.0 and 1 must be treated as equal, normalize your data beforehand. All processing runs client-side. No data leaves your browser.
Formulas
The diff algorithm operates in two phases. First, each row is indexed:
The hash function used is djb2:
Where k = key column index, row = array of cell strings, s = concatenated row string, h = running hash value. Comparison complexity is O(n + m) where n and m are row counts of File A and File B respectively.
Reference Data
| Diff Status | Symbol | Meaning | Color Code |
|---|---|---|---|
| Added | + | Row exists only in File B | Green (#82B366) |
| Removed | − | Row exists only in File A | Coral (#E07A6B) |
| Modified | Δ | Row key matches but cell values differ | Amber (#F0C05A) |
| Unchanged | = | Row is identical in both files | None (default) |
| Common TSV Edge Cases | |||
| Empty cell | \t\t | Two consecutive tabs produce an empty string cell | - |
| Trailing tab | data\t | Produces an extra empty cell at row end | - |
| CRLF line ending | \r\n | Windows-style; tool normalizes to \n | - |
| LF line ending | \n | Unix/macOS-style | - |
| Quoted field | "val\tval" | Not standard TSV; tool treats tab as delimiter | - |
| Comparison Modes | |||
| Key Column | - | Uses a specific column as row identifier for matching | - |
| Full Row | - | Hashes entire row; identical rows match regardless of order | - |
| Row Index | - | Compares row-by-row by position (line 1 vs line 1) | - |