User Rating 0.0
Total Usage 0 times
File A (Original)
Drop TSV file here
File B (Modified)
Drop TSV file here
Is this tool helpful?

Your feedback helps us improve.

About

Comparing tab-separated value files manually risks missing critical data discrepancies. A single mismatched cell in a dataset of thousands can cascade into flawed analytics, broken database imports, or silent data corruption. This tool parses both TSV inputs, constructs an indexed lookup by a user-defined key column (or by full-row identity), and performs an O(n) diff. It reports rows that exist only in File A (removed), only in File B (added), or in both but with cell-level differences (modified). The comparison is deterministic and handles edge cases including empty cells, trailing delimiters, and mixed line endings (CRLF / LF).

The diff engine does not use fuzzy matching. Equality is strict: cell values are compared as trimmed strings. If your TSV contains numeric fields where 1.0 and 1 must be treated as equal, normalize your data beforehand. All processing runs client-side. No data leaves your browser.

tsv compare tsv diff tab separated values file comparison data diff tsv viewer compare files online

Formulas

The diff algorithm operates in two phases. First, each row is indexed:

key(row) = row[k] (key column mode)
key(row) = hash(row[0] + "\t" + row[1] + …) (full row mode)

The hash function used is djb2:

h0 = 5381
hi+1 = ((hi « 5) + hi) + charCode(s[i])

Where k = key column index, row = array of cell strings, s = concatenated row string, h = running hash value. Comparison complexity is O(n + m) where n and m are row counts of File A and File B respectively.

Reference Data

Diff StatusSymbolMeaningColor Code
Added+Row exists only in File BGreen (#82B366)
RemovedRow exists only in File ACoral (#E07A6B)
ModifiedΔRow key matches but cell values differAmber (#F0C05A)
Unchanged=Row is identical in both filesNone (default)
Common TSV Edge Cases
Empty cell\t\tTwo consecutive tabs produce an empty string cell -
Trailing tabdata\tProduces an extra empty cell at row end -
CRLF line ending\r\nWindows-style; tool normalizes to \n -
LF line ending\nUnix/macOS-style -
Quoted field"val\tval"Not standard TSV; tool treats tab as delimiter -
Comparison Modes
Key Column - Uses a specific column as row identifier for matching -
Full Row - Hashes entire row; identical rows match regardless of order -
Row Index - Compares row-by-row by position (line 1 vs line 1) -

Frequently Asked Questions

Key column mode uses a single column (e.g., an ID field) to pair rows between files. If row 5 in File A has key "USR-042" and row 12 in File B has the same key, they are compared cell-by-cell. Full-row mode hashes the entire row content and matches identical hashes. Use key column mode when rows may be reordered between files. Use full-row mode when there is no unique identifier.
If multiple rows share the same key value within a single file, only the last occurrence is indexed. Earlier duplicates are effectively invisible to the diff. If your data has non-unique keys, consider using Row Index mode or deduplicating beforehand.
Yes. Columns are compared positionally. If File A has columns [Name, Age] and File B has [Age, Name], every cell will appear modified. Ensure both files share the same column structure. The tool displays detected headers to help verify alignment.
Yes. If a row in File A has 5 columns and the matching row in File B has 7 columns, the extra columns in File B are flagged as modified (added cells). Missing columns are treated as empty strings for comparison purposes.
Processing is entirely in-browser. Practical limits depend on available RAM. Files under 50,000 rows (~10 MB) process near-instantly. Larger files trigger chunked processing with a progress indicator. Above 200,000 rows, expect several seconds of processing time.
All comparisons are string-based after trimming whitespace. The values "1.00" and "1" are considered different. If you need numeric equivalence, pre-process your TSV to normalize number formatting before comparing.