User Rating 0.0
Total Usage 0 times
Drop a .tsv file here or click to browse
Is this tool helpful?

Your feedback helps us improve.

About

TSV (Tab-Separated Values) files encode tabular data using the horizontal tab character (U+0009) as a field delimiter and a newline (U+000A) as a record terminator. Miscounting entries in a TSV file leads to silent data truncation during database imports, broken ETL pipelines, and corrupted analytics reports. A stray trailing newline inflates the row count by 1; an inconsistent column count across rows signals corrupted records that most naive parsers ignore. This tool parses raw TSV input, counts total rows, non-empty rows, columns, and individual cells, and flags structural anomalies such as ragged rows where the column count deviates from the header. It assumes the first non-empty row defines the column schema. No data leaves your browser.

tsv counter tsv analyzer tab separated values count rows tsv tsv entries tsv statistics data analysis

Formulas

TSV entry counting relies on deterministic string splitting. The total line count is derived from splitting the input on the normalized newline character:

totalLines = split(input, \n).length

Non-empty rows are filtered by a non-whitespace test:

nonEmpty = totalLines emptyRows

The column count is extracted from the header (first non-empty line):

cols = split(headerRow, \t).length

Total cell count across all non-empty rows:

totalCells = Ni=1 split(rowi, \t).length

A row is classified as ragged when its field count deviates from the header column count:

isRagged(row) = split(row, \t).length cols

Where N = number of non-empty rows, cols = header-derived column count, \t = horizontal tab character (U+0009).

Reference Data

MetricDescriptionTypical Range
Total LinesAll lines including empty trailing lines1 - 106
Non-Empty RowsLines with at least one non-whitespace character1 - 106
Empty RowsLines containing only whitespace or nothing0 - 100
Data RowsNon-empty rows excluding the header row0 - 106
Columns (from header)Tab-delimited fields in the first non-empty row1 - 500
Total CellsColumns × non-empty rows1 - 108
Filled CellsCells containing at least one non-whitespace characterVaries
Empty CellsCells that are blank or whitespace-onlyVaries
Ragged RowsRows whose column count header column count0 (ideal)
Duplicate RowsRows with identical content to a previous rowVaries
Max Row LengthHighest number of fields in any single row1 - 1000
Min Row LengthLowest number of fields in any non-empty row1 - 1000
DelimiterTSV uses horizontal tab U+0009Fixed
Line EndingLF (\n), CR+LF (\r\n), or CR (\r)Platform-dependent
File Size Limit (browser)Practical limit for in-memory text processing< 100 MB

Frequently Asked Questions

The parser normalizes line endings (CR, LF, CR+LF) to LF, then splits on LF. Trailing empty strings produced by a final newline are counted as empty rows and excluded from the non-empty row metric. The "Total Lines" metric includes them for transparency, so you can spot the discrepancy.
The tool reports the number of ragged rows - rows where the tab-delimited field count does not equal the header row's field count. It also shows the minimum and maximum row lengths across all non-empty rows so you can identify structural inconsistencies before importing into a database.
Yes. The first non-empty row is treated as the header. The "Data Rows" metric equals non-empty rows minus 1 (the header). If your TSV has no header, interpret "Data Rows" as total records minus 1 and add 1 back manually.
TSV, unlike CSV, does not define a quoting mechanism in its specification (IANA text/tab-separated-values). Fields containing literal tabs or newlines violate the TSV format. This tool splits strictly on tab and newline characters. If your data uses CSV-style quoting, convert it to properly escaped TSV first.
Processing occurs entirely in your browser's memory. Files under 50 MB parse in under 2 seconds on modern hardware. Files between 50-100 MB may cause brief UI pauses. Files exceeding 100 MB risk browser memory limits. For very large datasets, consider command-line tools like wc or awk.
Each non-empty row's raw string (untrimmed) is hashed into a Set. If a row string has been seen before, it is counted as a duplicate. Whitespace differences (e.g., trailing spaces within a cell) make rows distinct. The duplicate count excludes the first occurrence - it counts only the extra copies.