About

Miscounting rows in a TSV file before a database import leads to silent data loss. A trailing newline adds a phantom row. An embedded line break inside a quoted field splits one record into two. This tool parses raw TSV text and reports the exact count of total lines, non-empty data rows, and detected columns from the first record. It distinguishes between blank lines (matching ^\s*$) and actual data, so you know precisely what your import script will consume. The count assumes standard TSV conventions: no quoting, no escaped tabs, one record per line.

Formulas

The row count is computed by splitting the input on newline characters and applying optional filters:

R_total = split(input, \n).length

R_data = R_total − R_empty − R_header

Where R_empty counts lines matching the regular expression ^\s*$ (zero or more whitespace characters). R_header is 1 if the user opts to exclude a header row, 0 otherwise. Column count C is derived from the first non-empty line: C = split(line₀, \t).length. A trailing newline produces one additional empty string after the final split. The tool detects and discards this phantom entry to avoid off-by-one errors.

Reference Data

Format	Delimiter	Typical Extension	Quoting Convention	Max Columns (Practical)	Line Ending	Encoding	Header Row	Empty Field	Common Use
TSV	Tab (\t, U+0009)	.tsv, .txt	None (fields must not contain tabs)	Unlimited	LF or CRLF	UTF-8	Optional	Adjacent tabs	Bioinformatics, spreadsheets
CSV	Comma (,)	.csv	Double-quote RFC 4180	Unlimited	CRLF (RFC)	UTF-8 / Latin-1	Optional	Adjacent commas	General data exchange
SSV	Semicolon (;)	.csv (locale)	Double-quote	Unlimited	CRLF	UTF-8	Optional	Adjacent semicolons	European Excel exports
PSV	Pipe (\|)	.txt, .psv	Rare	Unlimited	LF or CRLF	UTF-8	Optional	Adjacent pipes	Legacy mainframe data
Fixed-Width	Column positions	.txt, .dat	None	Defined by spec	LF or CRLF	ASCII / EBCDIC	Optional	Spaces	Government filings
JSON Lines	Newline per object	.jsonl	JSON strings	N/A (key-value)	LF	UTF-8	N/A	NULL	Log streaming
Parquet	Binary columnar	.parquet	N/A	Unlimited	N/A	Binary	Schema	Null bitmap	Big data / analytics
Excel XLSX	XML cells	.xlsx	N/A	16384	N/A	UTF-8 XML	Optional	Empty cell element	Business reporting
IANA TSV	Tab (IANA registered)	.tsv	None (strict)	Unlimited	LF (IANA rec.)	UTF-8	Required (IANA)	Adjacent tabs	Standards-compliant exchange
W3C WebVTT	Tab (cue fields)	.vtt	None	3	LF or CRLF	UTF-8	Signature line	Empty cue	Video subtitles
BED Format	Tab	.bed	None	12 (standard)	LF	ASCII	None (track lines)	Period (.)	Genomic intervals
VCF	Tab	.vcf	None	8 + samples	LF	UTF-8	Meta + header	Period (.)	Variant calling
GFF3	Tab	.gff3	URL-encoded	9	LF	UTF-8	Directive lines	Period (.)	Gene annotation
SAM	Tab	.sam	None	11 + optional	LF	ASCII	@-prefixed headers	Asterisk (*)	Sequence alignment

Frequently Asked Questions

A file ending with a newline character (\n) produces an empty string as the last element when split. This tool detects trailing newlines and excludes the resulting phantom empty row from the total count, preventing the common off-by-one error that plagues naive line-counting approaches.

Yes. Before counting, all carriage-return/line-feed sequences (\r\n) are normalized to \n. Standalone \r characters (old Mac format) are also converted. This ensures consistent counts regardless of the originating operating system.

Standard TSV does not support embedded newlines within fields - unlike CSV with RFC 4180 quoting. If your data contains embedded newlines, each fragment will be counted as a separate row. Pre-process such data by removing or escaping embedded newlines before counting.

Both are classified as empty. The filter uses the pattern ^\s*$ which matches lines containing zero or more whitespace characters (spaces, tabs). When "Skip empty lines" is enabled, all such lines are excluded from the data row count.

The tool processes text in-browser using JavaScript string operations. Files up to approximately 50 MB work reliably in modern browsers. For files exceeding this, consider using command-line tools like wc -l on Unix systems. The tool provides a file size indicator after upload so you can gauge feasibility.

Column count is derived from the first non-empty line by splitting on tab characters (U+0009). If your file uses spaces, commas, or other delimiters instead of tabs, the entire line registers as a single column. Verify your delimiter is a genuine tab character, not multiple spaces.