CSV to TSV Converter
Convert CSV files to TSV format online. RFC 4180-compliant parser handles quoted fields, escaped characters, and large files with drag-and-drop support.
About
CSV (Comma-Separated Values) and TSV (Tab-Separated Values) are the two dominant flat-file interchange formats. The critical difference is the delimiter: d = , for CSV versus d = \t for TSV. A naive find-and-replace conversion will corrupt any dataset where commas appear inside quoted fields - a violation of RFC 4180, Section 2, Rule 6. This tool implements a finite-state-machine parser that correctly distinguishes structural delimiters from literal characters within double-quoted fields. It handles embedded newlines, escaped double quotes ("" โ "), and empty fields without data loss.
Incorrect delimiter conversion is a common cause of silent data corruption in ETL pipelines and database imports. Spreadsheet applications sometimes auto-detect delimiters incorrectly, compounding the error. This converter processes files entirely in your browser - no server upload, no data exposure. Files exceeding 1MB are processed in a Web Worker to prevent UI blocking. The parser assumes UTF-8 encoding. Note: if source fields contain literal tab characters, they are escaped to preserve TSV structural integrity.
Formulas
The conversion operates on a parsed two-dimensional array M of m rows and n columns. Each cell Mi,j is extracted by the CSV state machine, then rejoined with a tab delimiter.
The CSV parser uses three states: S0 = FIELD_START, S1 = UNQUOTED, S2 = QUOTED. Transitions occur on encountering ", ,, or \n. Inside S2, a comma is treated as a literal character. A double-quote "" within S2 produces a single literal quote and remains in S2.
Where M = parsed matrix, m = total rows, n = columns per row, Mi = the i-th row array, join = concatenation with tab delimiter.
Reference Data
| Feature | CSV (RFC 4180) | TSV |
|---|---|---|
| Delimiter | Comma (,) | Tab (\t, U+0009) |
| MIME Type | text/csv | text/tab-separated-values |
| File Extension | .csv | .tsv or .tab |
| Quoting | Double quotes for fields containing delimiters, newlines, or quotes | Rarely used; tabs within fields must be escaped or stripped |
| Quote Escaping | "" (doubled) | Not standardized |
| Newlines in Fields | Allowed inside quoted fields | Generally not supported; may break parsers |
| Header Row | Optional (RFC 4180 ยง2.3) | Optional |
| Encoding | UTF-8 recommended; varies | UTF-8 typical |
| Excel Import | Auto-detected on most locales | Requires manual delimiter selection |
| Database Import | Widely supported (MySQL LOAD DATA, PostgreSQL COPY) | Supported (PostgreSQL COPY with DELIMITER) |
| Max Columns (practical) | No standard limit; Excel caps at 16,384 | Same practical limits |
| Common Pitfall | Locale-dependent: some regions use semicolons (;) | Embedded tabs corrupt structure |
| RFC Standard | RFC 4180 (October 2005) | IANA registered; no formal RFC |
| Used By | Excel, Google Sheets, most databases | Bioinformatics (BLAST, BED), Unix tools |
| Advantages | Universal support; well-defined quoting rules | Simpler parsing when fields lack tabs; no quoting ambiguity |