CSV to 0SV Converter
Convert CSV files to any-separator-value (0SV) format. Supports custom delimiters, RFC 4180 parsing, auto-detection, and instant download.
About
Delimiter-separated value files appear simple until a quoted field contains your target delimiter, a literal newline, or an escaped double-quote. A naive split on d (where d is your delimiter character) will corrupt structured data. This tool implements a full RFC 4180 state-machine parser that correctly handles every edge case - nested quotes (""), multiline fields, and BOM markers - before re-serializing to your chosen output separator. Auto-detection analyzes character frequency across the first 10 rows to infer the source delimiter with high confidence. All processing runs client-side; no data leaves your browser.
The term 0SV refers to "zero-assumption separated values" - the output delimiter is not fixed to commas or tabs but is any single character or string you specify. This matters when piping data between systems that disagree on format: a PostgreSQL COPY command expects tab-separated input, Excel region settings may default to semicolons, and Unix tools like awk default to whitespace. Mismatched delimiters cause silent column shifts that propagate errors downstream. This converter eliminates that risk.
Formulas
The parser operates as a finite state machine with three states. Given input string S, source delimiter din, and quote character q (default "), each character ci triggers a state transition:
Escaped quotes are detected when ci = q β§ ci+1 = q inside IN_QUOTED state - the pair is collapsed to a single q in output.
For re-serialization to output delimiter dout, each field f is wrapped in quotes if and only if:
Where din = source delimiter, dout = target delimiter, q = quote character, ci = character at position i, and f = a single parsed field value.
Reference Data
| Format Name | Delimiter | Common Extension | Typical Use Case | RFC / Standard | Quoting Rule |
|---|---|---|---|---|---|
| CSV (Comma) | , | .csv | Spreadsheets, databases, general interchange | RFC 4180 | Double-quote fields containing delimiter or newline |
| TSV (Tab) | \t | .tsv | Bioinformatics, PostgreSQL COPY, Unix tools | IANA text/tab-separated-values | Rarely quoted; tabs in data are escaped |
| SSV (Semicolon) | ; | .csv | European Excel exports (comma is decimal separator) | De facto (ISO locale-dependent) | Same as CSV but with semicolon |
| PSV (Pipe) | | | .psv / .txt | EDI, HL7 health data, legacy mainframes | HL7 v2.x, X12 EDI | Rarely quoted; pipe uncommon in text |
| Colon-SV | : | .txt | /etc/passwd, Unix config files | POSIX convention | No quoting; fields must not contain colons |
| Space-SV | β£ | .txt | Fixed-width simulation, simple logs | None | Problematic - spaces in data cause misalignment |
| Tilde-SV | ~ | .txt | Legacy banking, NACHA ACH files | NACHA specification | No standard quoting |
| Caret-SV | ^ | .txt | Custom ETL pipelines where other delimiters conflict | None | Application-specific |
| NULL-SV | \0 | Binary | xargs -0, find -print0 (filenames with spaces) | POSIX | No quoting needed - NULL never appears in text |
| SOH-SV | \x01 | Binary | Hive default, internal database interchange | Apache Hive convention | No quoting needed - SOH is non-printable |
| RS/GS-SV | \x1E / \x1D | Binary | ASCII record/group separation (ISO 646) | ISO 646, ASCII control chars | Purpose-built; no conflicts |
| Multi-char | :: or |-| | .txt | Custom formats where single-char delimiters conflict | None | Application-specific escaping |