CSV to TXT Converter
Convert CSV files to TXT format online. Supports tab-delimited, fixed-width, pipe-separated output with RFC 4180 parsing. Free, instant, private.
About
CSV (Comma-Separated Values) parsing appears trivial until a quoted field contains an embedded comma, a newline, or a literal double-quote escaped as "". RFC 4180 defines the grammar, but real-world exports from Excel, Google Sheets, and legacy ERP systems routinely deviate with BOM markers, mixed line endings (CRLF vs LF), and semicolon delimiters dictated by locale. Incorrect parsing silently shifts columns, corrupting downstream analysis. This converter implements a character-level state-machine parser that handles all RFC 4180 edge cases, auto-detects the input delimiter, and outputs clean TXT in your choice of tab-delimited, fixed-width, pipe-separated, or space-separated format. Processing runs entirely in your browser. No data leaves your machine.
The tool enforces strict quoting rules: a field that begins with a double-quote must end with one, and internal quotes must be doubled ("" → "). Malformed rows are flagged, not silently dropped. Fixed-width output pads each column to its maximum observed width plus 2 characters, aligning data for monospaced display or legacy mainframe ingest. Files up to 50 MB are supported. For files exceeding 1 MB, parsing offloads to a Web Worker to keep the UI responsive. Note: this tool assumes UTF-8 encoding. Non-UTF-8 files may produce garbled characters in multibyte sequences.
Formulas
The CSV parser operates as a finite state machine with three states: FIELD_START, IN_QUOTED, and IN_UNQUOTED. Transitions are determined character-by-character:
Delimiter auto-detection counts occurrences of each candidate delimiter (, ; \t |) across the first 5 lines. The delimiter with the lowest coefficient of variation in per-line counts is selected:
where σ is the standard deviation of per-line counts and μ is the mean. The candidate with the lowest score (most consistent count per line) wins. Ties are broken by priority order: comma > semicolon > tab > pipe.
Fixed-width output computes column width as:
where Wj is the padded width for column j, and each cell is right-padded with spaces to Wj characters.
Reference Data
| Output Format | Separator Character | Best For | Column Alignment | Readability | Import Compatibility |
|---|---|---|---|---|---|
| Tab-Delimited | \t (U+0009) | Spreadsheets, databases | Variable | Medium | Excel, SQL loaders, R, Python pandas |
| Fixed-Width | Space padding | Mainframes, COBOL, reports | Exact column alignment | High | FORTRAN, SAS, legacy ETL |
| Pipe-Delimited | | (U+007C) | Data pipelines, logs | Variable | Medium | Unix tools, awk, sed |
| Space-Delimited | Single space | Simple text, CLI tools | Variable | Low (if data has spaces) | cut, tr, shell scripts |
| Custom Delimiter | User-defined character | Proprietary formats | Variable | Varies | Application-specific |
| Common CSV Input Delimiters (Auto-Detected) | |||||
| Comma | , (U+002C) | Default RFC 4180 | - | - | Universal |
| Semicolon | ; (U+003B) | European locale Excel exports | - | - | German, French, Italian Excel |
| Tab (TSV) | \t (U+0009) | Tab-separated values | - | - | Widely supported |
| Pipe | | (U+007C) | Medical (HL7), financial feeds | - | - | Domain-specific |
| RFC 4180 Quoting Rules | |||||
| Plain field | No quoting required: hello | ||||
| Field with comma | Must be quoted: "hello, world" | ||||
| Field with newline | Must be quoted: "line1\nline2" | ||||
| Field with quote | Quote doubled inside quotes: "say ""hello""" | ||||
| Empty field | Two consecutive delimiters: a,,c | ||||
| Quoted empty | Explicit empty: a,"",c | ||||
| File Size & Performance | |||||
| < 100 KB | Instant parsing on main thread (< 50 ms) | ||||
| 100 KB - 1 MB | Main thread, 50 - 500 ms | ||||
| 1 MB - 50 MB | Web Worker parsing, progress indicator shown | ||||
| > 50 MB | Rejected with error (browser memory limits) | ||||