CSV Escape-Unescape
Escape and unescape CSV data per RFC 4180. Handle special characters, double quotes, commas, and newlines in CSV fields correctly.
About
Malformed CSV data causes silent failures in data pipelines. A single unescaped comma inside a field shifts every subsequent column, corrupting thousands of rows without raising an error. This tool applies RFC 4180 escaping rules: fields containing commas, double quotes ("), or line breaks get enclosed in double quotes, and any internal double quote is doubled (""). The unescape operation reverses this process exactly. The parser uses a finite state machine rather than naive regex splitting, which means it correctly handles quoted fields that themselves contain delimiters and newlines.
Limitation: this tool assumes a single consistent delimiter per dataset. It does not auto-detect mixed delimiters (e.g., some rows using semicolons and others using commas). Pro tip: many European CSV exports use semicolons as delimiters because the comma serves as a decimal separator in those locales. Verify your delimiter before processing.
Formulas
The CSV escape operation follows a deterministic rule per field F:
Where d = the chosen delimiter character (default ,). The unescape operation is the inverse:
The parser uses a finite state machine with 4 states: S โ {FIELD_START, UNQUOTED, QUOTED, QUOTE_IN_QUOTED}. Transition on each character c is deterministic, guaranteeing O(n) parsing with no backtracking.
Reference Data
| Character | Name | ASCII Code | Escape Behavior (RFC 4180) | Common Issue |
|---|---|---|---|---|
| " | Double Quote | 34 | Doubled: "", field wrapped in quotes | Unescaped quotes break field boundaries |
| , | Comma | 44 | Field wrapped in double quotes | Splits one field into two columns |
| \n | Line Feed (LF) | 10 | Field wrapped in double quotes | Creates phantom extra rows |
| \r | Carriage Return (CR) | 13 | Field wrapped in double quotes | OS-dependent line ending issues |
| \r\n | CRLF | 13+10 | Field wrapped in double quotes | Windows vs Unix line ending mismatch |
| ; | Semicolon | 59 | No action (unless used as delimiter) | European CSV default delimiter confusion |
| \t | Tab | 9 | No action (unless TSV mode) | Invisible whitespace corruption |
| ' | Single Quote | 39 | No action per RFC 4180 | Some parsers treat as text qualifier |
| \ | Backslash | 92 | No action per RFC 4180 | Some tools use as escape char (non-standard) |
| = | Equals | 61 | No action per RFC 4180 | Excel formula injection risk (=CMD()) |
| + | Plus | 43 | No action per RFC 4180 | Excel formula injection risk |
| @ | At Sign | 64 | No action per RFC 4180 | Excel formula injection risk |
| - | Hyphen | 45 | No action per RFC 4180 | Excel may interpret as formula prefix |
| (empty) | Empty Field | - | Represented as ,, or ,"", | NULL vs empty string ambiguity |
| (space) | Leading/Trailing Space | 32 | Wrap in quotes to preserve | Trimmed silently by many parsers |