User Rating 0.0 โ˜…โ˜…โ˜…โ˜…โ˜…
Total Usage 0 times
Input
Output
Is this tool helpful?

Your feedback helps us improve.

โ˜… โ˜… โ˜… โ˜… โ˜…

About

Malformed CSV data causes silent failures in data pipelines. A single unescaped comma inside a field shifts every subsequent column, corrupting thousands of rows without raising an error. This tool applies RFC 4180 escaping rules: fields containing commas, double quotes ("), or line breaks get enclosed in double quotes, and any internal double quote is doubled (""). The unescape operation reverses this process exactly. The parser uses a finite state machine rather than naive regex splitting, which means it correctly handles quoted fields that themselves contain delimiters and newlines.

Limitation: this tool assumes a single consistent delimiter per dataset. It does not auto-detect mixed delimiters (e.g., some rows using semicolons and others using commas). Pro tip: many European CSV exports use semicolons as delimiters because the comma serves as a decimal separator in those locales. Verify your delimiter before processing.

csv escape csv unescape csv formatter rfc 4180 csv tool csv special characters csv parser

Formulas

The CSV escape operation follows a deterministic rule per field F:

{
escape(F) = " + replace(F, ", "") + " if F contains d โˆจ " โˆจ \n โˆจ \rescape(F) = F otherwise

Where d = the chosen delimiter character (default ,). The unescape operation is the inverse:

{
unescape(F) = replace(strip_quotes(F), "", ") if F starts and ends with "unescape(F) = F otherwise

The parser uses a finite state machine with 4 states: S โˆˆ {FIELD_START, UNQUOTED, QUOTED, QUOTE_IN_QUOTED}. Transition on each character c is deterministic, guaranteeing O(n) parsing with no backtracking.

Reference Data

CharacterNameASCII CodeEscape Behavior (RFC 4180)Common Issue
"Double Quote34Doubled: "", field wrapped in quotesUnescaped quotes break field boundaries
,Comma44Field wrapped in double quotesSplits one field into two columns
\nLine Feed (LF)10Field wrapped in double quotesCreates phantom extra rows
\rCarriage Return (CR)13Field wrapped in double quotesOS-dependent line ending issues
\r\nCRLF13+10Field wrapped in double quotesWindows vs Unix line ending mismatch
;Semicolon59No action (unless used as delimiter)European CSV default delimiter confusion
\tTab9No action (unless TSV mode)Invisible whitespace corruption
'Single Quote39No action per RFC 4180Some parsers treat as text qualifier
\Backslash92No action per RFC 4180Some tools use as escape char (non-standard)
=Equals61No action per RFC 4180Excel formula injection risk (=CMD())
+Plus43No action per RFC 4180Excel formula injection risk
@At Sign64No action per RFC 4180Excel formula injection risk
-Hyphen45No action per RFC 4180Excel may interpret as formula prefix
(empty)Empty Field - Represented as ,, or ,"",NULL vs empty string ambiguity
(space)Leading/Trailing Space32Wrap in quotes to preserveTrimmed silently by many parsers

Frequently Asked Questions

Per RFC 4180, the field is first enclosed in double quotes. Then every double quote character inside the field is doubled. For example, a field value of He said, "hello" becomes "He said, ""hello""". The outer quotes delimit the field, and the doubled inner quotes represent literal quote characters. The parser state machine handles this via the QUOTE_IN_QUOTED state: when it encounters a quote inside a quoted field, it checks if the next character is also a quote (escaped literal) or a delimiter/newline (end of field).
Excel applies its own parsing heuristics beyond RFC 4180. Fields starting with =, +, -, or @ are interpreted as formulas, not text. This is a known CSV injection vector. Additionally, Excel may auto-detect delimiters differently based on your system locale. German and French Windows installations default to semicolons. Wrapping such fields in double quotes does not prevent formula execution in Excel. You must prefix them with a single quote or tab character as a workaround.
RFC 4180 specifies only double-quote escaping: special characters trigger field quoting, and internal quotes are doubled (""). Backslash escaping (\, or \") is not part of the CSV standard. Some tools like MySQL's LOAD DATA INFILE use backslash escaping, but this is non-standard. Mixing the two conventions in one file guarantees parse failures. This tool uses strictly RFC 4180 double-quote escaping.
An empty field between two delimiters (,,) represents an empty string. There is no native NULL representation in RFC 4180. Some systems use ,"", for empty strings and ,, for NULL, but this convention is not standardized. PostgreSQL's COPY command uses \N for NULL. When unescaping, this tool treats "" (quoted empty) and an empty span between delimiters identically as empty strings. Distinguish NULL semantics in your application layer, not in the CSV.
Yes. The finite state machine parser tracks whether the current position is inside a quoted field. A newline character (\n or \r\n) encountered in the QUOTED state is treated as part of the field value, not as a row terminator. This is the primary reason a state machine is required instead of a simple line-by-line split. A naive split("\n") approach will break any CSV containing multiline text fields such as addresses or descriptions.
Yes. The delimiter selector lets you choose comma (,), semicolon (;), tab (\t), or pipe (|). The escape rules remain identical regardless of delimiter: any field containing the chosen delimiter, a double quote, or a newline gets quoted. European datasets frequently use semicolons. Tab-separated values (TSV) rarely need escaping since tabs are uncommon in natural text, but the tool handles them correctly when present.