User Rating 0.0
Total Usage 0 times
Examples:
Input CSV
Drop .csv file here or click to upload
Output
Is this tool helpful?

Your feedback helps us improve.

About

CSV files encoded per RFC 4180 wrap fields in double-quote characters (") whenever the field contains a delimiter, a newline, or the quote character itself. Many downstream systems - flat-file importers, legacy databases, fixed-width loaders - choke on these quotes or interpret them as literal data. Manually stripping them in a text editor risks destroying fields that legitimately contain the delimiter, producing column-shift errors that cascade silently through an entire dataset. This tool implements a character-level state-machine parser that distinguishes structural quotes from literal content, letting you remove only the wrapping quotes while preserving escaped interior quotes and field integrity.

Three removal modes are provided. "Smart" mode removes quotes only from fields that do not require them (no embedded delimiters or newlines). "All surrounding" mode strips every field's outer quotes regardless. "Global" mode deletes every " character - useful only when you are certain no field contains intentional quote characters. The tool auto-detects the delimiter by frequency analysis of the first 5 lines, supporting comma, semicolon, tab, and pipe. Limitations: this tool assumes well-formed CSV. Malformed files with unbalanced quotes will trigger a diagnostic warning with the offending line number.

csv quotes remover csv cleaner csv formatter remove quotes csv csv tool text formatting

Formulas

The parser operates as a deterministic finite automaton (DFA) with three states per field:

S0 = FIELD_START if char = Q, transition to S1
S1 = INSIDE_QUOTED if char = Q, transition to S2
S2 = QUOTE_END_OR_ESCAPE if next char = Q, emit literal quote, return to S1

Where Q is the configured quote character (default "). A field is classified as "quote-necessary" when its content satisfies:

needsQuote(field) = field contains D field contains Q field contains \n

Where D is the detected delimiter character. In "Smart" mode, quotes are preserved when needsQuote returns TRUE. In "All Surrounding" mode, outer quotes are always stripped and interior escaped quotes ("") are reduced to single quotes ("). In "Global" mode, every instance of Q is deleted without field-boundary awareness.

Delimiter auto-detection scores each candidate by counting occurrences across the first 5 lines and selecting the character with the lowest variance in per-line count and a non-zero mean:

score(D) = countσ(count) + 1

Where count is the mean occurrence per line and σ is standard deviation. The delimiter with the highest score wins.

Reference Data

ScenarioOriginal FieldSmart Mode OutputAll Surrounding OutputGlobal Output
Simple text, no special chars"Hello"HelloHelloHello
Field contains comma"New York, NY""New York, NY" (kept)New York, NYNew York, NY
Field contains escaped quote"She said ""hi""""She said ""hi""" (kept)She said "hi"She said hi
Numeric field quoted"12345"123451234512345
Empty quoted field""(empty)(empty)(empty)
Field with newline"Line1\nLine2""Line1\nLine2" (kept)Line1\nLine2Line1\nLine2
Field with delimiter & quote"Price is $5, ""final""""Price is $5, ""final""" (kept)Price is $5, "final"Price is $5, final
Unquoted field (no change)HelloHelloHelloHello
Tab-delimited quoted"Data" (tab sep)DataDataData
Single-quote (not affected)"Value""Value""Value""Value"
Pipe-delimited with quotes"A|B" (pipe sep)"A|B" (kept)A|BA|B
Semicolon-delimited"München;Berlin""München;Berlin" (kept)München;BerlinMünchen;Berlin
Mixed: some fields quoted"A",B,"C,D"A,B,"C,D"A,B,C,DA,B,C,D
Whitespace around quotes "Data" Data Data Data
Custom quote char (')"Hello"Hello (if configured)Hello (if configured)Hello (if configured)

Frequently Asked Questions

Per RFC 4180, a literal double-quote inside a quoted field is represented as two consecutive double-quotes (""). When the tool strips surrounding quotes in "All Surrounding" or "Smart" mode, it also un-escapes these pairs, converting "" back to a single ". In "Global" mode, every quote character is simply deleted, which means interior quotes vanish entirely. Choose your mode based on whether downstream systems expect escaped or literal quotes.
The tool samples the first 5 lines and counts occurrences of four candidate delimiters: comma, semicolon, tab, and pipe. It calculates a consistency score for each - the mean count divided by (standard deviation + 1). The candidate with the highest score (most consistent across lines) is selected. You can also override the auto-detected delimiter manually in the settings panel.
In "Smart" mode, no. The parser checks each field's content: if it contains the active delimiter, a newline, or the quote character, the surrounding quotes are preserved. In "All Surrounding" mode, yes - those quotes are stripped regardless, which means re-importing the output into a CSV parser would cause column misalignment. Use "All Surrounding" only when you are exporting to a non-CSV target (plain text, fixed-width, or display).
Yes. The settings panel includes a "Quote Character" option. You can set it to a single quote ('), backtick (`), or any single character. The parser will then treat that character as the field enclosure. This is useful for non-standard CSV exports from systems like MySQL's SELECT INTO OUTFILE which can use arbitrary enclosure characters.
The tool processes data entirely in the browser. For files under 5 MB (roughly 50,000-100,000 rows), processing is near-instantaneous. Files up to 50 MB are supported with a progress indicator and chunked processing via requestAnimationFrame to prevent browser freezing. Beyond 50 MB, browser memory limits may apply depending on your device. The tool will warn you if the input exceeds the recommended threshold.
The state-machine parser correctly distinguishes between newlines that are part of a field's content (inside quotes) and newlines that terminate a row (outside quotes). When surrounding quotes are removed in "All Surrounding" mode, embedded newlines remain in the field content. This means the output row count may appear different from the input row count if you view it in a plain text editor, but structurally the data is correct.