User Rating 0.0 โ˜…โ˜…โ˜…โ˜…โ˜…
Total Usage 0 times
Drop CSV file here or click to browse Supports .csv, .tsv, .txt up to 50 MB
or paste CSV text
Source Settings
Target Settings
Is this tool helpful?

Your feedback helps us improve.

โ˜… โ˜… โ˜… โ˜… โ˜…

About

CSV files lack a universal standard for field quoting. Some systems export with double quotes ("), others use single quotes ("), and some omit quoting entirely - causing parse failures when fields contain the delimiter character or line breaks. Importing a file quoted with " into a parser expecting " produces corrupted columns, lost rows, or silent data truncation. This tool performs deterministic quote character replacement on CSV data using a finite-state tokenizer compliant with RFC 4180. It correctly handles escaped quotes (doubled characters), fields containing embedded delimiters, and multiline field values.

The tool does not naively find-and-replace characters. It parses the full CSV structure, then re-serializes with the target quote character and chosen quoting strategy. This matters because a raw replacement of " with ' will break any field that already contains a literal apostrophe. The tokenizer resolves this by applying proper escape sequences during re-serialization. Limitations: the tool assumes consistent quoting within the source file. Mixed-quoting files (some fields single-quoted, others double-quoted) require manual inspection first.

csv quotes delimiter csv formatter csv converter quote character csv tool data cleaning

Formulas

The CSV tokenizer operates as a finite-state machine with four states. Each input character triggers a transition that determines whether it belongs to the current field, ends the field, or modifies the quoting context.

S โˆˆ { FIELD_START, UNQUOTED, QUOTED, QUOTE_ESCAPE }

Transition rules govern parsing behavior:

FIELD_START + Q โ†’ QUOTED
QUOTED + Q โ†’ QUOTE_ESCAPE
QUOTE_ESCAPE + Q โ†’ QUOTED (literal quote appended)
QUOTE_ESCAPE + D โ†’ FIELD_START (field ends)

Where Q = current quote character, D = delimiter character. During re-serialization, the quoting strategy determines which fields receive the target quote character Qtarget:

needsQuote(field) = field contains D โˆจ field contains Qtarget โˆจ field contains newline

Escape within re-serialized fields uses doubling: every occurrence of Qtarget inside a field value is replaced with QtargetQtarget.

Reference Data

Quote StyleCharacterUnicodeCommon UsageEscape MethodRFC 4180
Double Quote"U+0022RFC 4180 standard, Excel, Google SheetsDoubled: ""Yes
Single Quote'U+0027MySQL exports, some Unix toolsDoubled: ''No
Backtick`U+0060MySQL identifiers, MarkdownDoubled: ``No
No Quotes - - Simple numeric CSVs, TSV filesN/A (fields cannot contain delimiter)Partial
Left Double Curlyโ€œU+201CWord processors, copy-paste errorsDoubledNo
Right Double Curlyโ€U+201DWord processors, copy-paste errorsDoubledNo
Left Single Curlyโ€˜U+2018macOS auto-correct, rich textDoubledNo
Right Single Curlyโ€™U+2019macOS auto-correct, rich textDoubledNo
Guillemet DoubleยซยปU+00AB/BBEuropean locales, French textRareNo
Comma Delimiter,U+002CDefault CSV separator - Yes
Semicolon Delimiter;U+003BEuropean locales (decimal comma conflict) - No
Tab Delimiter\tU+0009TSV files, database exports - No
Pipe Delimiter|U+007CLegacy systems, HL7 medical data - No
Caret Delimiter^U+005EMainframe exports - No
Tilde Delimiter~U+007EEDI/X12 transactions - No

Frequently Asked Questions

The tool escapes them using the RFC 4180 doubling convention. If you convert to single quotes and a field contains the text it's here, the output becomes 'it''s here'. The embedded single quote is doubled so parsers correctly interpret it as a literal character rather than a field terminator.
The finite-state tokenizer tracks whether the parser is inside a quoted field. While in the QUOTED state, newline characters (both \n and \r\n) are treated as literal content within the field, not as row terminators. This preserves multiline address fields, notes, and descriptions without splitting them into separate rows.
If any field contains the delimiter character (e.g., a comma inside "New York, NY"), removing quotes makes the parser interpret that comma as a field separator, splitting one field into two and misaligning all subsequent columns. The tool warns you when stripping quotes would cause this. Use the Quote When Necessary strategy instead to quote only fields that require it.
Yes. The auto-detect algorithm scans the first 5000 characters and checks for fields beginning with common quote characters (", ', `). It counts occurrences of each candidate appearing at field boundaries (after a delimiter or at line start) and selects the character with the highest boundary-adjacent frequency. If no clear winner is found, it defaults to double quote per RFC 4180.
Yes. You can change the field delimiter independently of the quote character. For example, converting a comma-delimited file with double quotes to a semicolon-delimited file with single quotes. The tool re-parses with the source delimiter and re-serializes with the target delimiter, applying proper quoting to any field that contains the new delimiter character.
The tool processes files up to 50 MB in the browser. Files under 5 MB are parsed synchronously for instant feedback. Larger files are processed in chunks to prevent the browser from becoming unresponsive. For files exceeding 50 MB, consider splitting them with a command-line tool first.
Word processors often replace straight quotes with typographic curly quotes (โ€œ โ€ or โ€˜ โ€™). The auto-detect recognizes these as quote characters. When converting away from them, the tool treats the opening and closing variants as equivalent, stripping or replacing both with the selected target character.