CSV to JSON Array Converter
Convert CSV files to JSON arrays online. Paste CSV or upload a file, configure delimiters and headers, get clean JSON output instantly.
About
CSV parsing fails silently. A misplaced quote, a delimiter inside a field value, a newline embedded in a cell - any of these corrupts your output if the parser lacks RFC 4180 compliance. Most naive implementations split on commas with regex. That approach breaks the moment a field contains delimiter characters within quoted strings. This tool implements a finite-state machine parser that handles quoted fields, escaped double-quotes (""), and embedded newlines correctly. It auto-detects the delimiter by frequency analysis across the first 5 rows, supporting comma, semicolon, tab, and pipe. Limitations: the parser assumes UTF-8 encoding. Binary-encoded CSV or files exceeding 10 MB are rejected to prevent browser memory issues.
Output is a standard JSON array of objects (keyed by header names) or a nested array of arrays if no headers are specified. Indentation is configurable. Pro tip: if your CSV originates from Excel on European locales, the delimiter is almost certainly a semicolon, not a comma. Validate your delimiter choice before processing large datasets. This tool does not round, coerce, or interpret field values - every value remains a string, preserving data fidelity.
Formulas
The CSV parser operates as a finite-state machine with 4 states. For each character c in the input stream, the machine transitions between states to correctly segment fields.
Transition rules:
Delimiter auto-detection counts the frequency of each candidate delimiter across the first n = 5 lines. The candidate with the lowest coefficient of variation (most consistent count per line) is selected:
Where σ is the standard deviation of delimiter frequency counts across sampled lines, and d is the mean count. A delimiter with zero variance (identical count per row) is strongly preferred. Ties are broken by priority order: comma > semicolon > tab > pipe.
Reference Data
| Delimiter | Symbol | Common Source | Auto-Detect Pattern | RFC 4180 | Notes |
|---|---|---|---|---|---|
| Comma | , | US/UK Excel, Google Sheets export | Highest comma frequency | Yes (default) | Fails if decimal commas used |
| Semicolon | ; | European Excel (DE, FR, IT locales) | Highest semicolon frequency | Extension | Common in SAP exports |
| Tab | \t | TSV files, database dumps | Highest tab frequency | Extension | Rarely appears inside fields |
| Pipe | | | Legacy mainframe systems, HL7 | Highest pipe frequency | Extension | Used in medical data (HL7v2) |
| Quoted Field | "..." | Any source with special chars | - | Yes | Encloses fields with delimiters/newlines |
| Escaped Quote | "" | Fields containing literal quotes | - | Yes | Two consecutive double-quotes = one literal |
| CRLF Line End | \r\n | Windows systems | - | Yes (required) | LF-only also accepted by most parsers |
| LF Line End | \n | Unix/macOS systems | - | Extension | De facto standard in web contexts |
| CR Line End | \r | Classic Mac OS (pre-X) | - | Extension | Extremely rare today |
| BOM Marker | \uFEFF | UTF-8 files from Windows Notepad | First byte check | Not specified | Stripped automatically by this tool |
| Empty Field | ,, | Sparse datasets | - | Yes | Produces empty string, not null |
| Newline in Field | "a\nb" | Address fields, descriptions | - | Yes | Must be inside quotes per RFC 4180 |
| Trailing Delimiter | a,b,c, | Some DB exports | - | Ambiguous | Creates extra empty field per row |
| Header Row | - | Most structured exports | First row analysis | Optional | Used as JSON object keys |
| No Header | - | Raw sensor data, logs | - | Optional | Output becomes array of arrays |