CSV Columns to an Array Converter
Convert CSV columns into arrays for JavaScript, Python, PHP, Ruby, Java, C#, Go, Swift, Rust & TypeScript. RFC 4180 compliant parsing.
About
Manual extraction of columnar data from CSV files into programming-language arrays introduces transcription errors, mismatched quoting, and delimiter confusion. A single unescaped comma inside a quoted field breaks naive split(delimiter) logic. This tool implements an RFC 4180-compliant parser that correctly handles quoted fields containing delimiters, escaped double-quotes (""), and embedded newlines. It auto-detects the delimiter by scoring consistency of , ; \t | across sample rows, then transposes the row-major parsed matrix into column-major arrays. Output is generated with proper escaping for 10 target languages. The tool approximates type inference (numeric vs. string) but does not guarantee type safety for ambiguous values like 007 or locale-specific decimals (3,14 vs 3.14).
Formulas
The delimiter auto-detection algorithm scores each candidate delimiter d by computing the variance of field counts across sample rows. The delimiter with the lowest variance and highest consistency wins.
Where σ2(counts) is the variance of the number of fields per row when split by delimiter d, and is the mean field count. A perfect score occurs when every row produces the same number of fields (variance = 0), and the mean field count is maximized. The + 1 term prevents division by zero.
Column transposition converts row-major matrix M of dimensions r × c into c arrays of length r:
Where r = total data rows (excluding header if selected), c = maximum column count across all rows, and missing cells in ragged rows are filled with empty strings.
Reference Data
| Language | Array Syntax | String Quote | Numeric Handling | Trailing Comma |
|---|---|---|---|---|
| JavaScript | const arr = […] | Single or Double | Unquoted | Optional |
| TypeScript | const arr: string[] = […] | Single or Double | Unquoted | Optional |
| Python | arr = […] | Single or Double | Unquoted | Optional |
| PHP | $arr = […]; | Single or Double | Unquoted | Allowed |
| Ruby | arr = […] | Single or Double | Unquoted | Optional |
| Java | String[] arr = {…}; | Double only | Unquoted | Allowed |
| C# | string[] arr = {…}; | Double only | Unquoted | Allowed |
| Go | arr := []string{…} | Double only | Unquoted | Required |
| Swift | let arr: [String] = […] | Double only | Unquoted | Optional |
| Rust | let arr: Vec<&str> = vec![…]; | Double only | Unquoted | Optional |
| Delimiter Detection Scoring | ||||
| Comma (,) | RFC 4180 standard. Most common CSV delimiter worldwide. | |||
| Semicolon (;) | Common in European locales where comma is the decimal separator. | |||
| Tab (\t) | TSV format. Rarely appears inside field values. | |||
| Pipe (|) | Used in legacy systems and database exports. | |||
| Colon (:) | Uncommon. Found in /etc/passwd and some log formats. | |||
| RFC 4180 Edge Cases | ||||
| Quoted comma | "New York, NY" → single field: New York, NY | |||
| Escaped quote | "She said ""hi""" → She said "hi" | |||
| Embedded newline | "Line1\nLine2" → single field with newline | |||
| Empty field | a,,c → three fields, middle is empty string | |||
| Ragged rows | Rows with fewer columns padded with empty strings | |||