CSV to Matrix Converter
Convert CSV data to matrix format instantly. Export as JSON, NumPy, MATLAB, LaTeX, C array, R matrix, and Markdown table formats.
About
Incorrect delimiter detection or unescaped quote handling during CSV-to-matrix conversion silently corrupts data. A single misaligned column shifts every downstream calculation. This tool implements a full RFC 4180-compliant state-machine parser that correctly resolves quoted fields containing commas, newlines, and escaped double-quotes. It auto-detects delimiters among comma, semicolon, tab, and pipe by frequency analysis of the first 10 rows. Output dimensions are validated as m × n rectangular matrices before export. Non-rectangular inputs are flagged and padded or truncated per user preference.
The converter exports to 8 target formats: JSON 2D array, Python NumPy, MATLAB, LaTeX bmatrix, Markdown table, C/C++ array initializer, R matrix(), and Wolfram Mathematica. Note: this tool treats all cell values as-is. Numeric validation is optional. Floating-point values using comma decimals (European notation) require semicolon or tab delimiters to avoid ambiguity.
Formulas
The delimiter auto-detection algorithm scores each candidate delimiter by computing its frequency consistency across the first k rows (default k = 10):
Where d is the candidate delimiter, σd is the standard deviation of delimiter count per row, and d is the mean count. A perfect CSV has σ = 0, yielding maximum score. The delimiter with the highest score wins.
Matrix dimensions are reported as m × n where m is the row count and n is the column count. For rectangular validation, the tool checks that len(rowi) = n for all i ∈ [0, m). Padding mode fills short rows with empty strings. Truncate mode clips to min column count.
The CSV parser uses a finite-state machine with 3 states: FIELD_START, UNQUOTED, and QUOTED. Transitions occur on delimiter character, quote character, or newline. The QUOTED state handles escaped quotes ("") by checking the next character before transitioning.
Reference Data
| Output Format | Language / System | Syntax Pattern | Numeric Only | Supports Strings | Max Practical Size |
|---|---|---|---|---|---|
| JSON 2D Array | JavaScript / Universal | [[1,2],[3,4]] | No | Yes | ~50 MB |
| NumPy | Python | np.array([[1,2],[3,4]]) | Recommended | dtype=object | ~100k×100k |
| MATLAB | MATLAB / Octave | [1 2; 3 4] | Yes | No (cell array) | ~10k×10k |
| LaTeX bmatrix | LaTeX / TeX | \begin{bmatrix}...\end{bmatrix} | Recommended | Yes | ~50×50 (display) |
| Markdown Table | Markdown / GitHub | | a | b | | No | Yes | ~1000 rows |
| C/C++ Array | C / C++ | int m[2][2] = {{1,2},{3,4}}; | Yes | No | Stack: ~1k×1k |
| R matrix() | R | matrix(c(1,2,3,4), nrow=2, byrow=TRUE) | Recommended | Yes | ~50k×50k |
| Wolfram | Mathematica | {{1,2},{3,4}} | No | Yes | ~10k×10k |
| CSV (Comma) | Universal | 1,2\n3,4 | No | Yes | ~500 MB |
| TSV (Tab) | Spreadsheet | 1\t2\n3\t4 | No | Yes | ~500 MB |
| Delimiter Frequency | Auto-detection ranks: comma > semicolon > tab > pipe by occurrence in first 10 rows | ||||
| RFC 4180 Rule 1 | Each record is on a separate line, delimited by a line break (CRLF) | ||||
| RFC 4180 Rule 2 | Last record may or may not have an ending line break | ||||
| RFC 4180 Rule 3 | First record may be a header (optional flag in this tool) | ||||
| RFC 4180 Rule 4 | Fields may be enclosed in double quotes; fields containing delimiters must be quoted | ||||
| RFC 4180 Rule 5 | Double quotes inside quoted fields are escaped by doubling: "" | ||||