About

CSV parsing fails silently. A misplaced comma inside an unquoted field shifts every column to the right, corrupting downstream data pipelines and reports. This converter implements a full RFC 4180-compliant finite state machine parser that correctly handles quoted fields containing commas, embedded newlines (CRLF inside double quotes), and escaped quote characters ("" sequences). It does not split on commas naively. The parser transitions through four discrete states - FIELD_START, UNQUOTED, QUOTED, and QUOTE_IN_QUOTED - to produce a correct two-dimensional array regardless of field content.

Output formats include fixed-width aligned tables (column width w computed as max(cell lengths) per column), Markdown-compatible tables with pipe delimiters, tab-separated values for spreadsheet pasting, and key-value pair layouts for configuration files. The tool approximates real conversion assuming UTF-8 input and monospaced character widths. CJK or emoji characters will misalign in fixed-width modes because their display width is 2 columns, not 1.

Formulas

The CSV parser operates as a finite state machine with four states. Each character c at position i triggers a transition:

{

FIELD_START → QUOTED if c = """FIELD_START → UNQUOTED if c ≠ delimiterQUOTED → QUOTE_IN_QUOTED if c = """QUOTE_IN_QUOTED → QUOTED if c = """ (escaped quote)

For aligned table output, each column width w_j is computed across all m rows:

w_j = mmaxi=0 len(cell_i,j) + padding

Where w_j is the display width of column j, len returns the string length of the cell content, and padding is the user-configured extra spacing (default 2). Each cell is then padded using padEnd(w_j) for left alignment or padStart(w_j) for right alignment.

Reference Data

Output Format	Separator	Alignment	Use Case	Header Row	Border Characters
Plain Columns	Space(s)	Left	Quick visual inspection	Optional	None
Aligned Table	Pipe \|	Left-padded	Terminal / log output	Yes, with separator line	+ - \|
Markdown Table	Pipe \|	Left	Documentation, GitHub, wikis	Required (first row)	\| ---
Tab-Separated (TSV)	Tab \t	None	Spreadsheet paste, data exchange	Preserved	None
Key-Value Pairs	Colon :	Right-aligned keys	Config files, record display	Used as keys	None
Custom Template	User-defined	User-defined	Custom report generation	Optional	User-defined
JSON Lines	N/A	N/A	Streaming data, log ingestion	Used as keys	{ }
HTML Table	N/A	N/A	Web embedding, emails	Uses <th>	HTML tags
Fixed Width	None (padding)	Left or Right	Mainframe, legacy systems	Optional	None
SQL INSERT	Comma	N/A	Database seeding	Used as column names	( )
XML Records	N/A	N/A	Enterprise data exchange	Used as tag names	< >
YAML	Colon :	Indented	Configuration, DevOps	Used as keys	-

Frequently Asked Questions

The parser uses a finite state machine compliant with RFC 4180. When it encounters an opening double quote at the start of a field, it enters the QUOTED state. In this state, commas and newline characters (both \n and \r\n) are treated as literal field content, not as delimiters or row terminators. The field only ends when an unescaped closing double quote is found followed by a delimiter, newline, or end of input. Escaped quotes (two consecutive double quotes "") are collapsed into a single literal quote character.

The converter normalizes all rows to the maximum column count found in the dataset. Short rows are padded with empty strings. For example, if row 1 has 5 columns and row 3 has only 3 columns, the output will treat row 3 as having 2 trailing empty cells. This prevents misalignment in fixed-width and table outputs. A warning indicator appears when inconsistent column counts are detected.

Many European locales use semicolons as delimiters because the comma serves as the decimal separator (e.g., 3,14 for pi). Select "Semicolon" from the delimiter dropdown. The parser also supports auto-detection: it counts occurrences of common delimiters (comma, semicolon, tab, pipe) in the first 5 lines and selects the most frequent one. Auto-detection accuracy exceeds 95% for well-formed files.

The output follows GitHub Flavored Markdown (GFM) table syntax. The first row becomes the header, followed by a separator row using --- per column. All columns are left-aligned by default. You can switch alignment per column if needed. Note that GFM requires at least three dashes per column in the separator row, and this tool always produces exactly three. Pipes on the outer edges are included for maximum compatibility across parsers.

The tool accepts files up to 10 MB. Parsing a 1 MB CSV (approximately 20,000 rows with 10 columns) completes in under 200 ms on modern hardware. Files between 5 and 10 MB trigger a progress indicator. The parser is single-pass with O(n) time complexity where n is the character count. Memory usage is approximately 2× the input size due to storing both the raw text and the parsed array structure.

The FileReader API reads files as UTF-8 by default. If your file uses a different encoding (e.g., Windows-1252 or ISO-8859-1), special characters like accented letters may display as replacement characters (U+FFFD). For best results, convert your file to UTF-8 before uploading. Most text editors (Notepad++, VS Code) can re-save with UTF-8 encoding. The tool displays a warning if replacement characters are detected in the parsed output.