About

Raw CSV output is unreadable without tooling. Misaligned columns cause data misinterpretation, especially in logs, reports, and terminal output where monospaced formatting is the only visual structure available. This converter parses CSV input using an RFC 4180-compliant state machine that correctly handles quoted fields containing commas, escaped double-quotes (""), and embedded newlines. It then calculates the maximum character width w_max for each column and pads every cell to produce perfectly aligned fixed-width text output.

The tool auto-detects the input delimiter by frequency analysis across the first 20 rows, scoring candidates (comma, tab, semicolon, pipe) by consistency. Output alignment is configurable per-column: left, right, or center. Note: alignment assumes a monospaced font. Proportional fonts will break visual alignment regardless of padding. For datasets exceeding 10,000 rows, consider splitting the file first - browser memory constraints apply.

Formulas

Column width calculation determines the padding required for alignment. For each column j in a dataset of m columns and n rows:

w_j = maxi=1..n len(cell_i,j)

The padded output width for each cell includes a gutter g (user-configurable, default 2 spaces):

output_i,j = pad(cell_i,j, w_j + g, align)

Where pad applies left, right, or center spacing. For center alignment, left padding = floor((w_j − len(cell)) ÷ 2) and right padding absorbs the remainder.

Delimiter auto-detection scores each candidate d by computing the standard deviation σ of field counts per row. The delimiter with σ = 0 (perfectly consistent column count) and the highest median field count wins:

score(d) = median_fields(d)1 + σ(d)

Where median_fields(d) is the median number of fields per row when split by delimiter d, and σ(d) is the standard deviation of field counts across sampled rows.

Reference Data

Delimiter	Character	Common Use	Auto-Detect Priority	RFC Standard
Comma	,	General CSV (spreadsheets, exports)	1	RFC 4180
Tab	\t	TSV files, database exports	2	IANA TSV
Semicolon	;	European CSV (locale uses comma as decimal)	3	De facto
Pipe	\|	Unix utilities, log files	4	De facto
Space	\s	Fixed-width legacy formats	5	None
Colon	:	/etc/passwd, config files	6	None
Output Padding Modes
Left-align	Cell padded with trailing spaces: value···
Right-align	Cell padded with leading spaces: ···value
Center-align	Cell padded equally on both sides: ·value··
RFC 4180 Quoting Rules
Rule 1	Fields containing delimiters, quotes, or newlines must be enclosed in double-quotes
Rule 2	Double-quote inside a quoted field is escaped as ""
Rule 3	Leading/trailing whitespace inside quotes is preserved
Rule 4	CRLF (\r\n) is the standard line ending; LF (\n) is accepted
Common CSV Encoding Issues
BOM	UTF-8 BOM (0xEF 0xBB 0xBF) at file start causes phantom characters if not stripped
Encoding	Non-UTF-8 files (Windows-1252, ISO-8859-1) may produce garbled output
Trailing delimiter	Some exports append a trailing comma, creating an empty ghost column

Frequently Asked Questions

The parser samples the first 20 rows and tests each candidate delimiter (comma, tab, semicolon, pipe). For each candidate, it counts fields per row and calculates the standard deviation. A delimiter that produces a consistent field count (standard deviation of 0) with the highest median field count is selected. If all candidates produce inconsistent splits, comma is used as the RFC 4180 default.

Fixed-width column alignment depends on a monospaced font (e.g., Courier New, Consolas, Fira Code) where every character occupies equal horizontal space. Proportional fonts like Arial or Times New Roman render characters at varying widths, destroying alignment. Always paste output into a monospaced context: terminal, code editor, or a

 block.

The parser implements RFC 4180 quoting rules. A field wrapped in double-quotes preserves internal commas, newlines (CR, LF, CRLF), and literal double-quotes (escaped as ""). The outer quotes are stripped during parsing. If a field contains an odd number of unescaped quotes, the parser treats the remainder as a continuation until a closing quote is found or the input ends.

The converter normalizes all rows to the maximum column count found in the dataset. Short rows are padded with empty cells. This prevents alignment errors where a missing trailing field would shift subsequent columns. The reference column count is derived from the header row (first row) if "First row is header" is enabled, otherwise from the maximum across all rows.

The parser processes input as a single string in memory. For files under 5 MB, performance is near-instant. Between 5-20 MB, expect a brief processing delay shown via a progress indicator. Files exceeding 20 MB may cause browser memory pressure. For very large datasets, split the file into chunks externally before converting. The tool displays a warning if input exceeds 10 MB.

Yes. A UTF-8 Byte Order Mark (bytes 0xEF 0xBB 0xBF) prepended to a file creates an invisible character before the first field, which corrupts header detection and alignment of the first column. This converter automatically strips the BOM during parsing. If your source file uses a different encoding (Windows-1252, ISO-8859-1), non-ASCII characters may display incorrectly since the FileReader defaults to UTF-8.