About

Inserting a column at position zero in a CSV file sounds trivial until your data contains quoted fields with embedded commas, newlines inside cells, or escaped double-quotes. A naive find-and-replace corrupts the file. This tool implements a full RFC 4180 parser that tokenizes each record correctly before inserting the new column value at the specified index. It handles edge cases most spreadsheet exports produce: fields wrapped in DQUOTE characters, CRLF sequences inside quoted text, and mixed delimiters. All processing runs client-side in your browser. No data leaves your machine.

The prepend value supports three modes: a static string applied uniformly, an auto-incrementing row index (starting from 0 or 1), or a pattern with a {n} placeholder for sequential numbering. The tool auto-detects your delimiter by analyzing character frequency across the first 5 lines. Accuracy depends on consistent formatting. If your source file uses mixed delimiters, results will be unreliable. Fix the source first.

Formulas

The delimiter auto-detection algorithm scores each candidate delimiter d by computing a consistency metric across the first N lines:

score(d) = 11 + σ_d × c_d

Where σ_d is the standard deviation of delimiter d count per line (lower is better - consistent column count), and c_d is the mean occurrence count. The delimiter with the highest score wins. A score of 0 means the character never appeared.

Column prepend operation per record R_i:

R_i′ = [newValue(i)] + R_i

Where newValue(i) resolves to the static string, row index, or pattern result depending on mode. If the value itself contains the active delimiter or quotes, it is automatically wrapped in DQUOTE per RFC 4180 before serialization.

Reference Data

Delimiter	Common Name	Character Code	Typical Source	Auto-Detected
,	Comma	U+002C	Excel (EN), Google Sheets	Yes
;	Semicolon	U+003B	Excel (EU locales)	Yes
\t	Tab (TSV)	U+0009	Database exports, Unix tools	Yes
\|	Pipe	U+007C	Legacy mainframe systems	Yes
:	Colon	U+003A	/etc/passwd, config files	No
^	Caret	U+005E	SAS transport files	No
~	Tilde	U+007E	EDI / X12 formats	No
RFC 4180 Quoting Rules
Field contains delimiter		Wrap entire field in double quotes: "hello, world"
Field contains double quote		Escape with double-double quote: "say ""hi"""
Field contains newline		Wrap in double quotes; parser must track open/close state
Leading/trailing whitespace		Not trimmed per RFC 4180; preserved as-is
Empty field		Two consecutive delimiters: a,,c → 3 fields
BOM marker (U+FEFF)		Stripped automatically by this tool if present at byte 0

Frequently Asked Questions

The parser implements a finite-state machine per RFC 4180. It tracks whether the current position is inside a quoted field by monitoring DQUOTE open/close transitions. A comma encountered inside an open quote state is treated as literal text, not a delimiter. Likewise, CRLF sequences inside quotes do not trigger a new record. Escaped quotes (two consecutive DQUOTEs) are collapsed to a single quote character in the parsed output.

The auto-detection algorithm scores consistency across lines. If your data fields contain semicolons, the count-per-line variance (σ) rises, lowering the score for semicolon as a delimiter. In ambiguous cases, override the auto-detected delimiter manually using the delimiter selector. Always verify the preview table shows correct column alignment before downloading.

Yes. The insertion index field accepts any zero-based column position from 0 (first) up to the current column count (append as last). If the index exceeds the number of existing columns, the value is appended at the end. The header row and data rows use the same insertion index.

Yes. When row numbering is active, the header row receives the header name you specify (e.g., "Row" or 'ID'). Data rows begin numbering from your configured start value (default 1). If you set the start value to 0, the first data row is labeled 0.

Files under 5 MB process near-instantly. Files between 5 - 50 MB are processed in batches using asynchronous chunking to keep the UI responsive. Above 50 MB, browser memory constraints may cause issues depending on your device. The tool shows a progress indicator for large files.

The tool detects and strips the UTF-8 BOM character (U+FEFF) if present at the start of the input. This prevents the BOM from being treated as part of the first field name, which commonly causes invisible bugs in downstream processing.