About

Transferring tabular data from spreadsheets into web pages introduces encoding errors, broken markup, and inconsistent delimiters more often than most developers expect. Excel alone uses three different clipboard formats depending on OS version, and CSV files from European locales swap commas for semicolons - a single misidentified delimiter corrupts every row. This converter parses raw clipboard or file input using RFC 4180-compliant logic, auto-detects the delimiter via frequency analysis, escapes all HTML entities to prevent XSS injection, and outputs clean semantic <table> markup ready for production. It handles quoted fields, embedded newlines, and cells containing special characters like &, <, and > without corruption.

Limitations: this tool processes plain-text representations of spreadsheet data. It does not parse binary .xlsx or .ods formats directly - paste from your spreadsheet application or export to CSV/TSV first. Maximum practical file size is around 50MB depending on browser memory. Pro tip: when pasting from Excel, the clipboard already contains tab-separated values - no export step is needed.

Formulas

Delimiter auto-detection uses frequency analysis. For each candidate delimiter d in the set {\t, ,, ;, |}, the algorithm counts occurrences per row and computes a consistency score:

score(d) = rows with identical count of dtotal rows

The delimiter with the highest score and a count ≥ 1 per row wins. Ties are broken by priority order: tab > comma > semicolon > pipe.

CSV field parsing follows RFC 4180 as a finite state machine with three states: FIELD_START, QUOTED, UNQUOTED. Transitions occur on characters: delimiter d, double-quote ", newline \n, and any other character. Within QUOTED state, two consecutive quotes "" emit a literal quote. This handles embedded delimiters, newlines inside cells, and escaped quotes correctly.

HTML entity escaping applies the replacement chain: & → &, then < → <, then > → >, then " → ", then ' → '. Order matters: ampersand must be first to avoid double-escaping.

Reference Data

Delimiter	Symbol	Common Source	Auto-Detected	RFC Standard	Notes
Tab	\t	Excel, Google Sheets (paste)	Yes	IANA TSV	Default clipboard format for spreadsheets
Comma	,	CSV exports (US/UK locale)	Yes	RFC 4180	Fields with commas must be quoted
Semicolon	;	CSV exports (EU locale)	Yes	-	Excel uses this in German, French, etc.
Pipe	\|	Database exports, logs	Yes	-	Rare in user data, low false-positive rate
Double Quote	"	Field enclosure	N/A	RFC 4180	Escaped as "" inside fields
Newline (CRLF)	\r\n	Windows systems	Normalized	RFC 4180	Converted to \n internally
Newline (LF)	\n	Unix/macOS systems	Normalized	-	Primary line terminator
HTML Entity: &	&	User cell data	Escaped	HTML5	Prevents broken markup
HTML Entity: <	<	User cell data	Escaped	HTML5	Prevents XSS / tag injection
HTML Entity: >	>	User cell data	Escaped	HTML5	Prevents broken markup
HTML Entity: "	"	User cell data	Escaped	HTML5	Safe attribute values
HTML Entity: '	'	User cell data	Escaped	HTML5	Safe attribute values
UTF-8 BOM	\uFEFF	Excel CSV export	Stripped	Unicode	Invisible character at file start
Empty Row	-	Trailing newlines	Trimmed	-	Trailing empty rows removed from output
thead Generation	-	First row option	Toggle	HTML5	First row wrapped in <thead>

Frequently Asked Questions

The parser implements RFC 4180 as a finite state machine. When a field begins with a double-quote, all characters - including commas, tabs, and newlines - are treated as literal cell content until a closing unescaped double-quote is encountered. Two consecutive double-quotes inside a quoted field emit a single literal quote character. This means a CSV cell like "San Francisco, CA" correctly becomes one table cell, not two.

Auto-detection uses frequency consistency analysis and is accurate for well-formed data. However, if your data has irregular delimiters or mixed formats, you can override the auto-detection by manually selecting the delimiter from the dropdown. The tool supports tab, comma, semicolon, and pipe as explicit choices. After changing the delimiter, the data is re-parsed and the preview updates immediately.

Yes. When you copy cells from Excel, LibreOffice Calc, or Google Sheets, the clipboard contains tab-separated plain text. The converter auto-detects the tab delimiter and parses it correctly. This is the fastest workflow: select cells in your spreadsheet, press Ctrl+C, click the paste area in this tool, and press Ctrl+V. The preview renders instantly.

By default the output is clean semantic HTML using only , , , , and elements with no inline styles, classes, or data attributes. You can optionally enable an "Add ID" option that places an id attribute on the element. This produces the most portable markup that works with any CSS framework or custom stylesheet.

All cell content is HTML-entity-escaped before output. Ampersands become &, angle brackets become < and >, double quotes become ", and single quotes become '. This prevents XSS injection and ensures the generated HTML is valid. The escaping order is critical - ampersands are escaped first to avoid double-encoding.

There is no hard-coded limit. The tool uses chunked rendering with requestAnimationFrame to handle large datasets without freezing the browser. Practically, files up to 50 MB or around 100,000 rows perform well in modern browsers. Beyond that, browser memory becomes the bottleneck. For extremely large files, consider splitting them or processing server-side.