About

Command-line tools, database clients, and documentation systems routinely output tabular data as ASCII art - bordered with pipes (|), dashes (−), and plus signs (+). Importing that data into spreadsheets, databases, or ETL pipelines requires stripping the decorative characters and converting to a machine-readable delimiter format. Manual cleanup with find-and-replace is error-prone: misaligned columns, phantom whitespace, and dropped rows corrupt datasets silently. This tool parses the structural geometry of the ASCII table - detecting column boundaries from pipe positions, classifying separator rows via pattern matching against /^[+\-=|:\s]+$/, and extracting trimmed cell values into clean TSV output.

Supported input formats include MySQL CLI output, PostgreSQL psql formatting, Markdown pipe tables, and reStructuredText grid tables. The parser handles irregular spacing, mixed separator styles, and tables with or without explicit header separators. Limitation: space-aligned tables without any pipe delimiters require at least two consistent whitespace gaps between columns for reliable detection. Pro tip: if your source table uses Unicode box-drawing characters (─, │, ┌), replace them with ASCII equivalents before pasting.

Formulas

The parser operates in three stages. First, line classification assigns each input line a type:

classify(line) =

{

SEPARATOR if line matches /^[+\-=|:\s]+$/EMPTY if trim(line) = ""DATA otherwise

Second, column boundary detection identifies the character indices of all pipe delimiters across data rows:

B = { i : line[i] = "|" , ∀ line ∈ DATA }

The intersection of boundary sets across all data rows yields stable column positions. Third, cell extraction uses boundaries to slice each data line:

cell_j = trim(line.substring(B_j + 1, B_j+1))

Where B_j is the j-th boundary index. Output encoding joins cells with a horizontal tab character (U+0009) and rows with a newline (U+000A). Cells containing literal tabs are escaped to preserve TSV structural integrity.

Reference Data

Format	Separator Row Pattern	Column Delimiter	Header Indicator	Example Source
MySQL CLI	+−−−+−−−+	\|	First separator after header row	mysql −e "SELECT..."
PostgreSQL psql	−−−−+−−−−	\|	Separator below header	psql −c "SELECT..."
Markdown Pipe	\| −−− \| −−− \|	\|	Second row is always separator	GitHub, GitLab, Reddit
reStructuredText Grid	+===+===+	\|	= line vs − line	Sphinx documentation
reStructuredText Simple	=== ===	Whitespace (≥2 spaces)	Double = lines	Sphinx documentation
Org-mode	\|−−−+−−−\|	\|	Separator row with −	Emacs Org tables
JIRA/Confluence	None (double \|\|)	\|\| for header, \| for data	Double-pipe cells	Atlassian products
ASCII Box Drawing	┌───┬───┐	│	Top/bottom borders	Various CLI tools
Space-aligned	None	≥2 consecutive spaces	First row assumed header	df −h, ps aux
TSV (passthrough)	None	Tab character (\t)	First row	Already converted
CSV (common confusion)	None	Comma (,)	First row	Not supported - use CSV tool
Tab Width Standard	N/A	U+0009 (1 char)	N/A	TSV specification (IANA)
TSV MIME Type	N/A	N/A	N/A	text/tab-separated-values
Max Safe Columns	N/A	N/A	N/A	Excel: 16384, Sheets: 18278
Max Safe Rows	N/A	N/A	N/A	Excel: 1048576

Frequently Asked Questions

The parser identifies the first DATA-type line as the header row. If a SEPARATOR row immediately follows the first DATA row, it confirms the header boundary. All subsequent DATA rows become body rows. In Markdown format, the second line is always treated as a separator per the CommonMark specification, regardless of its content pattern.

The parser uses the row with the maximum number of detected columns as the canonical column count. Rows with fewer columns are padded with empty cells on the right. Rows with more columns (rare, usually from malformed input) have excess columns merged into the last cell. A warning toast notifies you of the inconsistency.

No. All cell values are trimmed of leading and trailing whitespace. ASCII tables use padding spaces for visual alignment - these are decorative, not semantic. If your data genuinely contains significant whitespace, you should use a format that supports quoting (like CSV with double-quotes) rather than ASCII tables.

Merged cells (spanning multiple columns) are not supported because ASCII table formats have no standardized merge syntax. Multi-line cells (where a single logical cell spans two physical lines) are partially supported: consecutive DATA lines between separators that share the same pipe positions are treated as continuation lines and their content is concatenated with a space.

TSV avoids the quoting ambiguity inherent in CSV. In CSV, fields containing commas, quotes, or newlines must be wrapped in double-quotes, and embedded quotes must be escaped by doubling them. TSV uses the tab character (U+0009) as the delimiter, which rarely appears in natural data. This makes TSV simpler to parse, less error-prone, and directly pasteable into spreadsheet applications.

The converter automatically normalizes common Unicode box-drawing characters to their ASCII equivalents before parsing. Specifically, │ (U+2502) maps to | (U+007C), ─ (U+2500) maps to - (U+002D), and corner/junction characters (┌┐└┘├┤┬┴┼) map to + (U+002B). This normalization happens transparently - no manual replacement is needed.