About

Terminal output often contains ANSI escape sequences - byte patterns like ESC[31m that encode color, cursor position, and text style for display in compatible emulators. These sequences are invisible in a terminal but corrupt output when pasted into logs, documentation, or non-terminal applications. Manually removing them is error-prone because sequences vary in length from 3 to 20+ bytes and span dozens of categories defined in ECMA-48 and ISO 6429. This tool parses all CSI (Control Sequence Introducer), OSC (Operating System Command), and single-character escape patterns using a compliant finite-state recognizer, producing clean ASCII output with zero residual control bytes.

Incorrect stripping - such as a naive regex that only targets SGR codes - will leave cursor movement artifacts or partial sequences that break downstream parsers. This converter handles the full ECMA-48 specification including CUU, CUD, ED, EL, and private-mode sequences (DEC origin mode, bracketed paste). It approximates terminal rendering by optionally interpreting erase and cursor commands rather than simply deleting them. Limitation: it does not reconstruct full terminal screen state for applications that use alternate screen buffers (e.g., vim, less).

Formulas

The converter operates as a lexical scanner over the input byte stream. Every character is classified into one of three categories: printable ASCII (retained), escape sequence initiator (triggers sequence consumption), or standalone control character (removed or interpreted).

Primary strip pattern (CSI sequences):

ESC [ P₁ ; P₂ ; ... ; P_n F

Where ESC = 0x1B (escape byte), P₁...P_n are parameter bytes in range 0x30 - 0x3F, intermediate bytes in range 0x20 - 0x2F, and F is the final byte in range 0x40 - 0x7E that determines the command.

Regex (JavaScript):

pattern = /[\x1B\x9B][[\]()#;?]*(([\d{1,4}](;[\d{0,4}])*)?[0-9A-PRZcf-nqry=>])/g

The full implementation extends this to also capture OSC sequences (ESC] through ST or BEL), single-character escapes (ESC followed by one byte in 0x40 - 0x5F), character set designations (ESC(, ESC)), and C0/C1 control characters (0x00 - 0x08, 0x0E - 0x1A, 0x7F, 0x80 - 0x9F). Tab (0x09), newline (0x0A), and carriage return (0x0D) are preserved as meaningful whitespace.

Reference Data

Sequence Type	Pattern	Example	Function	ECMA-48 Section
SGR (Color/Style)	ESC[nm	ESC[31m	Set foreground red	§8.3.117
SGR Reset	ESC[0m	ESC[0m	Reset all attributes	§8.3.117
SGR Bold	ESC[1m	ESC[1m	Bold / increased intensity	§8.3.117
SGR 256-Color	ESC[38;5;nm	ESC[38;5;82m	Set FG to palette color 82	Extended (xterm)
SGR 24-bit Color	ESC[38;2;r;g;bm	ESC[38;2;255;128;0m	Set FG to RGB	Extended (ISO-8613-6)
Cursor Up (CUU)	ESC[nA	ESC[3A	Move cursor up 3 rows	§8.3.22
Cursor Down (CUD)	ESC[nB	ESC[1B	Move cursor down 1 row	§8.3.19
Cursor Forward (CUF)	ESC[nC	ESC[5C	Move cursor right 5 cols	§8.3.20
Cursor Back (CUB)	ESC[nD	ESC[2D	Move cursor left 2 cols	§8.3.18
Cursor Position (CUP)	ESC[r;cH	ESC[10;20H	Move cursor to row 10, col 20	§8.3.21
Erase in Display (ED)	ESC[nJ	ESC[2J	Clear entire screen	§8.3.39
Erase in Line (EL)	ESC[nK	ESC[0K	Clear from cursor to end of line	§8.3.41
Scroll Up (SU)	ESC[nS	ESC[1S	Scroll up 1 line	§8.3.147
Scroll Down (SD)	ESC[nT	ESC[1T	Scroll down 1 line	§8.3.113
Save Cursor (DECSC)	ESC 7	ESC7	Save cursor position	DEC Private
Restore Cursor (DECRC)	ESC 8	ESC8	Restore cursor position	DEC Private
OSC (Title Set)	ESC]n;...ST	ESC]0;titleBEL	Set terminal window title	§8.3.89
Charset Select (G0)	ESC(C	ESC(B	Set G0 charset to ASCII	§8.3.35
Private Mode Set	ESC[?nh	ESC[?25h	Show cursor (DECTCEM)	DEC Private
Private Mode Reset	ESC[?nl	ESC[?25l	Hide cursor	DEC Private
Hyperlink (OSC 8)	ESC]8;;urlST	ESC]8;;https://...ST	Terminal hyperlink	Extended (iTerm2)
C0 Control: BEL	0x07	\a	Terminal bell / string terminator	§8.3.3
C0 Control: BS	0x08	\b	Backspace	§8.3.5
C0 Control: DEL	0x7F	-	Delete character	§8.3.27
C1: CSI (8-bit)	0x9B	-	8-bit CSI introducer	§8.3.16

Frequently Asked Questions

Stripping removes all escape sequences from the byte stream, leaving only printable characters in their original order. Interpreting attempts to reconstruct what the terminal would have displayed by processing cursor movement (CUU, CUD, CUF, CUB), erase commands (ED, EL), and carriage returns. For simple colored text, both produce identical output. For progress bars or overwritten lines (common in npm install or wget output), interpretation produces cleaner results because it resolves overlapping writes.

This typically occurs when the input has been double-encoded or when the source file uses 8-bit C1 control codes (bytes 0x80-0x9F) that your text editor interpreted as UTF-8 multi-byte characters. The converter handles both 7-bit (ESC + [) and 8-bit (0x9B) CSI introducers. If you still see artifacts, the input may contain non-ANSI escape sequences from a proprietary terminal emulator. Toggle the "Aggressive mode" option to also strip any non-ASCII bytes.

Yes. SGR sequences with parameters 38;2;r;g;b (foreground) and 48;2;r;g;b (background) are fully matched and stripped. These sequences can be up to 19 bytes long (e.g., ESC[38;2;255;255;255m), and the parser's parameter-matching logic handles arbitrary-length semicolon-separated numeric parameters without a fixed upper bound.

In strip mode, CR (0x0D) and BS (0x08) are preserved as-is since they are standard ASCII control characters, not ANSI escape sequences. In interpret mode, CR resets the write position to column 0 of the current line (enabling overwrite simulation), and BS moves the write position back by one column. This correctly handles terminal spinners and progress indicators that use CR to redraw lines.

The converter processes text entirely in the browser. For files up to approximately 50 MB, it operates without issue on modern hardware. Beyond that, memory pressure in the JavaScript heap may cause slowdowns. The tool processes in chunks and provides a progress indicator for files exceeding 1 MB. For binary files, enable "Strip non-printable" to remove all bytes outside the range 0x20-0x7E (plus tab, newline, CR), which effectively extracts readable strings.

The parser follows ECMA-48 (5th Edition, 1991), which is identical to ISO/IEC 6429. It covers all C0 control codes (0x00-0x1F), C1 control codes (0x80-0x9F and their 7-bit ESC-based equivalents), CSI sequences with parameters and intermediates, and independent control functions. It additionally handles widely-adopted extensions not in ECMA-48: xterm 256-color and true-color SGR, OSC 8 hyperlinks, and bracketed paste mode (DEC private mode 2004).