ASCII from Regex Creator
Convert regular expressions into ASCII art railroad diagrams. Visualize regex structure with box-drawing characters for documentation and debugging.
About
Regular expressions compress powerful pattern logic into terse syntax. A production regex like ^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$ is functionally correct but visually opaque. Misreading a single quantifier or group boundary causes silent match failures in validation pipelines, data extraction, and routing logic. This tool parses any regex string into an Abstract Syntax Tree and renders it as an ASCII railroad diagram using box-drawing characters (┌─┐│└─┘). The output is plain-text, paste-ready for code comments, README files, terminal output, or any monospace environment where images are impractical.
The parser handles capturing and non-capturing groups, character classes with ranges, nested quantifiers, alternation branches, escape sequences, and anchors. Note: this tool approximates visualization for readability. Lookaheads, lookbehinds, and Unicode property escapes (\p{}) are displayed as labeled nodes but not expanded internally. The diagram uses a left-to-right flow convention consistent with standard railroad diagram notation (ISO/IEC 14977 EBNF visual style).
Formulas
The regex-to-ASCII conversion follows a two-phase pipeline: parsing and rendering.
The parser implements a recursive descent strategy with operator precedence. At the top level, an expression is a sequence of alternatives separated by |. Each alternative is a sequence of terms. Each term is an atom optionally followed by a quantifier.
Alternative → Term*
Term → Atom Quantifier?
Atom → Literal | Dot | CharClass | Group | Anchor | Escape
The rendering phase walks the AST and assigns each node a bounding box measured in character cells. Width w of a sequence node equals the sum of child widths plus connector characters. For alternation, height h equals the sum of branch heights plus separator lines.
Where wi is the rendered width of the i-th child node, and the additional 1 accounts for the connecting ─ character between boxes. Quantifier suffixes add 2 - 6 characters depending on notation length (e.g., {2,5} adds 5 characters).
Reference Data
| Regex Token | Symbol | ASCII Representation | Description |
|---|---|---|---|
| Literal | a | ─[ a ]─ | Matches exact character |
| Dot | . | ─[ . ANY ]─ | Matches any character except newline |
| Character Class | [a-z] | ─[ a-z ]─ | Matches one character from set |
| Negated Class | [^0-9] | ─[ ^0-9 ]─ | Matches any character NOT in set |
| Capturing Group | (abc) | ─┤ Group #1 ├─ | Captures matched substring |
| Non-capturing Group | (?:abc) | ─┤ Group ├─ | Groups without capturing |
| Alternation | a|b | ┬─[ a ]─┬└─[ b ]─┘ | Matches either branch |
| Zero or More | a* | ─[ a ]─⟲* | Matches 0 or more times |
| One or More | a+ | ─[ a ]─⟲+ | Matches 1 or more times |
| Optional | a? | ─[ a ]─? | Matches 0 or 1 time |
| Exact Count | a{3} | ─[ a ]─{3} | Matches exactly n times |
| Range Count | a{2,5} | ─[ a ]─{2,5} | Matches n to m times |
| Start Anchor | ^ | ─[ ^ START ]─ | Asserts start of string/line |
| End Anchor | $ | ─[ $ END ]─ | Asserts end of string/line |
| Word Boundary | \b | ─[ \b BOUNDARY ]─ | Asserts word boundary position |
| Digit | \d | ─[ \d 0-9 ]─ | Shorthand for [0-9] |
| Word Char | \w | ─[ \w a-zA-Z0-9_ ]─ | Shorthand for [a-zA-Z0-9_] |
| Whitespace | \s | ─[ \s SPACE ]─ | Matches whitespace characters |
| Lookahead | (?=abc) | ─┤ ?= LOOK ├─ | Positive lookahead assertion |
| Neg. Lookahead | (?!abc) | ─┤ ?! LOOK ├─ | Negative lookahead assertion |
| Lookbehind | (?<=abc) | ─┤ ?<= LOOK ├─ | Positive lookbehind assertion |
| Lazy Quantifier | a*? | ─[ a ]─⟲*? | Matches minimum possible times |