Rich Text to Markdown Converter
Convert rich text or HTML to clean Markdown instantly. Paste formatted content from any source and get GitHub Flavored Markdown output.
About
Markdown remains the de facto standard for technical documentation, README files, and content management systems. Yet most content originates in rich-text editors, email clients, or web pages that output HTML. Manual conversion introduces errors: broken link references, lost heading hierarchy, mangled nested lists. This tool parses the HTML DOM tree recursively, mapping each node to its Markdown equivalent using GFM (GitHub Flavored Markdown) rules. It handles edge cases such as nested blockquotes, mixed list types, inline code within headings, and HTML tables with alignment. The converter strips unsafe elements (script, style, event attributes) before processing. Paste rich text directly from Google Docs, Confluence, Notion, or any browser source. Conversion accuracy depends on the semantic quality of the source HTML. Purely visual formatting (e.g., font-size without heading tags) cannot be inferred.
Formulas
The converter operates as a recursive DOM tree walker. Each node N is evaluated by type and tag name, producing a Markdown string M.
where children(N) = n∑i=0 convert(N.childNodes[i]) concatenates all child conversions. The rule function maps tag names to Markdown wrappers. For example, rule("STRONG", c) = "**" + c + "**". List depth d determines indentation: prefix = " ".repeat(d). Table alignment is read from style.textAlign or align attribute and mapped to separator patterns: :--- (left), :---: (center), ---: (right).
Reference Data
| HTML Element | Markdown Output | GFM Extension | Notes |
|---|---|---|---|
| <h1> - <h6> | # to ###### | No | ATX-style headings |
| <strong> / <b> | **text** | No | Bold wrapping |
| <em> / <i> | *text* | No | Italic wrapping |
| <del> / <s> | ~~text~~ | Yes | Strikethrough |
| <a href> | [text](url) | No | Title attr preserved |
| <img> |  | No | Alt text required |
| <ul> / <li> | - item | No | Nested with 4-space indent |
| <ol> / <li> | 1. item | No | Sequential numbering |
| <input type=checkbox> | - [x] / - [ ] | Yes | Task lists inside <li> |
| <blockquote> | > text | No | Nested with >> |
| <code> | `code` | No | Inline code |
| <pre><code> | ```lang ... ``` | Yes | Fenced code blocks; lang from class |
| <table> | Pipe table | Yes | Alignment via :--- syntax |
| <hr> | --- | No | Thematic break |
| <br> | Two trailing spaces or <br> | No | Configurable |
| <p> | Double newline | No | Paragraph separation |
| <sub> | <sub>text</sub> | No | Passed through as HTML |
| <sup> | <sup>text</sup> | No | Passed through as HTML |
| <abbr> | <abbr>text</abbr> | No | No Markdown equivalent |
| <details> | <details>...</details> | No | HTML passthrough |
| <mark> | <mark>text</mark> | No | No native Markdown |
| <kbd> | <kbd>text</kbd> | No | Semantic HTML preserved |