DOM Tree to JSON Converter
Convert HTML DOM tree structures to clean JSON format. Parse any HTML markup into nested JSON objects with tags, attributes, and children.
JSON output will appear here...
About
HTML documents form a tree of nodes. Each element carries a tag name, a map of attributes, and zero or more children. Manually reconstructing this hierarchy into a data-interchange format like JSON is error-prone: missed closing tags, misread nesting depth, or dropped attribute values. This converter uses the browser's native DOMParser API to build a spec-compliant parse tree, then performs a depth-first recursive walk to emit a clean JSON object. The output preserves node type (ELEMENT, TEXT, COMMENT), all attribute key-value pairs, and the full child hierarchy. Note: the parser follows HTML5 error-recovery rules, so malformed markup will be silently corrected rather than rejected.
Typical use cases include automated testing fixtures, CMS migration scripts, and accessibility audits where you need a machine-readable snapshot of a page fragment. The tool handles documents up to roughly 500 KB of markup in the main thread; larger inputs are offloaded to a Web Worker. Pro tip: if your source HTML contains entities or inline style attributes, those appear verbatim in the JSON output. Filter them downstream if your pipeline requires clean data.
Formulas
The conversion algorithm performs a recursive depth-first traversal. For each node N in the DOM tree, the mapping function f produces a JSON object:
Where c0 … cn are the child nodes of N. The attribute mapping function iterates the NamedNodeMap:
Text node handling applies a whitespace filter predicate P:
Where t = the text content of the node. Only nodes satisfying P are included unless the "include whitespace" option is enabled. The total node count in the output is bounded by the recursive relation T(N) = 1 + k∑i=0 T(ci), where k is the number of children. Time complexity is O(n) where n is total node count.
Reference Data
| Node Type | nodeType Value | JSON Representation | Included by Default |
|---|---|---|---|
| Element | 1 | { tag, attributes, children } | Yes |
| Text | 3 | { type: "text", content } | Yes (non-empty) |
| Comment | 8 | { type: "comment", content } | Optional |
| CDATA Section | 4 | { type: "cdata", content } | Optional |
| Document | 9 | Root wrapper | Skipped (children used) |
| DocumentType | 10 | { type: "doctype", name } | Optional |
| DocumentFragment | 11 | Root wrapper | Skipped (children used) |
| Attribute | 2 | Merged into parent attributes | Always |
| Processing Instruction | 7 | { type: "pi", target, data } | Optional |
| Entity Reference | 5 | Resolved to text | Automatic |
| Void Elements (br, img, hr, input) | 1 | { tag, attributes, children: [] } | Yes |
| SVG Elements | 1 | Namespace preserved in tag | Yes |
| Custom Elements (web components) | 1 | Hyphenated tag preserved | Yes |
| Template Content | 11 | Fragment children extracted | Optional |
| Whitespace-only Text | 3 | Filtered out | No (configurable) |