HTML to XML Converter
Convert loose HTML to strict, well-formed XML. Fixes unclosed tags, quotes attributes, and handles void elements. Free online developer tool.
About
This HTML to XML Converter transforms loose, messy HTML code into strict, well-formed XML (or XHTML). Unlike HTML browsers, which are forgiving of errors like missing closing tags or unquoted attributes, XML requires absolute precision. This tool parses your HTML using the browser's native engine and re-serializes it into a structure that machines and strictly typed parsers can understand.
It automatically handles complex tasks such as expanding boolean attributes (e.g., changing checked to checked="checked"), closing void elements like img and br, and encoding HTML named entities into their numeric XML equivalents to ensure validity without external DTDs.
Formulas
The converter enforces well-formedness rules defined by the W3C. The core transformation logic for an element E can be visualized as:
XML(E) = <tag> n∑i=0 XML(childi) </tag> otherwise
Where attrs represents the normalization of attribute k=v such that all v are enclosed in quotes " and special characters are escaped.
Reference Data
| Feature | HTML (Loose) | XML (Strict) |
|---|---|---|
| Void Tags | <br>, <hr>, <img src=...> | <br />, <hr />, <img src=... /> |
| Attributes | <div class=box> (Unquoted) | <div class="box"> (Always Quoted) |
| Boolean Attrs | <input disabled> | <input disabled="disabled" /> |
| Case Sensitivity | <DIV> = <div> | Tags must match exactly. |
| Root Element | Optional (implicit body) | Required (Single root node) |
| Entities | , © | , © (or declared) |