CSV to XML Converter
Convert CSV data to structured XML with custom element mapping, attribute support, XPath-like paths, and primary key grouping. Free online tool.
About
Structural mismatch between tabular CSV and hierarchical XML is a persistent data-engineering problem. A naive row-to-element conversion produces flat, unusable XML that fails schema validation in systems expecting nested structures, attributes, and grouped records. This converter implements a column-to-XPath mapping engine: each CSV header maps to a target path such as root/parent/child/text() or root/element/@attr. The parser follows RFC 4180 for CSV, handling quoted fields, embedded delimiters, and escaped double-quotes. Optional primary-key grouping collapses N rows sharing the same key into a single parent element with repeated child nodes. The tool approximates the behavior of XSLT-based pipelines but runs entirely in the browser with zero server round-trips.
Limitations: XML namespace declarations are not auto-generated. If your target schema requires xmlns prefixes, add them manually to the root element name field. Very large files (above 5 MB) may cause browser memory pressure. For production ETL pipelines processing millions of rows, a server-side streaming solution remains appropriate.
Formulas
The conversion pipeline operates in three discrete stages. First, the CSV parser tokenizes input using a finite-state machine with states S ∈ {FIELD_START, UNQUOTED, QUOTED, QUOTE_IN_QUOTED}. Transitions depend on the current character c and configured delimiter d.
Second, the mapping engine processes each row. For a mapping target path P = p1/p2/.../pn, the algorithm walks from the row's root element, creating intermediate elements as needed. The terminal segment pn determines the action:
Third, when primary-key grouping is enabled, rows sharing key k are merged. The total output elements E relates to input rows R and unique keys K:
Where K = {ri[primaryKey] | ri ∈ R}. Each unique key produces one parent element. Repeated-key rows append child elements within that parent.
Reference Data
| XPath-like Token | Meaning | Example Path | Resulting XML |
|---|---|---|---|
| element | Creates/selects child element | book/title | <book><title>...</title></book> |
| text() | Sets text content of parent element | book/title/text() | <title>Value</title> |
| @attr | Sets attribute on parent element | book/@isbn | <book isbn="Value"> |
| parent/child | Nested elements via slash separator | a/b/c/text() | <a><b><c>Val</c></b></a> |
| @code on intermediate | Attribute on any nesting level | item/type/@code | <item><type code="Val"/></item> |
| CSV Delimiter Reference | |||
| Comma | , | Default RFC 4180 | Most common format |
| Semicolon | ; | European locale CSVs | Used when decimal is comma |
| Tab | \t | TSV files | Database exports |
| Pipe | | | Legacy mainframe | Fixed-width alternatives |
| XML Special Character Escaping | |||
| & | & | Ampersand | Always escaped in text/attrs |
| < | < | Less than | Escaped in text content |
| > | > | Greater than | Escaped in text content |
| " | " | Double quote | Escaped in attribute values |
| ' | ' | Apostrophe | Escaped in attribute values |
| Common XML Encoding Declarations | |||
| UTF-8 | <?xml version="1.0" encoding="UTF-8"?> | Default | Most web systems |
| UTF-16 | encoding="UTF-16" | Windows legacy | BOM required |
| ISO-8859-1 | encoding="ISO-8859-1" | Latin-1 | Legacy European |