Compare Two XML Files
Compare two XML files side by side. Detect added, removed, and modified elements, attributes, and text nodes with precise XPath locations.
About
XML comparison is a structural problem, not a text problem. Running a plain text diff on XML produces noise: whitespace changes, attribute reordering, and namespace declarations all trigger false positives that obscure real modifications. This tool parses both documents into DOM trees and performs recursive node-by-node comparison. It detects added elements, removed elements, modified attributes, changed text content, and reordered children. Each difference reports a computed XPath so you can locate it in your source file without guessing. The algorithm handles attribute order normalization and ignores insignificant whitespace between tags.
Incorrect XML merges cause silent data loss in configuration files, API contracts, and build pipelines. A single missing attribute in a Maven pom.xml or a Spring applicationContext.xml can break deployments with no compile-time warning. This tool assumes well-formed XML input. It does not validate against XSD or DTD schemas. Namespace-aware comparison treats ns:element and element as distinct nodes. For documents exceeding 10,000 nodes, expect processing times around 2 - 5 seconds depending on browser and hardware.
Formulas
The comparison algorithm computes structural difference using recursive tree traversal. For each pair of nodes at equivalent positions, the diff function evaluates:
Child node alignment uses a simplified LCS (Longest Common Subsequence) approach. The match score between two element nodes is:
Where wtag = 1.0 is the tag name weight, wattr = 0.5 is the attribute similarity weight, δ is the Kronecker delta (1 if equal, 0 otherwise), and J is the Jaccard index of attribute key sets: J(A, B) = |A ∩ B||A ∪ B|. XPath generation concatenates ancestor tag names with positional predicates: path = /root/child[n]/node[m].
Reference Data
| Diff Type | Symbol | Meaning | Example Scenario |
|---|---|---|---|
| Added | + | Node exists only in right (new) XML | New <dependency> block added to pom.xml |
| Removed | − | Node exists only in left (old) XML | Deprecated <filter> removed from web.xml |
| Modified (Text) | Δ | Same element path, different text content | Version number changed from 1.0 to 2.0 |
| Modified (Attribute) | Δ attr | Same element, attribute value differs | timeout="30" changed to timeout="60" |
| Attribute Added | + attr | Attribute exists only in right XML | New enabled="true" attribute on <feature> |
| Attribute Removed | − attr | Attribute exists only in left XML | Removed deprecated class="old" attribute |
| Type Mismatch | ≠ | Node types differ at same position | Element replaced with comment node |
| Tag Renamed | ≠ tag | Different tag name at same tree position | <user> renamed to <account> |
| Child Count | ± | Different number of child elements | List grew from 3 to 5 items |
| Order Changed | ↔ | Same children but in different sequence | <first> and <last> swapped positions |
| Namespace Diff | ns | Same local name, different namespace URI | Migrated from http://old.ns to http://new.ns |
| CDATA vs Text | Δ | CDATA section replaced with plain text | <![CDATA[...]]> unwrapped to text node |
| Comment Diff | Δ ! | Comment content changed | TODO note updated in config file |
| Processing Instruction | Δ ? | PI target or data changed | <?xml-stylesheet?> href modified |