About

XML comparison is a structural problem, not a text problem. Running a plain text diff on XML produces noise: whitespace changes, attribute reordering, and namespace declarations all trigger false positives that obscure real modifications. This tool parses both documents into DOM trees and performs recursive node-by-node comparison. It detects added elements, removed elements, modified attributes, changed text content, and reordered children. Each difference reports a computed XPath so you can locate it in your source file without guessing. The algorithm handles attribute order normalization and ignores insignificant whitespace between tags.

Incorrect XML merges cause silent data loss in configuration files, API contracts, and build pipelines. A single missing attribute in a Maven pom.xml or a Spring applicationContext.xml can break deployments with no compile-time warning. This tool assumes well-formed XML input. It does not validate against XSD or DTD schemas. Namespace-aware comparison treats ns:element and element as distinct nodes. For documents exceeding 10,000 nodes, expect processing times around 2 - 5 seconds depending on browser and hardware.

Formulas

The comparison algorithm computes structural difference using recursive tree traversal. For each pair of nodes at equivalent positions, the diff function evaluates:

diff(L, R) =

{

REMOVED if R = NULLADDED if L = NULLMODIFIED if type(L) ≠ type(R) ∨ name(L) ≠ name(R)diff(children(L), children(R)) otherwise

Child node alignment uses a simplified LCS (Longest Common Subsequence) approach. The match score between two element nodes is:

score(a, b) = w_tag ⋅ δ(tag(a), tag(b)) + w_attr ⋅ J(attrs(a), attrs(b))

Where w_tag = 1.0 is the tag name weight, w_attr = 0.5 is the attribute similarity weight, δ is the Kronecker delta (1 if equal, 0 otherwise), and J is the Jaccard index of attribute key sets: J(A, B) = |A ∩ B||A ∪ B|. XPath generation concatenates ancestor tag names with positional predicates: path = /root/child[n]/node[m].

Reference Data

Diff Type	Symbol	Meaning	Example Scenario
Added	+	Node exists only in right (new) XML	New <dependency> block added to pom.xml
Removed	−	Node exists only in left (old) XML	Deprecated <filter> removed from web.xml
Modified (Text)	Δ	Same element path, different text content	Version number changed from 1.0 to 2.0
Modified (Attribute)	Δ attr	Same element, attribute value differs	timeout="30" changed to timeout="60"
Attribute Added	+ attr	Attribute exists only in right XML	New enabled="true" attribute on <feature>
Attribute Removed	− attr	Attribute exists only in left XML	Removed deprecated class="old" attribute
Type Mismatch	≠	Node types differ at same position	Element replaced with comment node
Tag Renamed	≠ tag	Different tag name at same tree position	<user> renamed to <account>
Child Count	±	Different number of child elements	List grew from 3 to 5 items
Order Changed	↔	Same children but in different sequence	<first> and <last> swapped positions
Namespace Diff	ns	Same local name, different namespace URI	Migrated from http://old.ns to http://new.ns
CDATA vs Text	Δ	CDATA section replaced with plain text	<![CDATA[...]]> unwrapped to text node
Comment Diff	Δ !	Comment content changed	TODO note updated in config file
Processing Instruction	Δ ?	PI target or data changed	<?xml-stylesheet?> href modified

Frequently Asked Questions

The tool compares nodes using their full qualified name including namespace prefix. If two elements have the same local name but different prefixes bound to different namespace URIs, they are reported as different. Prefix-only changes (same URI, different alias) will appear as tag name differences. For namespace-agnostic comparison, strip namespace prefixes from both inputs before comparing.

No. Attributes within a single element are sorted alphabetically by name before comparison. This means <tag a="1" b="2"> and <tag b="2" a="1"> are treated as identical. The XML specification does not define attribute order as significant, and this tool follows that convention.

Mixed content is compared node by node in document order. Text nodes between elements are treated as distinct children. If one document has while another has , the whitespace text node will appear as an added or removed diff. Use the "Ignore Whitespace" option to suppress insignificant whitespace-only text node differences.

The tool processes XML entirely in-browser memory. Files up to approximately 5 MB (roughly 50,000-100,000 nodes) perform well in modern browsers. Beyond that, the DOMParser and recursive diff may cause noticeable delays or memory pressure. For very large files, consider splitting them into smaller logical sections before comparing. The tool shows a progress indicator during processing to prevent the appearance of freezing.

CDATA sections are compared as text nodes since most DOM parsers convert them to text automatically. XML comments () are included in the comparison by default. Processing instructions () are also compared. If your diff contains noise from comment changes, remove comments from both inputs before comparing.

Each diff entry includes a computed XPath that identifies the exact location in the XML tree. Positional predicates like /root/item[3] mean the third child of . This XPath can be used directly in tools like xmllint, Saxon, or browser DevTools to locate the node. Note that the index is 1-based following XPath convention, not 0-based.