BSON to XML Converter
Convert BSON binary data to well-formed XML online. Supports hex, base64, and file upload input with full BSON type support.
About
BSON (Binary JSON) is the binary-encoded serialization format used by MongoDB and other systems to store documents. Its wire format is compact but opaque. Extracting structured data from raw BSON without a proper parser risks silent data corruption. Type 0x01 is an IEEE 754 double. Type 0x12 is a signed 64-bit integer stored little-endian. Misreading byte order or length prefixes produces garbage. This tool parses the full BSON specification and emits well-formed XML with proper character escaping. It handles nested documents, arrays, ObjectIds, UTC datetimes, regex patterns, and binary subtypes. Limitation: JavaScript lacks native 64-bit integer precision. Values of type Int64 beyond 253 are represented as strings to avoid rounding.
Input is accepted as a hexadecimal string, a base64-encoded string, or a raw .bson file. The converter validates the BSON document length prefix against actual byte length before parsing. Malformed documents produce descriptive error messages referencing the byte offset of failure. Output XML uses a configurable root element name and indentation depth. Array elements are emitted as repeated <item> nodes with an index attribute. Special BSON types carry a bsonType attribute so no information is lost in the conversion.
Formulas
A BSON document begins with a 4-byte little-endian int32 declaring the total document size in bytes, followed by a sequence of typed elements, and terminates with a 0x00 byte.
Each element in e_list is structured as:
The total byte length is validated against the declared size:
Strings are length-prefixed: a 4-byte int32 byte count (including the trailing 0x00), followed by UTF-8 encoded bytes. XML output escapes five reserved characters:
Where e_name is a cstring (null-terminated sequence of non-zero bytes), used as the XML element tag. Invalid XML name characters are replaced with underscores.
Reference Data
| BSON Type | ID (Hex) | Description | XML Representation | Size (bytes) |
|---|---|---|---|---|
| Double | 0x01 | IEEE 754 floating point | Text content | 8 |
| String | 0x02 | UTF-8 string | Text content (escaped) | 4 + len + 1 |
| Document | 0x03 | Embedded BSON document | Nested child elements | Variable |
| Array | 0x04 | BSON array | Repeated <item> elements | Variable |
| Binary | 0x05 | Binary data with subtype | Base64 text, subtype attr | 5 + len |
| Undefined | 0x06 | Deprecated | Empty element, bsonType attr | 0 |
| ObjectId | 0x07 | 12-byte unique ID | Hex string text | 12 |
| Boolean | 0x08 | true / false | "true" or "false" text | 1 |
| UTC Datetime | 0x09 | Milliseconds since epoch | ISO 8601 string | 8 |
| Null | 0x0A | Null value | Empty element, bsonType="null" | 0 |
| Regex | 0x0B | Regular expression | pattern & options attrs | Variable |
| DBPointer | 0x0C | Deprecated DB reference | Namespace + ObjectId text | 4 + len + 13 |
| JavaScript | 0x0D | JS code string | CDATA text content | 4 + len + 1 |
| Symbol | 0x0E | Deprecated symbol | Text content | 4 + len + 1 |
| Code w/ Scope | 0x0F | JS code + scope doc | Nested code + scope elements | Variable |
| Int32 | 0x10 | 32-bit signed integer | Integer text | 4 |
| Timestamp | 0x11 | MongoDB internal timestamp | increment + timestamp attrs | 8 |
| Int64 | 0x12 | 64-bit signed integer | String text (precision safe) | 8 |
| Decimal128 | 0x13 | 128-bit decimal float | Hex string text | 16 |
| MinKey | 0xFF | Internal lowest value | Empty, bsonType="minKey" | 0 |
| MaxKey | 0x7F | Internal highest value | Empty, bsonType="maxKey" | 0 |