User Rating 0.0
Total Usage 0 times
Category JSON Tools
Drag & drop a .bson file here or click to browse
Is this tool helpful?

Your feedback helps us improve.

About

BSON (Binary JSON) is a binary-encoded serialization format used primarily by MongoDB to store documents and for remote procedure calls. Its binary nature makes it unreadable without proper decoding. A single misinterpreted byte offset corrupts the entire downstream parse, producing silent data errors that propagate through ETL pipelines. This tool implements the full BSON specification (v1.1) as defined at bsonspec.org, decoding all 19 element types including ObjectId, Int64 via BigInt, Decimal128, UTC datetime, regex, and nested subdocuments. It parses raw binary data byte-by-byte using little-endian reads with strict bounds checking.

Limitation: Decimal128 values are rendered as hex strings because IEEE 754 128-bit float precision cannot be fully represented in JavaScript's native 64-bit doubles. Documents exceeding 16 MB (MongoDB's default max) are accepted but may cause slower parsing in the browser. The parser assumes well-formed BSON. Malformed or truncated documents will produce a descriptive error at the exact byte offset of failure.

bson json converter binary mongodb bson parser data conversion

Formulas

A BSON document is a linear sequence of bytes. Parsing begins by reading the total document size as a 32-bit little-endian integer at offset 0. Each element follows the pattern:

element typebyte + namecstring + payloadtype-dependent

where type is a single byte identifying the data type, name is a null-terminated UTF-8 string (the field key), and payload varies per type. Strings use length-prefixed encoding:

string lenint32 + bytesUTF-8 + 0x00terminator

where len includes the null terminator. Int64 values are read as two 32-bit halves and combined:

int64 = lo + hi × 232

where lo is the unsigned lower 32 bits and hi is the signed upper 32 bits. This uses JavaScript BigInt for lossless representation. The document terminates with a 0x00 byte. The parser validates that the consumed byte count matches the declared document size.

Reference Data

Type IDBSON TypeJSON RepresentationSize (bytes)Notes
0x01DoubleNumber8IEEE 754 64-bit float
0x02StringString4 + n + 1UTF-8 encoded, length-prefixed
0x03DocumentObjectVariableEmbedded BSON document
0x04ArrayArrayVariableKeys are "0", "1", "2"…
0x05BinaryBase64 string5 + nSubtype byte included
0x06Undefinednull0Deprecated in BSON spec
0x07ObjectIdHex string (24 chars)12Timestamp + machine + counter
0x08Booleantrue / false10x00 = false, 0x01 = true
0x09UTC DatetimeISO 8601 string8Milliseconds since Unix epoch
0x0ANullnull0No payload
0x0BRegex{"pattern": ..., "flags": ...}VariableTwo cstrings: pattern + options
0x0DJavaScript CodeString4 + n + 1UTF-8 JS source code
0x0ESymbolString4 + n + 1Deprecated, treated as string
0x0FCode w/ Scope{"code": ..., "scope": ...}VariableDeprecated in MongoDB 4.4+
0x10Int32Number4Signed 32-bit integer, little-endian
0x11Timestamp{"t": ..., "i": ...}8Internal MongoDB replication timestamp
0x12Int64String (BigInt)8Signed 64-bit, exceeds Number.MAX_SAFE_INTEGER
0x13Decimal128Hex string16IEEE 754 128-bit decimal
0xFFMin Key{"$minKey": 1}0Internal type, compares less than all
0x7FMax Key{"$maxKey": 1}0Internal type, compares greater than all

Frequently Asked Questions

BSON Int64 (type 0x12) stores signed 64-bit integers. JavaScript's Number type uses IEEE 754 doubles which safely represent integers only up to 2531 (9,007,199,254,740,991). This parser uses native BigInt to read both 32-bit halves and combine them losslessly. In the JSON output, Int64 values are serialized as strings with a "$numberLong" wrapper to preserve precision, following MongoDB Extended JSON v2 conventions.
The parser performs bounds checking before every read operation. If a read would exceed the buffer length, it throws a descriptive error including the byte offset and expected vs. available bytes. Unknown type IDs (not in the 0x01 - 0x13 range plus 0xFF and 0x7F) cause an immediate parse error with the hex value of the unrecognized type. The error toast will display the exact offset for debugging with a hex editor.
Yes. MongoDB dump files and wire protocol streams often contain sequential BSON documents without delimiters. The parser reads the first 4-byte size header, parses that document, then checks if additional bytes remain. If so, it continues parsing subsequent documents and outputs them as a JSON array. Each document's declared size is validated against actual consumed bytes to prevent misalignment.
Binary fields contain a subtype byte (0x00 generic, 0x04 UUID, 0x05 MD5, etc.) followed by the raw bytes. The parser outputs them as an object with "$binary" containing a base64-encoded string and "$type" containing the subtype as a two-character hex string. This follows the MongoDB Extended JSON v2 canonical format and preserves all binary data without loss.
Decimal128 (type 0x13) uses the IEEE 754 128-bit decimal floating-point format with 34 significant digits. JavaScript has no native 128-bit float type, and BigInt only handles integers. Full Decimal128 decoding requires parsing the combination field, exponent, and coefficient according to the IEEE 754-2008 specification. This parser outputs the raw 16 bytes as a hex string in a "$numberDecimal" wrapper. For production use requiring exact decimal values, decode this hex on the server side with a library like libbson.
The hex input accepts any string of hexadecimal characters (0 - 9, a - f, A - F). Whitespace, newlines, commas, and "0x" prefixes are automatically stripped before parsing. The total character count (after stripping) must be even, as each byte requires exactly two hex digits. Common sources include MongoDB shell's BinData output, Wireshark packet captures, or hex editor exports.