User Rating 0.0
Total Usage 0 times
Supports raw Bencode text or .torrent file contents
Examples:
XML output will appear here...
Is this tool helpful?

Your feedback helps us improve.

About

Bencode (pronounced bee-encode) is the encoding format used by BitTorrent for storing and transmitting loosely structured data. The format supports four data types: byte strings (length-prefixed), integers (signed, base-10), lists (ordered sequences), and dictionaries (key-value maps with sorted string keys). Unlike JSON or XML, Bencode lacks native support for floating-point numbers, null values, or boolean types. The format's simplicity makes it resistant to parsing ambiguities but creates challenges when converting to richer formats. This converter implements a full recursive descent parser per the BitTorrent Enhancement Proposal specifications, handling edge cases like nested structures exceeding 100 levels deep and binary data (piece hashes, peer IDs) that require Base64 encoding for valid XML output.

Conversion errors typically occur with malformed length prefixes (string declares 500 bytes but contains 50), unclosed containers (missing e terminator), or invalid dictionary key ordering. The parser performs strict validation: dictionaries with out-of-order keys are flagged as non-canonical Bencode. Binary strings containing non-UTF8 sequences are automatically Base64-encoded with an encoding="base64" attribute to preserve round-trip fidelity.

bencode xml torrent bittorrent converter parser dht

Formulas

Bencode parsing follows a deterministic grammar where each type is identified by its leading byte. The parser implements a recursive descent strategy with lookahead of 1 character.

Bencode Integer | String | List | Dictionary

Type detection occurs via the first byte:

{
Integer if byte = "i"List if byte = "l"Dictionary if byte = "d"String if byte {0-9}

String length prefix parsing extracts n bytes after the colon delimiter:

String parseInt(digits) : readBytes(n)

XML escaping applies five mandatory character substitutions to ensure well-formed output:

< < , > > , & & , " " , ' '

Binary detection uses UTF-8 validation. If a byte sequence fails UTF-8 decoding, the string is Base64-encoded:

output =
{
escape(str) if valid UTF-8base64(bytes) otherwise

Reference Data

Bencode TypeSyntax PatternXML MappingExample BencodeExample XML Output
Integeri<number>e<integer>i42e<integer>42</integer>
Negative Integeri-<number>e<integer>i-17e<integer>-17</integer>
Zeroi0e<integer>i0e<integer>0</integer>
String (ASCII)<len>:<data><string>4:spam<string>spam</string>
String (UTF-8)<len>:<data><string>6:日本語<string>日本語</string>
String (Binary)<len>:<bytes><string encoding="base64">20:<binary SHA1><string encoding="base64">...</string>
Empty String0:<string>0:<string></string>
List (Empty)le<list>le<list></list>
List (Items)l<items>e<list><item>...</item></list>li1ei2ee<list><item><integer>1</integer></item><item><integer>2</integer></item></list>
Dictionary (Empty)de<dict>de<dict></dict>
Dictionary (Entries)d<key><value>e<dict><key name="...">...</key></dict>d3:foo3:bare<dict><key name="foo"><string>bar</string></key></dict>
Nested StructureAny combinationRecursive mappingd4:infod4:name4:testee<dict><key name="info"><dict><key name="name"><string>test</string></key></dict></key></dict>
Torrent AnnounceURL string<string>d8:announce38:http://tracker.example.com:6969/announcee<dict><key name="announce"><string>http://...</string></key></dict>
Piece LengthPower of 2 integer<integer>d12:piece lengthi262144ee<key name="piece length"><integer>262144</integer></key>
Info Hash Data20-byte binaryBase64 encoded6:pieces20:<SHA1 bytes><key name="pieces"><string encoding="base64">...</string></key>
Creation DateUnix timestamp<integer>d13:creation datei1234567890ee<key name="creation date"><integer>1234567890</integer></key>
File ListList of dictsNested structured5:filesld6:lengthi1024e4:pathl8:file.txteee<key name="files"><list><item><dict>...</dict></item></list></key>
Private FlagInteger 0 or 1<integer>d7:privatei1ee<key name="private"><integer>1</integer></key>
CommentUTF-8 string<string>d7:comment12:Hello World!e<key name="comment"><string>Hello World!</string></key>
URL ListString or ListDepends on typed8:url-listl20:http://mirror1.com/20:http://mirror2.com/ee<key name="url-list"><list>...</list></key>
DHT NodesList of listsNested listsd5:nodesll9:127.0.0.1i6881eeee<key name="nodes"><list><item><list>...</list></item></list></key>

Frequently Asked Questions

Binary data that fails UTF-8 validation is automatically Base64-encoded. The output XML element includes an encoding="base64" attribute to indicate this transformation. A standard torrent's pieces field contains concatenated 20-byte SHA1 hashes which will always trigger Base64 encoding since raw hash bytes are not valid UTF-8 sequences.
The BitTorrent specification requires dictionary keys to be sorted lexicographically by raw byte value. This converter flags non-canonical ordering with a warning but still processes the data. The output XML preserves the original key order while noting the violation, since some legacy implementations produce unsorted dictionaries that remain functionally valid.
Yes. The parser supports arbitrary nesting depth up to 1000 levels, which exceeds any practical torrent structure. Multi-file torrents store paths as lists of path components (e.g., ["folder", "subfolder", "file.txt"]), and each component becomes a separate item element within the path list in the XML output.
Bencode integers are signed 64-bit values. File sizes exceeding 2^63-1 bytes (approximately 9.2 exabytes) would require unsigned representation, which Bencode does not support. If you see unexpected negative values, the source data likely contains corruption or was generated by a non-compliant encoder.
Empty strings (Bencode: 0:) convert to with no content between tags. Empty lists (le) become and empty dictionaries (de) become . These preserve the type information while indicating zero elements.
No. This converter performs syntactic transformation only. It validates Bencode structure (proper length prefixes, balanced delimiters, sorted keys) but does not verify that announce URLs are reachable, piece lengths are powers of two, or file paths are valid. Semantic validation requires domain-specific logic beyond format conversion.