About

Software that only handles perfect input is fragile software. The primary failure mode of JSON parsers occurs when they encounter malformed data: missing delimiters, truncated payloads, type mismatches, or illegal characters embedded within otherwise valid structures. This tool applies controlled corruption to valid JSON, producing damaged output that mimics real-world failure scenarios such as network truncation, encoding errors, or upstream API regressions. Corruption intensity is governed by a probability parameter p ∈ [0.1, 1.0], where higher values increase the density of injected faults per token.

Note: this tool operates on the string representation of JSON. It does not guarantee that every damage mode produces parseable output. That is the point. If your parser survives the output of this tool at intensity 8+, your error handling is likely production-grade. Limitation: extremely large inputs (>1 MB) may slow the browser since all mutations run on the main thread.

Formulas

Each damage operation is applied independently to tokens in the JSON string. The probability that any single token is affected is controlled by the intensity parameter:

P_damage = intensity10

Where intensity ∈ [1, 10]. At intensity 1, roughly 10% of eligible tokens are corrupted. At intensity 10, every eligible token is damaged. For each token t_i, a uniform random value r ∈ [0, 1] is generated:

corrupt(t_i) =

{

apply_mutation(t_i) if r ≤ P_damaget_i otherwise

When multiple damage types are enabled simultaneously, they are applied in sequence (pipeline): structural mutations first, then key mutations, then value mutations, then character-level mutations, and finally truncation (applied last since it discards everything after the cut point).

Reference Data

Damage Type	Description	Real-World Cause	Typical Parser Error
Remove Brackets	Deletes random { } [ ]	Network packet loss	Unexpected end of input
Remove Commas	Strips comma separators between elements	Manual editing error	Expected comma or closing bracket
Add Trailing Commas	Inserts commas before closing } or ]	Code generation bugs	Unexpected token
Unquote Keys	Removes quotes from object keys	JS object literal confusion	Expected property name
Duplicate Keys	Copies existing keys with different values	Merge conflicts	Ambiguous value (last wins)
Type Coercion	Changes value types: string → number, bool → string	Schema migration errors	Type mismatch / validation failure
Inject Null	Replaces random values with null	Database NULL propagation	NullPointerException
Inject NaN/Undefined	Inserts NaN or undefined literals	JavaScript serialization bugs	Invalid JSON value
Unicode Injection	Inserts zero-width spaces, BOM, RTL marks	Copy-paste from web / encoding bugs	Invisible parse failure
Swap Delimiters	Exchanges : with , and vice versa	Regex-based generation	Expected colon after property name
Truncation	Cuts JSON string at a random position	Timeout / stream interruption	Unexpected end of JSON input
String Escape Breakage	Removes backslashes from escape sequences	Double-encoding / decoding errors	Bad control character in string
Control Characters	Inserts ASCII 0x00 - 0x1F in strings	Binary data contamination	Invalid character
Number Corruption	Adds extra dots, leading zeros, or letters to numbers	Locale-specific formatting (comma vs dot)	Invalid number
Key Truncation	Shortens object keys to 1 - 2 characters	Minification bugs	Missing expected field

Frequently Asked Questions

Yes. The tool first attempts to parse the input with JSON.parse(). If the input is already invalid JSON, it will notify you and offer to damage the raw string anyway. Valid JSON allows the tool to apply semantically meaningful damage (e.g., targeting keys vs values), while raw string mode only applies character-level corruption.

Not exactly. Each damage run uses Math.random() for token selection, so results vary between runs even with identical settings. For reproducible fuzzing, copy the damaged output immediately. A future enhancement could add seed-based PRNG, but current browser Math.random() is not seedable without a custom implementation.

Start at intensity 3-4 for realistic corruption that mimics real-world failures (partial packet loss, encoding hiccups). Intensity 7-8 produces heavily damaged output suitable for testing catastrophic failure paths. Intensity 10 corrupts virtually every eligible token and is useful for verifying that your parser fails gracefully rather than crashing or hanging.

Truncation is always applied last in the pipeline. First, all other enabled mutations (structural, key, value, character) are applied to the full JSON string. Then, if truncation is enabled, the already-damaged string is cut at a random position between 20% and 90% of its length. This means truncation compounds the damage from other operations.

Neither NaN nor undefined are valid JSON values per RFC 8259. However, JavaScript's JSON.stringify() can silently drop undefined values or produce unexpected results with NaN. Injecting these literals as raw text tests whether your parser or deserializer correctly rejects them rather than coercing them into null or 0, which is a common source of data integrity bugs.

Yes. Zero-width spaces (U+200B), byte order marks (U+FEFF), and RTL override characters (U+202E) are multi-byte in UTF-8 encoding. A string that appears identical in a text editor may have a different byte length, which can break Content-Length headers, fixed-width parsers, or checksum validations. This is one of the most insidious real-world corruption modes.