Add Fuzziness to Text
Add fuzziness to text with homoglyphs, typos, Zalgo, or leetspeak. Adjustable intensity slider for realistic text distortion and obfuscation.
About
Text fuzziness is the deliberate introduction of controlled noise into a string. Applications range from adversarial testing of NLP pipelines and spam filter evasion analysis to data augmentation for training robust OCR and text-classification models. A misapplied fuzziness level can render test data useless or produce artifacts that don't reflect real-world corruption patterns. This tool applies four distinct distortion algorithms - homoglyph substitution using Unicode confusables from Latin, Cyrillic, and Greek blocks, stochastic typo injection based on a QWERTY adjacency matrix, Zalgo diacritical stacking via combining characters in the range U+0300 - 036F, and weighted leetspeak mapping - each governed by a probability parameter p β [0, 1]. Note: homoglyph output may appear identical to the original on certain fonts but differs at the codepoint level. Results depend on the rendering engine and typeface.
Formulas
Each character in the input string is independently subjected to a distortion decision. The probability of any character being altered is governed by the fuzziness parameter:
where f β [0, 1] is the fuzziness intensity from the slider, and wmode is the mode-specific weight (homoglyph: 0.9, typo: 0.4, Zalgo: 0.7, leet: 0.8). For each character ci, a uniform random value r β [0, 1) is drawn:
For Zalgo mode, the number of combining marks n stacked on each affected character scales linearly with intensity:
where f at maximum yields up to 16 combining characters per base glyph. For typo mode, the error type is selected uniformly from the set {swap, duplicate, omit, neighbor}, with neighbor-key substitution using a precomputed QWERTY adjacency lookup of 26 Γ ~4.5 average neighbors per key.
Reference Data
| Mode | Technique | Unicode Range / Mechanism | Detectability | Use Case |
|---|---|---|---|---|
| Homoglyph | Visual lookalike substitution | Cyrillic (U+0400 - U+04FF), Greek (U+0370 - U+03FF) | Low (visually identical) | Phishing research, filter bypass testing |
| Typo | Adjacent-key swap, duplication, omission | QWERTY keyboard adjacency map | Medium (human-readable errors) | NLP robustness testing, data augmentation |
| Zalgo | Combining diacritical marks stacking | U+0300 - U+036F (Combining Diacriticals) | High (visually chaotic) | Artistic text, stress testing renderers |
| Leetspeak | Alpha β numeric/symbol replacement | ASCII substitution dictionary | Medium (recognizable pattern) | Gaming culture, basic obfuscation |
| Mixed | Weighted blend of all four modes | All of the above | Variable | Comprehensive fuzzing |
| Common Homoglyph Pairs (Latin β Cyrillic/Greek) | ||||
| a | Π° (Cyrillic Small A) | U+0430 | Visually identical | Confusable substitution |
| e | Π΅ (Cyrillic Small Ie) | U+0435 | Visually identical | Confusable substitution |
| o | ΞΏ (Greek Small Omicron) | U+03BF | Visually identical | Confusable substitution |
| p | Ρ (Cyrillic Small Er) | U+0440 | Visually identical | Confusable substitution |
| c | Ρ (Cyrillic Small Es) | U+0441 | Visually identical | Confusable substitution |
| x | Ρ (Cyrillic Small Kha) | U+0445 | Visually identical | Confusable substitution |
| y | Ρ (Cyrillic Small U) | U+0443 | Visually identical | Confusable substitution |
| s | Ρ (Cyrillic Small Dze) | U+0455 | Visually identical | Confusable substitution |
| i | Ρ (Cyrillic Small Byelorussian-Ukrainian I) | U+0456 | Visually identical | Confusable substitution |
| H | Π (Cyrillic Capital En) | U+041D | Visually identical | Confusable substitution |
| T | Π’ (Cyrillic Capital Te) | U+0422 | Visually identical | Confusable substitution |
| B | Π (Cyrillic Capital Ve) | U+0412 | Visually identical | Confusable substitution |
| Combining Diacritical Marks (Zalgo) | ||||
| Above | U+0300 - U+0315 | Grave, Acute, Circumflex, Tilde, etc. | 22 marks | Stack above glyphs |
| Below | U+0316 - U+0333 | Grave below, Cedilla, etc. | 30 marks | Stack below glyphs |
| Overlay | U+0334 - U+0338 | Tilde overlay, Stroke, etc. | 5 marks | Strike-through effect |