User Rating 0.0
Total Usage 0 times
0 / 10,000 characters
Insert position:
Is this tool helpful?

Your feedback helps us improve.

About

Zero-width characters are Unicode code points that occupy no visible space in rendered text. They exist in the Unicode standard for legitimate typographic purposes - U+200B (Zero-Width Space) controls line-break opportunities, U+200C (Zero-Width Non-Joiner) prevents ligature formation in scripts like Arabic and Devanagari, and U+200D (Zero-Width Joiner) forces ligatures or combines emoji sequences. This tool exploits these characters for text steganography: encoding arbitrary plaintext into a sequence of invisible code points that can be embedded within normal visible text. The encoding maps each character's binary representation to a pair of zero-width characters (U+200D 1, U+200C 0), separated by U+200B delimiters. The result is a string that appears completely empty to the human eye but carries a full payload when decoded.

A failure to detect zero-width characters in user-submitted content creates real security risks. Invisible strings can bypass naive input validation, smuggle data through chat filters, fingerprint leaked documents by embedding unique invisible watermarks, or break string comparison logic in code. Copy-pasting text from untrusted sources without stripping zero-width characters has caused bugs in production systems. This tool provides three operations: encoding visible text into invisible characters, decoding invisible sequences back to readable text, and detecting the presence and quantity of hidden zero-width characters in any pasted content. The encoding is lossless for all BMP characters (code points 65535). Note: some platforms strip zero-width characters on paste - test your target platform before relying on this for message delivery.

invisible text zero-width characters hidden text generator unicode steganography invisible message ZWC encoder text hiding tool

Formulas

The encoding algorithm converts each character to its binary representation, then maps each bit to a zero-width character:

encode(c) = map(bin(charCodeAt(c)), bit bit = 1 ? U+200D : U+200C)

For a full message of n characters, the output is the concatenation of each encoded character separated by the delimiter U+200B:

output = encode(c1) + U+200B + encode(c2) + U+200B + ... + encode(cn)

The output length in code points for a single character with code point value v is:

L(v) = floor(log2(v)) + 1

Total invisible string length for n characters:

T = ni=1 L(vi) + (n 1)

Where the + (n 1) accounts for delimiter characters between encoded character groups.

c = input character, v = Unicode code point value, n = number of characters, L = bit-length of a code point, T = total invisible code points generated.

Reference Data

CharacterUnicodeNameWidthPurposeUsed In This Tool
U+200BZero-Width Space0pxLine-break opportunityCharacter delimiter
U+200CZero-Width Non-Joiner0pxPrevent ligaturesBinary 0
U+200DZero-Width Joiner0pxForce ligatures / emoji glueBinary 1
U+2060Word Joiner0pxPrevent line-breakDetection only
U+200ELeft-to-Right Mark0pxBiDi controlDetection only
U+200FRight-to-Left Mark0pxBiDi controlDetection only
U+202ALR Embedding0pxBiDi embeddingDetection only
U+202BRL Embedding0pxBiDi embeddingDetection only
U+202CPop Directional Formatting0pxBiDi terminatorDetection only
U+FEFFBOM / Zero-Width No-Break Space0pxByte order markDetection only
U+2063Invisible Separator0pxMath separatorDetection only
U+2062Invisible Times0pxImplied multiplicationDetection only
U+2061Function Application0pxMath notationDetection only
U+180EMongolian Vowel Separator0pxMongolian scriptDetection only
U+3164Hangul FillerVariableKorean fillerDetection only
U+115FHangul Choseong FillerVariableKorean initial fillerDetection only

Frequently Asked Questions

Most platforms preserve zero-width characters: Gmail, Outlook, Slack, Discord, WhatsApp, Telegram, and standard text editors. Twitter/X strips some zero-width characters from tweets. Facebook may strip them in certain contexts. Google Docs preserves them. Always test by encoding a short message, pasting it into your target platform, copying it back out, and running it through the Decode tab to verify integrity.
The tool encodes characters in the Basic Multilingual Plane (code points U+0000 to U+FFFF), which covers virtually all modern languages, punctuation, and symbols. Each character produces 7-16 invisible code points depending on its value. A 1,000-character message produces roughly 12,000-17,000 zero-width characters. The tool limits input to 10,000 characters to prevent browser performance issues, yielding up to ~170,000 invisible code points. Most text fields have no issue storing this volume.
Yes. Any system performing Unicode category analysis can detect zero-width characters. The regex pattern [\u200B-\u200F\u2028-\u202F\u2060\uFEFF] catches the most common invisible characters. Enterprise DLP (Data Loss Prevention) tools increasingly flag zero-width character sequences as potential steganographic payloads. This tool's Detect mode performs exactly this analysis. Do not rely on zero-width encoding for security-critical secrecy - it is obfuscation, not encryption.
Base64 converts binary data to visible ASCII characters (A-Z, a-z, 0-9, +, /). The output is clearly visible and recognizable as encoded text. Zero-width character encoding converts data to Unicode characters that render with zero pixel width - the output is literally invisible when embedded in normal text. The trade-off is efficiency: Base64 expands data by ~33%, while zero-width encoding expands each character to 7-16 invisible code points. Zero-width encoding is for steganography (hiding the existence of a message), not efficient data transport.
Three common causes: (1) The target application strips or normalizes zero-width characters on paste. (2) The operating system clipboard performs Unicode normalization (NFC/NFD), which can affect surrounding text but typically preserves zero-width characters. (3) Rich text editors may interpret U+200D as an emoji joiner and combine adjacent characters unexpectedly. For reliable transfer, use plain-text paste (Ctrl+Shift+V) and avoid rich text editors. The Detect tab will show you exactly which zero-width characters survived the transfer.
Absolutely. Two strings that appear visually identical can fail strict equality checks if one contains zero-width characters. The string "hello" and "h\u200Bello" render identically but "hello" === "h\u200Bello" returns false. This has caused production bugs in authentication systems, database lookups, and URL routing. Always sanitize user input by stripping zero-width characters with: str.replace(/[\u200B-\u200F\u2028-\u202F\u2060\uFEFF]/g, '') before comparison or storage.