Characters to Hex HTML Entities Converter
Convert any text characters to hexadecimal HTML entities (&#x...;) and decode hex entities back to readable text. Supports full Unicode.
About
Incorrect character encoding in HTML documents causes rendering failures across browsers and locales. A single unescaped character outside the ASCII range can break XML parsers, corrupt RSS feeds, or display as ๏ฟฝ (the replacement character) on systems with mismatched encodings. This tool converts every character in your input to its hexadecimal HTML entity form HHHH;, using the Unicode code point value. It handles the full Unicode range from U+0000 to U+10FFFF, including astral plane characters (emoji, CJK extensions, mathematical symbols) that require surrogate pairs in UTF-16 but map to single code points.
The reverse decoder parses HHHH; patterns using strict regex validation and reconstructs characters via String.fromCodePoint. Note: this tool does not handle named entities like &. It operates exclusively on hexadecimal numeric references. For documents served as UTF-8, hex entities are technically redundant for most characters. They remain essential when embedding content in attribute values, working within ASCII-only transport layers (email headers, legacy databases), or preventing XSS by encoding user-generated content before insertion into HTML.
Formulas
Each character in the input string is converted to its Unicode code point, then expressed as a hexadecimal HTML numeric character reference.
codePoint = codePointAt(c)
hexValue = toString(codePoint, 16)
entity = &#x + toUpperCase(hexValue) + ;
Where codePoint is the Unicode scalar value ranging from 0 to 10FFFF16 (0 to 1,114,11110). Characters in the Basic Multilingual Plane (BMP) have code points โค FFFF16 and produce 1-4 hex digit entities. Supplementary plane characters (emoji, rare CJK) have code points > FFFF16 and produce 5-digit entities.
match(/&#x([0-9a-fA-F]+);/g)
character = String.fromCodePoint(parseInt(hexDigits, 16))
Where hexDigits is the captured group from the regex. The parseInt function converts the hex string to a decimal integer, and String.fromCodePoint reconstructs the original character. This correctly handles surrogate pairs that String.fromCharCode cannot.
Reference Data
| Character | Name | Code Point | Hex Entity | Category |
|---|---|---|---|---|
| & | Ampersand | U+0026 | & | Must-Escape in HTML |
| < | Less-Than Sign | U+003C | < | Must-Escape in HTML |
| > | Greater-Than Sign | U+003E | > | Must-Escape in HTML |
| " | Quotation Mark | U+0022 | " | Must-Escape in Attributes |
| ' | Apostrophe | U+0027 | ' | Must-Escape in Attributes |
| Non-Breaking Space | U+00A0 | Whitespace | ||
| ยฉ | Copyright Sign | U+00A9 | © | Special Symbol |
| ยฎ | Registered Sign | U+00AE | ® | Special Symbol |
| โข | Trade Mark Sign | U+2122 | ™ | Special Symbol |
| โฌ | Euro Sign | U+20AC | € | Currency |
| ยฃ | Pound Sign | U+00A3 | £ | Currency |
| ยฅ | Yen Sign | U+00A5 | ¥ | Currency |
| ยข | Cent Sign | U+00A2 | ¢ | Currency |
| - | Em Dash | U+2014 | — | Punctuation |
| - | En Dash | U+2013 | – | Punctuation |
| โฆ | Horizontal Ellipsis | U+2026 | … | Punctuation |
| โข | Bullet | U+2022 | • | Punctuation |
| ยฐ | Degree Sign | U+00B0 | ° | Math/Science |
| ยฑ | Plus-Minus Sign | U+00B1 | ± | Math/Science |
| ร | Multiplication Sign | U+00D7 | × | Math/Science |
| รท | Division Sign | U+00F7 | ÷ | Math/Science |
| โ | Infinity | U+221E | ∞ | Math/Science |
| ฯ | Greek Small Pi | U+03C0 | π | Greek Letter |
| ฮฉ | Greek Capital Omega | U+03A9 | Ω | Greek Letter |
| ฮฑ | Greek Small Alpha | U+03B1 | α | Greek Letter |
| โ | Leftwards Arrow | U+2190 | ← | Arrow |
| โ | Rightwards Arrow | U+2192 | → | Arrow |
| โ | Upwards Arrow | U+2191 | ↑ | Arrow |
| โ | Downwards Arrow | U+2193 | ↓ | Arrow |
| โ | Black Spade Suit | U+2660 | ♠ | Miscellaneous |