Fullwidth to Halfwidth Character Converter
Convert fullwidth (zenkaku) characters to halfwidth (hankaku) and vice versa. Supports ASCII, numbers, katakana, and symbols via Unicode mapping.
About
Fullwidth characters (zenkaku, ε ¨θ§) occupy a double-width cell in monospaced grids. They span Unicode codepoints U+FF01 through U+FF5E for ASCII equivalents and U+3000 for the ideographic space. Mixing fullwidth and halfwidth (hankaku, εθ§) characters in form submissions, CSV exports, or database records causes silent validation failures, broken regex matches, and inflated string lengths. This tool converts between the two representations using codepoint offset arithmetic: chalf = cfull β 0xFEE0 for printable ASCII, with a dedicated lookup dictionary for halfwidth Katakana (U+FF65 - U+FF9F) to fullwidth Katakana (U+30A1 - U+30F6) including dakuten and handakuten combining marks. The conversion is lossless and reversible for all supported character classes.
Limitation: CJK Unified Ideographs (kanji/hanzi) do not have a halfwidth variant in Unicode and pass through unchanged. Halfwidth Katakana combining sequences (e.g., halfwidth ka + dakuten β ga) are normalized into single fullwidth codepoints during half-to-full conversion, but the reverse decomposition is also handled. Pro tip: always normalize your data to one width before running string comparisons or calculating column widths in terminal output.
Formulas
The core conversion for ASCII-range fullwidth characters uses a constant offset in the Unicode codepoint space:
where cfull β [U+FF01, U+FF5E] and the result chalf β [U+0021, U+007E]. The reverse operation adds the same offset:
where chalf β [U+0021, U+007E].
The ideographic space is a special case: U+3000 β U+0020.
For Katakana, no linear offset exists. A lookup dictionary maps each halfwidth Katakana codepoint (U+FF65 - U+FF9F) to its fullwidth equivalent (U+30A1 - U+30F6 range). Voiced consonants (dakuten, U+FF9E) and semi-voiced consonants (handakuten, U+FF9F) combine with the preceding base character to form a single fullwidth codepoint. The algorithm scans left-to-right: if a base halfwidth katakana is followed by U+FF9E or U+FF9F, the pair is consumed and mapped to one fullwidth character. In the reverse (full β half), voiced fullwidth katakana decompose into base + combining mark.
Variable legend: cfull = fullwidth Unicode codepoint. chalf = halfwidth Unicode codepoint. 0xFEE0 = 65248 in decimal, the constant offset between fullwidth and halfwidth ASCII blocks in Unicode.
Reference Data
| Character Class | Fullwidth Range | Halfwidth Range | Offset / Method | Example Full β Half |
|---|---|---|---|---|
| Digits 0-9 | U+FF10 - U+FF19 | U+0030 - U+0039 | β0xFEE0 | οΌ β 3 |
| Uppercase A - Z | U+FF21 - U+FF3A | U+0041 - U+005A | β0xFEE0 | οΌ‘ β A |
| Lowercase a - z | U+FF41 - U+FF5A | U+0061 - U+007A | β0xFEE0 | ο½ β a |
| Symbols ! - ~ | U+FF01 - U+FF5E | U+0021 - U+007E | β0xFEE0 | οΌ β ! |
| Ideographic Space | U+3000 | U+0020 | Direct map | γ β (space) |
| Katakana γ’ (A) | U+30A2 | U+FF71 | Lookup table | γ’ β ο½± |
| Katakana γ« (Ka) | U+30AB | U+FF76 | Lookup table | γ« β ο½Ά |
| Katakana γ¬ (Ga) | U+30AC | U+FF76U+FF9E | Decompose + dakuten | γ¬ β ο½ΆοΎ |
| Katakana γ (Pa) | U+30D1 | U+FF8AU+FF9F | Decompose + handakuten | γ β οΎοΎ |
| HW Katakana Middle Dot | U+30FB | U+FF65 | Lookup table | γ» β ο½₯ |
| HW Katakana Prolonged Sound | U+30FC | U+FF70 | Lookup table | γΌ β ο½° |
| HW Corner Bracket γ | U+300C | U+FF62 | Lookup table | γ β ο½’ |
| HW Corner Bracket γ | U+300D | U+FF63 | Lookup table | γ β ο½£ |
| Fullwidth Won Sign | U+FFE6 | U+20A9 | Direct map | οΏ¦ β β© |
| Fullwidth Yen Sign | U+FFE5 | U+00A5 | Direct map | οΏ₯ β Β₯ |
| CJK Ideographs (Kanji) | U+4E00 - U+9FFF | N/A | No halfwidth variant | ε β ε (unchanged) |
| Hiragana | U+3040 - U+309F | N/A | No halfwidth variant | γ β γ (unchanged) |