User Rating 0.0 β˜…β˜…β˜…β˜…β˜…
Total Usage 0 times
Fullwidth (Zenkaku ε…¨θ§’) 0 chars
Full β†’ Half
Halfwidth (Hankaku εŠθ§’) 0 chars
Is this tool helpful?

Your feedback helps us improve.

β˜… β˜… β˜… β˜… β˜…

About

Fullwidth characters (zenkaku, ε…¨θ§’) occupy a double-width cell in monospaced grids. They span Unicode codepoints U+FF01 through U+FF5E for ASCII equivalents and U+3000 for the ideographic space. Mixing fullwidth and halfwidth (hankaku, εŠθ§’) characters in form submissions, CSV exports, or database records causes silent validation failures, broken regex matches, and inflated string lengths. This tool converts between the two representations using codepoint offset arithmetic: chalf = cfull βˆ’ 0xFEE0 for printable ASCII, with a dedicated lookup dictionary for halfwidth Katakana (U+FF65 - U+FF9F) to fullwidth Katakana (U+30A1 - U+30F6) including dakuten and handakuten combining marks. The conversion is lossless and reversible for all supported character classes.

Limitation: CJK Unified Ideographs (kanji/hanzi) do not have a halfwidth variant in Unicode and pass through unchanged. Halfwidth Katakana combining sequences (e.g., halfwidth ka + dakuten β†’ ga) are normalized into single fullwidth codepoints during half-to-full conversion, but the reverse decomposition is also handled. Pro tip: always normalize your data to one width before running string comparisons or calculating column widths in terminal output.

fullwidth halfwidth zenkaku hankaku unicode japanese character converter katakana text formatting

Formulas

The core conversion for ASCII-range fullwidth characters uses a constant offset in the Unicode codepoint space:

chalf = cfull βˆ’ 0xFEE0

where cfull ∈ [U+FF01, U+FF5E] and the result chalf ∈ [U+0021, U+007E]. The reverse operation adds the same offset:

cfull = chalf + 0xFEE0

where chalf ∈ [U+0021, U+007E].

The ideographic space is a special case: U+3000 ↔ U+0020.

For Katakana, no linear offset exists. A lookup dictionary maps each halfwidth Katakana codepoint (U+FF65 - U+FF9F) to its fullwidth equivalent (U+30A1 - U+30F6 range). Voiced consonants (dakuten, U+FF9E) and semi-voiced consonants (handakuten, U+FF9F) combine with the preceding base character to form a single fullwidth codepoint. The algorithm scans left-to-right: if a base halfwidth katakana is followed by U+FF9E or U+FF9F, the pair is consumed and mapped to one fullwidth character. In the reverse (full β†’ half), voiced fullwidth katakana decompose into base + combining mark.

Variable legend: cfull = fullwidth Unicode codepoint. chalf = halfwidth Unicode codepoint. 0xFEE0 = 65248 in decimal, the constant offset between fullwidth and halfwidth ASCII blocks in Unicode.

Reference Data

Character ClassFullwidth RangeHalfwidth RangeOffset / MethodExample Full β†’ Half
Digits 0-9U+FF10 - U+FF19U+0030 - U+0039βˆ’0xFEE0οΌ“ β†’ 3
Uppercase A - ZU+FF21 - U+FF3AU+0041 - U+005Aβˆ’0xFEE0οΌ‘ β†’ A
Lowercase a - zU+FF41 - U+FF5AU+0061 - U+007Aβˆ’0xFEE0a β†’ a
Symbols ! - ~U+FF01 - U+FF5EU+0021 - U+007Eβˆ’0xFEE0! β†’ !
Ideographic SpaceU+3000U+0020Direct mapγ€€ β†’ (space)
Katakana γ‚’ (A)U+30A2U+FF71Lookup tableγ‚’ β†’ ο½±
Katakana γ‚« (Ka)U+30ABU+FF76Lookup tableγ‚« β†’ ο½Ά
Katakana ガ (Ga)U+30ACU+FF76U+FF9EDecompose + dakutenガ β†’ 「゙
Katakana パ (Pa)U+30D1U+FF8AU+FF9FDecompose + handakutenパ β†’ パ
HW Katakana Middle DotU+30FBU+FF65Lookup table・ β†’ ο½₯
HW Katakana Prolonged SoundU+30FCU+FF70Lookup tableー → ー
HW Corner Bracket γ€ŒU+300CU+FF62Lookup tableγ€Œ β†’ ο½’
HW Corner Bracket 」U+300DU+FF63Lookup table」 β†’ ο½£
Fullwidth Won SignU+FFE6U+20A9Direct mapοΏ¦ β†’ β‚©
Fullwidth Yen SignU+FFE5U+00A5Direct mapοΏ₯ β†’ Β₯
CJK Ideographs (Kanji)U+4E00 - U+9FFFN/ANo halfwidth variantε­— β†’ ε­— (unchanged)
HiraganaU+3040 - U+309FN/ANo halfwidth variantあ β†’ あ (unchanged)

Frequently Asked Questions

East Asian legacy encodings (JIS X 0201, KS X 1001) assigned single-byte codes to Latin characters and double-byte codes to CJK characters. When Unicode unified these, it preserved both widths for backward compatibility. Fullwidth Latin (U+FF01 - U+FF5E) exists so that Latin text aligns with CJK character grids in monospaced layouts. Halfwidth Katakana (U+FF65 - U+FF9F) exists because early Japanese systems used single-byte katakana.
Yes. When converting halfwidth to fullwidth Katakana, the tool performs a lookahead: if a base halfwidth katakana character (e.g., U+FF76 γ‚«) is followed by the halfwidth dakuten mark (U+FF9E), the pair is merged into a single fullwidth voiced character (U+30AC ガ). The same applies to handakuten (U+FF9F) for p-row sounds. In reverse (full to half), a voiced fullwidth katakana decomposes into the base character plus the appropriate combining mark.
CJK Unified Ideographs (kanji/hanzi), Hiragana, emoji, Latin characters already in the target width, and any character outside the defined conversion ranges pass through without modification. The converter only transforms characters that have a defined counterpart in the opposite width.
Frequently. A fullwidth digit like οΌ“ (U+FF13) will fail a regex pattern such as /[0-9]/ because it is not in the ASCII digit range. Similarly, fullwidth letters break email validation, URL parsing, and numeric input fields. Database collation may treat οΌ‘ (fullwidth A) and A (U+0041) as different characters, causing duplicate entries. Normalizing width before validation eliminates these issues.
For ASCII-range characters (digits, letters, symbols, space), the conversion is perfectly reversible: full β†’ half β†’ full returns the original string. For Katakana, the conversion is also reversible, but note that halfwidth katakana combining sequences (base + dakuten) normalize into single fullwidth codepoints. Converting back decomposes them again, so the roundtrip is consistent but the intermediate representation differs in string length.
These are special fullwidth currency symbols defined in the Halfwidth and Fullwidth Forms block but outside the simple 0xFEE0 offset range. The converter maps οΏ₯ (fullwidth yen) to U+00A5 (Β₯) and οΏ¦ (fullwidth won) to U+20A9 (β‚©) via explicit lookup rather than arithmetic offset.