User Rating 0.0 ★★★★★

Total Usage 0 times

Category Text Formatting

ASCII (A–Z, 0–9, symbols) Katakana (カタカナ) Space (　 ↔ \u0020)

Fullwidth (Zenkaku 全角) 0 chars

Full → Half

Halfwidth (Hankaku 半角) 0 chars

Is this tool helpful?

Your feedback helps us improve.

★ ★ ★ ★ ★

About

Fullwidth characters (zenkaku, 全角) occupy a double-width cell in monospaced grids. They span Unicode codepoints U+FF01 through U+FF5E for ASCII equivalents and U+3000 for the ideographic space. Mixing fullwidth and halfwidth (hankaku, 半角) characters in form submissions, CSV exports, or database records causes silent validation failures, broken regex matches, and inflated string lengths. This tool converts between the two representations using codepoint offset arithmetic: c_half = c_full − 0xFEE0 for printable ASCII, with a dedicated lookup dictionary for halfwidth Katakana (U+FF65 - U+FF9F) to fullwidth Katakana (U+30A1 - U+30F6) including dakuten and handakuten combining marks. The conversion is lossless and reversible for all supported character classes.

Limitation: CJK Unified Ideographs (kanji/hanzi) do not have a halfwidth variant in Unicode and pass through unchanged. Halfwidth Katakana combining sequences (e.g., halfwidth ka + dakuten → ga) are normalized into single fullwidth codepoints during half-to-full conversion, but the reverse decomposition is also handled. Pro tip: always normalize your data to one width before running string comparisons or calculating column widths in terminal output.

Formulas

The core conversion for ASCII-range fullwidth characters uses a constant offset in the Unicode codepoint space:

c_half = c_full − 0xFEE0

where c_full ∈ [U+FF01, U+FF5E] and the result c_half ∈ [U+0021, U+007E]. The reverse operation adds the same offset:

c_full = c_half + 0xFEE0

where c_half ∈ [U+0021, U+007E].

The ideographic space is a special case: U+3000 ↔ U+0020.

For Katakana, no linear offset exists. A lookup dictionary maps each halfwidth Katakana codepoint (U+FF65 - U+FF9F) to its fullwidth equivalent (U+30A1 - U+30F6 range). Voiced consonants (dakuten, U+FF9E) and semi-voiced consonants (handakuten, U+FF9F) combine with the preceding base character to form a single fullwidth codepoint. The algorithm scans left-to-right: if a base halfwidth katakana is followed by U+FF9E or U+FF9F, the pair is consumed and mapped to one fullwidth character. In the reverse (full → half), voiced fullwidth katakana decompose into base + combining mark.

Variable legend: c_full = fullwidth Unicode codepoint. c_half = halfwidth Unicode codepoint. 0xFEE0 = 65248 in decimal, the constant offset between fullwidth and halfwidth ASCII blocks in Unicode.

Reference Data

Character Class	Fullwidth Range	Halfwidth Range	Offset / Method	Example Full → Half
Digits 0-9	U+FF10 - U+FF19	U+0030 - U+0039	−0xFEE0	３ → 3
Uppercase A - Z	U+FF21 - U+FF3A	U+0041 - U+005A	−0xFEE0	Ａ → A
Lowercase a - z	U+FF41 - U+FF5A	U+0061 - U+007A	−0xFEE0	ａ → a
Symbols ! - ~	U+FF01 - U+FF5E	U+0021 - U+007E	−0xFEE0	！ → !
Ideographic Space	U+3000	U+0020	Direct map	→ (space)
Katakana ア (A)	U+30A2	U+FF71	Lookup table	ア → ｱ
Katakana カ (Ka)	U+30AB	U+FF76	Lookup table	カ → ｶ
Katakana ガ (Ga)	U+30AC	U+FF76U+FF9E	Decompose + dakuten	ガ → ｶﾞ
Katakana パ (Pa)	U+30D1	U+FF8AU+FF9F	Decompose + handakuten	パ → ﾊﾟ
HW Katakana Middle Dot	U+30FB	U+FF65	Lookup table	・ → ･
HW Katakana Prolonged Sound	U+30FC	U+FF70	Lookup table	ー → ｰ
HW Corner Bracket 「	U+300C	U+FF62	Lookup table	「 → ｢
HW Corner Bracket 」	U+300D	U+FF63	Lookup table	」 → ｣
Fullwidth Won Sign	U+FFE6	U+20A9	Direct map	￦ → ₩
Fullwidth Yen Sign	U+FFE5	U+00A5	Direct map	￥ → ¥
CJK Ideographs (Kanji)	U+4E00 - U+9FFF	N/A	No halfwidth variant	字 → 字 (unchanged)
Hiragana	U+3040 - U+309F	N/A	No halfwidth variant	あ → あ (unchanged)

Frequently Asked Questions

East Asian legacy encodings (JIS X 0201, KS X 1001) assigned single-byte codes to Latin characters and double-byte codes to CJK characters. When Unicode unified these, it preserved both widths for backward compatibility. Fullwidth Latin (U+FF01 - U+FF5E) exists so that Latin text aligns with CJK character grids in monospaced layouts. Halfwidth Katakana (U+FF65 - U+FF9F) exists because early Japanese systems used single-byte katakana.

Yes. When converting halfwidth to fullwidth Katakana, the tool performs a lookahead: if a base halfwidth katakana character (e.g., U+FF76 カ) is followed by the halfwidth dakuten mark (U+FF9E), the pair is merged into a single fullwidth voiced character (U+30AC ガ). The same applies to handakuten (U+FF9F) for p-row sounds. In reverse (full to half), a voiced fullwidth katakana decomposes into the base character plus the appropriate combining mark.

CJK Unified Ideographs (kanji/hanzi), Hiragana, emoji, Latin characters already in the target width, and any character outside the defined conversion ranges pass through without modification. The converter only transforms characters that have a defined counterpart in the opposite width.

Frequently. A fullwidth digit like ３ (U+FF13) will fail a regex pattern such as /[0-9]/ because it is not in the ASCII digit range. Similarly, fullwidth letters break email validation, URL parsing, and numeric input fields. Database collation may treat Ａ (fullwidth A) and A (U+0041) as different characters, causing duplicate entries. Normalizing width before validation eliminates these issues.

For ASCII-range characters (digits, letters, symbols, space), the conversion is perfectly reversible: full → half → full returns the original string. For Katakana, the conversion is also reversible, but note that halfwidth katakana combining sequences (base + dakuten) normalize into single fullwidth codepoints. Converting back decomposes them again, so the roundtrip is consistent but the intermediate representation differs in string length.

These are special fullwidth currency symbols defined in the Halfwidth and Fullwidth Forms block but outside the simple 0xFEE0 offset range. The converter maps ￥ (fullwidth yen) to U+00A5 (¥) and ￦ (fullwidth won) to U+20A9 (₩) via explicit lookup rather than arithmetic offset.