About

Case conversion errors propagate silently through data pipelines. A mismatched key in a case-sensitive database lookup returns zero rows instead of throwing an error. This tool applies the Unicode Simple Uppercase Mapping to every code point in your input. For standard ASCII, each lowercase letter from a (0x61) through z (0x7A) is shifted by subtracting 32 (0x20) to yield A (0x41) through Z (0x5A). Non-alphabetic characters pass through unchanged. The conversion handles the full BMP range, so accented characters like é correctly become É.

Limitation: locale-specific mappings (e.g., Turkish dotless i → İ) follow the browser’s default locale, not a user-selected one. For bulk normalization tasks, verify your target locale matches.

Formulas

For any character c within the ASCII lowercase range, the uppercase equivalent C is computed by a fixed integer offset:

C = c − 32

This works because the ASCII table places uppercase letters at code points 65 - 90 and lowercase at 97 - 122. The gap is exactly 0x20 (32 in decimal). In binary, this is equivalent to clearing bit 5:

C = c ∧ 0xDF

The general condition for this offset to apply:

0x61 ≤ charCodeAt(c) ≤ 0x7A

Where c = the input character, C = the resulting uppercase character, 0x20 = the fixed decimal offset of 32 between ASCII cases, and 0xDF = the bitmask that clears bit 5. For characters outside this range (digits, punctuation, already-uppercase, extended Unicode), the JavaScript engine applies the full Unicode Simple Uppercase Mapping table internally.

Reference Data

Character	Lowercase Code	Uppercase	Uppercase Code	Offset
a	97 (0x61)	A	65 (0x41)	−32
b	98 (0x62)	B	66 (0x42)	−32
c	99 (0x63)	C	67 (0x43)	−32
d	100 (0x64)	D	68 (0x44)	−32
e	101 (0x65)	E	69 (0x45)	−32
f	102 (0x66)	F	70 (0x46)	−32
g	103 (0x67)	G	71 (0x47)	−32
h	104 (0x68)	H	72 (0x48)	−32
i	105 (0x69)	I	73 (0x49)	−32
j	106 (0x6A)	J	74 (0x4A)	−32
k	107 (0x6B)	K	75 (0x4B)	−32
l	108 (0x6C)	L	76 (0x4C)	−32
m	109 (0x6D)	M	77 (0x4D)	−32
n	110 (0x6E)	N	78 (0x4E)	−32
o	111 (0x6F)	O	79 (0x4F)	−32
p	112 (0x70)	P	80 (0x50)	−32
q	113 (0x71)	Q	81 (0x51)	−32
r	114 (0x72)	R	82 (0x52)	−32
s	115 (0x73)	S	83 (0x53)	−32
t	116 (0x74)	T	84 (0x54)	−32
u	117 (0x75)	U	85 (0x55)	−32
v	118 (0x76)	V	86 (0x56)	−32
w	119 (0x77)	W	87 (0x57)	−32
x	120 (0x78)	X	88 (0x58)	−32
y	121 (0x79)	Y	89 (0x59)	−32
z	122 (0x7A)	Z	90 (0x5A)	−32

Frequently Asked Questions

Yes. The converter uses JavaScript's native String.prototype.toUpperCase(), which follows the Unicode Simple Uppercase Mapping defined in the ECMAScript specification. Characters like é become É, ñ becomes Ñ, and ß becomes SS (the standard German uppercase mapping). However, locale-specific rules (e.g., Turkish dotless ı → I vs. İ) depend on the browser's default locale, not a user-configurable setting.

They pass through completely unchanged. The uppercase mapping only affects characters classified as Unicode Lowercase_Letter (category Ll). Code points for digits (0x30-0x39), spaces (0x20), tabs (0x09), and all standard punctuation have no uppercase variant and are returned as-is.

The ASCII table was designed intentionally so that lowercase and uppercase Latin letters differ by exactly one bit - bit 5 (value 32). This made case-insensitive comparison trivial on early hardware: clear bit 5 with a single AND 0xDF instruction. The offset of 32 is a deliberate engineering choice from the 1963 ASCII standard, not a coincidence.

JavaScript's toUpperCase() operates in O(n) time and is implemented natively in the engine (V8, SpiderMonkey), so it runs at near-C speed. Texts up to several megabytes convert in under 50ms on modern hardware. For texts exceeding 10MB, consider splitting input into chunks. The textarea in this tool has no artificial character limit.

Per Unicode rules, ß maps to the two-character sequence SS when uppercased. This means the output string can be longer than the input. For example, "straße" (6 characters) becomes "STRASSE" (7 characters). The capital Eszett (ẞ, U+1E9E) was added to Unicode in 2008 but is not the default mapping result.

No. Uppercase conversion is a lossy operation. Once applied, the original case information is destroyed. "Hello World" and "hello world" both produce "HELLO WORLD". If you need reversibility, store the original text before converting.