About

Consonant removal is a character-level text transformation that strips every non-vowel letter from a string while preserving whitespace, digits, and punctuation. The operation maps an input alphabet of 26 Latin letters down to the vowel subset {a, e, i, o, u}. This tool extends coverage to accented variants (á, ë, ö, ü, etc.) totaling over 60 recognized vowel forms. Applications range from phonological analysis and linguistic pattern research to creative writing constraints and cipher construction. Misidentifying a character as consonant or vowel corrupts downstream analysis. This tool handles uppercase preservation, extended Latin diacritics, and non-Latin pass-through correctly.

Note: the classification treats y as a consonant by default, consistent with standard English orthographic convention. In languages where y functions as a vowel (Finnish, Welsh), enable the optional toggle. The tool processes text locally in the browser. No data is transmitted to any server.

Formulas

The consonant removal function f operates on each character c of an input string S of length n:

f(S) = n⋀i=1 g(c_i)

where the per-character gate function g is defined as:

{

c if c ∈ Vc if c ∉ Lε otherwise

where V = {a, e, i, o, u, á, é, …} is the set of recognized vowel characters, L is the set of all Unicode letters (detected via regex /\p{L}/u), and ε is the empty string. Characters that are letters but not vowels (consonants) are deleted. All non-letter characters (digits, spaces, punctuation, symbols) pass through unchanged. When the optional toggle is active, y and Y are added to set V.

Reference Data

Letter	Classification	IPA Example	Frequency (English)
a	Vowel	/æ/, /eɪ/	8.167%
b	Consonant	/b/	1.492%
c	Consonant	/k/, /s/	2.782%
d	Consonant	/d/	4.253%
e	Vowel	/ɛ/, /iː/	12.702%
f	Consonant	/f/	2.228%
g	Consonant	/ɡ/, /dʒ/	2.015%
h	Consonant	/h/	6.094%
i	Vowel	/ɪ/, /aɪ/	6.966%
j	Consonant	/dʒ/	0.153%
k	Consonant	/k/	0.772%
l	Consonant	/l/	4.025%
m	Consonant	/m/	2.406%
n	Consonant	/n/	6.749%
o	Vowel	/ɒ/, /oʊ/	7.507%
p	Consonant	/p/	1.929%
q	Consonant	/k/	0.095%
r	Consonant	/ɹ/	5.987%
s	Consonant	/s/, /z/	6.327%
t	Consonant	/t/	9.056%
u	Vowel	/ʌ/, /uː/	2.758%
v	Consonant	/v/	0.978%
w	Consonant	/w/	2.360%
x	Consonant	/ks/	0.150%
y	Consonant*	/j/, /iː/	1.974%
z	Consonant	/z/	0.074%
á, à, â, ä, ã, å	Vowel	Accented /a/	Language-dependent
é, è, ê, ë	Vowel	Accented /e/	Language-dependent
í, ì, î, ï	Vowel	Accented /i/	Language-dependent
ó, ò, ô, ö, õ	Vowel	Accented /o/	Language-dependent
ú, ù, û, ü	Vowel	Accented /u/	Language-dependent

Frequently Asked Questions

The vowel set contains over 60 entries covering standard Latin vowels plus their accented variants: á, à, â, ä, ã, å, æ, é, è, ê, ë, í, ì, î, ï, ó, ò, ô, ö, õ, ø, ú, ù, û, ü, and their uppercase equivalents. Any letter not in this set is treated as a consonant and removed. Characters like ñ, ç, ß are classified as consonants because they represent consonant phonemes.

In standard English orthography, y functions as a consonant at syllable onsets ("yes", "yellow") and as a vowel in other positions ("gym", "my"). The default consonant classification follows the majority convention used in crossword puzzles, Scrabble dictionaries, and most NLP tokenizers. Enable the "Treat Y as vowel" toggle if you are analyzing Welsh, Finnish, or performing phonological vowel-harmony studies where y consistently represents a vowel phoneme.

The tool uses Unicode letter detection (/\p{L}/u). Only characters present in the explicit vowel set are recognized as vowels. Cyrillic letters like а, е, и are Unicode-distinct from Latin a, e, i and are not in the vowel set. They will be removed as unrecognized consonants. CJK ideographs are not classified as Unicode letters by the \p{L} pattern in most contexts and pass through unchanged. For Cyrillic vowel extraction, a dedicated tool with a Cyrillic vowel set is recommended.

Yes. The algorithm only removes characters classified as consonant letters. All whitespace characters (spaces, tabs, newlines, carriage returns), digits (0 - 9), punctuation marks, and symbols are passed through unmodified. Line breaks in multi-line input are preserved exactly in the output.

In standard English text, consonants comprise approximately 62% of all letters (based on letter frequency analysis). After removal, expect the output to retain roughly 38% of the original letter count, plus all non-letter characters. Highly consonant-dense words like "strengths" (1 vowel, 8 consonants) reduce to a single character. Vowel-heavy words like "onomatopoeia" (8 vowels, 4 consonants) retain most of their structure.

The output reveals the vowel skeleton of text, which is useful for detecting vowel harmony patterns, analyzing assonance in poetry, or studying vocalic structure in morphology. However, this tool operates on orthography, not phonetics. Silent vowels (e.g., the final e in "cake") are retained, and diphthongs represented by consonant letters (like w in "cow") are removed. For true phonetic analysis, work with IPA transcriptions instead.