User Rating 0.0
Total Usage 0 times
0 characters
Use uppercase T, N, L, R for retroflex. Use a dot (.) to break ambiguous sequences.
Is this tool helpful?

Your feedback helps us improve.

About

Tamil script comprises 12 vowels (உயிர் எழுத்துக்கள்), 18 consonants (மெய் எழுத்துக்கள்), and 216 consonant-vowel combinations (உயிர்மெய் எழுத்துக்கள்), totaling 247 characters. Typing Tamil natively requires a specialized keyboard layout unfamiliar to most users trained on QWERTY. Incorrect transliteration produces garbled text that fails Unicode validation and renders as broken glyphs across devices. This tool implements a greedy longest-match phonetic mapping algorithm: you type English letter sequences (e.g., ka → க, thi → தி) and receive correct Tamil Unicode output in real time.

The mapping follows the widely-adopted Tamil phonetic convention used by Tamil99 and similar input systems, covering retroflex consonants (T, N, L in uppercase), aspirates, and all 12 vowel signs. Output is standard Unicode (range U+0B80 - U+0BFF), compatible with all modern browsers, operating systems, and databases. Note: this tool handles phonetic approximation. Loanwords from Sanskrit or English that have no native Tamil phoneme may require manual adjustment.

tamil transliteration phonetic converter unicode tamil typing english to tamil

Formulas

The transliteration engine uses a greedy longest-match algorithm over an input buffer B. At each keystroke, the algorithm attempts to match the longest prefix of B against the phonetic mapping dictionary D.

transliterate(B) =
{
D[B] if B Dtransliterate(B[0..n1]) otherwise, reduce length

Where B is the current input buffer (sequence of English keystrokes not yet converted), D is the phonetic mapping dictionary containing all valid English-to-Tamil mappings, and n = len(B). The algorithm scans from the full buffer length down to 1, emitting the Tamil character for the first match found and moving the buffer cursor forward by the match length.

Tamil consonant-vowel combination is computed as: combined = consonant_base + vowel_sign, where the vowel sign (combining mark, உயிர் குறி) occupies Unicode codepoints U+0BBE through U+0BCC. A bare consonant with no following vowel receives the pulli (புள்ளி, U+0BCD) to indicate the inherent vowel is suppressed.

Reference Data

English InputTamil CharacterUnicodeType
aU+0B85Vowel
aa / AU+0B86Vowel
iU+0B87Vowel
ee / IU+0B88Vowel
uU+0B89Vowel
oo / UU+0B8AVowel
eU+0B8EVowel
EU+0B8FVowel
aiU+0B90Vowel
oU+0B92Vowel
OU+0B93Vowel
auU+0B94Vowel
kக்U+0B95Consonant
ngங்U+0B99Consonant
chச்U+0B9AConsonant
njஞ்U+0B9EConsonant
Tட்U+0B9FConsonant (Retroflex)
Nண்U+0BA3Consonant (Retroflex)
thத்U+0BA4Consonant
nந்U+0BA8Consonant
pப்U+0BAAConsonant
mம்U+0BAEConsonant
yய்U+0BAFConsonant
rர்U+0BB0Consonant
lல்U+0BB2Consonant
vவ்U+0BB5Consonant
zhழ்U+0BB4Consonant
Lள்U+0BB3Consonant (Retroflex)
Rற்U+0BB1Consonant
nnன்U+0BA9Consonant
sஸ்U+0BB8Grantha
shஷ்U+0BB7Grantha
jஜ்U+0B9CGrantha
hஹ்U+0BB9Grantha
Sriஸ்ரீU+0BB8 U+0BCD U+0BB0 U+0BC0Grantha Combo

Frequently Asked Questions

Retroflex consonants use uppercase English letters. Type T for ட, N for ண, L for ள, and R for ற. The case distinction is critical: lowercase n maps to ந (dental), while uppercase N maps to ண (retroflex).
The engine uses greedy longest-match resolution. For example, typing nj maps to ஞ rather than ந + ஜ, because the two-character sequence nj has higher priority. If you need ந followed by ஜ, insert a separator (period or slash) between them: n.j.
A consonant followed by another consonant or a space automatically receives the pulli mark (்). For example, typing k followed by a space produces க். To explicitly add pulli before continuing, you can also follow the consonant key with no vowel sequence.
Yes. The mapping includes j → ஜ, s → ஸ, sh → ஷ, h → ஹ, and the compound Sri → ஸ்ரீ. These are part of the extended Tamil Unicode block (U+0B9C, U+0BB7 - U+0BB9).
Yes. You can either type character-by-character for real-time conversion or paste a full English phonetic text. The engine processes the entire string through the same greedy algorithm, converting each phonetic cluster sequentially. Numbers, punctuation, and unrecognized characters pass through unchanged.
The sequence th is a single mapping to த (dental stop). If you need த followed by ஹ separately, use a separator: t.h. The dot acts as a buffer-break signal, forcing the engine to commit t before processing h independently.