User Rating 0.0 ★★★★★

Total Usage 0 times

Category Text Formatting

Type in English (Phonetic)

0 characters

Tamil Output

Use uppercase T, N, L, R for retroflex. Use a dot (.) to break ambiguous sequences.

Is this tool helpful?

Your feedback helps us improve.

★ ★ ★ ★ ★

About

Tamil script comprises 12 vowels (உயிர் எழுத்துக்கள்), 18 consonants (மெய் எழுத்துக்கள்), and 216 consonant-vowel combinations (உயிர்மெய் எழுத்துக்கள்), totaling 247 characters. Typing Tamil natively requires a specialized keyboard layout unfamiliar to most users trained on QWERTY. Incorrect transliteration produces garbled text that fails Unicode validation and renders as broken glyphs across devices. This tool implements a greedy longest-match phonetic mapping algorithm: you type English letter sequences (e.g., ka → க, thi → தி) and receive correct Tamil Unicode output in real time.

The mapping follows the widely-adopted Tamil phonetic convention used by Tamil99 and similar input systems, covering retroflex consonants (T, N, L in uppercase), aspirates, and all 12 vowel signs. Output is standard Unicode (range U+0B80 - U+0BFF), compatible with all modern browsers, operating systems, and databases. Note: this tool handles phonetic approximation. Loanwords from Sanskrit or English that have no native Tamil phoneme may require manual adjustment.

Formulas

The transliteration engine uses a greedy longest-match algorithm over an input buffer B. At each keystroke, the algorithm attempts to match the longest prefix of B against the phonetic mapping dictionary D.

transliterate(B) =

{

D[B] if B ∈ Dtransliterate(B[0..n−1]) otherwise, reduce length

Where B is the current input buffer (sequence of English keystrokes not yet converted), D is the phonetic mapping dictionary containing all valid English-to-Tamil mappings, and n = len(B). The algorithm scans from the full buffer length down to 1, emitting the Tamil character for the first match found and moving the buffer cursor forward by the match length.

Tamil consonant-vowel combination is computed as: combined = consonant_base + vowel_sign, where the vowel sign (combining mark, உயிர் குறி) occupies Unicode codepoints U+0BBE through U+0BCC. A bare consonant with no following vowel receives the pulli (புள்ளி, U+0BCD) to indicate the inherent vowel is suppressed.

Reference Data

English Input	Tamil Character	Unicode	Type
a	அ	U+0B85	Vowel
aa / A	ஆ	U+0B86	Vowel
i	இ	U+0B87	Vowel
ee / I	ஈ	U+0B88	Vowel
u	உ	U+0B89	Vowel
oo / U	ஊ	U+0B8A	Vowel
e	எ	U+0B8E	Vowel
E	ஏ	U+0B8F	Vowel
ai	ஐ	U+0B90	Vowel
o	ஒ	U+0B92	Vowel
O	ஓ	U+0B93	Vowel
au	ஔ	U+0B94	Vowel
k	க்	U+0B95	Consonant
ng	ங்	U+0B99	Consonant
ch	ச்	U+0B9A	Consonant
nj	ஞ்	U+0B9E	Consonant
T	ட்	U+0B9F	Consonant (Retroflex)
N	ண்	U+0BA3	Consonant (Retroflex)
th	த்	U+0BA4	Consonant
n	ந்	U+0BA8	Consonant
p	ப்	U+0BAA	Consonant
m	ம்	U+0BAE	Consonant
y	ய்	U+0BAF	Consonant
r	ர்	U+0BB0	Consonant
l	ல்	U+0BB2	Consonant
v	வ்	U+0BB5	Consonant
zh	ழ்	U+0BB4	Consonant
L	ள்	U+0BB3	Consonant (Retroflex)
R	ற்	U+0BB1	Consonant
nn	ன்	U+0BA9	Consonant
s	ஸ்	U+0BB8	Grantha
sh	ஷ்	U+0BB7	Grantha
j	ஜ்	U+0B9C	Grantha
h	ஹ்	U+0BB9	Grantha
Sri	ஸ்ரீ	U+0BB8 U+0BCD U+0BB0 U+0BC0	Grantha Combo

Frequently Asked Questions

Retroflex consonants use uppercase English letters. Type T for ட, N for ண, L for ள, and R for ற. The case distinction is critical: lowercase n maps to ந (dental), while uppercase N maps to ண (retroflex).

The engine uses greedy longest-match resolution. For example, typing nj maps to ஞ rather than ந + ஜ, because the two-character sequence nj has higher priority. If you need ந followed by ஜ, insert a separator (period or slash) between them: n.j.

A consonant followed by another consonant or a space automatically receives the pulli mark (்). For example, typing k followed by a space produces க். To explicitly add pulli before continuing, you can also follow the consonant key with no vowel sequence.

Yes. The mapping includes j → ஜ, s → ஸ, sh → ஷ, h → ஹ, and the compound Sri → ஸ்ரீ. These are part of the extended Tamil Unicode block (U+0B9C, U+0BB7 - U+0BB9).

Yes. You can either type character-by-character for real-time conversion or paste a full English phonetic text. The engine processes the entire string through the same greedy algorithm, converting each phonetic cluster sequentially. Numbers, punctuation, and unrecognized characters pass through unchanged.

The sequence th is a single mapping to த (dental stop). If you need த followed by ஹ separately, use a separator: t.h. The dot acts as a buffer-break signal, forcing the engine to commit t before processing h independently.