User Rating 0.0 ★★★★★

Total Usage 0 times

Category Text Formatting

Examples:

Phonetic Input (Latin) Type using Latin letters. Digraphs like 'sh', 'ch', 'zh' are auto-detected.

Native Script Output

Input: 0 chars Output: 0 chars Language: Russian

Is this tool helpful?

Your feedback helps us improve.

★ ★ ★ ★ ★

About

Transliteration errors corrupt data. A single mismatched phonetic rule turns sh into two separate characters instead of one ш, producing gibberish that search engines cannot index and native readers cannot parse. This converter implements greedy longest-match algorithms across 12 writing systems, processing digraphs and trigraphs (like shch → щ) before falling back to single-character maps. Hebrew output includes full nikud vowel pointing using Unicode combining marks in the range U+05B0 - U+05BB. Korean output composes Jamo elements into precomposed Hangul syllable blocks via the standard formula offset at U+AC00.

The tool assumes ISO-9 and BGN/PCGN romanization conventions where applicable, but phonetic input is inherently lossy. Ambiguous sequences default to the most statistically frequent mapping. For example, Russian е vs э cannot always be distinguished from Latin e alone. Pro tip: use the apostrophe character to insert a soft sign (ь) in Russian, and double-apostrophe for hard sign (ъ). Conversion is real-time and runs entirely in your browser. No data is transmitted to any server.

Formulas

The converter uses a greedy longest-match algorithm. For an input string S of length n, the algorithm scans from position i = 0 and attempts to match substrings of decreasing length against the mapping table M:

convert(S) = for i = 0 to n: find max k where S[i..i+k] ∈ M, emit M[S[i..i+k]]

Korean Hangul syllable composition follows the Unicode standard formula:

syllable = 0xAC00 + (onset × 21 + nucleus) × 28 + coda

Where onset is the initial consonant index (0 - 18), nucleus is the vowel index (0 - 20), and coda is the final consonant index (0 - 27, where 0 means no coda). The algorithm time complexity is O(n ⋅ m) where m is the maximum digraph length (typically 4), making it effectively linear.

Reference Data

Language	Script	Phonetic Input Example	Native Output	Digraphs Supported	Vowel System
Russian	Cyrillic	privet mir	привет мир	sh, ch, shch, zh, ts, yu, ya, yo, je	Inline vowels
Ukrainian	Cyrillic	pryvit svit	привіт світ	sh, ch, shch, zh, ts, yi, ya, yu	Inline vowels
Hebrew	Hebrew + Nikud	shalom	שָׁלוֹם	sh, ch, ts, th	Nikud combining marks
Arabic	Arabic	marhaba	مرحبا	sh, th, dh, gh, kh	Optional diacritics
Greek	Greek	kalimera	καλιμερα	th, ph, ch, ps, ks	Inline vowels
Japanese (Hiragana)	Hiragana	konnichiwa	こんにちわ	sh, ch, ts, fu, n+vowel	CV mora system
Japanese (Katakana)	Katakana	konnichiwa	コンニチワ	sh, ch, ts, fu, n+vowel	CV mora system
Korean	Hangul	annyeonghaseyo	안녕하세요	Jamo onset/coda pairs	Composed syllable blocks
Georgian	Mkhedruli	gamarjoba	გამარჯობა	sh, ch, ts, zh, gh, kh	Inline vowels
Armenian	Armenian	barev dzez	բարև delays	sh, ch, ts, zh, dz	Inline vowels
Thai	Thai	sawatdi	สวัสดี	th, ph, kh, ng	Tone-dependent vowels
Hindi	Devanagari	namaste	नमस्ते	sh, ch, th, dh, bh, ph, kh, gh	Inherent /a/ + matras

Frequently Asked Questions

The algorithm uses greedy longest-match. It always attempts the longest possible digraph or trigraph first. So "sh" will always map to a single character (e.g., ш in Russian, שׁ in Hebrew) rather than two separate characters. If you genuinely need "s" + "h" as separate letters, insert a delimiter such as a period or hyphen between them: "s.h" or "s-h".

Yes. Hebrew vowels are rendered as nikud combining marks. Type the vowel immediately after the consonant: "ba" produces בַּ (bet with patach), "bi" produces בִּ (bet with chirik), "bu" produces בּוּ (bet with shuruk). The vowel mapping covers patach (a), chirik (i), shuruk/kubutz (u), segol (e), and cholam (o). Consonants without a following vowel receive no nikud.

Hangul composition requires a valid onset-nucleus pair at minimum. If the phonetic input provides a consonant without a following vowel (e.g., a trailing 'k'), it cannot form a syllable block and remains as a standalone Jamo character. Ensure each consonant is followed by a vowel sound to produce proper composed blocks. The formula requires at least an onset index and nucleus index to compute a valid syllable at codepoint U+AC00 and above.

Yes. All input is normalized to lowercase before lookup. The mapping tables are case-insensitive. The original casing has no effect on the output since target scripts like Cyrillic, Hebrew, Arabic, and Hangul either have their own case systems (handled automatically) or are unicameral (no case distinction).

The mapping uses "ye" for е when word-initial or after a vowel, and "e" alone maps to э. The letter й maps from "j" or "y" when not followed by a vowel. The digraph "yo" maps to ё, "yu" to ю, "ya" to я. The soft sign ь is typed with an apostrophe, and hard sign ъ with double apostrophe. These conventions follow BGN/PCGN transliteration standards.

Unmapped characters pass through unchanged. Numbers, punctuation marks, spaces, and any Unicode characters not in the active mapping table are preserved in their original form in the output. This allows mixed-content text (e.g., 'privet 123!') to convert correctly (привет 123!).