User Rating 0.0
Total Usage 1 times
?
HEX: -
DEC: -
BIN: -

Name: Select a character

Category: -

Bit Strip Equivalent: -

The bit-strip equivalent shows the character resulting from Byte & 0x7F.

Is this tool helpful?

Your feedback helps us improve.

About

System administrators and digital archivists frequently encounter KOI8-R when recovering data from early Unix systems or analyzing pre-2000 Russian internet traffic. This 8-bit character encoding (defined in RFC 1489) assigns the upper 128 bytes to Cyrillic characters and box-drawing symbols. Unlike ISO-8859-5 or Windows-1251, the KOI8-R layout is not alphabetical. The designers arranged Cyrillic letters to map onto phonetically similar ASCII characters if the eighth bit is stripped.

Data corruption often occurs when 8-bit text passes through a 7-bit gateway (such as early email servers). In such cases, the phonetic property of KOI8-R acts as a fail-safe. The text degrades into a readable transliteration rather than random garbage. This tool provides a complete lookup table for debugging encoding errors (mojibake) and verifying byte sequences in hex editors.

character encoding koi8-r legacy unix cyrillic ascii rfc 1489

Formulas

The core design principle of KOI8-R involves a bitwise relationship between the Cyrillic upper range and the Latin lower range. If the most significant bit (MSB) is dropped, the value correlates to a phonetic equivalent.

Ccyr 0x7F Clat

For example, the Cyrillic "Russian a" occupies a specific index. Applying the mask yields the Latin "a".

Let x = 0xC1 (Cyrillic 'a')
x 011111112 = 0x41 (ASCII 'A')

Reference Data

DecimalHexBinaryCharDescription
1930xC111000001aCYRILLIC SMALL LETTER A
1940xC211000010бCYRILLIC SMALL LETTER BE
2250xE111100001ACYRILLIC CAPITAL LETTER A
2260xE211100010БCYRILLIC CAPITAL LETTER BE
1280x8010000000BOX DRAWINGS LIGHT HORIZONTAL
1540x9A10011010 NO-BREAK SPACE

Frequently Asked Questions

The ordering prioritizes phonetic mapping over alphabetic sorting. The positions 192 through 255 align with ASCII characters 64 through 127. This ensures that if the eighth bit (128) is subtracted or lost during transmission, the resulting 7-bit code displays a readable Latin approximation (e.g., "Русский" becomes 'Russkij').
KOI8-U is the Ukrainian variation. It is largely identical to KOI8-R but replaces specific box-drawing characters in the 0xA4-0xBF range with Ukrainian letters like Ґ, Є, І, and Ї. Systems configured for KOI8-R may display these Ukrainian characters as graphical lines or symbols.
The mapping effectively inverts the case in many instances. For example, Cyrillic lowercase characters (0xC0-0xDF) map to ASCII uppercase (0x40-0x5F) when the bit is stripped. This is a known side effect of the design choice to align specific phonetic sounds within the limitations of the ASCII table structure.
While UTF-8 has largely replaced single-byte encodings, KOI8-R remains relevant in legacy Russian Unix servers, old IRC logs, mail archives from the 1990s, and certain industrial control systems that were never upgraded to Unicode.
If you open a text file and see a mix of Cyrillic letters and box-drawing characters where you expect plain text, or if you see Latin characters that look like a phonetic transliteration of Russian, you may be viewing a KOI8-R file with a stripped bit. Common mojibake signatures include sequences like "рПЙЧЕФ" (Privet) appearing as nonsense in Windows-1251.