User Rating 0.0 β˜…β˜…β˜…β˜…β˜…
Total Usage 0 times
Encoded Output
Output will appear here after conversion
Is this tool helpful?

Your feedback helps us improve.

β˜… β˜… β˜… β˜… β˜…

About

ASCII defines 128 characters (code points 0 - 127). ANSI extends this to 256 characters using Windows code pages, where bytes 128 - 255 map to locale-specific glyphs. Confusing code pages corrupts text irreversibly. A file saved as CP1252 (Western European) and opened as CP1251 (Cyrillic) produces garbled output known as mojibake. This tool performs real byte-level encoding using complete lookup tables for each Windows code page. It does not guess. It maps every character deterministically and flags anything outside the target code page’s repertoire.

The converter accepts plain text or uploaded files and produces the exact byte sequence a Windows application would generate for the selected code page. Output is available as hexadecimal, decimal, or binary representation. You can download the raw encoded binary file for integration testing, protocol debugging, or legacy system interoperability. Note: ASCII characters (0 - 127) are identical across all ANSI code pages. Divergence occurs only in the upper half (128 - 255).

ascii ansi encoding code page character converter text encoding windows-1252 cp1252 hex converter binary converter

Formulas

The encoding process maps each Unicode code point to a single byte in the target ANSI code page.

encode(c) =
{
c if 0 ≀ c ≀ 127 (ASCII range)Tcp[c] if c ∈ Tcp (code page lookup)0x3F otherwise (unmappable β†’ '?')

Where c is the Unicode code point of the input character, Tcp is the lookup table for code page cp, and 0x3F is the byte for the replacement character "?". The hex representation converts each byte b to a two-digit hexadecimal string via b.toString(16).padStart(2, "0"). Binary output uses b.toString(2).padStart(8, "0").

Reference Data

Code PageNameRegion / LanguageUnique RangeNotable Characters
CP1250Windows-1250Central European128 - 255Ε , Ε‘, Ž, ž, Ε‚, Δ…
CP1251Windows-1251Cyrillic128 - 255А - Π―, Π° - я, Ё, Ρ‘
CP1252Windows-1252Western European128 - 255€, ß, ΓΆ, Γ±, Γ§
CP1253Windows-1253Greek128 - 255Ξ‘ - Ξ©, Ξ± - Ο‰
CP1254Windows-1254Turkish128 - 255ş, ğ, İ, ı, ç
CP1255Windows-1255Hebrew128 - 255א - Χͺ, niqqud marks
CP1256Windows-1256Arabic128 - 255Arabic letters, β€Ž, ‏
CP1257Windows-1257Baltic128 - 255ā, č, Δ“, Δ£, Δ·, ΔΌ, Ε†
CP1258Windows-1258Vietnamese128 - 255Ζ‘, Ζ°, combining tones
CP874Windows-874Thai128 - 255Thai consonants, vowels, tones
ASCIIUS-ASCIIUniversal0 - 127Control chars, printable Latin
Bytes 0x00 - 0x7F are shared across all code pages. Bytes 0x80 - 0x9F in CP1252 contain printable characters (e.g., € at 0x80) where ISO-8859-1 has control codes.

Frequently Asked Questions

Characters without a mapping in the target code page are replaced with byte 0x3F (the "?" character) and flagged in the output with a visual warning indicator. The converter counts unmappable characters and displays the total so you can assess data loss before downloading the encoded file.
ISO-8859-1 assigns C1 control codes (non-printable) to bytes 0x80 - 0x9F. Microsoft's CP1252 repurposed these bytes for printable characters like the Euro sign (€ at 0x80), curly quotes, and em-dashes. This is why HTML pages declared as ISO-8859-1 often actually use CP1252. Browsers compensate silently, but binary tools do not.
The converter preserves line endings exactly as entered. Windows ANSI files typically use CR+LF (0x0D 0x0A), Unix uses LF (0x0A), and classic Mac uses CR (0x0D). Since these are all ASCII-range bytes, they pass through unchanged regardless of code page. The hex view shows exactly which line ending bytes are present.
Yes. The input field accepts any text your browser can render (which is UTF-16 internally). The converter reads each Unicode code point and maps it to the corresponding single byte in the selected ANSI code page. Multi-byte UTF-8 sequences like 0xC3 0xA9 (Γ©) become single byte 0xE9 in CP1252. Characters beyond the code page's repertoire (e.g., CJK ideographs in CP1252) become 0x3F.
CP1252 (Windows-1252) is the safest default for Western languages. It is the most widely deployed ANSI code page and is the de facto encoding assumed by most legacy Windows applications, older HTTP servers, and email clients in the Americas and Western Europe. For Cyrillic text, use CP1251. For Central European languages with diacritics (Polish, Czech, Hungarian), use CP1250.
No. Pure ASCII text (code points 0 - 127) produces identical bytes in every ANSI code page. The conversion is a no-op for ASCII. Differences only appear when your text contains characters above code point 127, such as accented letters, currency symbols, or typographic punctuation. The tool highlights which characters fall in the extended range.