Clamp ASCII Data
Clamp, filter, or replace ASCII characters outside a specified code range. Remove non-printable chars, sanitize text data, and analyze character distributions.
About
Raw text data from legacy systems, sensor logs, or scraped sources frequently contains characters outside the expected ASCII range. Non-printable control codes (ASCII 0 - 31), extended characters above 126, or stray null bytes cause parsing failures in downstream pipelines. A single out-of-range byte can corrupt a CSV import, break a fixed-width record parser, or produce invisible rendering artifacts in terminal output. This tool applies a strict numeric clamp on each character's code point: given bounds lo and hi, every character with a code outside [lo, hi] is either clamped to the nearest bound, replaced with a user-defined substitute, or removed entirely.
The standard printable ASCII range is 32 - 126. Restricting to 48 - 57 isolates digits only. Clamping to 65 - 90 enforces uppercase-only alphabetic data. This tool approximates a character-level filter assuming single-byte encoding. Multi-byte UTF-8 sequences with code points above 127 will be treated per their individual code unit value, not their combined Unicode scalar. For strict Unicode normalization, a dedicated Unicode tool is required.
Formulas
For each character c in the input string with code point v = charCodeAt(c), and user-defined bounds [lo, hi]:
Where v = integer code point of the character, lo = minimum allowed ASCII value, hi = maximum allowed ASCII value, r = user-defined replacement character, chr(n) = String.fromCharCode(n). The modification count M = number of characters where v < lo ∨ v > hi. The modification ratio is MN × 100%, where N = total character count.
Reference Data
| Range | Dec | Description | Common Use |
|---|---|---|---|
| NUL - US | 0 - 31 | Control characters | Terminal control, line endings (LF=10, CR=13, TAB=9) |
| SP | 32 | Space | Word separator |
| ! - / | 33 - 47 | Punctuation & symbols | Exclamation, quotes, hash, dollar, percent |
| 0-9 | 48 - 57 | Digits | Numeric data |
| : - @ | 58 - 64 | Symbols | Colon, semicolon, angle brackets, equals, at-sign |
| A - Z | 65 - 90 | Uppercase letters | Identifiers, constants |
| [ - ` | 91 - 96 | Brackets & symbols | Array notation, backslash, caret, underscore, backtick |
| a - z | 97 - 122 | Lowercase letters | Text, variable names |
| { - ~ | 123 - 126 | Braces & symbols | Code blocks, pipe, tilde |
| DEL | 127 | Delete control | Legacy terminal delete |
| Extended | 128 - 255 | Extended ASCII / Latin-1 | Accented chars, currency symbols, box drawing |
| Printable | 32 - 126 | All printable ASCII | Standard safe text range |
| Alphanumeric | 48 - 57, 65 - 90, 97 - 122 | Letters and digits only | Identifiers, filenames |
| Whitespace | 9, 10, 13, 32 | Tab, LF, CR, Space | Text formatting |
| Base64 safe | 43, 47 - 57, 61, 65 - 90, 97 - 122 | Base64 character set | Encoded binary data |
| URL safe | 45 - 46, 48 - 57, 65 - 90, 95, 97 - 122, 126 | Unreserved URI chars (RFC 3986) | URL paths, query parameters |
| Filename safe | 32 - 126 excl. \ / : * ? " < > | | OS-safe filename chars | Cross-platform file naming |