User Rating 0.0 โ˜…โ˜…โ˜…โ˜…โ˜…
Total Usage 0 times
0 characters 0 words
Analysis History
Is this tool helpful?

Your feedback helps us improve.

โ˜… โ˜… โ˜… โ˜… โ˜…

About

Email spam filters use multi-layered heuristic engines that score messages against hundreds of weighted rules. A single misplaced keyword like FREE combined with excessive capitalization can push your message past the spam threshold T5.0 on the SpamAssassin scale. Legitimate marketing emails get flagged at rates between 10% and 20%, costing businesses measurable revenue per campaign. This tool runs your text through 150+ pattern detectors covering urgency triggers, phishing signatures, obfuscated characters, and statistical anomalies to produce a composite spam probability Pspam.

The analyzer does not connect to any external service. All processing happens in your browser. It approximates the behavior of Bayesian classifiers by using a curated rule dictionary with empirically assigned weights. Limitations: it cannot evaluate sender reputation, SPF/DKIM headers, or IP blacklists. It analyzes content signals only. For production mail campaigns, cross-reference results with tools like Mail-Tester that inspect delivery infrastructure.

spam checker spam score email spam phishing detector spam filter spam probability message analyzer spam test

Formulas

The composite spam probability is computed by summing weighted category scores and normalizing through a sigmoid function to bound the output between 0 and 100%.

Sraw = nโˆ‘i=1 wi โ‹… mi

where wi is the weight of the i-th rule and mi is the match count (typically 0 or 1, but frequency-scaled for repeated triggers).

Pspam = 1001 + eโˆ’k(Sraw โˆ’ T)

where k = 0.35 controls the steepness of the curve and T = 8.0 is the midpoint threshold calibrated against SpamAssassin defaults. A raw score of 5.0 yields approximately 25% probability. A score above 15 yields > 90%.

Additional heuristic signals are computed as ratios:

Rcaps = NuppercaseNalpha ร— 100%

where Rcaps > 30% triggers the ALL CAPS penalty. Similarly, Rdigits measures numeric character density and Rspecial measures non-alphanumeric pollution.

Reference Data

Spam Indicator CategoryExample TriggersTypical WeightSpamAssassin RuleRisk Level
Urgency / Pressure"Act now", "Limited time", "Expires today"2.5URGENCY_PHRASESHigh
Financial Bait"$$$", "Million dollars", "Wire transfer"3.0MONEY_PHRASESCritical
Pharmaceutical"Viagra", "V1@gra", "Pharmacy", "Pills"3.5DRUGS_ERECTILECritical
Phishing / Identity"Verify your account", "SSN", "Password"3.8PHISHING_PHRASESCritical
Free Offers"Free gift", "No cost", "Complimentary"2.0FREE_OFFERSMedium
ALL CAPS Ratio> 30% uppercase characters1.8UPPERCASE_50_75Medium
Excessive Punctuation"!!!", "???", "$$$", "***"1.5EXCL_MARKSMedium
Suspicious URLsIP-based links, URL shorteners, .xyz/.tk2.8SUSPICIOUS_URLHigh
Character Obfuscation"Fr33", "W1n", "C@sh", zero for O3.2OBFUSCATED_WORDSCritical
Unsubscribe AbsenceMarketing email without opt-out1.0NO_UNSUBSCRIBELow
Crypto / Investment"Bitcoin", "NFT", "ROI guaranteed"2.5CRYPTO_SPAMHigh
Adult ContentExplicit terms, dating scam phrases3.5ADULT_CONTENTCritical
Lottery / Prize"You've won", "Congratulations", "Claim"3.0LOTTERY_PRIZECritical
HTML AnomaliesInvisible text, tiny fonts, hidden divs2.2HTML_TRICKSHigh
Sender Impersonation"From: support@", "Dear Customer"2.0IMPERSONATIONHigh
Emotional Manipulation"Help me", "Dying wish", "Orphan"2.8EMOTIONAL_MANIPHigh
Malware Indicators".exe", ".scr", "Download attachment"3.5MALWARE_ATTACHCritical
Generic Greeting"Dear Sir/Madam", "Dear Friend"1.2GENERIC_GREETINGLow
Digit Ratio> 15% digits in text body1.3HIGH_DIGIT_RATIOLow
Short Body + LinkMessage under 20 words with URL2.0SHORT_BODY_URLMedium
Encoding TricksBase64 body, Unicode homoglyphs3.0ENCODING_TRICKSCritical

Frequently Asked Questions

The sigmoid curve with k = 0.35 and threshold T = 8.0 creates a gradual transition zone between scores of 3 and 13. A raw score of 5.0 maps to roughly 25% probability, while 8.0 maps to exactly 50%. This prevents a single medium-weight trigger from causing a dramatic jump. Messages with only 1-2 low-weight matches (like a generic greeting) will score under 15%, which reflects real-world filter behavior where isolated signals rarely cause rejection.
Marketing emails inherently contain spam-correlated patterns: promotional language ("special offer", "limited time"), calls to action ("click here", "buy now"), and HTML formatting. This is expected. Real spam filters offset this with sender reputation and authentication (SPF, DKIM, DMARC), which this content-only tool cannot evaluate. To reduce your score: replace urgency phrases with specific dates, use your company name instead of generic greetings, include a physical address, and ensure an unsubscribe link is mentioned in the text.
Modern filters detect obfuscation through Levenshtein distance matching, regex pattern libraries, and Unicode normalization. This tool checks for common substitutions: digits for letters (0โ†’O, 1โ†’I, 3โ†’E, 4โ†’A, 5โ†’S), symbols (@โ†’A, $โ†’S), and Unicode homoglyphs (Cyrillic ะฐ vs Latin a). In practice, obfuscation now increases spam scores rather than decreasing them, because legitimate senders never obfuscate words. The tool assigns obfuscation a weight of 3.2 - among the highest.
The current rule dictionary focuses on English-language spam indicators. Non-English text will score lower because fewer keyword matches fire, but structural heuristics (CAPS ratio, digit density, punctuation excess, URL analysis) remain language-agnostic and will still detect statistical anomalies. For accurate multilingual analysis, supplement results with a service that maintains localized dictionaries.
SpamAssassin's default rejection threshold is a raw score of 5.0, which our sigmoid maps to approximately 25%. Most enterprise filters reject at 5.0-7.0 (25-42% in our scale). Gmail's neural classifier operates differently but empirically rejects content-only signals around our 50-60% range. If your text scores above 40%, review the flagged indicators. Above 70%, the message will almost certainly be filtered by any major provider.
URLs receive compound scoring. A bare domain scores 0. A URL shortened via bit.ly/tinyurl adds 1.5 weight. An IP-address URL (http://192.168.x.x) adds 2.8. Suspicious TLDs (.xyz, .tk, .top, .buzz) add 1.5. Multiple URLs in a short message trigger the SHORT_BODY_URL rule at weight 2.0. These compound because phishing emails typically combine shortened URLs with urgency language, and each signal reinforces the classification independently.