User Rating 0.0
Total Usage 0 times
Supports English prose. Each word is colored by grammatical role.
Is this tool helpful?

Your feedback helps us improve.

About

Misidentifying parts of speech leads to flawed sentence structure, weak writing, and failed grammar tests. Manual annotation is slow and error-prone past a few sentences. This tool applies automatic lexical classification to every word in your input using a heuristic dictionary of 1,500+ English words mapped to syntactic categories: nouns, verbs, adjectives, adverbs, prepositions, pronouns, conjunctions, articles, and numerals. Each category receives a distinct color, producing an instant visual parse of your text. The classification operates on isolated token matching. It does not resolve contextual ambiguity (e.g., "run" as noun vs. verb). Accuracy is approximately 75 - 85% on standard prose. Proper NLP parsers reach 97% but require server-side models.

color words text coloring parts of speech grammar highlighting word colorizer text analysis POS tagger

Formulas

The tool tokenizes input text using a regular expression pattern that separates words from whitespace and punctuation:

tokenize(text) match(text, /[a-zA-Z'\u2019]+|\d+[\d,.]*|\S|\s+/g)

Each token t is classified by dictionary lookup:

classify(t) = lookup(lowercase(t), D)

Where D is a dictionary mapping lowercase word forms to part-of-speech categories. If no match exists, suffix heuristics apply:

{
Adjective if t ends in -ful, -less, -ous, -ive, -able, -ible, -al, -ishAdverb if t ends in -lyVerb if t ends in -ing, -ed, -ize, -ise, -ateNoun if t ends in -tion, -ment, -ness, -ity, -ism, -er, -orUnknown otherwise

Where t = input token, D = part-of-speech dictionary containing 1,500+ entries. The suffix heuristic layer catches inflected forms not in the base dictionary. Statistical accuracy on standard English prose: 75 - 85%.

Reference Data

Part of SpeechAbbreviationColorRole in SentenceExample Words
NounNN#5B9BD5Names a person, place, thing, or ideadog, city, freedom, Alice
VerbVB#E06666Expresses action or state of beingrun, is, become, think
AdjectiveJJ#93C47DModifies a nounbig, red, complex, silent
AdverbRB#E69138Modifies a verb, adjective, or another adverbquickly, very, never, well
PronounPRP#C27BA0Replaces a nounhe, she, it, they, whom
PrepositionIN#8E7CC3Shows relationship between elementsin, on, at, between, through
ConjunctionCC#D5A439Connects clauses or wordsand, but, or, because, although
ArticleDT#76A5AFDetermines specificity of nouna, an, the, this, that
NumeralCD#6D9EEBRepresents a number or quantityone, 42, third, million
InterjectionUH#E67399Expresses emotion or exclamationoh, wow, hey, ouch, bravo
Unknown??#B4A7D6Unclassified tokenrare/domain-specific words

Frequently Asked Questions

Many English words are ambiguous: "run" can be a noun or verb, "light" can be noun, verb, or adjective. This tool uses a static dictionary without sentence-context parsing. It assigns the most common usage. For example, "set" defaults to verb even when used as a noun ("a set of tools"). Statistical NLP models resolve this via context windows but require server-side processing.
Words not found in the 1,500-entry dictionary are classified by their ending. Suffixes like -tion, -ment, -ness map to noun. Endings like -ing, -ed, -ize map to verb. Suffixes -ful, -less, -ous, -ive map to adjective. The suffix -ly maps to adverb. If no suffix matches, the word is tagged as Unknown. This catches most derived and inflected forms.
Yes. Common contractions (don't, can't, I'm, they're, it's, won't, shouldn't) are included in the dictionary. Possessive forms like "John's" are split at the apostrophe by the tokenizer - "John" is classified separately from "'s". The apostrophe token itself receives no color.
On standard prose (news, essays, fiction), expect 75-85% accuracy. On technical text with jargon (medical, legal, engineering), accuracy drops to 50-65% because domain terms are absent from the general dictionary. All unknown terms appear in the Unknown/purple category. You can visually identify gaps by the density of purple tokens.
Yes. Use the Copy HTML button to get the colored text as styled HTML spans. Paste into rich-text editors (Google Docs, Word) that accept HTML paste. Alternatively, use the Print function which formats the colored output for A4 paper with a clean legend. The colors are chosen at WCAG AA contrast against white backgrounds.
Determiners (a, an, the) and demonstratives (this, that, these, those) are grouped under the Article/Determiner category (DT) following Penn Treebank conventions. In context, "that" can be a conjunction, pronoun, or determiner. Without parse trees, the tool defaults to the most frequent tag in corpus data, which for "that" is determiner.