User Rating 0.0 ★★★★★

Total Usage 0 times

Category Text Analysis

Paste or type your text Supports English prose. Each word is colored by grammatical role.

Font Size

Background Style

Show Unknown Words

Is this tool helpful?

Your feedback helps us improve.

★ ★ ★ ★ ★

About

Misidentifying parts of speech leads to flawed sentence structure, weak writing, and failed grammar tests. Manual annotation is slow and error-prone past a few sentences. This tool applies automatic lexical classification to every word in your input using a heuristic dictionary of 1,500+ English words mapped to syntactic categories: nouns, verbs, adjectives, adverbs, prepositions, pronouns, conjunctions, articles, and numerals. Each category receives a distinct color, producing an instant visual parse of your text. The classification operates on isolated token matching. It does not resolve contextual ambiguity (e.g., "run" as noun vs. verb). Accuracy is approximately 75 - 85% on standard prose. Proper NLP parsers reach 97% but require server-side models.

Formulas

The tool tokenizes input text using a regular expression pattern that separates words from whitespace and punctuation:

tokenize(text) → match(text, /[a-zA-Z'\u2019]+|\d+[\d,.]*|\S|\s+/g)

Each token t is classified by dictionary lookup:

classify(t) = lookup(lowercase(t), D)

Where D is a dictionary mapping lowercase word forms to part-of-speech categories. If no match exists, suffix heuristics apply:

{

Adjective if t ends in -ful, -less, -ous, -ive, -able, -ible, -al, -ishAdverb if t ends in -lyVerb if t ends in -ing, -ed, -ize, -ise, -ateNoun if t ends in -tion, -ment, -ness, -ity, -ism, -er, -orUnknown otherwise

Where t = input token, D = part-of-speech dictionary containing 1,500+ entries. The suffix heuristic layer catches inflected forms not in the base dictionary. Statistical accuracy on standard English prose: 75 - 85%.

Reference Data

Part of Speech	Abbreviation	Color	Role in Sentence	Example Words
Noun	NN	#5B9BD5	Names a person, place, thing, or idea	dog, city, freedom, Alice
Verb	VB	#E06666	Expresses action or state of being	run, is, become, think
Adjective	JJ	#93C47D	Modifies a noun	big, red, complex, silent
Adverb	RB	#E69138	Modifies a verb, adjective, or another adverb	quickly, very, never, well
Pronoun	PRP	#C27BA0	Replaces a noun	he, she, it, they, whom
Preposition	IN	#8E7CC3	Shows relationship between elements	in, on, at, between, through
Conjunction	CC	#D5A439	Connects clauses or words	and, but, or, because, although
Article	DT	#76A5AF	Determines specificity of noun	a, an, the, this, that
Numeral	CD	#6D9EEB	Represents a number or quantity	one, 42, third, million
Interjection	UH	#E67399	Expresses emotion or exclamation	oh, wow, hey, ouch, bravo
Unknown	??	#B4A7D6	Unclassified token	rare/domain-specific words

Frequently Asked Questions

Many English words are ambiguous: "run" can be a noun or verb, "light" can be noun, verb, or adjective. This tool uses a static dictionary without sentence-context parsing. It assigns the most common usage. For example, "set" defaults to verb even when used as a noun ("a set of tools"). Statistical NLP models resolve this via context windows but require server-side processing.

Words not found in the 1,500-entry dictionary are classified by their ending. Suffixes like -tion, -ment, -ness map to noun. Endings like -ing, -ed, -ize map to verb. Suffixes -ful, -less, -ous, -ive map to adjective. The suffix -ly maps to adverb. If no suffix matches, the word is tagged as Unknown. This catches most derived and inflected forms.

Yes. Common contractions (don't, can't, I'm, they're, it's, won't, shouldn't) are included in the dictionary. Possessive forms like "John's" are split at the apostrophe by the tokenizer - "John" is classified separately from "'s". The apostrophe token itself receives no color.

On standard prose (news, essays, fiction), expect 75-85% accuracy. On technical text with jargon (medical, legal, engineering), accuracy drops to 50-65% because domain terms are absent from the general dictionary. All unknown terms appear in the Unknown/purple category. You can visually identify gaps by the density of purple tokens.

Yes. Use the Copy HTML button to get the colored text as styled HTML spans. Paste into rich-text editors (Google Docs, Word) that accept HTML paste. Alternatively, use the Print function which formats the colored output for A4 paper with a clean legend. The colors are chosen at WCAG AA contrast against white backgrounds.

Determiners (a, an, the) and demonstratives (this, that, these, those) are grouped under the Article/Determiner category (DT) following Penn Treebank conventions. In context, "that" can be a conjunction, pronoun, or determiner. Without parse trees, the tool defaults to the most frequent tag in corpus data, which for "that" is determiner.