User Rating 0.0 ★★★★★

Total Usage 0 times

Category Text Formatting

Input

0 items

Sort Direction

Separator

Locale

Case Sensitive Natural Numbers Remove Duplicates Trim Whitespace Ignore Blank Lines Ignore Articles (a, an, the) Number Lines

Sorted Result

Is this tool helpful?

Your feedback helps us improve.

★ ★ ★ ★ ★

About

Alphabetical sorting follows the Unicode Collation Algorithm (UCA), which defines ordering rules for over 150,000 characters across scripts. A naive sort using raw code points produces incorrect results for accented characters: “résumé” would sort after “z” instead of near “r”. This tool uses Intl.Collator with locale-aware comparison, handling diacritics, ligatures, and case folding according to CLDR (Common Locale Data Repository) rules. Mis-sorted reference lists, bibliographies, or data imports cause lookup failures and compliance issues in regulated industries (legal filings, pharmacopeias, ISO 9001 document control).

The tool supports natural numeric collation, so “item2” sorts before “item10” rather than after it. Duplicate detection uses Unicode NFC normalization to catch visually identical strings composed from different code points. Note: sort order is locale-dependent. German dictionaries sort “ö” as “oe”, while Swedish treats it as a separate letter after “z”. This tool defaults to English (en) locale. Adjust expectations for non-Latin scripts accordingly.

Formulas

The core comparison function delegates to the ECMAScript Internationalization API:

result = Intl.Collator(locale, { sensitivity, numeric }).compare(a, b)

Where result < 0 means a precedes b, result > 0 means b precedes a, and result = 0 means they are collation-equivalent.

The sensitivity parameter controls case and accent handling:

base → a = A = á
accent → a = A ≠ á
case → a ≠ A, a = á
variant → a ≠ A ≠ á

For duplicate detection, each line is normalized: NFC(trim(line)), then compared via a Set using the active sensitivity level. Article stripping applies the regex pattern /^(a|an|the)\s+/i before comparison but preserves the original text in output.

Where locale = BCP 47 language tag (e.g., en, de, sv). numeric = TRUE enables natural number sorting. a, b = two strings being compared.

Reference Data

Feature	Description	Use Case
A → Z Sort	Standard ascending lexicographic order per locale	Glossaries, directories, indexes
Z → A Sort	Descending reverse alphabetical order	Reverse lookups, priority lists
Case Insensitive	Treats a = A during comparison	Mixed-case data normalization
Case Sensitive	Uppercase letters sort before lowercase (Unicode default)	Programming identifiers, CSV headers
Natural Numeric	item2 < item10 (not string order)	File names, version numbers
Remove Duplicates	Eliminates identical lines after NFC normalization	Deduplicating mailing lists, tags
Trim Whitespace	Strips leading/trailing spaces per line	Pasted data from spreadsheets
Ignore Articles	Skips “A”, “An”, “The” at line start for sorting	Book titles, movie lists, bibliographies
Custom Separator	Split by newline, comma, semicolon, or tab	CSV fields, inline lists
Ignore Blank Lines	Filters out empty lines before sorting	Pasted multi-paragraph text
Number Lines	Prepends 1. 2. etc. to sorted output	Ordered/numbered lists
Locale: en	English collation (CLDR). “é” groups near “e”	Default for English text
Locale: de	German collation. “ä” = “ae”	German-language lists
Locale: sv	Swedish collation. “ö” after “z”	Scandinavian text
Locale: fr	French collation. Accent-sensitive ordering	French-language bibliographies
Locale: es	Spanish collation. “ñ” between “n” and “o”	Spanish directories
Export as .txt	Downloads sorted result as plain text file	Offline use, archiving
Copy to Clipboard	One-click copy of sorted output	Pasting into documents
Import from File	Load a .txt file directly into the input	Batch processing large lists

Frequently Asked Questions

With sensitivity set to "variant" (case-sensitive mode), accented characters are treated as distinct from their base letters. The collator places "résumé" near "r" but after "resume" because the acute accent adds a secondary sorting weight. Switch to case-insensitive mode (sensitivity: 'base') to treat them as equivalent for sorting purposes.

The Intl.Collator with numeric = TRUE extracts embedded number sequences and compares them as integers rather than character-by-character. So "file2.txt" (2) sorts before "file10.txt" (10), whereas pure lexicographic sorting would place "10" before "2" because the character "1" precedes "2".

The "Trim Whitespace" option normalizes each line by stripping leading and trailing spaces and tabs before sorting. If "Remove Duplicates" is also enabled, lines that become identical after trimming are collapsed to a single instance. The first occurrence is preserved. Internal whitespace (spaces within the line) is not modified.

No. Article stripping ('A', "An", 'The') is applied only during the comparison phase. The output preserves the original line text including articles. For example, "The Great Gatsby" sorts under "G" but displays as "The Great Gatsby" in the result.

Each locale defines its own collation rules per the CLDR standard. In German (de), "ö" is treated as equivalent to "oe" and sorts near "o". In Swedish (sv), "ö" is a distinct letter that sorts after "z". Spanish (es) places "ñ" between "n" and "o". Selecting the wrong locale produces technically valid but culturally incorrect ordering. Default English (en) treats most accented characters as variants of their base letter.

JavaScript's Array.sort uses TimSort (O(n log n)) and Intl.Collator is optimized in modern engines. Lists up to 100,000 lines sort in under 500ms on typical hardware. Beyond 250,000 lines, you may notice a delay of 1 - 3 seconds. The tool shows a loading indicator for any sort operation exceeding 200ms.