About

Miscounting lines in configuration files, CSV datasets, or code patches leads to off-by-one errors that cascade into broken parsers and corrupted imports. The problem is subtle: different operating systems encode line endings differently - Unix uses LF (0x0A), Windows uses CRLF (0x0D 0x0A), and legacy Mac systems use bare CR (0x0D). A naive count of \n characters misses entire line classes. This tool normalizes all three conventions before splitting, then reports total lines, non-empty lines, blank lines, and per-line length statistics. It handles the edge case where a trailing newline does not constitute an additional empty line in most editor conventions. Note: the tool treats the input literally - a single character with no line breaks is one line, and an empty input is zero lines.

Formulas

The line count is derived by splitting the input string S on the regular expression pattern that matches all standard line break sequences:

lines = S.split(/\r\n|\r|\n/)

Total line count N equals the length of the resulting array:

N = |lines|

Non-empty line count N_ne filters lines where the trimmed length exceeds zero:

N_ne = N∑i=1

{

1 if trim(line_i).length > 00 otherwise

Average line length L is computed as:

L = N∑i=1 len(line_i)N

Where S = input string, N = total number of lines, line_i = the i-th line after splitting, len() = character count of a line, trim() = removal of leading and trailing whitespace.

Reference Data

Line Ending	Escape Sequence	Hex Code	Used By	Unicode Name
Line Feed (LF)	\n	0x0A	Unix, Linux, macOS (10+)	U+000A
Carriage Return (CR)	\r	0x0D	Classic Mac OS (pre-X)	U+000D
CR + LF	\r\n	0x0D 0x0A	Windows, DOS, HTTP headers	U+000D U+000A
Next Line (NEL)	-	0x85	IBM mainframes (EBCDIC)	U+0085
Line Separator (LS)	-	0x2028	Unicode standard	U+2028
Paragraph Separator (PS)	-	0x2029	Unicode standard	U+2029
Vertical Tab (VT)	\v	0x0B	Some legacy terminals	U+000B
Form Feed (FF)	\f	0x0C	Printers, page breaks	U+000C
Record Separator (RS)	-	0x1E	Data interchange (ASCII)	U+001E
Null (NUL)	\0	0x00	C string terminators	U+0000
End of Text (ETX)	-	0x03	Legacy serial protocols	U+0003
End of Transmission (EOT)	-	0x04	Terminal Ctrl+D	U+0004

Frequently Asked Questions

The tool splits on the regex pattern /\r\n|\r|\n/ which matches CRLF first (greedy longest match), then falls back to bare CR or LF. This means a file with mixed endings - common when contributors use different operating systems - is correctly split at every line boundary regardless of the encoding used on each individual line.

Yes. JavaScript's String.split() produces an additional empty string element when the input ends with a delimiter. For example, the string "hello\n" splits into ["hello", ""], yielding a total of 2 lines. This matches POSIX convention where a well-formed text file ends with a newline, but it may differ from what some editors display. The "Non-Empty Lines" metric helps distinguish meaningful content lines from trailing blanks.

Modern browsers allocate string memory in the heap, typically allowing strings up to several hundred megabytes. However, textarea rendering becomes sluggish above roughly 1-5 MB of text. The line-counting algorithm itself is O(n) and processes 10 MB in under 50 ms on typical hardware. For very large files, consider using the paste function rather than typing, as the DOM re-rendering of the textarea is the bottleneck - not the computation.

A line containing only whitespace characters (spaces, tabs, non-breaking spaces) is classified as an "Empty Line" because the tool applies a trim() operation before checking length. This matches the common developer convention where blank-looking lines are not considered content. The "Total Lines" count still includes them.

Yes. JavaScript strings are UTF-16 encoded. The line-splitting regex operates on code points and correctly identifies \n, \r, and \r\n regardless of surrounding multi-byte characters. Line length is reported in UTF-16 code units, meaning a single emoji composed of a surrogate pair (e.g., 😀) counts as 2 units. This matches JavaScript's native String.length behavior.