Common Log Format to JSON Converter
Convert Apache/Nginx Common Log Format (CLF) and Combined Log Format entries to structured JSON. Paste, upload, or drag-and-drop log files.
About
Web server access logs in Common Log Format (CLF) are dense, single-line records defined by the NCSA. Each line encodes the remote host (h), identity (l), authenticated user (u), timestamp (t), HTTP request (r), status code (s), and response size in bytes (b). The Combined variant appends the Referer and User-Agent headers. Parsing these with naive string splitting fails on edge cases: quoted strings containing spaces, missing byte counts represented as -, and date formats that vary between servers. Malformed entries corrupt downstream analytics pipelines silently. This tool applies a strict regular expression against the NCSA specification, flags unparseable lines with their line numbers, and outputs validated JSON with ISO 8601 dates.
The converter handles both CLF and Combined formats automatically. Byte values of - become null per convention. Status codes are cast to integers. Timestamps are normalized to UTC ISO 8601 regardless of the source timezone offset. Note: this tool approximates the CLF spec as implemented by Apache mod_log_config and Nginx log_format. Custom log formats with non-standard field ordering will not parse correctly.
Formulas
The Common Log Format line is matched against the following regular expression pattern. Each capture group maps to a JSON field.
Pattern: ^(\S+) (\S+) (\S+) \[([^\]]+)\] "([^"]+)" (\d{3}) (\S+)
For Combined Log Format, two additional quoted fields are appended:
Extended: ... "([^"]*)") "([^"]*)"
The date field requires secondary parsing. The standard CLF timestamp format is:
dd/Mon/yyyy:HH:mm:ss ยฑhhmm
This is decomposed and reconstructed as an ISO 8601 string:
yyyyโMMโddTHH:mm:ssZ
Where Mon is a three-letter English month abbreviation mapped to its zero-padded numeric index (01 - 12). The timezone offset ยฑhhmm is applied to convert to UTC. The byte field transformation follows:
Where b = raw byte string from the log entry. Status code s is always cast via parseInt(s, 10) to ensure numeric type in JSON output.
Reference Data
| CLF Field | Directive | JSON Key | Type | Example Value | Notes |
|---|---|---|---|---|---|
| Remote Host | %h | remoteHost | string | 192.168.1.1 | IPv4, IPv6, or hostname |
| Remote Logname | %l | remoteLogName | string | - | Almost always - (identd disabled) |
| Auth User | %u | authUser | string | - | - if no auth |
| Timestamp | %t | date | string (ISO 8601) | 2024-01-15T08:23:45.000Z | Converted from CLF bracket format |
| Request Line | "%r" | request | string | GET /index.html HTTP/1.1 | Full method + path + protocol |
| Status Code | %>s | status | number | 200 | Final status after internal redirects |
| Response Bytes | %b | bytes | number | null | 10305 | - โ null |
| Referer | "%{Referer}i" | referer | string | http://example.com/ | Combined format only |
| User-Agent | "%{User-agent}i" | userAgent | string | Mozilla/5.0 ... | Combined format only |
| HTTP Method | - | method | string | GET | Extracted from request (optional) |
| Request Path | - | path | string | /index.html | Extracted from request (optional) |
| Protocol | - | protocol | string | HTTP/1.1 | Extracted from request (optional) |
| Status Class | - | - | - | 2xx | 1xx=Info, 2xx=Success, 3xx=Redirect, 4xx=Client Error, 5xx=Server Error |
| CLF Date Format | - | - | - | [10/Oct/2000:13:55:36 -0700] | dd/Mon/yyyy:HH:mm:ss ยฑhhmm |
| RFC Date Variant | - | - | - | [Wed, 11 Jun 2014 16:24:02 GMT] | Some servers use RFC 2822 style |
| Apache Directive | - | - | - | LogFormat "%h %l %u %t \"%r\" %>s %b" | Common format config |
| Nginx Directive | - | - | - | log_format combined ... | Default is combined |
| Max Line Length | - | - | - | ~8192 bytes typical | Server-dependent buffer size |
| IPv6 Example | %h | remoteHost | string | ::1 | Localhost in IPv6 |
| Null Bytes (%B) | %B | bytes | number | 0 | %B uses 0 instead of - |
Frequently Asked Questions
referer and userAgent in the JSON output.LogFormat directive.split first.request field contains the full request line (e.g., GET /index.html HTTP/1.1). Enable the "Split Request Fields" option to additionally output method, path, and protocol as separate JSON keys. This is useful for filtering by HTTP method or aggregating by endpoint path in downstream analytics.