User Rating 0.0
Total Usage 0 times
1 – 100,000
No data generated yet.

    
Is this tool helpful?

Your feedback helps us improve.

About

Testing observability pipelines with production data is a security risk. Sharing real logs exposes IPs, usernames, session tokens, and internal infrastructure topology. This generator produces structurally valid log entries across multiple formats - Apache Combined, Syslog RFC 5424, JSON (Elastic Common Schema), NGINX access, and NDJSON - with realistic field distributions. Severity levels follow configurable weighted random selection so your ERROR to INFO ratio mirrors actual systems (typically 1:50). Timestamps distribute uniformly across a user-defined window and sort chronologically. HTTP status codes follow a power-law distribution: ~78% are 2xx, ~12% are 3xx, ~7% are 4xx, and ~3% are 5xx, matching real-world traffic patterns.

The tool also generates OpenTelemetry-compatible trace and span IDs (32-char and 16-char hex strings respectively) and time-series metric data using a random walk algorithm with configurable drift d and variance σ2. All generation runs client-side. No data leaves your browser. Output is limited to 100,000 lines per batch due to browser memory constraints. For load testing at scale, generate multiple batches and concatenate. Note: generated IPs use full 0.0.0.0 - 255.255.255.255 range without filtering reserved blocks - filter post-generation if realism in that dimension matters.

log generator random data test data apache logs syslog json logs trace data metric data observability devops

Formulas

Timestamps are generated using uniform random distribution within the specified window, then sorted ascending to simulate chronological log flow:

ti = tstart + rand() (tend tstart)

Where tstart and tend are Unix epoch milliseconds of the user-defined range, and rand() produces a value in [0, 1).

HTTP status code selection uses weighted random sampling. Given weights wk for each status category, the cumulative distribution function determines selection:

P(status = k) = wkni=1 wi

Metric data uses a bounded random walk to produce realistic time-series:

vt+1 = clamp(vt + d + σ N(0,1), min, max)

Where d = drift (trend bias), σ = volatility, and N(0,1) is a standard normal deviate produced via the Box-Muller transform: z = 2 ln(u1) cos(2πu2).

UUID v4 generation follows RFC 4122: 122 random bits with version nibble set to 0100 and variant bits to 10.

Reference Data

FormatStandardFields GeneratedUse CaseExample System
Apache CombinedCLF + Referer/UAIP, identity, user, timestamp, method, path, protocol, status, bytes, referer, user-agentWeb server access log testingApache HTTPD, HAProxy
NGINX AccessCustom (default format)IP, timestamp, method, path, protocol, status, bytes, referer, user-agent, request_timeReverse proxy log analysisNGINX, OpenResty
Syslog RFC 5424RFC 5424Priority, version, timestamp, hostname, app-name, procid, msgid, structured-data, messageSystem/daemon log ingestionrsyslog, syslog-ng
JSON (ECS)Elastic Common Schema 1.x@timestamp, log.level, message, host.name, service.name, trace.id, span.id, event.datasetStructured log pipelinesElasticsearch, Datadog
NDJSONNewline Delimited JSONtimestamp, level, msg, pid, hostname, req_id, duration_ms, callerStreaming JSON ingestionBunyan, Pino, Loki
Metric (Prometheus)Prometheus Expositionmetric_name, labels, value, timestamp_msTime-series metric testingPrometheus, Grafana
CSVRFC 4180Configurable columns matching selected formatSpreadsheet/DB import testingExcel, PostgreSQL COPY

Frequently Asked Questions

IP addresses are randomly generated across the full IPv4 range (0.0.0.0 - 255.255.255.255) without filtering RFC 1918 private ranges or IANA reserved blocks. User agents are sampled from a curated pool of 20+ real-world browser strings (Chrome, Firefox, Safari, Edge, bots) with weighted distribution favoring Chrome (~65%). For GDPR-compliant testing, the IPs are synthetic and carry no PII risk.
JSON format follows Elastic Common Schema 1.x field naming (@timestamp, log.level, host.name, trace.id). Syslog output conforms to RFC 5424 structure including PRI calculation: PRI = facility × 8 + severity. Both will pass basic structural validation. However, semantic consistency (e.g., trace IDs correlating across spans) is not enforced - each line is independently generated.
The tool caps at 100,000 lines per batch. Generation runs in a Web Worker to prevent UI freezing. At 100K lines of Apache Combined format, output is approximately 25 - 35 MB. JSON format will be larger (40 - 60 MB). If your browser tab crashes, reduce the count or use the download option which streams to a Blob rather than rendering to the DOM.
You can adjust relative weights for each severity level (DEBUG, INFO, WARN, ERROR, FATAL). The default distribution mirrors production systems: INFO 60%, DEBUG 20%, WARN 12%, ERROR 6%, FATAL 2%. Weights are normalized to probabilities. Setting all weights equal produces uniform distribution. Setting ERROR to 100 and others to 0 generates error-only logs for fault-injection testing.
Trace IDs are 32-character lowercase hexadecimal strings and span IDs are 16-character lowercase hex, matching the W3C Trace Context format. Each log line receives a unique trace/span pair. The tool does not generate correlated spans (parent-child relationships) - every line represents an independent trace root. For distributed tracing pipeline testing, you would need to post-process and assign shared trace IDs across related entries.
Apache/NGINX use CLF format: [dd/Mon/yyyy:HH:mm:ss +0000]. Syslog uses ISO 8601 with millisecond precision. JSON/@timestamp uses ISO 8601 with UTC timezone designator (Z). All timestamps are generated in UTC. The date range picker operates in your local timezone but converts to UTC for generation. If you need a specific timezone offset, the Syslog format includes the offset field in structured data.