User Rating 0.0 ★★★★★

Total Usage 0 times

Category Audio Tools

Hours

Minutes

Seconds

Total duration of the audio

Sample Rate Samples per second

Bit Depth Bits per sample

Channels Number of audio channels or tracks

Uncompressed (WAV/AIFF) 0 bytes Bitrate: 0 kbps

Is this tool helpful?

Your feedback helps us improve.

★ ★ ★ ★ ★

About

Miscalculating audio storage requirements leads to truncated recordings, exhausted disk space during sessions, and incorrect bandwidth provisioning for streaming infrastructure. This calculator computes exact uncompressed PCM file sizes from f_s (sample rate), b (bit depth), c (channel count), and t (duration), then estimates compressed sizes for MP3, AAC, OGG, FLAC, ALAC, and WMA using standard CBR bitrate models. Results account for raw audio data only. Container overhead (RIFF/WAV header ≈ 44 bytes, ID3 tags, metadata) is excluded because it is negligible at file lengths beyond a few seconds.

The tool assumes constant bitrate encoding for lossy formats. Real-world VBR encoders produce files that vary by 5 - 15% depending on signal complexity. Silence-heavy recordings compress smaller; dense orchestral material compresses larger. For lossless formats like FLAC, the stated ratio (0.55 - 0.65× PCM) is an empirical average across mixed-genre corpora. Pro tip: always budget 10% headroom above calculated values when provisioning storage for recording sessions.

Formulas

The raw uncompressed PCM audio data size is computed as:

S_raw = f_s × b × c × t ÷ 8

where S_raw = file size in bytes, f_s = sample rate in Hz (samples per second), b = bit depth (bits per sample), c = number of channels (1 for mono, 2 for stereo), and t = duration in seconds. Division by 8 converts bits to bytes.

The corresponding data rate (bitrate) for uncompressed audio is:

R = f_s × b × c

where R is in bits/s. For CD-quality audio: 44100 × 16 × 2 = 1,411,200 bits/s = 1411.2 kbps.

For lossy compressed formats using constant bitrate encoding:

S_compressed = BR × t8

where BR = target bitrate in bits/s and t = duration in seconds.

For lossless compressed formats (FLAC, ALAC):

S_lossless ≈ S_raw × r

where r is the empirical compression ratio, typically 0.55 to 0.65 depending on source material complexity.

Binary unit conversions follow IEC 80000-13: 1 KiB = 1024 bytes, 1 MiB = 1024² bytes, 1 GiB = 1024³ bytes.

Reference Data

Format	Type	Typical Bitrate	Compression Ratio vs PCM	Container	Common Use
WAV (PCM)	Uncompressed	N/A (raw)	1.00×	RIFF	Studio recording, mastering
AIFF (PCM)	Uncompressed	N/A (raw)	1.00×	IFF/AIFF	macOS studio workflows
FLAC	Lossless	~900 kbps (CD)	0.55 - 0.65×	FLAC/OGG	Archival, audiophile playback
ALAC	Lossless	~900 kbps (CD)	0.55 - 0.65×	MP4/M4A	Apple ecosystem archival
MP3 (CBR 128)	Lossy	128 kbps	~0.09×	MP3	Podcasts, speech
MP3 (CBR 192)	Lossy	192 kbps	~0.14×	MP3	General music
MP3 (CBR 320)	Lossy	320 kbps	~0.23×	MP3	High-quality distribution
AAC (128)	Lossy	128 kbps	~0.09×	MP4/M4A	Streaming (Spotify, YouTube)
AAC (256)	Lossy	256 kbps	~0.18×	MP4/M4A	Apple Music, iTunes
OGG Vorbis (160)	Lossy	160 kbps	~0.11×	OGG	Open-source, gaming
OGG Vorbis (320)	Lossy	320 kbps	~0.23×	OGG	High-quality open format
WMA Standard	Lossy	128 kbps	~0.09×	ASF	Legacy Windows media
WMA Lossless	Lossless	~900 kbps (CD)	0.55 - 0.65×	ASF	Windows archival
Opus (64)	Lossy	64 kbps	~0.05×	OGG/WebM	VoIP, low-latency speech
Opus (128)	Lossy	128 kbps	~0.09×	OGG/WebM	Streaming, music
CD Audio (Red Book)	Uncompressed	1411.2 kbps	1.00×	N/A	44100 Hz, 16-bit, stereo
DVD Audio	Uncompressed	4608 kbps	1.00×	N/A	96000 Hz, 24-bit, stereo
DSD64	Uncompressed (1-bit)	2822.4 kbps	1.00×	DSF/DFF	Super Audio CD (SACD)
Telephony (G.711)	Uncompressed	64 kbps	N/A	RTP	8000 Hz, 8-bit, mono

Frequently Asked Questions

This calculator computes raw PCM audio data size. Actual WAV files include a 44-byte RIFF header, and may contain additional metadata chunks (BWF, LIST/INFO, iXML) that add anywhere from a few hundred bytes to several kilobytes. For files longer than a few seconds, this overhead is negligible (<0.01%). However, WAV files exceeding 4 GiB require the RF64 extension, which adds a ds64 chunk. If your file is very short (under 1 second), header overhead becomes proportionally significant.

This tool models constant bitrate (CBR) encoding, which produces predictable file sizes. VBR encoders allocate more bits to complex passages (transients, polyphony) and fewer to simple ones (silence, sustained tones). Real VBR files typically deviate ±5-15% from the CBR estimate. Speech recordings with pauses compress 10-20% smaller than the CBR value. Dense orchestral recordings may exceed it by 5-10%. Use the CBR estimate as an upper bound for storage planning.

Per the Nyquist-Shannon sampling theorem, the maximum frequency that can be accurately captured is exactly half the sample rate (f_max = f_s / 2). A 44,100 Hz sample rate captures frequencies up to 22,050 Hz, covering the full human hearing range (20-20,000 Hz). Rates of 88,200 Hz or 96,000 Hz extend this to 44,100 Hz or 48,000 Hz respectively, which is relevant for preserving ultrasonic content during processing or for reducing anti-aliasing filter artifacts.

Yes, for uncompressed PCM formats. Doubling the bit depth from 16 to 32 exactly doubles the file size because each sample occupies twice as many bits. However, for lossless compressed formats like FLAC, the relationship is sub-linear. 24-bit FLAC files are typically only 30-50% larger than 16-bit FLAC of the same source, because the additional 8 bits often contain low-level noise that compresses efficiently. For lossy formats, bit depth is irrelevant since the encoder targets a fixed bitrate regardless of input depth.

Multiply the single-track size by the number of simultaneous tracks. For example, a 24-track session at 96 kHz / 24-bit for 60 minutes produces: 96,000 × 24 × 1 × 3,600 / 8 = 1,036,800,000 bytes per track (≈ 989 MiB). Multiply by 24 tracks: ≈ 23.2 GiB of raw audio data. Add 10% headroom for file system overhead and metadata. Also account for DAW project files, autosave snapshots, and undo history, which can add 20-50% on top of raw audio.

FLAC uses linear prediction and Rice coding. Signals with high redundancy (sustained tones, silence, simple harmonic content) compress closer to 0.50×. Complex, noise-like signals (distorted guitars, dense percussion, audience applause) compress poorly, approaching 0.70× or worse. The 0.55-0.65 range represents a corpus average across mixed genres. FLAC compression levels (0-8) affect encoding speed, not the ratio significantly - the difference between level 0 and level 8 is typically under 3% in file size.