About

Extracting audio from video files typically requires desktop software or server-side processing. This tool performs the entire operation client-side using the Web Audio API and MediaRecorder API. Your files never leave the browser. It decodes the media container, isolates the audio track as raw PCM (Float32 samples), and re-encodes it into a downloadable format. The waveform visualization renders amplitude peaks calculated from the decoded AudioBuffer, giving you precise visual feedback for trimming. Note: decoding support depends on your browser's built-in codec library. Chromium-based browsers handle MP4 (AAC/H.264), WebM (Opus/Vorbis), and OGG reliably. Safari may reject WebM containers.

If you need a specific segment, the trim controls map your time selection to sample indices at the file's native sample rate (f_s), typically 44100 or 48000 Hz. WAV output is uncompressed PCM - lossless but large. WebM/OGG output uses the browser's Opus encoder at approximately 128 kbps. For files exceeding 200 MB, expect processing delays proportional to duration. Pro tip: if extraction fails, try converting the source to WebM first - it has the broadest decoding support across modern browsers.

Formulas

WAV files use the RIFF container with a WAVE format chunk. The tool constructs the binary header manually before appending interleaved PCM sample data.

File Size = f_s × C × B × T + 44

Where f_s = sample rate in Hz, C = number of channels, B = bytes per sample (typically 2 for 16-bit PCM), T = duration in seconds, and 44 = RIFF header size in bytes.

Trimming maps time boundaries to sample offsets:

S_start = floor(t_start × f_s)

S_end = floor(t_end × f_s)

Where t_start and t_end are the trim boundaries in seconds. The tool slices each channel's Float32Array between these indices. For waveform rendering, the buffer is partitioned into N buckets (one per pixel column), and each bucket's peak amplitude is computed as:

peak_i = max(|sample_j|) for j ∈ bucket i

16-bit PCM conversion clamps floating-point samples from [−1, 1] to signed 16-bit integers [−32768, 32767]:

pcm₁₆ = clamp(sample × 32768, −32768, 32767)

Reference Data

Container Format	Common Audio Codecs	Chrome Support	Firefox Support	Safari Support	Typical Bitrate	Max Channels
MP4 (.mp4, .m4v)	AAC, MP3, AC-3	✅ Full	✅ Full	✅ Full	128 - 320 kbps	8
WebM (.webm)	Opus, Vorbis	✅ Full	✅ Full	⚠️ Partial	64 - 510 kbps	8
OGG (.ogg, .ogv)	Vorbis, Opus, FLAC	✅ Full	✅ Full	❌ None	80 - 500 kbps	8
MOV (.mov)	AAC, ALAC, PCM	✅ Full	⚠️ Partial	✅ Full	128 - 1411 kbps	6
MKV (.mkv)	AAC, Vorbis, FLAC, AC-3	⚠️ Partial	⚠️ Partial	❌ None	128 - 1411 kbps	8
AVI (.avi)	MP3, PCM, AC-3	❌ None	❌ None	❌ None	128 - 320 kbps	6
WAV (.wav)	PCM (uncompressed)	✅ Full	✅ Full	✅ Full	1411 kbps (16-bit stereo)	2
FLAC (.flac)	FLAC (lossless)	✅ Full	✅ Full	✅ (14.1+)	400 - 1200 kbps	8
MP3 (.mp3)	MP3	✅ Full	✅ Full	✅ Full	64 - 320 kbps	2
AAC (.aac, .m4a)	AAC-LC, HE-AAC	✅ Full	✅ Full	✅ Full	64 - 320 kbps	6
3GP (.3gp)	AMR, AAC	⚠️ Partial	⚠️ Partial	⚠️ Partial	8 - 128 kbps	2
WMA (.wma)	WMA, WMA Pro	❌ None	❌ None	❌ None	64 - 384 kbps	6

Frequently Asked Questions

Browser audio decoding relies on built-in codecs. If the video uses a codec your browser lacks (e.g., AC-3 in Firefox, or WMA in any browser), decodeAudioData() will reject the promise. Convert the source to MP4 (AAC) or WebM (Opus) using a desktop tool first, then re-extract here.

WAV output is uncompressed 16-bit PCM at the source's native sample rate - lossless but large. A 5-minute stereo track at 48000 Hz produces approximately 57 MB. WebM output uses the browser's built-in Opus encoder at roughly 128 kbps, yielding files around 4-5 MB for the same duration with perceptually transparent quality for speech and music.

Technically yes, but the entire file must be loaded into memory as an ArrayBuffer, then decoded into floating-point PCM. A 500 MB video with 10 minutes of stereo 48 kHz audio will consume roughly 230 MB of heap for the AudioBuffer alone. Browsers may crash on devices with limited RAM. For very large files, consider trimming the video first or using a desktop tool.

No. Trimming operates on the decoded PCM buffer - it simply slices the Float32Array at the calculated sample indices. No re-encoding or resampling occurs for WAV output. For WebM output, the trimmed PCM is re-encoded via MediaRecorder, which introduces one generation of lossy compression, same as full-length extraction.

Real audio signals have DC offset and asymmetric transients. Drums, for instance, have sharper positive peaks than negative. The waveform displays true peak amplitudes per bucket - visual asymmetry reflects the actual signal characteristics, not a rendering error.

WAV output preserves the source file's native sample rate as decoded by the browser (commonly 44100 Hz or 48000 Hz). WebM output uses the AudioContext's default sample rate, which matches the system's audio hardware - typically 44100 or 48000 Hz. If these differ, the browser handles resampling internally during MediaRecorder capture.