User Rating 0.0
Total Usage 0 times
Category Audio Tools
Drop your video or audio file hereor click to browse · MP4, WebM, MOV, OGG, WAV, MP3, FLAC
Is this tool helpful?

Your feedback helps us improve.

About

Extracting audio from video files typically requires desktop software or server-side processing. This tool performs the entire operation client-side using the Web Audio API and MediaRecorder API. Your files never leave the browser. It decodes the media container, isolates the audio track as raw PCM (Float32 samples), and re-encodes it into a downloadable format. The waveform visualization renders amplitude peaks calculated from the decoded AudioBuffer, giving you precise visual feedback for trimming. Note: decoding support depends on your browser's built-in codec library. Chromium-based browsers handle MP4 (AAC/H.264), WebM (Opus/Vorbis), and OGG reliably. Safari may reject WebM containers.

If you need a specific segment, the trim controls map your time selection to sample indices at the file's native sample rate (fs), typically 44100 or 48000 Hz. WAV output is uncompressed PCM - lossless but large. WebM/OGG output uses the browser's Opus encoder at approximately 128 kbps. For files exceeding 200 MB, expect processing delays proportional to duration. Pro tip: if extraction fails, try converting the source to WebM first - it has the broadest decoding support across modern browsers.

audio extractor extract audio from video video to audio wav converter audio ripper waveform trim audio

Formulas

WAV files use the RIFF container with a WAVE format chunk. The tool constructs the binary header manually before appending interleaved PCM sample data.

File Size = fs × C × B × T + 44

Where fs = sample rate in Hz, C = number of channels, B = bytes per sample (typically 2 for 16-bit PCM), T = duration in seconds, and 44 = RIFF header size in bytes.

Trimming maps time boundaries to sample offsets:

Sstart = floor(tstart × fs)
Send = floor(tend × fs)

Where tstart and tend are the trim boundaries in seconds. The tool slices each channel's Float32Array between these indices. For waveform rendering, the buffer is partitioned into N buckets (one per pixel column), and each bucket's peak amplitude is computed as:

peaki = max(|samplej|) for j bucket i

16-bit PCM conversion clamps floating-point samples from [−1, 1] to signed 16-bit integers [−32768, 32767]:

pcm16 = clamp(sample × 32768, −32768, 32767)

Reference Data

Container FormatCommon Audio CodecsChrome SupportFirefox SupportSafari SupportTypical BitrateMax Channels
MP4 (.mp4, .m4v)AAC, MP3, AC-3✅ Full✅ Full✅ Full128 - 320 kbps8
WebM (.webm)Opus, Vorbis✅ Full✅ Full⚠️ Partial64 - 510 kbps8
OGG (.ogg, .ogv)Vorbis, Opus, FLAC✅ Full✅ Full❌ None80 - 500 kbps8
MOV (.mov)AAC, ALAC, PCM✅ Full⚠️ Partial✅ Full128 - 1411 kbps6
MKV (.mkv)AAC, Vorbis, FLAC, AC-3⚠️ Partial⚠️ Partial❌ None128 - 1411 kbps8
AVI (.avi)MP3, PCM, AC-3❌ None❌ None❌ None128 - 320 kbps6
WAV (.wav)PCM (uncompressed)✅ Full✅ Full✅ Full1411 kbps (16-bit stereo)2
FLAC (.flac)FLAC (lossless)✅ Full✅ Full✅ (14.1+)400 - 1200 kbps8
MP3 (.mp3)MP3✅ Full✅ Full✅ Full64 - 320 kbps2
AAC (.aac, .m4a)AAC-LC, HE-AAC✅ Full✅ Full✅ Full64 - 320 kbps6
3GP (.3gp)AMR, AAC⚠️ Partial⚠️ Partial⚠️ Partial8 - 128 kbps2
WMA (.wma)WMA, WMA Pro❌ None❌ None❌ None64 - 384 kbps6

Frequently Asked Questions

Browser audio decoding relies on built-in codecs. If the video uses a codec your browser lacks (e.g., AC-3 in Firefox, or WMA in any browser), decodeAudioData() will reject the promise. Convert the source to MP4 (AAC) or WebM (Opus) using a desktop tool first, then re-extract here.
WAV output is uncompressed 16-bit PCM at the source's native sample rate - lossless but large. A 5-minute stereo track at 48000 Hz produces approximately 57 MB. WebM output uses the browser's built-in Opus encoder at roughly 128 kbps, yielding files around 4-5 MB for the same duration with perceptually transparent quality for speech and music.
Technically yes, but the entire file must be loaded into memory as an ArrayBuffer, then decoded into floating-point PCM. A 500 MB video with 10 minutes of stereo 48 kHz audio will consume roughly 230 MB of heap for the AudioBuffer alone. Browsers may crash on devices with limited RAM. For very large files, consider trimming the video first or using a desktop tool.
No. Trimming operates on the decoded PCM buffer - it simply slices the Float32Array at the calculated sample indices. No re-encoding or resampling occurs for WAV output. For WebM output, the trimmed PCM is re-encoded via MediaRecorder, which introduces one generation of lossy compression, same as full-length extraction.
Real audio signals have DC offset and asymmetric transients. Drums, for instance, have sharper positive peaks than negative. The waveform displays true peak amplitudes per bucket - visual asymmetry reflects the actual signal characteristics, not a rendering error.
WAV output preserves the source file's native sample rate as decoded by the browser (commonly 44100 Hz or 48000 Hz). WebM output uses the AudioContext's default sample rate, which matches the system's audio hardware - typically 44100 or 48000 Hz. If these differ, the browser handles resampling internally during MediaRecorder capture.