From microphone to .WAV with: getUserMedia and Web Audio

Update: The new MediaStream recording specification is aiming at solving this use case through a much simpler API. Follow the conversations on the mailing list.

A few years ago, I wrote a little ActionScript 3 library called MicRecorder, which allowed you to record the microphone input and export it to a .WAV file. Very simple, but pretty handy. The other day I thought it would be cool to port it to JavaScript. I realized quickly that it is not as easy. In Flash, the SampleDataEvent directly provides the byte stream  PCM samples) from the microphone. With getUserMedia, the Web Audio APIs are required to extract the samples. Note that getUserMedia and Web Audio are not broadly supported yet, but it is coming. Firefox has also landed Web Audio recently, which is great news.

Because I did not find an article that went through the steps involved, here is a short article on how it works, from getting access to the microphone to the final .WAV file, it may be useful to you in the future. The most helpful resource I came across was this nice HTML5 Rocks article which pointed to Matt Diamond’s example, which contains the key piece I was looking for to get the Web Audio APIs hooked up. Thanks so much Matt! Credits also goes to Matt for the merging and interleaving code of the buffers which works very nicely.

First, we need to get access to the microphone, and we use the getUserMedia API for that.

The first argument of the getUserMedia API provides information on what do we want to get access to (here the microphone), if we wanted to get access to the camera, we would have passed an object with the video flag on:

The two other arguments are callbacks to handle successful access to the hardware or failure. At this point, the success callback will be triggered if the user clicks “Allow” through this panel:


Once the user has allowed access to the microphone, we need to start querying the PCM samples, this is where it becomes tricky and the Web Audio APIs comes into the game. If you have not checked the Web Audio spec, you will see that the surface is very large and quite scary when you see it for the first time and that’s because the Web Audio APIs can do a lot, like audio filters, synthesized music, 3D audio engines and more. But all we need here are the PCM samples that we would store and pack inside a WAV container using a simple ArrayBuffer.

So our user has clicked “Allow”, we now go on and create an audio context and start capturing the audio data:

The createJavaScriptNode API takes as a first argument the buffer size you want to retrieve, as I added in the comments, this value will dictate how frequently the audioprocess event will be dispatched. For best latency, choose a low value, like 2048 (remember it needs to be a power of two). Every time the event is dispatched, we call the getChannelData APIs for each channel (left and right) and get a new Float32Array buffer for each channel that we clone (sorry GC) and store into two separate Arrays. This code could would be much simpler and more GC friendly if it was possible to write each channel into a Float32Array directly, but given that these cannot have an undefined length, we need to fallback to plain Arrays.

So why do we have to clone the channels? It actually drove me nuts for many hours. What happens is that the returned channel buffers are pointers to the current samples coming in, so you need to snapshot them (clone) otherwise you will end up with samples reflecting the sound coming from the microphone at the instant you stopped recording.

Once we have our arrays of buffers, we need to flat down each channel:

Once flat, we can interleave both channels together:

We then add the little writeUTFBytes utility function:

We are now ready for WAV packaging, you can change the volume variable if needed (from 0 to 1):

Obviously, if WAV packaging becomes too expensive, it is an ideal task to offload to a background worker ;)
Once done, we can save our blob to a file or do whatever we want with it. We can now save it locally or remotely, or even post process it. You can also check the live demo here for more fun.

Posted on July 22, 2013 by Thibault Imbert · 50 comments
  • TomOlivier

    Thanks for the post, really interesting!

    • Thibault Imbert

      Happy you liked it!

      • priy

        its not recording its silent and not working please help me

  • Pingback: Bruce Lawson’s personal site  : Reading List

  • Pingback: Новости » Blog Archive » Дайджест интересных материалов из мира веб-разработки и IT за последнюю неделю №67 (21 — 27 июля 2013)

  • Chris Matthieu

    Interesting post indeed. I’ve tried running the live demo on both the latest version of Chrome and Firefox on Mac and the recorded audio file plays but it’s silent – nothing recorded. Is it possible the Chrome and Firefox updates may have broken your demo?

    • Thibault Imbert

      Hi Chris,

      Weird. I just tried in Chrome 28.0.1500.71 and it worked. Which version of Chrome are you using? On Firefox, are you using the Nightly builds? Web Audio is partially implemented and they are still working on a bug (for createMediaStreamSource – https://bugzilla.mozilla.org/show_bug.cgi?id=856361), so it is expected if it does not work in Firefox for now. Hopefully it will soon.

      Let me know.

      • Chris Matthieu

        Yep, I’m running Chrome 28.0.1500.71 also with no luck; however, I was able to get the demo to run on Chrome Canary. Thanks.

        • Thibault Imbert

          Ah good. Thanks for the feedback on the issues, this is helpful.

          • priy

            help me the same in case of video

  • http://www.mathewporter.co.uk/ Mathew Porter

    Brilliant, the demo works brilliantly in chrome, great post and run through.

  • Daniel J

    Yay! It finally works! I have tried this several times but I have never got it to work (I’m using Ubuntu).
    Have a colorful demo! http://djazz.mine.nu/apps/visualizer/

    It let’s you see the audio with a spectrum. It mutes your speakers to prevent loopback, but you can use the volume slider to listen to the mic input (use headphones!).
    - djazz

  • Erin Mongkey

    Thank you for your excellent code. That is correct code that I was find. But it is run on the only chrome. Is there any method that can voice record on any web browser such as mobile? Thank you.

  • Daniele Baldo

    Thank you,

    great post.

    i have implemented a java server based on nanoHTTP and java Websocket server

    Do you think is it possible to save the channel data or clone the samples directly to the server using websockets?for a live broadcasting to the server.

    • Thibault Imbert

      Hi Daniele,

      Yes, sounds totally doable. Not sure about the latency though, but give it a try!

  • Sharun

    Very helpful write up. Thanks a lot.
    Why does the “live audio” indicator in the chrome tab(near the favicon) continue to blink after recording has stopped? Anyone know what the right way to release the audio nodes is?

    • Thibault Imbert

      Good catch, I am not releasing the nodes in this example. I will have a look when I have some time.

      Check the lifetime section on the spec: https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html

      Post it here if you find any solution. Thanks!

      • Harsh Shah

        Hi Thibault, how can I modify the above code to get an mp4 file?

  • Rick

    Thanks for the great article Thibault. This is one of the most useful WebRTC related posts for what I was trying to do.

    I started packaging your snippet into a more modular version in coffeescript with a simplified API. https://github.com/rickcarlino/simple_audio

    • Thibault Imbert

      Hi Rick, happy to hear it was useful. I wrote this because I also did not find much on the web about it.

      Very cool wrapper you wrote, love it. Thanks for the heads up!

  • olegk

    You have errors in your code samples. You never define leftChannel or rightChannel, yet you try to push data into them

    • Thibault Imbert

      Hi olegk,

      You should check the code source on the demo link, you will see the complete code. The snippets I posted here in the blog post are just samples from the code I wanted to highlight. In other words, the code is incomplete here in the post.

  • Anshu

    how to get audio recorded file in our file system????

  • Anshu

    This getusermedia() functionality works on IE???

  • Anshu

    After debugging, I had few questions :-

    1) getusermedia() is nt working. So, how i can do audio recording for IE version??
    2) In Mozilla, i got recordingLength = 0 due to which i didn’t get any output.wav file for audio recording??
    3) what does it means left channel and right channel.. If i do audio recording via mobile headphone in google chrome, i got nthing audio voice record in the file?
    Can u please review these question and waiting for positive feedback from ur side..

  • kryptogoloc

    This looks suspiciously like you have purloined the code from recorder.js and removed it from its web worker home.

    • Thibault Imbert


      I actually pointed out in the article clearly that part of the code comes from Matt’s example here: http://webaudiodemos.appspot.com/AudioRecorder/index.html

      The intent was also to explain how this whole thing works, when there is almost no resource on the web that actually explains from A to Z how to do this.

      • kryptogoloc

        My sincere apologies.

        • Thibault Imbert

          No worries!

  • Harish Dubey

    How can I reduce the file size? A 16 sec file is over an mb. I tried to reduce the sampling rate and drop one channel data but then the playbacks at slower speed. Any suggestions?

    • allannaranjo

      Hi Harish, were you able to reduce the file size, I’m having exactly the same results when reduced the sample rate to 22050.

    • RelativeCausality

      I know your comment was posted over a year ago, but I wanted to leave this explanation in case someone else stumbles across this like I did via a Google search. Hopefully I can save someone else from having to stay up all night to figure this this out.

      With the way the demo code is written, the slow playback speed you’re encountering is expected behavior. The reason for this is that the sampleRate variable is effectively being used to store the playback speed in the WAV file. It does *not* control the number of audio bytes being stored in the wav file.

      The number of bytes being recorded is determined by the sample rate of the audio context. See https://developer.mozilla.org/en-US/docs/Web/API/AudioContext/sampleRate for more information.

      As far as I know, the sample-rate is a read-only property of the audio context. If the sample rate value being stored in the WAV file’s header is below the sampling rate of the audio context, it will result in slower playback. If it’s above the sampling rate of the audio context, it will result in accelerated playback. In fact, the sampling rate being used by the demo of 44100 is completely incorrect. At the time of writing this both FireFox and Chrome appear to record at 48000 Hz. Consequently, setting the variable to 44100 Hz will result in reduced playback speed.

      The important take-away is this: the sampleRate variable should be set to the sampling rate of the browser’s audio context. In this way, the payback speed will match the recording speed and the file will sound normal.

      In order to reduce the file size, you must reduce the number of bytes being stored in the file. You can accomplish this in one of three ways. First, you can have a mono file (single channel) instead of a stereo file (two channel). See http://stackoverflow.com/a/16320400 for an example on how to do this. This will effectively cut the file size in half, with the exception of the 44-byte file header.

      Second, you can down-sample the recording from the audio context’s sample rate to a reduced one. (See http://stackoverflow.com/a/26245260 for an example). Audacity’s documentation has a list of common sampling rates at http://wiki.audacityteam.org/wiki/Sample_Rates.

      The third option is to combine the two above methods for an even smaller file.

      • Thibault Imbert

        Hi RelativeCausality,

        Funny you bring this up, because on a current project I am working on I actually experienced the slow playback problem and actually updated the code here a few days ago in the article but not on the live demo. If you look at the snippets I posted in the article you will see that I query the audio context sample rate and use it for the WAV packaging. I came across that bug on a Chromebook where the sample rate is indeed 48k whereas my MacBook Air returns 44k.

        For the file size reduction Harish, I might cover this in a future article. Steps from RelativeCausality are good ways to achieve this.

        • Thibault Imbert

          Btw, I also updated the live demo here for the sample rate issue.

          • RelativeCausality


            BTW, I forgot to mention that if someone down-samples the recording they will need to make sure that the reduced sampling rate is stored in the WAV file instead of the original sampling rate.

  • http://yamwav.net/ puf

    hello there!
    First of all, thx a lot for this code! It works fine :) but I would like to know how to upload the “blob” on server side ? I’m a bit confused, should I use a post method ? I heard about node-formidable for node.js, but I also read some stuff concerning binary.js or some way to stream the data to the server but it’s a bit obscure for me right now. For you what would be the best way to do that ?
    It might be a dumb question but i’m quite new with web developping and particulary node.js
    thx in advance!!

  • Vlad Titov

    Thanks Thibaul such a great post.

    How it is compare to AS3? in terms of speed.

    manipulating byte arrays from script is it faster?

  • masud

    when do the recording on mac browser it works good and play back fine but pc(windows) records and plays back in a deeper voice. can any one help me with this please?
    Many thanks

  • Thibault Imbert

    FYI. I updated the Web Audio code so that it works with the latest implementations.

  • ziyue

    Thank you, one of my applications need a microphone volume, without the need to download the wav file. Ask how real-time detecting and obtaining the microphone volume?

  • http://nitinsurana.com Nitin Surana

    getUserMedia (with webkit prefixed) not working on Mobile Safari 7 on iOS 7, any solution/polyfill ?

  • James Kleeh

    I’ve found the code fails to record if you switch tabs for more than ~2 seconds in Chrome (possibly FF as well). The fix I came up with revolves around connecting and disconnecting the recorder to and from the destination. This also prevents having onaudioprocess to run continually. Do you have this code on Github so I can contribute the change?

  • goodbedford

    Thanks for the article, very superb. I also commend you for updating the article and the comments.

  • Neeraj Sharma

    Hi Thibault, My code is not working even i wrote the way the code is written. First of all, i tried to hear the audio blob and couldn’t get any voice then on converting it into base64 data, i discovered that blob was empty If you want me i can give you my code.

  • Shiob Mohammed A

    Hi, thanks for the article its really knowledgeable, I have one doubt, can we set, sample rate(16000) and audio channel to mono, without breaking up audio?

  • http://www.kielsoft.net Olayode ‘Kielsoft’ Ezekiel

    Thanks for this post. Can you help to show how this can be use over socket.io i.e, from Microphone -> Socket.io (nodejs server) -> socket.io (another peer(s)) ->audio element.


  • TimMc

    I’ve been looking over this code for a while, but I still don’t understand this line while encoding into a WAV:

    var buffer = new ArrayBuffer(44 + interleaved.length * 2);
    What is the “* 2″ for? Doesn’t this just double the length of the WAV file? If the data is already ‘interleaved’, then won’t the length of the ‘interleaved’ data be the correct length anyway? Shouldn’t it just be:

    var buffer = new ArrayBuffer(44 + interleaved.length);

    • http://www.felixsansaccent.me Felix Beaulieu

      I think it’s because of the size of a sample.

      Samples are 16 bits (2 bytes) long in a standard .wav file.

  • Mickael fraga

    I have a code here that looks more like the error on the mobile “Android” … could help me to verify … it seems that it is high latency that the android owns, but I can not solve it, it keeps the voice failing