3d-graphics-color-spectrum

From microphone to .WAV with: getUserMedia and Web Audio

Update: The new MediaStream recording specification is aiming at solving this use case through a much simpler API. Follow the conversations on the mailing list.

A few years ago, I wrote a little ActionScript 3 library called MicRecorder, which allowed you to record the microphone input and export it to a .WAV file. Very simple, but pretty handy. The other day I thought it would be cool to port it to JavaScript. I realized quickly that it is not as easy. In Flash, the SampleDataEvent directly provides the byte stream  PCM samples) from the microphone. With getUserMedia, the Web Audio APIs are required to extract the samples. Note that getUserMedia and Web Audio are not broadly supported yet, but it is coming. Firefox has also landed Web Audio recently, which is great news.

Because I did not find an article that went through the steps involved, here is a short article on how it works, from getting access to the microphone to the final .WAV file, it may be useful to you in the future. The most helpful resource I came across was this nice HTML5 Rocks article which pointed to Matt Diamond’s example, which contains the key piece I was looking for to get the Web Audio APIs hooked up. Thanks so much Matt! Credits also goes to Matt for the merging and interleaving code of the buffers which works very nicely.

First, we need to get access to the microphone, and we use the getUserMedia API for that.

The first argument of the getUserMedia API provides information on what do we want to get access to (here the microphone), if we wanted to get access to the camera, we would have passed an object with the video flag on:

The two other arguments are callbacks to handle successful access to the hardware or failure. At this point, the success callback will be triggered if the user clicks “Allow” through this panel:

getusermedia-access

Once the user has allowed access to the microphone, we need to start querying the PCM samples, this is where it becomes tricky and the Web Audio APIs comes into the game. If you have not checked the Web Audio spec, you will see that the surface is very large and quite scary when you see it for the first time and that’s because the Web Audio APIs can do a lot, like audio filters, synthesized music, 3D audio engines and more. But all we need here are the PCM samples that we would store and pack inside a WAV container using a simple ArrayBuffer.

So our user has clicked “Allow”, we now go on and create an audio context and start capturing the audio data:

The createJavaScriptNode API takes as a first argument the buffer size you want to retrieve, as I added in the comments, this value will dictate how frequently the audioprocess event will be dispatched. For best latency, choose a low value, like 2048 (remember it needs to be a power of two). Every time the event is dispatched, we call the getChannelData APIs for each channel (left and right) and get a new Float32Array buffer for each channel that we clone (sorry GC) and store into two separate Arrays. This code could would be much simpler and more GC friendly if it was possible to write each channel into a Float32Array directly, but given that these cannot have an undefined length, we need to fallback to plain Arrays.

So why do we have to clone the channels? It actually drove me nuts for many hours. What happens is that the returned channel buffers are pointers to the current samples coming in, so you need to snapshot them (clone) otherwise you will end up with samples reflecting the sound coming from the microphone at the instant you stopped recording.

Once we have our arrays of buffers, we need to flat down each channel:

Once flat, we can interleave both channels together:

We then add the little writeUTFBytes utility function:

We are now ready for WAV packaging, you can change the volume variable if needed (from 0 to 1):

Obviously, if WAV packaging becomes too expensive, it is an ideal task to offload to a background worker ;)
Once done, we can save our blob to a file or do whatever we want with it. We can now save it locally or remotely, or even post process it. You can also check the live demo here for more fun.

Posted on July 22, 2013 by Thibault Imbert · 31 comments
  • TomOlivier

    Thanks for the post, really interesting!

    • Thibault Imbert

      Happy you liked it!

  • Pingback: Bruce Lawson’s personal site  : Reading List

  • Pingback: Новости » Blog Archive » Дайджест интересных материалов из мира веб-разработки и IT за последнюю неделю №67 (21 — 27 июля 2013)

  • Chris Matthieu

    Interesting post indeed. I’ve tried running the live demo on both the latest version of Chrome and Firefox on Mac and the recorded audio file plays but it’s silent – nothing recorded. Is it possible the Chrome and Firefox updates may have broken your demo?

    • Thibault Imbert

      Hi Chris,

      Weird. I just tried in Chrome 28.0.1500.71 and it worked. Which version of Chrome are you using? On Firefox, are you using the Nightly builds? Web Audio is partially implemented and they are still working on a bug (for createMediaStreamSource – https://bugzilla.mozilla.org/show_bug.cgi?id=856361), so it is expected if it does not work in Firefox for now. Hopefully it will soon.

      Let me know.

      • Chris Matthieu

        Yep, I’m running Chrome 28.0.1500.71 also with no luck; however, I was able to get the demo to run on Chrome Canary. Thanks.

        • Thibault Imbert

          Ah good. Thanks for the feedback on the issues, this is helpful.

  • http://www.mathewporter.co.uk/ Mathew Porter

    Brilliant, the demo works brilliantly in chrome, great post and run through.

  • Daniel J

    Yay! It finally works! I have tried this several times but I have never got it to work (I’m using Ubuntu).
    Have a colorful demo! http://djazz.mine.nu/apps/visualizer/

    It let’s you see the audio with a spectrum. It mutes your speakers to prevent loopback, but you can use the volume slider to listen to the mic input (use headphones!).
    - djazz

  • Erin Mongkey

    Thank you for your excellent code. That is correct code that I was find. But it is run on the only chrome. Is there any method that can voice record on any web browser such as mobile? Thank you.

  • Daniele Baldo

    Thank you,

    great post.

    i have implemented a java server based on nanoHTTP and java Websocket server

    Do you think is it possible to save the channel data or clone the samples directly to the server using websockets?for a live broadcasting to the server.

    • Thibault Imbert

      Hi Daniele,

      Yes, sounds totally doable. Not sure about the latency though, but give it a try!

  • Sharun

    Very helpful write up. Thanks a lot.
    Why does the “live audio” indicator in the chrome tab(near the favicon) continue to blink after recording has stopped? Anyone know what the right way to release the audio nodes is?

  • Rick

    Thanks for the great article Thibault. This is one of the most useful WebRTC related posts for what I was trying to do.

    I started packaging your snippet into a more modular version in coffeescript with a simplified API. https://github.com/rickcarlino/simple_audio

    • Thibault Imbert

      Hi Rick, happy to hear it was useful. I wrote this because I also did not find much on the web about it.

      Very cool wrapper you wrote, love it. Thanks for the heads up!

  • olegk

    You have errors in your code samples. You never define leftChannel or rightChannel, yet you try to push data into them

    • Thibault Imbert

      Hi olegk,

      You should check the code source on the demo link, you will see the complete code. The snippets I posted here in the blog post are just samples from the code I wanted to highlight. In other words, the code is incomplete here in the post.

  • Anshu

    how to get audio recorded file in our file system????

  • Anshu

    This getusermedia() functionality works on IE???

  • Anshu

    After debugging, I had few questions :-

    1) getusermedia() is nt working. So, how i can do audio recording for IE version??
    2) In Mozilla, i got recordingLength = 0 due to which i didn’t get any output.wav file for audio recording??
    3) what does it means left channel and right channel.. If i do audio recording via mobile headphone in google chrome, i got nthing audio voice record in the file?
    Can u please review these question and waiting for positive feedback from ur side..

  • kryptogoloc

    This looks suspiciously like you have purloined the code from recorder.js and removed it from its web worker home.

    • Thibault Imbert

      Hi,

      I actually pointed out in the article clearly that part of the code comes from Matt’s example here: http://webaudiodemos.appspot.com/AudioRecorder/index.html

      The intent was also to explain how this whole thing works, when there is almost no resource on the web that actually explains from A to Z how to do this.

      • kryptogoloc

        My sincere apologies.

        • Thibault Imbert

          No worries!

  • Harish Dubey

    How can I reduce the file size? A 16 sec file is over an mb. I tried to reduce the sampling rate and drop one channel data but then the playbacks at slower speed. Any suggestions?

    • allannaranjo

      Hi Harish, were you able to reduce the file size, I’m having exactly the same results when reduced the sample rate to 22050.

  • http://yamwav.net/ puf

    hello there!
    First of all, thx a lot for this code! It works fine :) but I would like to know how to upload the “blob” on server side ? I’m a bit confused, should I use a post method ? I heard about node-formidable for node.js, but I also read some stuff concerning binary.js or some way to stream the data to the server but it’s a bit obscure for me right now. For you what would be the best way to do that ?
    It might be a dumb question but i’m quite new with web developping and particulary node.js
    thx in advance!!

  • Vlad Titov

    Thanks Thibaul such a great post.

    How it is compare to AS3? in terms of speed.

    manipulating byte arrays from script is it faster?

  • masud

    when do the recording on mac browser it works good and play back fine but pc(windows) records and plays back in a deeper voice. can any one help me with this please?
    Many thanks