A few weeks ago I got a germ of an idea in my head for a personal web-application that required recording and playing video, something with which I have had very little experience. I have seen how effortless is it to play video with HTML5 so I thought this would be simple. After searching countless sites looking for the HTML5 magic bullet for recording both audio & video, I had pretty much given up.
If you have stumbled upon this article also looking for a way to record audio & video together, you can stop searching now. I can say with fairly strong confidence that such a mechanism does not yet exist (as of the publish date of this article). However, I believe I have a workable solution for the time being.
The full source for the example application is on github.
The most frustrating part is that the MediaStream API provides a nearly effortless way to route video and audio but doesn’t yet support recording the entire stream. The MediaRecorder API would be perfect but it isn’t currently supported by a major browser. (It is in development for Firefox and Chrome but doesn’t appear to be on the Safari radar. Status for IE is unknown).
So where do we go from here? We have the audio and the video…just not as a single stream. Let’s record and upload them separately and, later, retrieve and play them together. Obviously this isn’t a perfect solution but it should be sufficient for some applications.
I put together a simple example that uses the Whammy WebM encoder (to record & encode the video) and Recorder.js (to record & encode the audio). I used Meteor.js for the framework since it is great for building quick prototypes, especially if you are primarily focused on HTML and JavaScript (as this example is). Since both of the recording classes provide output blob data, a class wrapper around the FileReader class was used to convert the blob data to binary data, which can then be inserted into the MongoDB instance.
The Meteor Project
If you don’t care about understanding the project and just want to see how the html5 audio & video recording works, skip to the “Recording Audio and Video” section.Even if you haven’t used Meteor before, the code is quite simple and easy to follow. Named mustache templates are used for the front-facing html and JavaScript is used for both the server and client code. Meteor gives you freedom to organize the files and folders of a project how you want so I want to provide an overview of how this particular project is structured.
/lib/router.js
This file contains the configuration and url mapping for the iron-router package. There are three routes defined. The ‘home’ route matches the root path, the ‘record’ route matches /record/, which shows the ‘record’ template, and the ‘showVideo’ route, which matches /video/:_id where “:_id” matches the id variable for a specific user.
The most important code to note here is in the ‘showVideo’ route:
[sourcecode language=”javascript”]
this.route(‘showVideo’,
{
path: ‘/video/:_id’,
template: ‘showVideo’,
data: function () {
var id = this.params._id;
console.log("route-showVideo: " + id);
var user = Meteor.users.findOne({ _id: id });
if (!user) {
console.log("! no user found for route");
return null;
}
console.log("Finding audio/video for user: " + id);
var vid = UserVideos.findOne({ userId: id },
{ sort: { save_date: -1 }});
var aud = UserAudios.findOne({ userId: id },
{ sort: { save_date: -1 }});
return {
audio: aud,
video: vid,
userId: id
}
}
});
[/sourcecode]
Note that the ‘data’ section finds the video and audio for the user id passed through the url and returns an object containing all of that information, the audio data, the video data, and the userId. This audio and video data will be generated and added to the database from the ‘record’ template, discussed in more detail later.
/lib/models.js
This file defines the models used in this app: UserAudios and UserVideos, both collections that are used to store the audio & video data, respectively, for each user.
/server/publishing.js
Since “auto-publish” is enabled for this Meteor project, this file specifies the data that should be published to the client. Since this is a sample application, it is intentionally liberal.
/client/layout.html
All of the templates of this project are in the client folder. The ‘layout’ and ‘header’ templates are both in layout.html and provide an application layout template and login functionality.
/client/templates.html
This view file contains all of the functional templates for this application. The helper scripts for each template are in matching ‘.js’ files. For example, the ‘home’ template has a matching ‘home.js’ file with template helpers and associated script methods.
– home, userVideo: together these templates display a list of all recorded user videos. Scripts for these templates are in home.js.
[sourcecode language=”html”]
<template name="home">
<div id="video-list">
{{#each userList}}
{{> userVideo}}
{{/each}}
</div>
</template>
<template name="userVideo">
<div id="{{_id}}" class="video-item">
{{email}}: <a href="{{pathFor ‘showVideo’}}">Video</a>
</div>
</template>
[/sourcecode]
– record: this template displays an html5 video and controls for interacting with the video. The important part of this template is in the script handlers, which are in record.js. See “Recording Audio & Video” below for a description of the actual interactions going on here (the important part).
[sourcecode language=”html”]
<template name="home">
<div id="video-list">
{{#each userList}}
{{> userVideo}}
{{/each}}
</div>
</template>
<template name="userVideo">
<div id="{{_id}}" class="video-item">
{{email}}: <a href="{{pathFor ‘showVideo’}}">Video</a>
</div>
</template>
[/sourcecode]
– showVideo: this template shows a user-recorded video using ‘audio’ and ‘video’ elements.
[sourcecode language=”html”]
<template name="showVideo">
{{onLoad}}
{{userEmail}}
{{#if hasVideo}}
<button id="play-recording">Play</button>
<video id="review_video" width="320" height="240" src="{{userVideo}}">
</video>
<audio hidden="true" id="review_audio" src="{{userAudio}}">
</audio>
{{else}}
{{#if hasUser}}
Please wait while the video is retrieved.
{{else}}
No user found.
{{/if}}
{{/if}}
</template>
[/sourcecode]
Recording Audio and Video
The meat of this application, the actual recording and storing of the audio and video, is performed in the ‘record’ template and its associated scripts (in record.js). When the user clicks the ‘start-recording’ button, the event handler calls startRecording which uses the Whammy WebM encoder to record video and Recorder.js audio recorder to record the audio.
Setup (utilizing navigator.getUserMedia)
First let’s take a look at the code that sets up the media. Notice in the ‘record’ template code above, there is a call to the {{onLoad}} template parameter. I’m doing this as a way to get some code to be called as the page is being processed. That template code just calls the setupMedia function, which is the real center of the functionality.
[sourcecode language=”javascript”]
Template.record.onLoad = function () { setupMedia(); };
[/sourcecode]
Let’s example the setupMedia function:
[sourcecode language=”javascript”]
function setupMedia() {
if (supportsMedia()) {
audioContext = new AudioContext();
navigator.getUserMedia(
{
video: true,
audio: true
},
function (localMediaStream) {
// map the camera
var video = document.getElementById('live_video');
video.src = window.URL.createObjectURL(localMediaStream);
// create the canvas & get a 2d context
videoCanvas = document.createElement('canvas');
videoContext = videoCanvas.getContext('2d');
// setup audio recorder
var audioInput =
audioContext.createMediaStreamSource(localMediaStream);
//audioInput.connect(audioContext.destination);
// had to replace the above with the following to
// mute playback (so you don't get feedback)
var audioGain = audioContext.createGain();
audioGain.gain.value = 0;
audioInput.connect(audioGain);
audioGain.connect(audioContext.destination);
audioRecorder = new Recorder(audioInput);
mediaStream = localMediaStream;
mediaInitialized = true;
document.getElementById('uploading').hidden = true;
document.getElementById('media-error').hidden = true;
document.getElementById('record').hidden = false;
},
function (e) {
console.log('web-cam & microphone not initialized: ', e);
document.getElementById('media-error').hidden = false;
}
);
}
};
[/sourcecode]
To sum up the important steps of this function:
- Call navigator.getUserMedia to get the audio and video from the user’s webcam and microphone. Note that the user must accept this or the page will not work.
- Send the webCam video to the <video id=’live_video’> element’s src so the user can see what is being recorded.
- Create a canvas element and 2d context, which will be used later to pull & record video frames.
- Create an AudioContext object (http://www.w3.org/TR/webaudio/#AudioContext-section) and capture the local audio as a stream.
- Create a Recorder.js object to receive the audio stream. This will be used to record the audio.
A note on feedback: To ensure that we don’t get feedback while recording audio, we create a gain object from the audioContext, set the gain to 0, and put that in the audio stream.
Starting the Recording
Now that we have the video and audio streams routed to some objects that we can access, we need to figure out what to do when the user clicks the ‘start-recording’ button. The handler for that button simply ensures there is a valid user and calls startRecording.
[sourcecode language=”javascript”]
function startRecording() {
console.log("Begin Recording");
videoElement = document.getElementById('live_video');
videoCanvas.width = videoElement.width;
videoCanvas.height = videoElement.height;
imageArray = [];
// do request frames until the user stops recording
recording = true;
frameTime = new Date().getTime();
requestAnimationFrame(recordFrame);
// begin recording audio
audioRecorder.record();
}
[/sourcecode]
Simply enough, this just resets the information we have about the video, creates an empty array that will be used to store video frames, calls requestAnimationFrame to begin recording frames of video with the ‘recordFrame’ method, and calls ‘record’ on the Recorder.js audioRecorder created in setupMedia above.
Recording Video
As mentioned, the recordFrame method performs the actual frame saving during recording.
[sourcecode language=”javascript”]
function recordFrame() {
// console.log("-frame");
if (recording) {
var image;
// draw the video to the context, then get the image data
var video = document.getElementById('live_video');
var width = video.width;
var height = video.height;
videoContext.drawImage(video, 0, 0, width, height);
// optionally get the image, do some filtering on it, then
// put it back to the context
imageData = videoContext.getImageData(0, 0, width, height);
// - do some optional image manipulations on imageData
videoContext.putImageData(imageData, 0, 0);
var frameDuration = new Date().getTime() - frameTime;
console.log("duration: " + frameDuration);
// NOTE: the following attempted to add the frames to the
// encoder as it was recording but ended up getting saving
// the frame duration incorrectly.
//whammyEncoder.add(videoContext, frameDuration);
imageArray.push(
{
duration: frameDuration,
image: imageData
});
frameTime = new Date().getTime();
// request another frame
requestAnimationFrame(recordFrame);
}
else {
completeRecording();
}
}
[/sourcecode]
To sum up this code, we are drawing the current video image to our video context (created in setupMedia), then pulling the image data, then doing some image effects (left out here for simplicity) and putting it back to the context. Lastly, we store off the image data and the amount of time since the last frame was recorded. These two items of data will be used by the encoder when recording is complete.
Note that pushing the image data back to the context could be skipped if no effects are performed. This sample code just shows where you would do it if you wanted to.
The ‘stop-recording’ button handler simply sets the ‘recording’ flag to false, so that the recordFrame method here calls completeRecording, which does the processing and uploading for both audio and video.
Saving and Uploading the Data
We saw above in the startRecording method that we call audioRecorder.record() (on the Recorder.js object) to begin audio recording. Unlike with the video we don’t need to take any action while recording. The handler for the ‘stop-recording’ button sets the ‘recording’ flag to false, which results in a call to the completeRecording function.
[sourcecode language=”javascript”]
function completeRecording() {
// stop & export the recorder audio
audioRecorder.stop();
var user = Meteor.user();
if (!user) {
// must be the logged in user
console.log("completeRecording - NO USER LOGGED IN");
return;
}
console.log("completeRecording: " + user._id);
document.getElementById('uploading').hidden = false;
audioRecorder.exportWAV(function (audioBlob) {
// save to the db
BinaryFileReader.read(audioBlob, function (err, fileInfo) {
UserAudios.insert({
userId: user._id,
audio: fileInfo,
save_date: Date.now()
});
});
console.log("Audio uploaded");
});
// do the video encoding
// note: tried doing this in real-time as the frames were requested but
// the result didn't handle durations correctly.
var whammyEncoder = new Whammy.Video();
for (i in imageArray) {
videoContext.putImageData(imageArray[i].image, 0, 0);
whammyEncoder.add(videoContext, imageArray[i].duration);
delete imageArray[i];
}
var videoBlob = whammyEncoder.compile();
BinaryFileReader.read(videoBlob, function (err, fileInfo) {
UserVideos.insert({
userId: user._id,
video: fileInfo,
save_date: Date.now()
});
});
console.log("Video uploaded");
// stop the stream & redirect to show the video
mediaStream.stop();
Router.go('showVideo', { _id: user._id });
}
[/sourcecode]
Audio: First, we stop the audio recording. Since we aren’t recording any more matching video frames, we don’t want any more audio to be recorded. After some administrative stuff (enabled/disabling buttons and such), we save the audio. The Recorder.js object has an exportWAV method, which saves the audio data to a blob. Then we convert the blob to a data file and add it to our UserAudios collection.
You’ll notice that we use the BinaryFileReader object (shown below) to convert the blob, which is basically a simple wrapper around the FileReader class. You could just as easily use the FileReader class directly.
Video: For the video we create a new Video instance using the Whammy encoder class. We go through each image that was put into the array during recording, send it to the video context, and tell whammy to add the image from the context (passing in the saved duration as well). Lastly, we do the video encoding to get a blob for the video and, like the audio, convert the blob to file data.
To clean up, we stop the local media stream so we aren’t holding onto the client’s webcam and microphone and then reroute to show the recorded video.
[sourcecode language=”javascript”]
var BinaryFileReader = {
read: function (file, callback) {
var reader = new FileReader;
var fileInfo = {
name: file.name,
type: file.type,
size: file.size,
file: null
}
reader.onload = function () {
fileInfo.file = new Uint8Array(reader.result);
callback(null, fileInfo);
}
reader.onerror = function () {
callback(reader.error);
}
reader.readAsArrayBuffer(file);
}
}
[/sourcecode]
Playing the Video
The ‘showVideo’ template performs the task of displaying the video. If you’ll remember from much earlier, the route for ‘showVideo’ actually pulls the data that we saved and sends it to the ‘showVideo’ template. At that point, the template has and object with this info, which was added to the database during the recording process:
[sourcecode language=”javascript”]
{
audio: [the file data for the audio],
video: [the file data for the video],
userId: [the id for the user]
}
[/sourcecode]
The ‘showVideo’ template then takes that information in the handlers for ‘userVideo’ and ‘userAudio’ and converts it to something useful:
[sourcecode language=”javascript”]
Template.showVideo.userVideo = function () {
console.log("Template.record.userVideo: " + this.userId);
if (!this.video) {
return "";
}
var blob = new Blob([this.video.video.file],
{type: this.video.video.type});
return window.URL.createObjectURL(blob);
};
Template.showVideo.userAudio = function () {
console.log("Template.showVideo.userAudio: " + this.userId);
if (!this.audio) {
return "";
}
var blob = new Blob([this.audio.audio.file],
{type: this.audio.audio.type});
return window.URL.createObjectURL(blob);
};
[/sourcecode]
The template uses the above data as the src for the video and audio:
[sourcecode language=”javascript”]
<button id="play-recording">Play</button>
<video id="review_video" width="320" height="240" src="{{userVideo}}"></video>
<audio hidden="true" id="review_audio" src="{{userAudio}}"></audio>
[/sourcecode]
The handler for ‘play-recording’ simply calls play for the video and audio elements:
[sourcecode language=”javascript”]
document.getElementById("review_video").play();
document.getElementById("review_audio").play();
[/sourcecode]
Conclusion
I am under no delusions that this is the most perfect way to record both audio and video but, after an extensive search, do not think there is a simple method for recording both audio and video using html5 at the moment. I fully welcome someone proving me wrong. This approach gets the job done in a fairly straight-forward manner.
Obviously this isn’t a production application. I have only tested in Chrome. To use this in a production environment you would probably need a development phase focusing on the idiosyncrasies of all modern browsers. There are a number of things that could be improved upon and fixed. Some of the things I had considered or noticed but never go around to addressing:
- Use Meteor-CollectionFS for cleanly uploading files
- Error in Firefox saving video when recording.
- Safari won’t play video at all.
- Fix the occasional stuttering that occurs.