I wonder if you have ever used the web version of video interview or web version of online conference. They all support sharing screen and turning on camera. These are implemented on the browser. As a front-end development, have you been curious about the implementation principle of these functions?
The ability related to audio and video communication on the browser is called WebRTC (real time communication). It is a standard API for audio and video implemented by the browser with the faster and faster network speed and more and more audio and video requirements.
There are five steps in the process of audio and video communication: acquisition, coding, communication, decoding and rendering.
These five steps are easy to understand, but each step has a lot of content.
Today, let's implement the next collection part to quickly enter the next door and intuitively feel what WebRTC can do.
We will record the screen and camera, playback the recorded content, and support downloading.
Let's start.
Train of thought analysis
The browser provides navigator mediaDevices. Getdisplaymedia and navigator mediaDevices. Getusermedia api, which can be used to obtain screen stream, microphone stream and camera stream respectively.
From the name, we can see that getDisplayMedia obtains the screen stream, and getUserMedia obtains the user related stream, that is, the microphone and camera stream.
After the stream is obtained, it can be played by setting it to the srcObject property of video.
If you want to record video, you need to use the MediaRecorder api, which can listen to the data in the stream, and we can save the obtained data to the array. Then set the srcObject property of another video during playback.
The download is also based on the data recorded by the MediaRecorder. After being converted to a blob, the download is triggered through the a tag.
After we've sort out our ideas, let's write the code.
code implementation
We put two video tags on the page, one for watching the recorded video in real time and the other for playback.
Then put some buttons.
<selection> <video autoplay id = "player"></video> <video id = "recordPlayer"></video> </selection> <section> <button id = "startScreen">Open recording screen</button> <button id = "startCamera">Turn on the camera</button> <button id = "stop">end</button> <button id = "reply">playback</button> <button id = "download">download</button> </selection>
When the "start screen recording" and "turn on camera" buttons are clicked, recording is turned on, but in different ways.
startScreenBtn.addEventListener('click', () => { record('screen'); }); startCameraBtn.addEventListener('click', () => { record('camera'); });
One is to use getUserMedia api to obtain microphone and camera data, and the other is to use getDisplayMedia api to obtain screen data.
async function record(recordType) { const getMediaMethod = recordType === 'screen' ? 'getDisplayMedia' : 'getUserMedia'; const stream = await navigator.mediaDevices[getMediaMethod]({ video: { width: 500, height: 300, frameRate: 20 } }); player.srcObject = stream; }
Specify parameters such as lower width, height and frame rate, and set the returned stream to the srcObject property of video to see the corresponding audio and video in real time.
Then, you need to record, you need to import the api with MediaRecorder, and then call the start method to open the stream.
let blobs = [], mediaRecorder; mediaRecorder = new MediaRecorder(stream, { mimeType: 'video/webm' }); mediaRecorder.ondataavailable = (e) => { blobs.push(e.data); }; mediaRecorder.start(100);
The parameter of start is the size of segmentation. Passing in 100 means saving data every 100ms.
Listen to the dataavailable event and save the obtained data to the blobs array.
After that, the blobs are generated according to the blobs array, which can be played back and downloaded respectively:
Playback:
replyBtn.addEventListener('click', () => { const blob = new Blob(blobs, {type : 'video/webm'}); recordPlayer.src = URL.createObjectURL(blob); recordPlayer.play(); });
The blob goes through the URL The createobjecturl can only be played as an object url.
Download:
download.addEventListener('click', () => { var blob = new Blob(blobs, {type: 'video/webm'}); var url = URL.createObjectURL(blob); var a = document.createElement('a'); a.href = url; a.style.display = 'none'; a.download = 'record.webm'; a.click(); });
Generate a hidden a tag and set the download attribute to support downloading. Then trigger the click event.
So far, we have realized the recording of microphone, camera and screen, and supported playback and download.
Let's look at the effect:
The complete code is uploaded to github: https://tygithub.com/QuarkGluonPlasma/webrtc-exercize
A copy is also posted here:
<html> <head> <title>Record and download</title> </head> <body> <selection> <video autoplay id = "player"></video> <video id = "recordPlayer"></video> </selection> <section> <button id = "startScreen">Open recording screen</button> <button id = "startCamera">Turn on the camera</button> <button id = "stop">end</button> <button id = "reply">playback</button> <button id = "download">download</button> </selection> <script> const player = document.querySelector('#player'); const recordPlayer = document.querySelector('#recordPlayer'); let blobs = [], mediaRecorder; async function record(recordType) { const getMediaMethod = recordType === 'screen' ? 'getDisplayMedia' : 'getUserMedia'; const stream = await navigator.mediaDevices[getMediaMethod]({ video: { width: 500, height: 300, frameRate: 20 } }); player.srcObject = stream; mediaRecorder = new MediaRecorder(stream, { mimeType: 'video/webm' }); mediaRecorder.ondataavailable = (e) => { blobs.push(e.data); }; mediaRecorder.start(100); } const downloadBtn = document.querySelector('#download'); const startScreenBtn = document.querySelector('#startScreen'); const startCameraBtn = document.querySelector('#startCamera'); const stopBtn = document.querySelector('#stop'); const replyBtn = document.querySelector('#reply'); startScreenBtn.addEventListener('click', () => { record('screen'); }); startCameraBtn.addEventListener('click', () => { record('camera'); }); stopBtn.addEventListener('click', () => { mediaRecorder && mediaRecorder.stop(); }); replyBtn.addEventListener('click', () => { const blob = new Blob(blobs, {type : 'video/webm'}); recordPlayer.src = URL.createObjectURL(blob); recordPlayer.play(); }); download.addEventListener('click', () => { var blob = new Blob(blobs, {type: 'video/webm'}); var url = URL.createObjectURL(blob); var a = document.createElement('a'); a.href = url; a.style.display = 'none'; a.download = 'record.webm'; a.click(); }); </script> </body> </html>
summary
Audio and video communication is divided into five steps: acquisition, coding, communication, decoding and rendering. The API related to browser audio and video communication is called WebRTC.
We have implemented the collection part to get started with WebRTC, and also support playback and download.
There are three APIs involved:
- navigator.mediaDevices.getUserMedia: get the stream of microphone and camera
- navigator.mediaDevices.getDisplayMedia: get the stream of the screen
- MediaRecorder: monitor the change of stream and realize recording
We use the first two APIs to get the streams of screen, microphone and camera respectively, then record with MediaRecorder, save the data into the array, and then generate Blob.
video can set the srcObject property as a stream so that it can be played directly. If you set Blob, you need to use URL Createobjecturl processing.
The download is implemented by pointing to the object url of the Blob object through the a tag, specifying the download behavior through the download attribute, and then manually triggering click to download.
We have learned how to use WebRTC to collect data, which is the data source of audio and video communication. After that, we need to realize encoding, decoding and communication, which can be a complete RTC process. These will be further explored later.
We intuitively feel what WebRTC can do. Do we feel that this field is also very interesting?