Building a Real-Time WebRTC to HLS Streaming Platform: From P2P Video Calls to Live Broadcasting
I recently built a full-stack application that bridges this gap, creating a system where users can have WebRTC video calls while simultaneously broadcasting to HLS viewers. Here's the technical journey and key learnings from this project.
The Challenge: Two Worlds of Video Streaming
The project had an interesting dual requirement:
- WebRTC side: Enable real-time, low-latency video calls between participants (think Google Meet)
- HLS side: Allow viewers to watch the ongoing conversation as a live stream (think YouTube Live)
This presents a fascinating technical challenge because WebRTC and HLS serve different purposes:
- WebRTC: Ultra-low latency (sub-second), P2P connections, perfect for interactive communication
- HLS: Higher latency (3-10 seconds), CDN-friendly, scalable to millions of viewers
Architecture Overview
The solution involved several key components working in harmony:
Frontend (Next.js + TypeScript)
- Stream Page (
/stream): WebRTC participants with camera/microphone access - Watch Page (
/watch): HLS viewers consuming the live stream - Real-time Communication: Socket.io for signaling and coordination
Backend (Node.js + TypeScript)
- Mediasoup: SFU (Selective Forwarding Unit) for WebRTC media routing
- FFmpeg: Media transcoding from WebRTC to HLS format
- Socket.io: WebSocket management for real-time signaling
- Express: HTTP server for HLS segment delivery
The Technical Deep Dive
1. WebRTC with Mediasoup SFU
Instead of direct P2P connections, I used Mediasoup as an SFU. This architecture offers several advantages:
// Creating WebRTC transports for media exchange
socket.on('createWebRtcTransport', async ({ sender }, callback) => {
const transport = await router.createWebRtcTransport({
listenIps: [{ ip: '0.0.0.0', announcedIp: '192.168.1.38' }],
enableUdp: true,
enableTcp: true,
iceServers: [{ urls: 'stun:stun.l.google.com:19302' }],
});
// Transport configuration and callback...
});
The SFU approach means:
- Each participant sends their media once to the server
- The server forwards streams to other participants
- Much more scalable than full-mesh P2P for multiple participants
- Enables server-side processing (crucial for our HLS conversion)
2. The WebRTC to HLS Bridge
The most technically challenging part was converting real-time WebRTC streams to HLS format. Here's how it works:
Step 1: Extract RTP Streams
// Create plain transport to receive RTP from WebRTC
const transport = await router.createPlainTransport({
listenIp: '127.0.0.1',
rtcpMux: false,
comedia: false,
enableSrtp: false,
});
// Consume WebRTC producer and forward to plain transport
const consumer = await transport.consume({
producerId: videoProducer.producer.id,
rtpCapabilities: router.rtpCapabilities,
paused: false
});
Step 2: Generate SDP for FFmpeg
// Create SDP file describing the RTP streams
const sdpString = `v=0
o=- 0 0 IN IP4 127.0.0.1
s=FFMPEG
c=IN IP4 127.0.0.1
t=0 0
${sdpMedia}`;
Step 3: FFmpeg Transcoding Pipeline
const ffmpegArgs = [
'-f', 'sdp', '-i', sdpFilePath,
'-filter_complex', filterComplex,
'-c:v', 'libx264',
'-preset', 'ultrafast',
'-tune', 'zerolatency',
'-f', 'hls',
'-hls_time', '1',
'-hls_list_size', '3',
'-hls_flags', 'delete_segments+round_durations+independent_segments',
outputPath
];
3. Handling Multiple Video Streams
One interesting challenge was compositing multiple video streams for HLS output:
// Single participant: simple scaling
if (videoProducers.length === 1) {
filterComplex += '[0:v:0]scale=1280:720[vout];';
}
// Multiple participants: side-by-side layout
else if (videoProducers.length === 2) {
filterComplex += '[0:v:0]scale=640:720[v0];';
filterComplex += '[0:v:1]scale=640:720[v1];';
filterComplex += '[v0][v1]hstack=inputs=2[vout];';
}
Key Technical Challenges and Solutions
1. Codec Compatibility
Problem: WebRTC typically uses VP8/VP9, while HLS prefers H.264.
Solution: Real-time transcoding with FFmpeg, optimized for low-latency with ultrafast preset and zerolatency tuning.
2. Synchronization Issues
Problem: Audio and video streams could drift out of sync during transcoding. Solution: Careful RTP timestamp handling and periodic keyframe requests:
const keyFrameInterval = setInterval(() => {
consumers.forEach((consumer) => {
if (consumer && consumer.kind === 'video' && !consumer.closed) {
consumer.requestKeyFrame();
}
});
}, 4000);
3. Latency Optimization
Problem: Each step in the pipeline adds latency. Solution:
- 1-second HLS segments (vs typical 4-6 seconds)
- Aggressive FFmpeg settings for minimal buffering
- Direct RTP forwarding without unnecessary re-encoding
4. Resource Management
Problem: FFmpeg processes and media transports need proper cleanup. Solution: Comprehensive cleanup logic on client disconnect:
socket.on('disconnect', () => {
// Clean up producers, consumers, transports
// Kill FFmpeg processes
// Remove temporary files
});
Development Experience and Learnings
The Good
- Mediasoup: Excellent documentation and TypeScript support made WebRTC much more manageable
- FFmpeg flexibility: The filter complex system is incredibly powerful for video composition
- Next.js integration: Seamless development experience with both frontend and backend in one project
The Challenging
- Debugging media issues: When streams don't work, it's often unclear if the issue is WebRTC, SFU, FFmpeg, or network-related
- Platform differences: Media handling varies significantly between browsers and devices
- Resource intensive: Multiple FFmpeg processes can quickly consume system resources
Development Setup
The project uses a clever npm script setup:
{
"scripts": {
"dev": "concurrently \"npm run server\" \"npm run next\"",
"server": "ts-node server.ts",
"next": "next dev"
}
}
This allows both the Next.js frontend and Node.js backend to run simultaneously during development.
Performance Considerations
CPU Usage
FFmpeg transcoding is CPU-intensive. For production, consider:
- Hardware-accelerated encoding (VAAPI, NVENC)
- Multiple quality streams for adaptive bitrate
- Load balancing across multiple servers
Memory Management
- Proper cleanup of MediaStream objects
- FFmpeg process monitoring
- WebRTC connection state management
Network Optimization
- STUN/TURN server configuration for WebRTC
- CDN integration for HLS delivery
- Bandwidth adaptation based on participant count
Future Enhancements
This foundation opens up many possibilities:
- Multi-quality HLS streams for adaptive bitrate
- Recording capabilities by extending the FFmpeg pipeline
- Chat integration alongside video streams
- Authentication and room management
- Mobile app support with React Native
Conclusion
Building a WebRTC to HLS bridge taught me that modern video streaming is beautifully complex. The intersection of real-time communication protocols, media processing, and web technologies creates fascinating engineering challenges.
The key insight is that you don't need to choose between WebRTC and HLS, you can have both. WebRTC provides the interactive, low-latency experience for active participants, while HLS enables scalable broadcasting to passive viewers.
If you're interested in video streaming technology, I'd highly recommend diving into projects like this. The combination of Mediasoup, FFmpeg, and modern web frameworks provides a powerful toolkit for building next-generation streaming applications.
The complete source code for this project demonstrates practical implementations of WebRTC SFU architecture, real-time media processing, and HLS streaming—all tied together in a modern TypeScript/Next.js application.