P2P Technology (3) Complete Implementation of WebRTC and AWS KVS - In-depth Tutorial on Real-time Communication and Streaming Technology
What is WebRTC?
WebRTC (Web Real-Time Communication) is an open-source standard led by Google, and also a collection of APIs native to browsers. Its goal is to enable developers to implement real-time voice, video, and data communication functions in browsers without installing any plugins.
In the previous two articles, we learned the theoretical foundations of P2P and core technologies like STUN, TURN, and ICE. WebRTC is the practical framework that integrates these technologies. Its underlying layer is based on protocols like ICE, SDP, STUN, and TURN for NAT traversal, ultimately establishing reliable P2P connections.
What Does Signaling Server Do?
In the WebRTC architecture, the Signaling Server plays the role of a “matchmaker” or “introducer”. Its main task is to help two endpoints exchange the basic information needed to establish P2P connections.
Information that Signaling Server is responsible for exchanging includes:
- SDP (Session Description Protocol): Session description protocol defining media formats and transport parameters
- ICE Candidates: Various possible connection path information
- Control Signals: Call start, end, and other state management
Important Concept: Signaling Server only participates in the preliminary work of connection establishment. Once P2P connection is successfully established, actual audio and video data will be transmitted directly between the two endpoints, no longer going through the Signaling Server.
Regarding implementation technology, the WebRTC standard does not mandate specific Signaling implementation methods. You can freely choose protocols like WebSocket, HTTP, or MQTT for implementation.
TIP
Signaling Server does not participate in audio/video data transmission, only handles connection information exchange, so communication protocols can be chosen based on requirements.
What is SDP?
SDP (Session Description Protocol) is a session description protocol (RFC 2327), like the “communication rules” that two people who want to talk agree on beforehand.
SDP’s main function is to define all parameters of media streams in detail, including:
- Media Formats: Which audio and video codecs are supported
- Transport Parameters: Network protocols and port ranges used
- Connection Information: IP addresses, ports, and other network connection details
- Media Attributes: Whether bidirectional communication is supported, can receive or only send
Simply put, SDP allows two endpoints that want to conduct WebRTC communication to understand each other: “What can you support? What can I support? How should we communicate?”
Here’s an SDP example:
v=0
o=mhandley 2890844526 2890842807 IN IP4 126.16.64.4
s=SDP Seminar
i=A Seminar on the session description protocol
u=http://www.cs.ucl.ac.uk/staff/M.Handley/sdp.03.ps
e=mjh@isi.edu (Mark Handley)
c=IN IP4 224.2.17.12/127
t=2873397496 2873404696
a=recvonly
m=audio 49170 RTP/AVP 0
m=video 51372 RTP/AVP 31
m=application 32416 udp wb
a=orient:portrait
What is ICE Candidate?
ICE Candidate is a very important concept in WebRTC, representing “candidate connection paths”. You can think of it as various different route choices from your home to a friend’s house.
Information contained in ICE Candidate:
- IP Address: Could be internal IP, public IP, or TURN server IP
- Port Number: Port used for connection
- Transport Protocol: Different transport methods like UDP, TCP
- Candidate Type: host (local), srflx (obtained via STUN), relay (TURN relay)
Generation Process:
Each time WebRTC initiates a connection, it automatically generates multiple Candidates for each network interface (network card), including local network, public addresses obtained via STUN, TURN relay addresses, etc. After two endpoints exchange their respective Candidate lists, they perform connection testing and ultimately select the best transmission path.
ICE Candidate Format Example:
{
"sdpMLineIndex": 0,
"sdpMid": "",
"candidate": "a=candidate:2999745851 1 udp 2113937151 192.168.56.1 51411 typ host generation 0"
}
Let’s parse this candidate string:
-
2999745851
: candidate ID -
1
: component ID (usually 1 is RTP, 2 is RTCP) -
udp
: transport protocol -
2113937151
: priority (higher number = higher priority) -
192.168.56.1
: IP address -
51411
: Port -
typ host
: candidate type (host = local address)
Exchange and Selection Process:
- Two endpoints exchange their respective Candidate lists through Signaling Server
- WebRTC uses ICE mechanism to perform connection testing on all possible paths
- Finally select the communication method with lowest latency and best stability
This process is like finding the fastest and most stable route among multiple roads to reach the destination.
WebRTC Connection Establishment Flow

The WebRTC connection establishment process seems complex but can be organized into the following four clear phases:
Phase 1: Message Exchange (Signaling)
- Both endpoints connect to Signaling Server
- Exchange SDP information (media capabilities and desired communication parameters)
- Exchange ICE Candidates (various possible connection paths)
Phase 2: Network Discovery
- Each endpoint queries its Public IP and NAT type from STUN Server
- Collect local network interfaces, public addresses obtained via STUN, etc.
Phase 3: Connectivity Checks
- ICE mechanism performs connection testing on all Candidate combinations
- Prioritize direct P2P connections (via STUN)
- If P2P fails, try TURN relay connections
Phase 4: Connection Establishment
- Select the best path from all successful connections
- Establish stable bidirectional communication channel
- Begin media stream transmission (audio, video, data)
The entire process is designed to ensure connections can be established in various network environments while prioritizing the most efficient communication methods.
What is AWS KVS?
AWS Kinesis Video Streams for WebRTC (abbreviated as KVS) is Amazon’s fully managed WebRTC cloud solution. Its emergence solves various challenges developers face when building WebRTC infrastructure themselves.
Core Advantages of KVS:
Complete Infrastructure
- Signaling Server: Built-in WebSocket-based signaling server
- STUN/TURN Services: Globally distributed NAT traversal infrastructure
- Load Balancing: Automatically handles high-concurrency connection requests
Enterprise-grade Security
- IAM Integration: Seamless integration with AWS identity authentication system
- Data Encryption: Supports end-to-end encrypted transmission
- Access Control: Fine-grained permission management
Multi-platform Support
- Web: Native JavaScript SDK
- iOS: Native Swift/Objective-C SDK
- Android: Native Java/Kotlin SDK
Use Cases:
KVS is particularly suitable for scenarios requiring quick deployment without maintaining complex infrastructure, such as remote monitoring, IoT device control, online education platforms, etc. You just need to integrate the corresponding SDK to establish stable bidirectional audio-video streaming services in a short time.
TIP
KVS is suitable for IoT, remote monitoring, IPCam and other scenarios. No need to maintain signaling or relay servers yourself, saving significant development and operational costs.
Results Demonstration
Below are the streaming results on iOS and Android after successful implementation:


Practical Gotcha Notes
During development, I encountered some technical issues worth sharing, hoping to help other developers avoid the same troubles.
AWS KVS WebRTC Android SDK Issue
Problem Description: When implementing AWS KVS WebRTC for Android, replacing the WebSocket connection library from the officially recommended tyrus to the commonly used okhttp resulted in a 403 Forbidden error during runtime.
Root Cause: After investigation, the problem was due to double URL encoding. When using okhttp, it automatically encodes URLs, but AWS’s Signature V4 signing is already calculated based on the original URL, causing signature verification to fail after double encoding.
Solution: Need to specially handle URL encoding issues when using okhttp, or stick with the officially recommended tyrus library.
Related Resources: 🔗 GitHub Issue #74
This example reminds us that when using third-party libraries, we need to pay special attention to their internal implementation details, which may affect cloud service authentication mechanisms.
Series Summary: Complete P2P Technology Map
Through three in-depth articles, we have established a complete P2P technology knowledge system:
First Article: Basic Architecture and NAT Problems
- Understood the differences between centralized, decentralized, and distributed architectures
- Grasped the relationship between IPv4 address scarcity and NAT technology
- Learned characteristics and limitations of various NAT types
Second Article: Core NAT Traversal Technologies
- STUN: Solves device discovery problems, letting devices know their public IP
- TURN: Provides relay services, solving connection issues in strict NAT environments
- ICE: Integration framework, intelligently selecting optimal communication paths
Third Article: Practical Frameworks and Cloud Services
- WebRTC: Mainstream standard implementation for modern P2P communication
- SDP and ICE Candidates: Core information exchange mechanisms for connection establishment
- AWS KVS: Commercial-grade WebRTC cloud solution
Technology Development Context:
P2P Requirements → NAT Problems → STUN/TURN/ICE → WebRTC → Cloud Services
Basic Architecture Traversal Challenges Core Technologies Practical Framework Commercial Applications
This technology stack forms a complete P2P communication solution from underlying network principles to cloud service applications. Understanding the development context of these technologies helps in choosing appropriate technical solutions for different scenarios.
TIP
If you encounter implementation bottlenecks while developing WebRTC or integrating AWS KVS, feel free to comment or email for discussion. I will continue to organize practical experience to help more developers.
Reference Resources
Enjoy Reading This Article?
Here are some more articles you might like to read next: