Skip to main content
Sandro Gauci

Sandro Gauci, Enable Security

, Alfred Farrugia

Alfred Farrugia, Enable Security

A Novel DoS Vulnerability affecting WebRTC Media Servers

Last updated on Jun 25, 2024 in , , ,

Executive summary (TL;DR)

A critical denial-of-service (DoS) vulnerability has been identified in media servers that process WebRTC’s DTLS-SRTP, specifically in their handling of ClientHello messages. This vulnerability arises from a race condition between ICE and DTLS traffic and can be exploited to disrupt media sessions, compromising the availability of real-time communication services. Mitigations include filtering packets based on ICE-validated IP and port combinations. The article also indicates safe testing methods and strategies for detecting the attack.


The Significance of DoS in WebRTC

In the early years when we started focusing on VoIP penetration testing, we were most concerned about vulnerabilities leading to toll fraud, illegal wiretapping, or full remote compromise. While denial-of-service (DoS) vulnerabilities were a concern, they didn’t always receive the same level of attention. However, with real-time communications, constant availability is crucial. In the RTC industry, every millisecond is critical, and downtime results in a total loss of value for that period.

Recognizing the critical nature of DoS vulnerabilities, especially based on client feedback, we started identifying these vulnerabilities across various areas, particularly application DoS. Unlike volumetric DoS attacks, which can often be mitigated by rate limiting, network scrubbing, and increasing bandwidth, application-level DoS vulnerabilities usually require code changes, making the solutions relatively inexpensive. However, addressing application-level DoS requires significant knowledge, skill, and intuition, which makes this problem area particularly interesting to us.

Even more intriguing are DoS vulnerabilities arising from protocol security issues or the interaction between different components or network protocols. The vulnerability discussed in this blog post is one such example.

Introduction to WebRTC and DTLS for Security Professionals

This section provides a concise introduction to ensure that the rest of the article is clear and coherent.

WebRTC (Web Real-Time Communication) enables web browsers and mobile applications to facilitate real-time voice, video, and data communication. Although WebRTC supports peer-to-peer connections, most platforms utilize media servers to relay voice and video between parties. This blog post focuses on the latter scenario. Note that for brevity, we will often refer to voice and video collectively as media.

On the web browser’s end, WebRTC functionality is implemented via browser APIs, such as getUserMedia, and RTCPeerConnection, which grant web applications access to the user’s camera and microphone, allowing streams to be captured and used for audio and video communication.

To put it in practical terms, when a user starts a video conference on a service like Google Meet, the web page will use the WebRTC browser APIs to trigger this functionality. The web browser then uses several components to start sending audio and video. This process relies on the Secure Real-Time Protocol (SRTP), which carries the media and operates over UDP. However, before the browser can encrypt and decrypt SRTP, several security steps must be completed. Let’s outline the key steps and security design decisions for WebRTC voice and video streaming.

As Tsahi Levent-Levi highlighted in a 2022 article, WebRTC is the most secure open standard VoIP protocol. This is because confidentiality and integrity are part of its design. A web browser will not initiate a voice or video call unless the web page is served over HTTPS. The media travels over SRTP, which ensures encryption, message authentication, and integrity protection. Additionally, the master keys for the SRTP streams are established using DTLS. Before this exchange, an ICE check using STUN (Session Traversal Utilities for NAT) provides authentication and message integrity. In WebRTC, these protocols typically flow through the same port, and the media server at the other end multiplexes and handles them accordingly.

Given that this vulnerability is related to DTLS, I’ll also introduce the protocol briefly. DTLS (Datagram Transport Layer Security) is based on TLS and is designed to provide privacy and data integrity for datagram-based applications using UDP. It extends TLS features for environments with low latency and packet loss, ensuring secure communication through message confidentiality, authenticity, and integrity. This makes it ideal for real-time applications such as VoIP (including WebRTC), online gaming, IoT, and VPN services.

Client                            Server
  |                                  |
  |------ STUN Binding Request------>|
  |                                  |
  |<----- STUN Binding Success ------|
  |                                  |
  |------ DTLS ClientHello --------->|
  |                                  |
  |<----- DTLS ServerHello ----------|
  |                                  |
  |<----- DTLS Certificate ----------|
  |                                  |
  |<----- DTLS ServerKeyExchange ----|
  |                                  |
  |<----- DTLS ServerHelloDone ------|
  |                                  |
  |------ DTLS Certificate --------->|
  |                                  |
  |          ...                     |
  |                                  |
  |<----- SRTP traffic ------------->|

The accompanying diagram shows the DTLS flow, which closely resembles TLS. Our focus is on the ClientHello DTLS message, the initial message in the DTLS flow. This message includes the supported cipher suites, which play an important part in this attack.

In WebRTC media servers, an ephemeral port is often allocated on the media server for each specific media session. Here, a DTLS error might cause the server to terminate that particular user’s media session. This behavior is also a critical element in this attack.

While this overview omits many details, it should help you better understand the subsequent sections of this post.

Details of the Vulnerability and Potential Exploitation

Having covered the basics, we can now delve into the core topic: how does the vulnerability arise, and how could it be exploited?

When a user initiates a WebRTC voice and video call, the media server allocates UDP ports to handle the media streams. The IP and port combination of the media server are communicated to the user through signaling. The user’s web browser then uses ICE media consent verification to determine how to reach the media server, employing STUN in the process. After a successful STUN process, a DTLS session is initiated to establish the SRTP master keys, followed by the switch to SRTP for delivering the media stream.

The vulnerability arises from a race condition between the ICE media consent verification and the initiation of DTLS traffic. An attacker could potentially send a DTLS ClientHello message before the legitimate user does. The ClientHello message, which includes a list of supported cipher suites, can be manipulated by the attacker to include an invalid cipher suite, such as TLS_NULL_WITH_NULL_NULL. This invalid cipher suite triggers a DTLS-level error on a vulnerable media server, which then prevents the establishment of the SRTP master keys, effectively blocking the SRTP session.

DTLS cipher suites set in the ClientHello message

For the attack to succeed, the attacker must guess the UDP ports on the media server that are handling incoming media sessions. Attackers have an advantage because they can continuously scan media servers by sending UDP packets to all ports designated for media. Each packet contains a ClientHello message with the null cipher suite to exploit the vulnerability. If the attack is successful, the media streams will not be established.

Thus an attack might look like the following diagram:

Client                            Server                             Attacker
  |                                  |<-------------- ClientHello ------|
  |------ STUN Binding Request------>|<-------------- ClientHello ------|
  |                                  |<-------------- ClientHello ------|
  |<----- STUN Binding Success ------|<-------------- ClientHello ------|
  |                                  |<-------------- ClientHello ------|
  |------ DTLS ClientHello --------->|<-------------- ClientHello ------|
  |                                  |<-------------- ClientHello ------|
  |<---- Alert (Handshake failure) --|                                  |
  |      ^^ (implementation specific)|                                  |

The Beauty of Vulnerabilities That Live Between Protocols

Following the discovery of this denial-of-service (DoS) vulnerability, we were left with two key questions:

  1. What type of DoS vulnerability is this?
  2. Does it affect all DTLS implementations, or is it a protocol design issue?

The Security Considerations for WebRTC (RFC 8826) document offers insights on similar attacks for ICE. Specifically, Section 4.2 on Communications Consent Verification states:

It is important to remember here that the site initiating ICE is presumed malicious; in order for the handshake to be secure, the receiving element MUST demonstrate receipt/knowledge of some value not available to the site (thus preventing the site from forging responses).

What is missing is a similar verification mechanism for DTLS. Unlike ICE, the initial DTLS messages do not include any form of verification that relies on previously exchanged credentials.

However, it seems this is not a vulnerability in the DTLS protocol itself. When DTLS operates as a server on a static port rather than an ephemeral port, it can handle multiple DTLS clients without stopping after a single client error, such as sending the null cipher in the ClientHello message. In such cases, only the DTLS session initiated by the erroneous party is terminated. This contrasts with scenarios where an ephemeral port is used to handle all media traffic, including both DTLS and SRTP, which we found vulnerable.

It is therefore a vulnerability that arises in the way that DTLS is used in certain WebRTC media servers that use ephemeral UDP ports. This vulnerability appears to be due to the assumption that since ICE does not have this vulnerability, the traffic that follows - the DTLS traffic - also inherits that security benefit. This is only the case where the ICE consent verification is used to filter out any network traffic that is not from the IP and port combination that has been verified. Thus the vulnerability is due to the gap between ICE (STUN) and DTLS in the WebRTC media establishment flow.

DTLS 1.2 supports the HelloVerifyRequest message to prevent certain attacks, but this protection only guards against attacks from spoofed IP addresses. The RFC states:

This mechanism does not provide any defense against DoS attacks mounted from valid IP addresses.

DTLS 1.3 offers similar protection with the HelloRetryRequest, but with the same limitation, as our attack does not involve spoofed IP addresses.

We also attempted to categorize this DoS vulnerability. After extensive research, the closest match was CWE-703: Improper Check or Handling of Exceptional Conditions. However, mapping to this category is not recommended.

How to Reproduce This Vulnerability Safely

In previous sections, we explained how an attacker could exploit this vulnerability against media servers. The attack is straightforward, using readily available tools like Scapy to replay DTLS ClientHello messages and target the range of ports allocated for media traffic on the servers.

As penetration testers, performing such an attack directly on live systems is often inadvisable, especially when those systems are in active use. We also needed to test publicly available platforms that lack dedicated test or lab environments. Therefore, we developed a safe methodology to conduct these tests without introducing risks.

Here’s the high-level process we followed:

  1. The victim user initiates a WebRTC media session (e.g., starts a conference call).
  2. The victim informs the attacker of the IP and port combination.
  3. The attacker sends packets only to those specific ports targeting the media server.
  4. The vulnerable media server would then prevent the victim from establishing the WebRTC media session.

In practice, we modified Chromium to act as the victim user while communicating with a remote attacker tool that sends the attack payload. This approach ensures that details such as signaling (often a custom protocol), the ICE 4-way handshake, the full DTLS flow, and SRTP are all managed automatically by the web browser code.

Specifically, we patched the JsepTransport::AddRemoteCandidates function. Whenever this function is called, it sends a POST request to the attacker tool, providing the IP and port combinations (candidate.address) of the media server. The attacker tool then starts sending DTLS ClientHello messages with the null cipher as the supported cipher suite.

This method proved effective, enabling us to safely test several public platforms. As a result, we were able to identify vulnerabilities and report them to the respective service providers.

When Are Implementations Not Vulnerable?

We’ve discussed how a WebRTC media server might be vulnerable to this security issue, but there are instances where this is not the case. Here are some scenarios where the vulnerability was absent:

  1. DTLS Server Configuration: When the web browser functions as a DTLS server, expecting the ClientHello message from the server. In WebRTC, the initiator of the session can be either an active DTLS client or a passive DTLS server. If it acts as a DTLS server, the vulnerability cannot be reproduced against the media server, which initiates the ClientHello messages. Similarly, the vulnerability was not reproducible against web browsers, which did not appear to be susceptible during our testing.

  2. Single Port Usage: When the media server uses a single port instead of ephemeral ports for handling media sessions. In this configuration, the media server must manage multiple states on the same port for DTLS. This design prevents media streams from interfering with each other, thereby avoiding the vulnerability.

  3. Security Fix Implementations: In many cases, we recommended developers implement specific security fixes to address the vulnerability. These solutions were effective in mitigating the issue.

Solutions to the Problem and Caveats

While examining Janus, the open-source WebRTC server, to determine its vulnerability to attack, we initially struggled to reproduce the issue. With assistance from the lead developer, we discovered that Janus uses libnice, which had implemented a solution to this vulnerability in version 0.1.15 released in December 2018.

The changelog notes:

Now drops all packets from addresses that have not been validated by an ICE check

This solution is what we had been recommending for other open-source solutions, such as Asterisk, FreeSWITCH, and RTP Engine, each of which released security fixes for this vulnerability. The fix involves trusting the ICE process and processing packets only from the IP and port combinations validated through ICE. This solution can be readily implemented by most existing vulnerable software without redesigning the media server.

A common concern is the risk of IP spoofing. While we agree that IP addresses or IP and port combinations should not be used as authentication mechanisms, we believe this is still a good solution for this specific problem. An attacker attempting to exploit this vulnerability by spoofing IP and port combinations would need to guess or know the correct combination. Bruteforcing all source and destination ports would likely result in a volumetric DoS attack, which is impractical. The primary way to obtain the source IP and port combination is by being positioned as a man-in-the-middle (MITM). In such cases, various other methods can cause DoS due to the design of the involved protocols. Therefore, MITM scenarios are beyond the scope of this threat model, which aims to prevent attacks from remote attackers without access to the network traffic of the victim client and server.

How the Attack Can Be Detected at the Network Level

Detecting this attack at the network level is potentially straightforward. One approach is to identify DTLS ClientHello messages that specify the null cipher, though this might not be ideal. A more effective method is to analyze metadata for IP addresses sending UDP packets to multiple ports on the media server. This generic detection method can identify attacks exploiting this vulnerability and similar ones, such as RTP inject and RTP bleed, which rely on RTP packets instead of DTLS.

However, if the attack is distributed across many different source IPs, it could evade detection based on the number of ports used by any single IP address.

Another method is to detect attackers sending large amounts of UDP packets to ports that are closed on the media servers.

There are likely other methods to detect these types of attacks, and we are interested in hearing from anyone implementing network or application-level attack detection for media servers.

Request a Free Quick Test (For Vendors)

Credits

Alfred Farrugia of Enable Security discovered this vulnerability, developed the tools for testing it, and conducted the majority of the testing. His work was invaluable in shaping the content and resources provided here. Special thanks also go to Philipp Hancke, a self-proclaimed purveyor of the dark side of WebRTC, and Lorenzo Miniero, the author of the Janus WebRTC Server, for their significant contributions.


Sandro Gauci

Sandro Gauci

CEO, Chief Mischief Officer at Enable Security

Sandro Gauci leads the operations and research at Enable Security. He is the original developer of SIPVicious OSS, the SIP security testing toolset. His role is to focus on the vision of the company, design offensive security tools and engage in security research and testing. Therefore, he is the proud owner of the title of Chief Mischief Officer at Enable Security.

He offers public office hours and is reachable here.


Alfred Farrugia

Alfred Farrugia

R&D, Chief Demolition Officer at Enable Security

Alfred Farrugia is the lead developer of SIPVicious PRO, does reverse engineering, fuzzing, DoS simulation and security research. He is an amazing pentester and often finds denial-of-service vulnerabilities where least expected. Hence he is the proud owner of the title of Chief Demolition Officer at Enable Security.