Claroty’s Team82 has disclosed a vulnerability in Belledonne Communications’ Linphone SIP Protocol Stack.
CVE-2021-33056 is a NULL pointer dereference vulnerability in the belle-sip library used to implement various SIP layers.
Belle-sip versions through 4.5.20 are affected as used in Linphone and other SIP-based products such as IoT firmware and popular VoIP mobile applications.
An attacker can use a specially crafted SIP header message sent to any vulnerable device and crash the client.
The vulnerability was fixed in v4.5.20 of the SIP protocol stack
Since many networks are a mix of operational technology and internet of things (IoT) devices, it makes sense to analyze these connected physical systems from a research perspective. The risk any IoT vulnerability poses can be substantial, and every day we’re seeing more evidence of attackers and researchers demonstrating ways to leverage that connectivity to either exploit a device directly, or move laterally through a network.
One common IoT use case prevalent inside the enterprise—and home networks—is voice and video devices. Today, this goes well beyond just a VoIP phone to include surveillance cameras, and even connected doorbells that record video as part of an overall security system. Protocols such as the Session Initiation Protocol (SIP) are the foundation of these devices and are used to facilitate message transport. As part of Team82’s research, we have been examining different SIP-based platforms and related software, including the Linphone SIP client suite.
Linphone is a 20-year-old open source voice over IP (VoIP) project touting itself as the first open source application to use the Session Initiation Protocol (SIP) on Linux. Developers use its SIP software to build communications systems that include instant messaging, voice, and video.
VoIP messages and calls are made over an IP network rather than over traditional public switched telephone networks (PSTN). Messages are sent using control protocols, such as SIP, the Skinny Client Control Protocol (SCCP), or various others that are proprietary. Many VoIP services are free and convenient for users, but a compromise of such a service can give an attacker a foothold onto a corporate network and possibly the IoT/OT network. One example would be modern security cameras and doorbells that use VoIP protocols to transfer audio data when initiating a “call” with an IoT device.
During the course of our research, Team82 found a null pointer dereference vulnerability in the Linphone belle-sip component. Belle-sip is a C library with an object-oriented API used to implement SIP transport, transaction, and dialog layers; there’s also a HTTP/HTTPS client implementation. The vulnerability is remotely exploitable, requiring no action from the victim. This is a dangerous zero-click attack requiring only an invalid SIP message header to be sent that would crash the client and create a denial-of-service condition.
All belle-sip versions prior to v 4.5.20 in Linphone and likely other similar products, are affected. The vulnerability was fixed in v4.5.20 of the SIP protocol stack (Commit with the fix). As with most third-party components, patching the core protocol stack is the right first step, but those updates must be applied downstream as well by vendors using the affected SIP stack in their respective products. Linphone’s website, for example, cites close to 30 reference customers, including some giants such as BT, Acer, and Swisscom, all of whom develop VoIP applications with Linphone at their core.
Here is a quick summation of Team82’s attack:
Adding a </ to any of the headers listed below will trigger a null pointer dereference vulnerability that will crash the SIP client application that uses belle-sip to handle and parse SIP messages.
Linphone is a free voice over IP softphone, SIP client and service. It may be used for audio and video direct calls and calls through any VoIP softswitch or IP-PBX. Under the hood, Linphone uses the belle-sip component for handling low-level SIP message parsing.
Linphone’s website states it has more than 200 corporate customers worldwide in various sectors including telecommunications, secure communications, social networking, home automation, telepresence, IoT, and more.
Session Initiation Protocol (SIP)
SIP is a signaling protocol used for initiating, maintaining, terminating and modifying real-time sessions between two or more IP-based endpoints that involve voice, video, messaging, and other communications applications and services. The SIP protocol and its extensions are defined across multiple RFCs, including: RFC3261, RFC3311, RFC3428, RFC3515, RFC3903, RFC6086, RFC6665, and others.
SIP is a text-based protocol with syntax similar to that of HTTP. There are two different types of SIP messages: requests and responses. The first line of a request has a method, which defines the nature of the request, and a Request-URI, which indicates where the request should be sent. The first line of a response has a response code. For example, here is a typical session:
Figure 2: Establishment of a session through a back-to-back user agent.
One of the key SIP messages is the INVITE request used to initiate a dialog for establishing a call. The request is sent by a SIP client to a SIP server. The SIP server will transfer the INVITE request to the destination SIP client. For example, if Alice wants to call Bob using SIP, she will probably send to something like this:
Figure 3: An example of a SIP INVITE message.
The message is straightforward: Alice invites Bob to establish a session. We can see the request was initiated From Alice and directed To Bob. Usually, SIP parsers are following RFC3261 to parse such messages and ensure all headers are constructed as described in the SIP RFC. For example, SIP parsers will expect to find SIP URI in the To/From headers as defined in sections 220.127.116.11 (To) and 18.104.22.168 (From) of the RFC.
In fact, every resource of a SIP network, such as user agents, call routers, and voicemail boxes, are identified by a Uniform Resource Identifier (URI). The syntax of the URI follows the general standard syntax also used in web services and e-mail. The URI scheme used for the SIP protocol is simply “sip” and a typical SIP URI has the following form:
SIP RFC RFC3261 determines that every valid SIP URI must have a scheme which can be sip or sips, similar to http vs. https. However, it also determines that “SIP elements MAY support Request-URIs with schemes other than “sip” and “sips,” for example, the “tel” URI scheme of RFC2806.” This means that multiple schemes could be supported, but in any case, a scheme must be found.
The question is: How are SIP parsers, specifically belle-sip, parsing SIP messages? And more specifically, how are they parsing SIP URIs?
In order to define the SIP grammar, belle-sip uses ANTLR, which is a parser generator that helps developers to create parsers. Parsers take written text and transform it into an organized structure called an Abstract Syntax Tree (AST). An AST is a tree representation of the abstract syntactic structure of a source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. In other words, we can think of the AST as a story describing the content of the code.
Figure 4: Graphical representation of an AST for the Euclidean algorithm.
ANother Tool for Language Recognition, also known as ANTLR, is a language tool offering support for tree construction, tree walking, translation, error recovery, and error reporting. ANTLR provides a framework for constructing recognizers, interpreters, compilers, and translators.
Belle-sip uses the ANTLR framework to define the grammar used to construct SIP messages according to the Session Initiation Protocol (SIP) RFC3261. Later, the ANTLR compiler is used to convert the SIP grammar, defined in belle_sip_message.g to a C source code file belle_sip_messageParser.c that could be later compiled to an executable.
As mentioned above, belle-sip uses a very detailed and complex grammar to parse SIP messages. For example, this is how the SIP From header is defined and parsed with ANTLR:
Figure 5: from_token: definition of the SIP From header.
As we can see, the image with the source code contains three parts (marked 1, 2, 3):
Header name, which must be From or f.
Colon : (space and tabs ignored).
From header value, which should be parsed as from_spec.
The third part of the from_token is defined as from_spec. As we can see at the bottom of the image, it has two options: be parsed as name_addr_with_generic_uri or as paramless_addr_spec_with_generic_uri. Since both options are considered as a valid SIP From header, belle-sip will try both options when parsing a From header as defined in its grammar.
The first option that belle-sip will try, is to parse the From header as a name_addr_with_generic_uri header:
Next, we have addr_spec_with_generic_uri. This is an interesting case. First, it will try to parse the address URI as simply URI; if it fails, it will try to parse it as a generic URI ($generic_uri), and will extract the scheme out of the parsed URI, and compare it to sip or sips.
But before the comparison and before the scheme is extracted, the URI must be parsed first. If we look at geneirc_uri, we could see that there are two valid options:
More complicated parsing involves parsing the scheme part, opaque part, and heir part.
Figure 8: generic_uri grammar definition.
If we follow the hier_part definition, we could see that there are three options for parsing heir_part data. What’s interesting is that all of them eventually use the path_segments definition to parse the data.
Figure 9: hier_part grammar definition.
However, path_segments accepts a single slash / and then any number of slashes ‘/’ and segments. So for example, these are considered valid path segments:
Figure 10: path_segments grammar definition.
Here is a diagram that shows the general code flow we just presented. We can see that a simple forward slash ( / ) is considered as a valid SIP URI because the URI is checked and parsed as a generic URI instead of a SIP URI.
The underlying bug here is that non-SIP URIs are accepted as valid SIP header values. Therefore, a generic URI such as a simple single forward slash (/), will be considered a SIP URI. This means that the given URI will not contain a valid SIP scheme (scheme will be NULL), and so when the compare function is called with the non-existent scheme (NULL), a null pointer dereference will be triggered and crash the SIP client.
In other words, a simple INVITE packet such as this will crash the SIP client application because of a NULL pointer dereference:
Figure 13: Malicious packet that could crash any SIP client using the belle-sip library to parse SIP messages.
In fact, any of the From / To / Diversion headers including their short-names (f, t, and d, respectively) are vulnerable to this attack. Therefore, simply having a header such as “d: </will trigger the NULL pointer dereference vulnerability, which will crash the SIP client receiver and create a denial-of-service condition.
The C source code that is being generated portrays the flow nicely:
Given the malicious header value </, this is what happens:
generic_uri is called to parse a given header value which should be a valid URI. Since our URI is / it will pass all checks and a generic URI will be created.
Since a forward slash is considered a valid path segment, no errors will be raised.
Next, a scheme will be extracted from the parsed URI. Since there is no scheme, a NULL pointer will be returned. The returned pointer is not checked.
Finally, strcasecmp will get called with a NULL pointer and “sip” as arguments. The first parameter will be dereferenced, and a segmentation fault will occur.
Every SIP client in a SIP network is listening for INVITE requests from other clients. Once an INVITE message is received, the SIP client will parse the message and respond accordingly. Therefore, it is possible to exploit this vulnerability without user interaction (zero-click).
All that is needed to exploit this remotely is to send to any SIP client in the network an INVITE SIP request with a specifically crafted From/To/Diversion header that will trigger the NULL pointer dereference vulnerability. Any application that uses belle-sip under the hood to parse SIP messages is vulnerable and will crash upon receiving a malicious SIP “call.”
Successful exploits targeting IoT vulnerabilities have demonstrated they can provide an effective foothold onto enterprise networks. A flaw in a foundational protocol such as the SIP stack in VoIP phones and applications can be especially troublesome given the scale and reach shown by attacks against numerous other third-party components used by developers in software projects.
SIP is a popular protocol at the core of many VoIP applications; it facilitates real-time messaging over voice, video, or text between IP-based endpoints. This is what prompted Team82 to examine the security of the Linphone SIP client suite, the first open source application to use SIP on Linux.
During our work we discovered that a simple, misplaced slash in an invalid SIP message header could trigger a dereference vulnerability in the belle-sip C library used to implement SIP transport, transaction, and dialog layers. This is a unique zero-click vulnerability that may be remotely exploitable, and could affect any device running a vulnerable version of belle-sip library and crash the VoIP client.
The vulnerability, CVE-2021-33056, was fixed in v4.5.20 of the SIP protocol stack (commit with the fix), and users should ensure their devices, applications, and development environments are running updated versions of the stack.