Deep Packet Inspection in LTE Networks
Meeting Performance and Flexibility Requirements with Intel® Xeon® Processors
Deep Packet Inspection (DPI) is a key technology within Long Term Evolution (LTE) network infrastructure, both for enabling next-generation services and for maximizing average revenue per user (ARPU). The big challenge in deploying DPI is scalability. DPI solutions must deliver wire-speed performance in networks where data capacity is exploding despite increasingly complex challenges in packet inspection, classification, and steering. Meeting these requirements requires a sophisticated balance of high-performance processors, dedicated hardware accelerators, and optimized, flexible software.
This article discusses how the latest Intel® Xeon® processors together with specialized packet processing software enable a range of DPI applications to deliver the performance necessary for LTE networks. It explains how this platform provides a scalable, flexible solution for the complex workloads that are involved. As a case study, the article will describe use of the 6WINDGate* packet processing software for DPI. We will demonstrate the performance of this software on the ATCA-7365 blade from Emerson Network Power, a Premier member of the Intel® Embedded Alliance.
Evolving Market Needs
A few years ago, the main networking applications for DPI technology were in wireline equipment. Cable system operators, for example, used DPI to identify harmful network traffic as well as to throttle or block applications that were determined to be “bandwidth hogs.” Recently, however, the adoption of DPI has accelerated due to the proliferation of new services, such as video, in both wireline and wireless networks and because of the business pressures faced by operators. The need for DPI is particularly keen in LTE networks, which introduce a new set of tough demands.
From the subscribers’ point of view, LTE networks have created expectations of constant access to advanced multimedia services, delivered via smartphones, tablets, or laptops, regardless of their location. Subscribers who select premium rate plans expect to receive an experience commensurate with the higher costs. Streaming video should be a TV-like experience, and parental control and filtering should be available for both fixed and mobile devices. In short, operators are required to provide a wide range of enhanced services to a fast-growing subscriber base while delivering more traffic per subscriber, supporting more applications, and offering an ever-expanding array of client devices with advanced capabilities.
At the same time, network operators face an ongoing challenge to maximize ARPU to offset constant increases in capital expenditures (CAPEX) caused by investments in network capacity expansion as well as growing operating expenditures (OPEX) due to the adoption of advanced services. Network intelligence (NI) is one key to maximizing ARPU: If the operator has real-time knowledge about the traffic characteristics and demand in their network, segmented by application, by user and by time-of-day, then they can deliver customized services that provide high value to specific customers. In addition, real-time application-level analysis of network traffic, including advanced heuristics or pattern recognition, allows operators to defend their networks against ever more sophisticated security threats and attacks.
DPI technology can provide a solution both to subscriber expectations and operator needs to maximize ARPU. As illustrated in Figure 1, DPI-based networking equipment looks deep into a packet’s payload in order to classify traffic flows based on their application as well as to capture appropriate application-specific content from those flows. This network intelligence enables networking functions such as forwarding, security, usage analysis, traffic management, resource allocation, quality of service (QoS) management, and billing.
Figure 1. Deep Packet Inspection provides visibility into the packet’s payload. (Source: Ericsson)
Similar functions are possible with shallow packet inspection that looks only at the packet header, but the network intelligence provided by DPI enables these functions to be performed with much more sophistication. With DPI, operators gain access to a wealth of information as a result of data mining, profiling and analytics, all of which are key to the creation of personalized service packages based on a deeper understanding of customers’ needs.
Within an LTE wireless network, DPI functions are generally divided into two categories:
- Policy and Charging Control (PCC) functions provide operators with advanced tools for QoS and billing control. Within the network, the Policy and Charging Enforcement Function (PCEF) uses DPI to identify and associate applications and/or users with specific traffic flows, applying policies to individual sessions based on requirements defined by a network element called the Policy and Charging Rules Function (PCRF).
- Network offload and traffic filtering functions include video optimization gateways, wireless offload gateways, lawful interception, network monitoring, and edge caching. In these network elements, DPI is used to extract flows matching specific criteria so that they can be processed and redirected appropriately. A mixture of header (Layer 4) and application (Layer 7) information is used to make these decisions.
Architecturally, the PCEF in an LTE network is part of the Packet Data Network Gateway (PGW). Physically, it can be implemented either as a dedicated blade, integrated into the PGW itself, or as a co-located but separate appliance. Regardless of the actual implementation, however, the advanced DPI functions required for PCEF place significant performance stress on the underlying processor platform due to the real-time processing bandwidth required to evaluate and act on sophisticated heuristics or to generate advanced billing information.
DPI Performance for LTE
The Intel Xeon processor provides an excellent match for these performance requirements, with up to 12 cores in a dual-socket configuration. In addition to providing the compute power necessary for DPI, Intel Xeon processors offer strong software support for these applications. The wide array of operating system (OS) and networking software options available for these processors makes it easier to build DPI equipment using popular components. These processors also provide a great deal of flexibility. The readily programmable nature of Intel Xeon processors allows developers to respond to evolving demands while the consistency of Intel® architecture and the platforms’ clear roadmap provide a smooth migration path for future designs.
The question for developers is how best to use the multi-core performance of Intel Xeon processors. A standard networking stack uses services provided by an OS such as Linux, and is subject to significant OS overheads associated with functions such as preemptions, threads, timers, and locking. These overheads are imposed on each packet passing through the system, resulting in a significant performance penalty. Although some improvements can be made to an OS stack to support multi-core architectures, performance fails to scale linearly over multiple cores for complex packet processing required in DPI. All in all, a standard OS stack does a poor job of exploiting the potential packet processing performance of a multi-core processor, and an optimized software solution is required.
One such DPI platform solution, optimized for Intel Xeon processors, is the 6WINDGate software from 6WIND. The 6WINDGate software splits the networking stack into two layers. The lower layer, typically called the “fast path,” processes the majority of packets outside the Linux environment without incurring any of the OS overheads that degrade overall performance. Only those rare packets that require complex processing are forwarded to the OS networking stack, which performs the necessary management, signaling, and control functions. To increase performance, packet processing can use the services of a dedicated executive outside the Linux environment. The Intel® Data Plane Development Kit (Intel® DPDK) provides these kinds of services on Intel Xeon processors.
Figure 2 illustrates the use of the 6WINDGate platform, running on an Intel Xeon platform, to accelerate DPI in support of PCEF. While many different flows are possible, the configuration shown represents a typical usage model within an LTE Packet Data Gateway.
Figure 2. The 6WINDGate* platform for Intel® Xeon® processors accelerates DPI in support of PCEF.
Network traffic, typically 40 Gbps today and soon 100 Gbps, enters the platform and is immediately decrypted using the high-performance security protocols within the 6WINDGate fast path. The decrypted flow is then inspected at the Layer 2/Layer 3 level using VLAN tagging, GTP flow identification, and header inspection techniques within the fast path to perform a pre-identification of the flow. Flows that are not able to be processed within the fast path are passed to external DPI software for additional analysis, following which the flow table in the fast path is updated with the appropriate application details.
Online DPI heuristics – simple but frequent DPI rules applied to a large part of the traffic – can be performed within the 6WINDGate fast path in order to improve real-time performance. The fast path also updates the flow statistics database that can be used by the PCEF software to measure user traffic. The PCEF software, which is an application running under Linux outside the fast path, correlates the flow statistics and the information about the user’s subscription in order to update the policy database (when necessary), so that the appropriate policy can be applied to the flow using the QoS function in the fast path. The fast path can maintain a large number of per-flow traffic conditioners. The processed traffic is then encrypted, using the appropriate high-performance fast path security protocol, before the traffic egresses from the platform.
Bringing DPI to Market
How can developers best deliver this DPI technology to an expectant customer base? The telecom network core imposes stringent reliability and enhanced environmental specifications that can complicate equipment design. One speedy and effective way to deal with these requirements is to use AdvancedTCA* (ATCA) boards and systems. ATCA is a well-supported open standard for commercial off-the-shelf (COTS) hardware and systems designed for use in carrier-grade applications. The standard was originally developed by major network equipment providers in tandem with both blade and systems vendors such as Emerson Network Power and technology leaders such as Intel. ATCA has steadily grown in stature to the point where it is now widely adopted and supported throughout the telecom industry.
Emerson Network Power ATCA-7365 blade exemplifies the potential of ATCA blades for packet processing applications (Figure 3). Offering the option of two 2.13 GHz Intel® Xeon® processors L5638 for use in high-temperature telecom environments or two 2.4 GHz Intel® Xeon® processors E5645 for use in cooler data center environments, the ATCA-7365 provides each CPU with six memory slots for a total of up to 96 GB of balanced memory and a dual-redundant 10 Gbps ATCA fabric connection. The blade also supports a range of connectivity options that include 1 Gbps and 10 Gbps Ethernet terminations, provided via a directly attached Rear Transition Module (RTM).
Figure 3. The Emerson Network Power ATCA-7365 blade offers robust compute power and 10G Ethernet connectivity.
Performance testing by 6WIND indicates that for a basic fast path configuration including VLAN, IP forwarding, GTP-U tunneling, flow accounting, and QoS conditioning, an Intel Xeon processor E5645 can process 2.5 million packets per second (Mpps) per core with an average packet size of 512 bytes (equals roughly 10 Gbps of packet ingress and 10 Gbps of packet egress per core). This performance matches well with a 4 x 10G RTM for the ATCA-7365. The tests indicate that the ATCA-7365 could handle the basic communications on 4 x 10G terminations using only 4 of the 12 cores available, leaving ample headroom for additional packet analysis and inspection. When deployed in Emerson Network Power’s carrier-grade Centellis 2000 2-slot ATCA platform, a system can offer up to 80Gbps of throughput in a 3U form factor (Figure 4). The same core architecture can even scale up to 480 Gbps in a 13U form factor.
Figure 4. Emerson Network Power Centellis 2000 supports up to 80 Gbps of throughput in a 3U form factor.
Performance for Today and Tomorrow
LTE operators are eager to deploy DPI solutions to meet rising customer expectations and to maximize ARPU. With the combination of Intel Xeon processor-based ATCA blades and 6WINDGate packet processing software, developers have a solution that enables them to deploy high-performance, flexible DPI equipment quickly and effectively. What’s more, this platform provides the scalability and roadmap reliability to meet new demands as LTE markets evolve and mature.
For more on building flexible networking solutions, see intel.com/go/embedded-consolidation
6WIND is an Affiliate member of the Intel® Embedded Alliance and a leading supplier of packet processing software for networking and telecom platforms. With optimized support for multi-core platforms, 6WINDGate allows system developers to focus their design efforts on their own value-added software components. With 6WINDGate, customers achieve wire-speed performance while accelerating their time-to-market and minimizing their systems’ power consumption.
This entry was posted on Friday, September 2nd, 2011 at 4:44 pm and is filed under Articles. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
|More Featured Articles|