Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Memory/Storage

Inside network programming with SML

Posted: 17 Nov 2003 ?? ?Print Version ?Bookmark and Share

Keywords:sml? programming? network programming language? rtsp? udp?

Service management language (SML) falls into a niche of programming languages and frameworks called network-programming languages. P-Cube created SML after realizing that no earlier language had provided the ability to describe and define network events at all protocol layers, including the application layer. Extending protocol abstraction to the application layer improves productivity by supporting a top-down design approach, which masks the details of protocol messages and packet formats from the programmer.

Most network-programming languages focus on setting rules and constraints for packet headers, and, as a result, provide a bottom-up approach in which a set of constraints is defined for each packet. While this makes it simple to create programs that classify traffic according to packet characteristics (such as "drop all UDP packets" or "count all TCP packets from destination subnet A with source port number XYZ"), they do not provide a mechanism to track, analyze, filter and control application flows (such as "drop all HTTP flows with a user-agent field X, and no cookie information" or "count number of RTSP connections with at least 1 video and 2 audio transmission"). This is the problem that SML addresses and solves.

While network communication is actually packetized, that is, a message is divided into smaller packets when transmitted from source to destination, logical application information cannot be obtained from any single packet. Hence, the need for a new programming language for application-layer network packet analysis - accurate and complete analysis cannot be performed using packet-based rules.

Rather, a programming language in which code is used to parse message exchanges with application-level awareness, regardless of how payloads were divided into packets, is necessary for even basic levels of traffic analysis sophistication.

The SML framework provides programmers with a per-application flow context and allows them to create programs that parse all "payload" bytes as a connection and a contiguous stream of bytes. For example, to search for a string "404 not found," the programmer can simply match that string against all open connections, regardless of whether the network has divided this text into one, two or 10 packets.

Further, SML provides the means to address advanced protocols in which application flows traverse more than a single network connection and bundles these flows into a single simplified context. For example, an RTSP application flow consists of one control flow (used by the streaming client and server to initiate streams, pause, rewind and the like) and multiple media flows carrying the actual digitized audio and video information. For these application flows, SML provides a simple means to relate multiple flows and maintain state information for the entire application flow.

Using the flow

SML code is executed in the context of a flow and concerns itself with the traffic stream rather than requiring logical code for each individual packet. A flow is a uniquely identified IP conversation between two entities - its quintuple representation includes both of the conversing entities' IP addresses and ports, and the protocol number being used (typically TCP or UDP). Each flow creates a virtual machine with an execution pointer and completed state information.

Flow bundling simplifies the SML coding for the analysis and control of multiflow protocols such as Session Initiation Protocol (SIP), RTSP and FTP. Additionally, with the ability to define regular-expression patterns to match at the application-layer payload, support for signature-based detection of protocols is made simple and easy. This is extremely useful for the detection of P2P protocols such as Kazaa, WinMX or eDonkey that typically are not confined to any particular port number and are essentially undetectable when packet and header-level rules are used.

A typical SML application monitors IP traffic and tracks application-specific message exchanges. During the lifetime of the application flow, the SML program can use variables defined on a per-flow basis to maintain state information for flows according to the way the application progresses.

It is important to note that multiple application flows typically progress in parallel, and therefore, in high-speed networks it is not uncommon for hundreds of thousands of these flows to traverse the network concurrently. The environment in which the SML program executes automatically maintains the state information for these application flows. Cross-flow state information can be maintained and managed with the definition of global parameters, which can be shared between application flows as needed: for example, in counting the number of failed HTTP connections on the entire link.

SML structure

When attempting to analyze application protocols, two levels require consideration. The first is the specific protocol messages, their encoding and the content of the various fields in the packets. The second is the interrelation between the different messages that comprise the protocol. Interrelations can include, for example, a response message from one end that always follows a request message from the other end of the communications channel. SML is constructed so that analysis of packets can occur at both of these levels using abstract data types (ADTs).

SML's ADTs are used for describing protocol messages covering the protocol message specification, encoding and content. ADTs can include text-based patterns, regular expressions or binary encoding schemes. An ADT may be represented by using message description languages such as ASN.1 or XML, by using regular expressions to define textual patterns or by using raw structures for protocols that use binary encoding.

The definition of an ADT is done independently from the coding and program(s) in which it is used, thereby facilitating the abstraction of protocol-specific events, identification and reuse. The protocols currently implemented in SML include HTTP, FTP, SIP, H.323, RTSP, SDP, POP3, SNMP, POP3, NNTP, WAP, MMS, Kazaa, eDonkey, WinMX, GNUTELA and many others. Prefabricated ADTs can be easily and quickly used in SML code to detect protocols in network traffic.

SML supports the normal set of C-like conditional statements, loops and expressions. In addition, an extensive set of predefined system, networking and general purpose functions are provided as part of the language, leveraging important functionality of the service engine platform (and hardware optimizations).

SML variables can be defined in one of four possible memory contexts: flow, bound, party or global. Flow variables are used to conserve a state or value that is limited to the scope of the current flow being analyzed. A variable shared by a set of dynamically bundled flows within that group's scope will use bound variables, while a state shared by all flows of a specific user will be of the party type and scope.

SML code's main building blocks are composed of events organized in Java-like packages. An SML event may include a combination of match of other events and\or ADTs, invocation of the extensive set of networking actions available (system calls) and a set of variables and their respective manipulating expressions.

Platforms running the SML code may encounter up to a million concurrent flows relating to hundreds of thousands of users. Each flow or user has its own flow or party context. Specific customized memory management subsystems are required to optimize and limit the footprint of large and complex memory contexts. The SML compiler optimizations significantly increase the number of flows that can be handled concurrently and scrub SML code for efficient dynamic allocation of memory when producing executables.

SML development

SML is the vehicle that enables P-Cube to succeed in quickly developing a variety of applications. Designers developed these applications on top of an infrastructure component called protocol libraries.

Protocol libraries include the implementation of the ADTs and the messages' interrelations for the specific protocol. Applications connect to protocol libraries through a set of protocol-related callbacks, allowing the application developer to enjoy the benefits of code reuse and protocol abstraction. The protocol library is depicted at the most detailed level, and the SML compiler optimizes the memory signature based on the specific use of the protocol library that the application makes.

In addition to protocol libraries, the SML development environment consists of an SML compiler that accepts as input SML code and translates it to an executable file on the target platform. P-Cube's SML crosscompiler can create executables for various types of service engine hardware, or generate files in binary format that can be run on the SML simulator and SML debugger for testing and debugging purposes. Additional SML back ends developed for other platforms and binary formats can support the execution of SML code on any platform that provides the necessary foundations for the execution of stateful network applications with Layer 7 inspection.

- Opher Reviv

Compiler Group Manager

P-Cube Ltd

Article Comments - Inside network programming with SML
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top