L-7 Protocol analysis
Traditional intrusion detection systems have always relied on protocol specific analysers to extract the context of the traffic stream. Basically, an intrusion detection system checks for the pattern of packets running inside the network, checking for any anomalous behaviour in the stream of packets running within the network. Traditional methods rely on the analysis of standard well-known port numbers that may not hold well in the current network landscape. Due to evolution of attacks of more sophisticated means, the defence landscape needs to evolve and adapt itself to the newer attacks. Few of the measures already implemented are detecting applications that do not use their standard ports. For each transfer of data in the FTP protocol, the payload is analysed and detects the C&C server that runs using IRC as their underlying base.
The major challenge posed to the classical method of analysis is finding the protocol that is used within a stream. Even though the classical method assumes well-known ports for certain applications, if an attacker chooses to set the port manually for a particular application, then the classical method of analysis goes for a toss. Sometimes an application uses a wrapper of another protocol in order to evade the administrative and firewall restrictions. In such cases, the L-7 analysers require first to strip the outer wrapper of the leveraged common protocol and then understand the inner semantics of the payload being sent. A simple example of torrents/file distribution chains use ports other than port 21, which run hidden FTP servers. Due to these reasons, the L-7 protocol decoders have to employ alternative means rather than relying upon the well-known port schemes. Leading IDSs like Snort and Intrushield rely on the protocol specific analysis. This paper focuses on dynamic analysis of L-7 protocols. The basic principle being: Detect the protocol, trigger the analyser"
Few of the noted methodologies for application protocol identification are statistical analysis of network traffic and finding the protocol specific byte patterns in the payload. The time lag between each packets, the distribution of packet size can be used to distinguish social networking applications like gTalk, Yahoo Messenger, Skype, etc. Signature-based detection (byte code based) is a popular technique in virus scanning and anti-malwares. Combining multiple methods, including manual detection, can yield results but are cumbersome in practice. Thus, most of the IDS/IPS systems prefer signature-based detection as the infrastructure readily supports this sort of analysis. Few connections which fail to perform the initial TCP handshake, or the ones that complete the handshake without any payload in exchange, cannot be analyzed, as there are no actual contents for the same.
Detecting malwares comes in various flavors in today's engines. The simplest ones just do a signature-based matching and give reports solely based on one approach, whereas the complex ones go all the way and perform stateful protocol analysis including anomaly based detection. Snort, which is commonly available open source, IDS doesn't ship with signatures, but we can use the signatures provided by the contributors in our analysis. It supports regular expression as well as raw byte pattern analysis. Another approach for detection of backdoors in the traffic could be a combination of examination of packet intervals and size distributions followed by analyzing hard-coded patterns within the stream.
When dealing with malicious stream of packets, it's very important that we have an environment that we are willing to sacrifice. This is because we are not aware of the behavior of the malware if we are analyzing it for the first time. Attackers have evolved in modern malware design by coming up with polymorphic malwares which change their signatures on the fly. Though the defense landscape for the polymorphic packets is limited, few methodologies do exist like Cryptanalysis and Emulation. Dynamic L7 protocol analysis requires the flexibility of using multiple methods in parallel —like hand-in-hand implementation of port based detection with signature-based detection. Sometimes a single packet is not enough to determine the nature of the protocol being used. In such cases, the design has to allow a stream of packets under monitor before concluding the type of protocol under use. A classic example of dynamic analysis of L7 protocol is as follows – suppose the payload incoming from port 80 (typically http) seems like an IRC session, then the entire analysis will be done as IRC analysis instead of a typical http analysis. This is where the dynamic analysis scores over static analysis of protocol streams.
In the above section, we have seen the usefulness of protocol decoders in stripping off the wrapper of leveraged protocols in order to evade admin privileges and firewalls. In the following section, we shall learn about decoders. As we are aware, a decoder takes raw packets as inputs and processes it in to structured data in real time. A decoder must have the capability to restore the application layer data, and provide uniform and structured data as its output. It should possess real time processing capacity and must aggregate data. A data in the Layer 7, is a raw stream of packets, consisting of application request/response data, and represents transfer of Data Objects. Any designed decoder must have the scalability to accommodate emerging protocols and must be modular enough to avoid being extremely complex in nature. Decoders can be protocol specific or can be used to analyze combination of protocols.
Dynamic analysis of protocols poses many challenges along with number of benefits. The major challenge being the impossibility of starting a protocol analyzer in the middle of a session, which requires the stream of packets to be buffered before the analyzer is kicked off. A dynamic analyser adds and removes the components as and when requires as clearly stated in the example of IRC session over a HTTP port! The network traffic today sees a major share of packets from nonstandard ports that call for dynamic analysis of application layer protocols. The basic reason for avoiding the use of nonstandard ports as stated above is to evade the security monitoring and policy enforcements. From various studies, statistics verify that dynamic analysis of protocols helps in increasing the number of security breaches in a corporate network!