Tuesday, July 2, 2013

PAPER: Behavioral Clustering of HTTP-Based Malware and Signature Generation Using Malicious Network Traces

I've recently read another paper for the first week quiz of Coursera course 'Malicious Software and its Underground Economy: Two Sides to Every Story" and I'm happy to share it with you.

Roberto Perdiscia, Wenke Leea , and Nick Feamstera

We present a novel network-level behavioral malware clustering system. We focus on analyzing the structural similarities among malicious HTTP traffic traces generated by executing HTTP-based malware. Our work is motivated by the need to provide quality input to algorithms that automatically generate network signatures. Accordingly, we define similarity metrics among HTTP
traces and develop our system so that the resulting clusters can yield high-quality malware signatures. We implemented a proof-of-concept version of our network-level malware clustering system and performed experiments with more than 25,000 distinct malware samples. Results from our evaluation, which includes real-world deployment, confirm the effectiveness of the proposed  clustering system and show that our approach can aid the process of automatically extracting network signatures for detecting HTTP traffic generated by malware compromised machines.


PAPER: BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection

I've recently read another paper for the first week quiz of Coursera course 'Malicious Software and its Underground Economy: Two Sides to Every Story" and I'm happy to share it with you.

Guofei Gu , Junjie Zhang, Wenke Lee and Roberto Perdisci

Botnets are now the key platform for many Internet attacks, such as spam, distributed denial-of-service (DDoS), identity theft, and phishing. Most of the current botnet detection approaches work only on specific botnet command and control (C&C) protocols (e.g., IRC) and structures (e.g., centralized), and can become ineffective as botnets change their C&C techniques. In this paper, we present a general detection framework that is independent of botnet C&C protocol and structure, and requires no a priori knowledge of botnets (such as
captured bot binaries and hence the botnet signatures, and C&C server names/addresses). We start from the definition and essential properties of botnets. We define a botnet as a coordinated group of malware instances that are controlled via C&C communication channels. The essential properties of a botnet are that the bots communicate with some C&C servers/peers, perform
malicious activities, and do so in a similar or correlated way. Accordingly, our detection framework clusters similar communication traffic and similar malicious traffic, and performs cross cluster correlation to identify the hosts that share both similar communication patterns and similar malicious activity patterns. These hosts are thus bots in the monitored network. We have implemented our BotMiner prototype system and evaluated it using many real network traces. The results show that it can detect real-world botnets (IRC-based, HTTP-based, and P2P botnets including Nugache and Storm worm), and has a very low false positive rate.

PAPER: Mining the Network Behavior of Bots

I've recently read another paper for the first week quiz of Coursera course 'Malicious Software and its Underground Economy: Two Sides to Every Story" and I'm happy to share it with you.

Lorenzo Cavallaro, Christopher Kruegel, and Giovanni Vigna

A botnet is a network of compromised hosts that fulfills the malicious intents of an attacker. Once installed, a bot is typically used to steal sensitive information, send SPAM, perform DDoS attacks, and other illegal activities. Research in botnet detection has been quite prolific in the past years, producing detection mechanisms that focus on specific command and control structures, or on the  correlation between the activities of the bots and the communication patterns shared by multiple infected machines. We present an approach that aims to detect bot-infected hosts. Our approach ( i) is independent on the underlying botnet structure, ( ii) is able to detect individually infected hosts, ( iii) deals with encrypted communication, ( iv) does not rely on the presence of noisy malicious activities and can thus detect legitimate-resembling communication patterns, and ( v) has a low false positive rate. Our technique starts by monitoring a network trace produced by a bot sample B, which is summarized into a set of
network flows. Similar flows are then grouped together by relying on a hierarchical clustering algorithm. The resulting clusters are analyzed for evidence of periodic behaviors. If no periodic behaviors are found, an output-based system selects those clusters that recur the most across different network traces obtained by running the sample B multiple times. Finally, our analysis automatically produces a network behavior model of B, which is deployed on a Bro NIDS sensor, that operates on real-time and realistic settings, raising few false positives.

PAPER: From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware

I've recently read that paper for the first week quiz of Coursera course 'Malicious Software and its Underground Economy: Two Sides to Every Story" and I'm happy to share it with you.

Manos Antonakakis, Roberto Perdisci, Yacin Nadji, Nikolaos Vasiloglou, Saeed Abu-Nimeh, Wenke Lee and David Dagon

Many botnet detection systems employ a blacklist of known command and  control (C&C) domains to detect bots and block their traffic. Similar to signature-based virus detection, such a botnet detection approach is static because the blacklist is updated only after running an external (and often manual) process of domain discovery. As a response, botmasters have begun employing domain generation algorithms (DGAs) to dynamically produce a large number of random domain names and select a small subset for actual C&C use. That is, a C&C domain is randomly generated and used for a very short period of time, thus rendering detection approaches that rely on static domain lists ineffective. Naturally, if we know how a domain generation algorithm works, we can generate the domains ahead of time and still identify and block botnet C&C traffic. The existing solutions are largely based on reverse engineering of the bot malware executables, which is not always feasible. In this paper we present a new technique to detect randomly generated domains without reversing. Our insight is that most of the DGA-generated (random) domains that a bot queries would result in Non-Existent Domain (NXDomain) responses, and that bots from the same botnet (with the same DGA algorithm) would generate similar NXDomain traffic. Our approach uses a combination of clustering and classification algorithms. The clustering algorithm clusters domains based on the similarity in the make-ups of domain names as well as the groups of machines that queried these domains. The classification algorithm is used to assign the generated clusters to models of known DGAs. If a cluster cannot be assigned to a known model, then a new model is produced, indicating a new DGA variant or family. We implemented a prototype system and evaluated it on real-world DNS traffic obtained from large ISPs in North America. We report
the discovery of twelve DGAs. Half of them are variants of known (botnet) DGAs, and the other half are brand new DGAs that have never been reported before.

Monday, July 1, 2013

PAPER: Know Your Enemy: Fast-Flux Service Networks

I've recently read that paper from The Honeynet Project to better understand what fast-flux is and I'm happy to share it with you. 

An Ever Changing Enemy Primary Authors:
William Salusky
Robert Danford
Last Modified: 13 July, 2007

One of the most active threats we face today on the Internet is cyber-crime. Increasingly capable criminals are constantly developing more sophisticated means of profiting from online criminal activity. This paper demonstrates a growing, sophisticated technique called fast-flux service networks which we are seeing increasingly used in the wild. Fast-flux service networks are a network of compromised computer systems with public DNS records that are constantly
changing, in some cases every few minutes. These constantly changing architectures make it much more difficult to track down criminal activities and shut down their operations.

In this paper we will first provide an overview of what fast-flux service networks are, how they operate, and how the criminal community is leveraging them, including two types which we have designated as single-flux and double-flux service networks. We then provide several examples of fast-flux service networks recently observed in the wild,. Next we detail how fast-flux service network malware operates and present the results of research where a honeypot was purposely infected with a fast-flux agent. Finally we cover how to detect, identify, and mitigate fast-flux service networks, primarily in large networking environments. At the end we supply five appendixes providing additional information for those interested in digging into more technical detail.

Malicious Software and its Underground Economy: Two Sides to Every Story @ Coursera

Everyone of you know what Coursera is and I'm very happy that on Cousera platform from the first of Jun is available a specific course about malwares. 

The course name is: Malicious Software and its Underground Economy: Two Sides to Every Story
by Lorenzo Cavallaro.

If you don't known what Coursera is here a little clip from the about page:

"Coursera is an education company that partners with the top universities and organizations in the world to offer courses online for anyone to take, for free. Our technology enables our partners to teach millions of students rather than hundreds."

If you want more info about the course again a little clip from the course home page:

Cybercrime has become both more widespread and harder to battle. Researchers and anecdotal experience show that the cybercrime scene is becoming increasingly organized and consolidated, with strong links also to traditional criminal networks. Modern attacks are indeed stealthy and often profit oriented.
Malicious software (malware) is the traditional way in which cybercriminals infect user and enterprise hosts to gain access to their private, financial, and intellectual property data. Once stolen, such information can enable more sophisticated attacks, generate illegal revenue, and allow for cyber-espionage.
By mixing a practical, hands-on approach with the theory and techniques behind the scene, the course discusses the current academic and underground research in the field, trying to answer the foremost question about malware and underground economy, namely, "Should we care?".
Students will learn how traditional and mobile malware work, how they are analyzed and detected, peering through the underground ecosystem that drives this profitable but illegal business. Understanding how malware operates is of paramount importance to form knowledgeable experts, teachers, researchers, and practitioners able to fight back. Besides, it allows us to gather intimate knowledge of the systems and the threats, which is a necessary step to successfully devise novel, effective, and practical mitigation techniques.