Date of Award

12-2012

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Legacy Department

Electrical Engineering

Committee Chair/Advisor

Brooks, Richard R

Committee Member

Brooks , Richard R

Committee Member

Russell , Harlan B

Committee Member

Hoover , Adam

Committee Member

Lund , Robert

Abstract

Network traffic analysis is widely used to infer information from Internet
traffic. This is possible even if the traffic is encrypted. Previous work uses
traffic characteristics, such as port numbers, packet sizes, and frequency,
without looking for more subtle patterns in the network traffic. In this work,
we use stochastic grammars, hidden Markov models (HMMs) and probabilistic
context-free grammars (PCFGs), as pattern recognition tools for traffic
analysis.
HMMs are widely used for pattern recognition and detection. We use a HMM
inference approach. With inferred HMMs, we use confidence intervals (CI) to
detect if a data sequence matches the HMM. To compare HMMs, we define a
normalized Markov metric. A statistical test is used to determine model
equivalence. Our metric systematically removes the least likely events from both
HMMs until the remaining models are statistically equivalent. This defines the
distance between models. We extend the use of HMMs to PCFGs, which have more
expressive power. We estimate PCFG production probabilities from data. A
statistical test is used for detection.
We present three applications of HMM and PCFG detection to network traffic
analysis. First, we infer the presence of protocol tunneling through Tor (the
onion router) anonymization network. The Markov metric quantifies the similarity
of network traffic HMMs in Tor to identify the protocol. It also measures
communication noise in Tor network.
We use HMMs to detect centralized botnet traffic. We infer HMMs from botnet
traffic data and detect botnet infections. Experimental results show that HMMs
can accurately detect Zeus botnet traffic.
To hide their locations better, newer botnets have P2P control structures.
Hierarchical P2P botnets contain recursive and hierarchical patterns. We use
PCFGs to detect P2P botnet traffic. Experimentation on real-world traffic data
shows that PCFGs can accurately differentiate between P2P botnet traffic and
normal Internet traffic.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.