Date of Award


Document Type


Degree Name

Master of Science (MS)

Legacy Department

Computer Engineering


Birchfield, Stan

Committee Member

Dean , Brian

Committee Member

Hoover , Adam


Our current work presents an approach to tackle the challenging task of tracking objects in Internet videos taken from large web repositories such as YouTube. Such videos more often than not, are captured by users using their personal hand-held cameras and cellphones and hence suffer from problems such as poor quality, camera jitter and unconstrained lighting and environmental settings. Also, it has been observed that events being recorded by such videos usually contain objects moving in an unconstrained fashion. Hence, tracking objects in Internet videos is a very challenging task in the field of computer vision since there is no a-priori information about the types of objects we might encounter, their velocities while in motion or intrinsic camera parameters to estimate the location of object in each frame. Hence, in this setting it is clearly not possible to model objects as single homogenous distributions in feature space. The feature space itself cannot be fixed since different objects might be discriminable in different sub-spaces.
Keeping these challenges in mind, in the current proposed technique, each object is divided into multiple fragments or regions and each fragment is represented in Gaussian Mixture model (GMM) in a joint feature-spatial space. Each fragment is automatically selected from the image data by adapting to image statistics using a segmentation technique. We introduce the concept of strength map which represents a probability distribution of the image statistics and is used to detecting the object. We extend our goal of tracking object to tracking them with accurate boundaries thereby making the current task more challenging. We solve this problem by modeling the object using a level sets framework, which helps in preserving accurate boundaries of the object and as well in modeling the target object and background. These extracted object boundaries are learned dynamically over time, enabling object tracking even during occlusion. Our proposed algorithm performs significantly better than any of the existing object modeling techniques. Experimental results have been shown in support of this claim. Apart from tracking, the present algorithm can also be applied to different scenarios. One such application is contour-based object detection. Also, the idea of strength map was successfully applied to track objects such as vessels and vehicles on a wide range of videos, as a part of the summer internship program.