Date of Award

12-2018

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Electrical and Computer Engineering (Holcomb Dept. of)

Committee Member

Melissa C. Smith, Committee Chair

Committee Member

Robert J. Schalkoff

Committee Member

Harlan Russell

Committee Member

Amy Apon

Abstract

Computer vision has become ubiquitous in today's society, with applications ranging from medical imaging to visual diagnostics to aerial monitoring to self-driving vehicles and many more. Common to many of these applications are visual perception systems which consist of classification, localization, detection, and segmentation components, just to name a few. Recently, the development of deep neural networks (DNN) have led to great advancements in pushing state-of-the-art performance in each of these areas. Unlike traditional computer vision algorithms, DNNs have the ability to generalize features previously hand-crafted by engineers specific to the application; this assumption models the human visual system's ability to generalize its surroundings. Moreover, convolutional neural networks (CNN) have been shown to not only match, but exceed performance of traditional computer vision algorithms as the filters of the network are able to learn "important" features present in the data. In this research we aim to develop numerous applications including visual warehouse diagnostics and shipping yard managements systems, aerial monitoring and tracking from the perspective of the drone, perception system model for an autonomous vehicle, and vehicle re-identification for surveillance and security. The deep learning models developed for each application attempt to match or exceed state-of-the-art performance in both accuracy and inference time; however, this is typically a trade-off when designing a network where one or the other can be maximized. We investigate numerous object-detection architectures including Faster R-CNN, SSD, YOLO, and a few other variations in an attempt to determine the best architecture for each application. We constrain performance metrics to only investigate inference times rather than training times as none of the optimizations performed in this research have any effect on training time. Further, we will also investigate re-identification of vehicles as a separate application add-on to the object-detection pipeline. Re-identification will allow for a more robust representation of the data while leveraging techniques for security and surveillance. We also investigate comparisons between architectures that could possibly lead to the development of new architectures with the ability to not only perform inference relatively quickly (or in close-to real-time), but also match the state-of-the-art in accuracy performance. New architecture development, however, depends on the application and its requirements; some applications need to run on edge-computing (EC) devices, while others have slightly larger inference windows which allow for cloud computing with powerful accelerators.

Share

COinS