All Theses

Connecting Architecture, Fitness, Optimizations and Performance using an Anisotropic Diffusion Filter

Sumedh Naik, Clemson UniversityFollow

Date of Award

12-2012

Document Type

Thesis

Degree Name

Master of Science (MS)

Legacy Department

Computer Engineering

Committee Chair/Advisor

Smith, Melissa C

Committee Member

Ligon , Walter

Committee Member

Birchfield , Stanley

Abstract

Over the past decade, computing architectures have continued to exploit multiple levels of parallelism in applications. This increased interest in parallel computing has not only fueled the growth of multi-core processors but has also lead to an emergence of several non-traditional computing architectures like General Purpose Graphical Processing Units (GP-GPUs), Cell Processors, and Field Programmable Gate Arrays (FPGAs). Of these non-traditional computing architectures, GP-GPUs have gained widespread popularity due to their massively parallel computational abilities and relative ease of programmability.
Several software development ecosystems have emerged to harness the power of these parallel architectures. Although several threading libraries like POSIX Threads, OpenMP and MPI are available for multi-core processors; the support for GP-GPUs remains limited to just two frameworks: Compute Unified Device Architecture (CUDA) and Open Compute Language (OpenCL). These threading libraries and frameworks each provide a powerful set of programming features that have a direct influence on the application performance.
In this work, we characterize the behavior of an anisotropic diffusion filter and identify the hardware bottlenecks that limit the performance of the filter. We choose an image processing filtering algorithm for this study owing to its massively parallel nature. We then utilize a recently developed fitness model from the literature to predict the fitness of this algorithm for the selected architectures and identify the causes for its failure.
We also report and analyze the variation of performance with problem size scaling, available optimization techniques, and execution configurations. We observe a best runtime of 3156 ms on the muti-core processors and 55.66 ms on the GP-GPUs . Our results and analysis highlight different architecture specific optimization techniques and identify the best match out of the selected architectures for this algorithm using a performance prediction model.

Recommended Citation

Naik, Sumedh, "Connecting Architecture, Fitness, Optimizations and Performance using an Anisotropic Diffusion Filter" (2012). All Theses. 1522.
https://tigerprints.clemson.edu/all_theses/1522

Download

Included in

Computer Engineering Commons

COinS

All Theses

Connecting Architecture, Fitness, Optimizations and Performance using an Anisotropic Diffusion Filter

Date of Award

Document Type

Degree Name

Legacy Department

Committee Chair/Advisor

Committee Member

Committee Member

Abstract

Recommended Citation

Included in

Search

Browse by

Useful Links

All Theses

Connecting Architecture, Fitness, Optimizations and Performance using an Anisotropic Diffusion Filter

Author

Date of Award

Document Type

Degree Name

Legacy Department

Committee Chair/Advisor

Committee Member

Committee Member

Abstract

Recommended Citation

Included in

Share

Search

Browse by

Useful Links