Date of Award

5-2023

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

Committee Chair/Advisor

Dr. Long Cheng

Committee Member

Dr. Yingjie Lao

Committee Member

Dr. Federico Iuricich

Abstract

Through artificial intelligence, algorithms can classify arrays of data, such as images or videos, into a predefined set of categories. With enough labeled data, a classifier can analyze an input’s components and calculate confidence scores for each category. However, machine learning relies heavily on approximation, which allows attackers to exploit classifiers by providing adversarial
examples. Specifically, attackers can modify their input so that the victim classifier cannot correctly label it, while a human observer would be unable to notice the difference.
This thesis proposes Gaslight, a system that uses deep reinforcement learning to generate adversarial examples against a victim classifier. Gaslight is a “black-box” and “hard-label” attacker, which means that it receives no information from the victim except the input shape, input range, and top-1 label output. Gaslight learns to attack the victim by modifying randomly generated inputs, rewarding the agent’s effectiveness for successful misclassifications and reduced distortions. After several iterations, Gaslight improves its ability to generate optimal perturbations for any
test input it is given. Once Gaslight finishes training an agent, it can be used on any input with the correct shape and range. Experiments on the CIFAR10 and ImageNet datasets have shown that Gaslight can successfully perturb inputs with a single query at a high success rate, improving upon existing methods that can take hundreds or even thousands of queries for a mislabel. When compared against other state-of-the-art hard-label attacks, Gaslight was able to achieve similar ℓ2 and ℓ∞ norms with 90% fewer queries. Gaslight’s code can be found at https://github.com/RajatSethi2001/Gaslight.

Recommended Citation

Sethi, Rajat, "Gaslight: Attacking Hard-Label Black-Box Classifiers via Deep Reinforcement Learning" (2023). All Theses. 4012.
https://tigerprints.clemson.edu/all_theses/4012

Download

Included in

Artificial Intelligence and Robotics Commons, Information Security Commons

COinS

All Theses

Gaslight: Attacking Hard-Label Black-Box Classifiers via Deep Reinforcement Learning

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Abstract

Recommended Citation

Included in

Search

Browse by

Useful Links

All Theses

Gaslight: Attacking Hard-Label Black-Box Classifiers via Deep Reinforcement Learning

Author

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Abstract

Recommended Citation

Included in

Share

Search

Browse by

Useful Links