東華大學圖書館 |

Towards Adversarial and Non-Adversarial Robustness of Machine Learning and Signal Processing: Fundamental Limits and Algorithms.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Towards Adversarial and Non-Adversarial Robustness of Machine Learning and Signal Processing: Fundamental Limits and Algorithms./
作者:	Yi, Jirong.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:	385 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
Contained By:	Dissertations Abstracts International83-02B.
標題:	Computer engineering. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28410403
ISBN:	9798534660685

Towards Adversarial and Non-Adversarial Robustness of Machine Learning and Signal Processing: Fundamental Limits and Algorithms.
Yi, Jirong.

Towards Adversarial and Non-Adversarial Robustness of Machine Learning and Signal Processing: Fundamental Limits and Algorithms. - Ann Arbor : ProQuest Dissertations & Theses, 2021 - 385 p.

Source: Dissertations Abstracts International, Volume: 83-02, Section: B.

Thesis (Ph.D.)--The University of Iowa, 2021.

This item must not be sold to any third party vendors.

We are currently in a century of data where massive amount of data are collected and processed every day, and machine learning plays a critical role in automatically processing the data and mining useful information from it for making decisions. Despite the wide and successful applications of machine learning in different fields, the robustness of such systems cannot always be guaranteed in the sense that small variations in a certain aspect of them can result in completely wrong decisions. This prevents the applications of machine learning systems in many safety-critical scenarios such as autonomous driving. In this thesis, we push forward the research on the robustness of machine learning systems in both the regression and the classification tasks.In classification tasks, the research focus will be on how the adversarial perturbations can affect the decision making. These classifiers, especially deep learning based classifiers have been shown to be susceptible to adversarial attacks: minor well crafted changes to the input of classifiers can dramatically alter their outputs, while being unnoticeable to humans. Our starting point is an analogy between the communication model and the classification model which offers an information-theoretic viewpoint of the robustness of classifiers. We present a simple hypothesis about a feature compression property of these artificial intelligence (AI) classifiers and give theoretical arguments to show that the feature compression property can account for the observed fragility or vulnerability of AI classifiers to small adversarial perturbations. We quantify theoretically the difference in the robustness of well-trained deep learning classification systems to adversarial perturbations and random perturbations.The feature compression hypothesis and the practice in communication lead naturally to a general class of defenses ("Trust, but Verify") for detecting the existence of adversarial input perturbations, and we further show theoretical arguments for the detection performance of the proposed method. We conduct experiments with a speech recognition task and an image recognition task, and our results demonstrate the effectiveness of the proposed detection methods. They also show that adversarial perturbation can cause large decrease in the mutual information between the estimate of the corresponding adversarial sample and its predicted candidate label.The experimental results from Trust, but Verify (TbV) imply that an adversarial can fool a classification system by simply adding adversarial perturbation to its input to reduce the mutual information between the adversarial sample and its true label, which motivates us to study adversarial attacks from an information theoretical viewpoint. We consider the problem of designing optimal adversarial attacks on decision systems that maximally degrade the achievable performance of the system as measured by the mutual information between the degraded input signal and the label of interest. We establish conditions for characterizing the optimality of adversarial attacks. Our theoretical results hold for all machine learning systems, which provides theoretical arguments for the transferability of adversarial samples over different deep learning classifiers. We derive the optimal adversarial attacks for discrete and continuous signals of interest, and we also show that it is much harder to achieve adversarial attacks for minimizing mutual information when we use multiple redundant copies of the input signal. It supports the "feature compression" hypothesis as a potential explanation for the adversarial vulnerability of deep learning classifiers. We also present results from computational experiments to illustrate our theoretical results.Our results show that the design of the optimal adversarial attacks can be independent of the specific classification models, and the optimal adversarial attack designed in our way for a particular classification system can also fool other classifiers. This implies that the decision regions of different classification systems can be similar. However, they do not say much about how the decision regions of different classifiers differ from each other andwhether we can arbitrarily manipulate the labels of them by designing the perturbation. This motivates us to study the question of selective fooling, i.e., given multiple machine learning systems for solving the same classification task, is it possible to construct a perturbation to the input sample so that the outputs of the multiple machine learning systems can be arbitrarily manipulated simultaneously? We formulate the problem of "selective fooling"as a novel optimization problem, and conduct experiments on the MNIST dataset. Our results show that it is in fact very easy to perform selective adversarial attack when the classifiers are identical in their architectures, training algorithms and training datasets except for random initialization in training. This suggests that multiple machine learning systems which can achieve the same level of high classification accuracy, do not in fact "think alike"at all, and their decision regions can differ greatly from each other.In the regression task, we focus on the robustness of linear sparse regression and its variants which are essentially signal recovery problems such as super-resolution or spectrally sparse during the training or learning process. In spectrally sparse signal recovery, we investigate the robustness of recovering the signal against basis mismatch between the underlying frequencies of the signal and the on-grid frequencies. We propose a Hankel matrix recovery approach for solving the problem, and show that the proposed approach can super-resolve the complex exponentials and identify their frequencies from compressed non-uniform measurements, regardless of the distance among the underlying frequencies. We propose a new concept of orthonormal atomic norm minimization (OANM), and argue that the success of Hankel matrix recovery in separation-free super-resolution and its robustness to the basis match are due to the fact that the nuclear norm of a Hankel matrix is essentially the orthonormal atomic norm. We show that the underlying parameter or frequencies values must be well separated apart for the traditional atomic norm minimization (ANM) approach to achieve successful signal recovery when the atoms are continuous functions of the continuously-valued parameter. However, the OANM can still succeed when the original atoms are arbitrarily close. As a byproduct of this research, we provide one matrix-theoretic inequality of nuclear norm, and give its proof using the theory of compressed sensing. Our experimental results demonstrate the superiority of the proposed method over the ANM.In outlier detection, we look into the robustness of recovering the ground truth signal against the sparse outliers. We propose a generative model approach for recovering the ground truth signals, and design efficient algorithms for recovering the ground truth signal. Different from traditional L1L1 minimization approach where the sparsity of the ground truth signal is required, our approach is complete free of the sparsity assumption. Instead, the generated signal recovery, outlier detection, low-rank matrix recovery, and error correction coding. Our attention is attracted by the robustness of such systems to random variations or perturbations model is used to learn and extract useful structural information for signal recovery. We establish the guarantees for the recovery of signals using learned generative models, thus achieving the robustness of signal recovery against the outliers. Our results are applicable in both linear and nonlinear generative models, and we conduct extensive experiments on real datasets using variational auto-encoder (VAE) and deep convolutional generative adversarial networks (DCGAN).

ISBN: 9798534660685Subjects--Topical Terms:

621879
Computer engineering.
Subjects--Index Terms:

Adversarial robustness

Towards Adversarial and Non-Adversarial Robustness of Machine Learning and Signal Processing: Fundamental Limits and Algorithms.
LDR:11237nmm a2200445 4500 001 2343168
005 20220502104151.5
008 241004s2021 ||||||||||||||||| ||eng d
020 $a 9798534660685
035 $a (MiAaPQ)AAI28410403
035 $a AAI28410403
040 $a MiAaPQ $c MiAaPQ
100 1 $a Yi, Jirong. $3 3681632
245 1 0 $a Towards Adversarial and Non-Adversarial Robustness of Machine Learning and Signal Processing: Fundamental Limits and Algorithms.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2021
300 $a 385 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
500 $a Advisor: Wu, Xiaodong;Xu, Weiyu.
502 $a Thesis (Ph.D.)--The University of Iowa, 2021.
506 $a This item must not be sold to any third party vendors.
520 $a We are currently in a century of data where massive amount of data are collected and processed every day, and machine learning plays a critical role in automatically processing the data and mining useful information from it for making decisions. Despite the wide and successful applications of machine learning in different fields, the robustness of such systems cannot always be guaranteed in the sense that small variations in a certain aspect of them can result in completely wrong decisions. This prevents the applications of machine learning systems in many safety-critical scenarios such as autonomous driving. In this thesis, we push forward the research on the robustness of machine learning systems in both the regression and the classification tasks.In classification tasks, the research focus will be on how the adversarial perturbations can affect the decision making. These classifiers, especially deep learning based classifiers have been shown to be susceptible to adversarial attacks: minor well crafted changes to the input of classifiers can dramatically alter their outputs, while being unnoticeable to humans. Our starting point is an analogy between the communication model and the classification model which offers an information-theoretic viewpoint of the robustness of classifiers. We present a simple hypothesis about a feature compression property of these artificial intelligence (AI) classifiers and give theoretical arguments to show that the feature compression property can account for the observed fragility or vulnerability of AI classifiers to small adversarial perturbations. We quantify theoretically the difference in the robustness of well-trained deep learning classification systems to adversarial perturbations and random perturbations.The feature compression hypothesis and the practice in communication lead naturally to a general class of defenses ("Trust, but Verify") for detecting the existence of adversarial input perturbations, and we further show theoretical arguments for the detection performance of the proposed method. We conduct experiments with a speech recognition task and an image recognition task, and our results demonstrate the effectiveness of the proposed detection methods. They also show that adversarial perturbation can cause large decrease in the mutual information between the estimate of the corresponding adversarial sample and its predicted candidate label.The experimental results from Trust, but Verify (TbV) imply that an adversarial can fool a classification system by simply adding adversarial perturbation to its input to reduce the mutual information between the adversarial sample and its true label, which motivates us to study adversarial attacks from an information theoretical viewpoint. We consider the problem of designing optimal adversarial attacks on decision systems that maximally degrade the achievable performance of the system as measured by the mutual information between the degraded input signal and the label of interest. We establish conditions for characterizing the optimality of adversarial attacks. Our theoretical results hold for all machine learning systems, which provides theoretical arguments for the transferability of adversarial samples over different deep learning classifiers. We derive the optimal adversarial attacks for discrete and continuous signals of interest, and we also show that it is much harder to achieve adversarial attacks for minimizing mutual information when we use multiple redundant copies of the input signal. It supports the "feature compression" hypothesis as a potential explanation for the adversarial vulnerability of deep learning classifiers. We also present results from computational experiments to illustrate our theoretical results.Our results show that the design of the optimal adversarial attacks can be independent of the specific classification models, and the optimal adversarial attack designed in our way for a particular classification system can also fool other classifiers. This implies that the decision regions of different classification systems can be similar. However, they do not say much about how the decision regions of different classifiers differ from each other andwhether we can arbitrarily manipulate the labels of them by designing the perturbation. This motivates us to study the question of selective fooling, i.e., given multiple machine learning systems for solving the same classification task, is it possible to construct a perturbation to the input sample so that the outputs of the multiple machine learning systems can be arbitrarily manipulated simultaneously? We formulate the problem of "selective fooling"as a novel optimization problem, and conduct experiments on the MNIST dataset. Our results show that it is in fact very easy to perform selective adversarial attack when the classifiers are identical in their architectures, training algorithms and training datasets except for random initialization in training. This suggests that multiple machine learning systems which can achieve the same level of high classification accuracy, do not in fact "think alike"at all, and their decision regions can differ greatly from each other.In the regression task, we focus on the robustness of linear sparse regression and its variants which are essentially signal recovery problems such as super-resolution or spectrally sparse during the training or learning process. In spectrally sparse signal recovery, we investigate the robustness of recovering the signal against basis mismatch between the underlying frequencies of the signal and the on-grid frequencies. We propose a Hankel matrix recovery approach for solving the problem, and show that the proposed approach can super-resolve the complex exponentials and identify their frequencies from compressed non-uniform measurements, regardless of the distance among the underlying frequencies. We propose a new concept of orthonormal atomic norm minimization (OANM), and argue that the success of Hankel matrix recovery in separation-free super-resolution and its robustness to the basis match are due to the fact that the nuclear norm of a Hankel matrix is essentially the orthonormal atomic norm. We show that the underlying parameter or frequencies values must be well separated apart for the traditional atomic norm minimization (ANM) approach to achieve successful signal recovery when the atoms are continuous functions of the continuously-valued parameter. However, the OANM can still succeed when the original atoms are arbitrarily close. As a byproduct of this research, we provide one matrix-theoretic inequality of nuclear norm, and give its proof using the theory of compressed sensing. Our experimental results demonstrate the superiority of the proposed method over the ANM.In outlier detection, we look into the robustness of recovering the ground truth signal against the sparse outliers. We propose a generative model approach for recovering the ground truth signals, and design efficient algorithms for recovering the ground truth signal. Different from traditional L1L1 minimization approach where the sparsity of the ground truth signal is required, our approach is complete free of the sparsity assumption. Instead, the generated signal recovery, outlier detection, low-rank matrix recovery, and error correction coding. Our attention is attracted by the robustness of such systems to random variations or perturbations model is used to learn and extract useful structural information for signal recovery. We establish the guarantees for the recovery of signals using learned generative models, thus achieving the robustness of signal recovery against the outliers. Our results are applicable in both linear and nonlinear generative models, and we conduct extensive experiments on real datasets using variational auto-encoder (VAE) and deep convolutional generative adversarial networks (DCGAN).
520 $a The experimental results show that the signals can be successfully recovered under outliers using our approach, and it outperforms the traditional Lasso and L2minimization approach.In separation-free super resolution and outlier detection problem, the null space condition plays a critical role in establishing the recovery guarantees, and this motivates us to generalize the null space condition for signal recovery to that for matrix recovery. Oymak etal. established a null space condition for successful recovery of a given low-rank matrix(the weak null space condition) using nuclear norm minimization, and derived the phase transition for the nuclear norm minimization. we show that the weak null space condition proposed by Oymak et al. is only a sufficient condition for successful matrix recovery using nuclear norm minimization, and is not a necessary condition as claimed. We further give a weak null space condition which is both necessary and sufficient for the success of nuclear norm minimization.Finally, we consider the error correction coding problems in the virus testing application,and we investigate the robustness of efficient virus testing via pooling tests against the noise in the pooling process. we propose a novel method to increase the reliability and robustness of COVID-19 virus or antibody tests by using specially designed pooling strategy. The ideas of our approach come from compressed sensing and error-correction coding which can correct for a certain number of errors in the test results. We present simulations and theoretical arguments to show that our method is significantly more efficient in improving diagnostic accuracy than individual testing, and the results run against traditional beliefs that, "even though pooled testing increased test capacity, pooled testings were less reliable than testing individuals separately.".
590 $a School code: 0096.
650 4 $a Computer engineering. $3 621879
650 4 $a Electrical engineering. $3 649834
650 4 $a Information science. $3 554358
650 4 $a Artificial intelligence. $3 516317
650 4 $a Sparsity. $3 3680690
650 4 $a Accuracy. $3 3559958
650 4 $a Deep learning. $3 3554982
650 4 $a Random variables. $3 646291
650 4 $a Antibodies. $3 709277
650 4 $a Success. $3 518195
650 4 $a Communication. $3 524709
650 4 $a Signal processing. $3 533904
650 4 $a Defense. $3 3681633
650 4 $a Information theory. $3 542527
650 4 $a COVID-19. $3 3554449
650 4 $a Error correction & detection. $3 3480646
650 4 $a Hypotheses. $3 3560118
650 4 $a Experiments. $3 525909
650 4 $a Voice recognition. $3 3564741
650 4 $a Decision making. $3 517204
650 4 $a Neural networks. $3 677449
650 4 $a Classification. $3 595585
650 4 $a Phase transitions. $3 3560387
650 4 $a Design. $3 518875
650 4 $a Algorithms. $3 536374
650 4 $a Coronaviruses. $3 894828
653 $a Adversarial robustness
653 $a Deep generative model
653 $a Deep learning classification
653 $a Information theory
653 $a Sparse linear regression
653 $a Statistical signal processing
653 $a Machine learning
690 $a 0464
690 $a 0544
690 $a 0723
690 $a 0800
690 $a 0389
690 $a 0459
710 2 $a The University of Iowa. $b Electrical and Computer Engineering. $3 1018779
773 0 $t Dissertations Abstracts International $g 83-02B.
790 $a 0096
791 $a Ph.D.
792 $a 2021
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28410403