東華大學圖書館 |

Sparse Machine Learning Methods for Prediction and Personalized Medicine.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Sparse Machine Learning Methods for Prediction and Personalized Medicine./
作者:	Yu, Hang.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:	92 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Contained By:	Dissertations Abstracts International83-03B.
標題:	Statistics. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28650492
ISBN:	9798538118809

Sparse Machine Learning Methods for Prediction and Personalized Medicine.
Yu, Hang.

Sparse Machine Learning Methods for Prediction and Personalized Medicine. - Ann Arbor : ProQuest Dissertations & Theses, 2021 - 92 p.

Source: Dissertations Abstracts International, Volume: 83-03, Section: B.

Thesis (Ph.D.)--The University of North Carolina at Chapel Hill, 2021.

This item must not be sold to any third party vendors.

With growing interest to use black-box machine learning for complex data with many feature variables, it is critical to obtain a prediction model that only depends on a small set of features to maximize generalizability. Therefore, feature selection remains to be an important and challenging problem in modern applications. Most of existing methods for feature selection are based on either parametric or semiparametric models, so the resulting performance can severely suffer from model misspecification when high-order nonlinear interactions among the features are present. A very limited number of approaches for nonparametric feature selection were proposed, but they are computationally intensive and may not even converge. Thus, nonparametric feature selection for high-dimensional data is an important problem in statistics and machine learning fields. Furthermore, in the field of precision medicine, machine learning techniques are usually applied on a large health dataset containing patients' information to find optimal individual treatment rule (ITR), which makes the learning process computational demanding. Thus, identifying the truly important feature variables shortens the computation time and saves the cost of collecting redundant data. Therefore, we focus on developing machine learning techniques to perform variable selection for both prediction and personalized medicine in the dissertation.In the first project, we propose a novel and computationally efficient approach for nonparametric feature selection in regression field based on a tensor-product kernel function over the feature space. The importance of each feature is governed by a parameter in the kernel function which can be efficiently computed iteratively from a modified alternating direction method of multipliers (ADMM) algorithm. We prove the oracle selection property of the proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via simulation studies and application to the prediction of Alzheimer's disease.In the second project, we continue to propose a new framework to perform nonparametric feature selection for both regression and classification problems. Under this framework, we learn prediction functions through empirical risk minimization over a reproducing kernel Hilbert space (RKHS). The space is generated by a novel tensor product kernel which depends on a set of parameters that determine the importance of the features. Computationally, we minimize the empirical risk with a penalty to estimate the prediction and kernel parameters simultaneously. The solution can be obtained by iteratively solving convex optimization problems. We study the theoretical property of the kernel feature space and prove oracle selection property and Fisher consistency of our proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via extensive simulation studies and application to a microarray study of eye disease in animals.Finally, we focus on applying the nonparametric feature selection framework for treatment decision making with high-dimensional data. We directly estimate the decision function in Reproducing Kernel Hilbert Space (RKHS) generated by a novel constructed tensor product kernel with parameters capturing the importance of each variable. Computationally, we adopt two steps to separate the procedure for both estimating and tuning processes, which makes the computation more fast and stable. Finally, we demonstrate the superior performance of our approach compared to existing methods via one simulation study and application to type 2 diabetes.

ISBN: 9798538118809Subjects--Topical Terms:

517247
Statistics.
Subjects--Index Terms:

Machine Learning

Sparse Machine Learning Methods for Prediction and Personalized Medicine.
LDR:04998nmm a2200409 4500 001 2352143
005 20221118093822.5
008 241004s2021 ||||||||||||||||| ||eng d
020 $a 9798538118809
035 $a (MiAaPQ)AAI28650492
035 $a AAI28650492
040 $a MiAaPQ $c MiAaPQ
100 1 $a Yu, Hang. $3 1916190
245 1 0 $a Sparse Machine Learning Methods for Prediction and Personalized Medicine.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2021
300 $a 92 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
500 $a Advisor: Zeng, Donglin;Zhang, Kai.
502 $a Thesis (Ph.D.)--The University of North Carolina at Chapel Hill, 2021.
506 $a This item must not be sold to any third party vendors.
520 $a With growing interest to use black-box machine learning for complex data with many feature variables, it is critical to obtain a prediction model that only depends on a small set of features to maximize generalizability. Therefore, feature selection remains to be an important and challenging problem in modern applications. Most of existing methods for feature selection are based on either parametric or semiparametric models, so the resulting performance can severely suffer from model misspecification when high-order nonlinear interactions among the features are present. A very limited number of approaches for nonparametric feature selection were proposed, but they are computationally intensive and may not even converge. Thus, nonparametric feature selection for high-dimensional data is an important problem in statistics and machine learning fields. Furthermore, in the field of precision medicine, machine learning techniques are usually applied on a large health dataset containing patients' information to find optimal individual treatment rule (ITR), which makes the learning process computational demanding. Thus, identifying the truly important feature variables shortens the computation time and saves the cost of collecting redundant data. Therefore, we focus on developing machine learning techniques to perform variable selection for both prediction and personalized medicine in the dissertation.In the first project, we propose a novel and computationally efficient approach for nonparametric feature selection in regression field based on a tensor-product kernel function over the feature space. The importance of each feature is governed by a parameter in the kernel function which can be efficiently computed iteratively from a modified alternating direction method of multipliers (ADMM) algorithm. We prove the oracle selection property of the proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via simulation studies and application to the prediction of Alzheimer's disease.In the second project, we continue to propose a new framework to perform nonparametric feature selection for both regression and classification problems. Under this framework, we learn prediction functions through empirical risk minimization over a reproducing kernel Hilbert space (RKHS). The space is generated by a novel tensor product kernel which depends on a set of parameters that determine the importance of the features. Computationally, we minimize the empirical risk with a penalty to estimate the prediction and kernel parameters simultaneously. The solution can be obtained by iteratively solving convex optimization problems. We study the theoretical property of the kernel feature space and prove oracle selection property and Fisher consistency of our proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via extensive simulation studies and application to a microarray study of eye disease in animals.Finally, we focus on applying the nonparametric feature selection framework for treatment decision making with high-dimensional data. We directly estimate the decision function in Reproducing Kernel Hilbert Space (RKHS) generated by a novel constructed tensor product kernel with parameters capturing the importance of each variable. Computationally, we adopt two steps to separate the procedure for both estimating and tuning processes, which makes the computation more fast and stable. Finally, we demonstrate the superior performance of our approach compared to existing methods via one simulation study and application to type 2 diabetes.
590 $a School code: 0153.
650 4 $a Statistics. $3 517247
650 4 $a Medical imaging. $3 3172799
650 4 $a Operations research. $3 547123
650 4 $a Information science. $3 554358
653 $a Machine Learning
653 $a Personalized medicine
653 $a Predictive modeling
653 $a Nonparametric feature selection
653 $a Health dataset
653 $a Individual Treatment Rule
653 $a Electronic health records
690 $a 0463
690 $a 0574
690 $a 0796
690 $a 0723
710 2 $a The University of North Carolina at Chapel Hill. $b Statistics and Operations Research. $3 3179567
773 0 $t Dissertations Abstracts International $g 83-03B.
790 $a 0153
791 $a Ph.D.
792 $a 2021
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28650492