東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁 到查詢結果 [ null ]

切換: 標籤 | MARC模式 | ISBD

FindBook

Google Book

Amazon

博客來

Deep Learning in Protein Design Studies and a Rank-Based Point Process Model.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Deep Learning in Protein Design Studies and a Rank-Based Point Process Model./
作者:	Chen, Yang.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:	135 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-01, Section: B.
Contained By:	Dissertations Abstracts International83-01B.
標題:	Statistics. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28319879
ISBN:	9798516065958

Deep Learning in Protein Design Studies and a Rank-Based Point Process Model.
Chen, Yang.

Deep Learning in Protein Design Studies and a Rank-Based Point Process Model. - Ann Arbor : ProQuest Dissertations & Theses, 2021 - 135 p.

Source: Dissertations Abstracts International, Volume: 83-01, Section: B.

Thesis (Ph.D.)--The Florida State University, 2021.

This item must not be sold to any third party vendors.

This dissertation consists of three projects in two research areas: 1) modeling of temporal point process (with one project) and 2) protein design in computational structural biology (with two projects). My research work in these three projects can be summarized as follows:Project 1 is on methodology development. It is well known that the classical theory on temporal point process only focuses on time-based framework, where a conditional intensity function at each given time can fully describe the process. However, such a framework cannot directly capture important overall features/patterns in the process, such as characterizing a center-outward rank or identifying outliers in a given sample. Therefore I propose a new, data-driven model for regular point process in this study, provide a probabilistic model using two factors: (1) the number of events in the process, and (2) the conditional distribution of these events given the number. The second factor is the key challenge. Based on the equivalent inter-event representation, I propose two frameworks on the inter-event times (IETs) to capture large variability in a given process -- One is to model the IETs directly by a Dirichlet mixture, and the other is to model the isometric logratio transformed IETs by a classical Gaussian mixture. Both mixture models can be properly estimated using a Dirichlet process (for the number of components) and Expectation-Maximization algorithm (for parameters in the models). In particular, I thoroughly examine the new models on the commonly used Poisson processes. I finally demonstrate the new framework on two simulated datasets and point out its effectiveness on characterizing center-outward ranks.Project 2 is a computational investigation, where I use deep learning methods to tackle inverse protein folding, a challenging problem in computational structural biology. I begin with formulating the IPF problem as predicting the residue type given the 3D structural environment around the target residue. Afterwards, I design a nine-layer deep convolutional neural network (CNN) that takes as input a gridded box with the atomic coordinates and types around the residue, captures structure information at various scales, and provides a predicted residue type. Trained on thousands of protein structures, the method, called ProDCoNN (Protein Design with Convolutional Neural Network), achieved state-of-the-art performance when tested on large numbers of test proteins and benchmark datasets.Project 3 is another computational investigation, where I use deep learning and reinforcement learning methods to study the challenging protein-ligand docking in computational structural biology. In order to find the optimal binding modes of protein-ligand complexes, I propose a supervised searching algorithm based on Asynchronous Advantage Actor Critic algorithm in reinforcement learning. I take the gridded box with protein atomic information around the target binding site as environment, design a CNN as actor to guide the ligand move in the box and another CNN as critic to predict the binding affinity. The new framework is trained and tested using hundreds of protein-ligand complexes with single-atom ligand Copper and Zinc. The result shows that the new framework can find out the target binding site efficiently and predicted a reasonable binding affinity.

ISBN: 9798516065958Subjects--Topical Terms:

517247
Statistics.
Subjects--Index Terms:

Deep learning

Deep Learning in Protein Design Studies and a Rank-Based Point Process Model.
LDR:04558nmm a2200373 4500 001 2347427
005 20220801062156.5
008 241004s2021 ||||||||||||||||| ||eng d
020 $a 9798516065958
035 $a (MiAaPQ)AAI28319879
035 $a AAI28319879
040 $a MiAaPQ $c MiAaPQ
100 1 $a Chen, Yang. $3 1250210
245 1 0 $a Deep Learning in Protein Design Studies and a Rank-Based Point Process Model.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2021
300 $a 135 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-01, Section: B.
500 $a Advisor: Wu, Wei.
502 $a Thesis (Ph.D.)--The Florida State University, 2021.
506 $a This item must not be sold to any third party vendors.
520 $a This dissertation consists of three projects in two research areas: 1) modeling of temporal point process (with one project) and 2) protein design in computational structural biology (with two projects). My research work in these three projects can be summarized as follows:Project 1 is on methodology development. It is well known that the classical theory on temporal point process only focuses on time-based framework, where a conditional intensity function at each given time can fully describe the process. However, such a framework cannot directly capture important overall features/patterns in the process, such as characterizing a center-outward rank or identifying outliers in a given sample. Therefore I propose a new, data-driven model for regular point process in this study, provide a probabilistic model using two factors: (1) the number of events in the process, and (2) the conditional distribution of these events given the number. The second factor is the key challenge. Based on the equivalent inter-event representation, I propose two frameworks on the inter-event times (IETs) to capture large variability in a given process -- One is to model the IETs directly by a Dirichlet mixture, and the other is to model the isometric logratio transformed IETs by a classical Gaussian mixture. Both mixture models can be properly estimated using a Dirichlet process (for the number of components) and Expectation-Maximization algorithm (for parameters in the models). In particular, I thoroughly examine the new models on the commonly used Poisson processes. I finally demonstrate the new framework on two simulated datasets and point out its effectiveness on characterizing center-outward ranks.Project 2 is a computational investigation, where I use deep learning methods to tackle inverse protein folding, a challenging problem in computational structural biology. I begin with formulating the IPF problem as predicting the residue type given the 3D structural environment around the target residue. Afterwards, I design a nine-layer deep convolutional neural network (CNN) that takes as input a gridded box with the atomic coordinates and types around the residue, captures structure information at various scales, and provides a predicted residue type. Trained on thousands of protein structures, the method, called ProDCoNN (Protein Design with Convolutional Neural Network), achieved state-of-the-art performance when tested on large numbers of test proteins and benchmark datasets.Project 3 is another computational investigation, where I use deep learning and reinforcement learning methods to study the challenging protein-ligand docking in computational structural biology. In order to find the optimal binding modes of protein-ligand complexes, I propose a supervised searching algorithm based on Asynchronous Advantage Actor Critic algorithm in reinforcement learning. I take the gridded box with protein atomic information around the target binding site as environment, design a CNN as actor to guide the ligand move in the box and another CNN as critic to predict the binding affinity. The new framework is trained and tested using hundreds of protein-ligand complexes with single-atom ligand Copper and Zinc. The result shows that the new framework can find out the target binding site efficiently and predicted a reasonable binding affinity.
590 $a School code: 0071.
650 4 $a Statistics. $3 517247
650 4 $a Bioinformatics. $3 553671
653 $a Deep learning
653 $a Dirichlet process
653 $a Inverse protein folding
653 $a Point process
653 $a Protein-ligand docking
653 $a Reinforcement learning
690 $a 0463
690 $a 0715
710 2 $a The Florida State University. $b Statistics. $3 3185231
773 0 $t Dissertations Abstracts International $g 83-01B.
790 $a 0071
791 $a Ph.D.
792 $a 2021
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28319879