語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Machine Learning for Prediction of P...
~
Kool, Daniel.
FindBook
Google Book
Amazon
博客來
Machine Learning for Prediction of Protein Properties.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Machine Learning for Prediction of Protein Properties./
作者:
Kool, Daniel.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2023,
面頁冊數:
221 p.
附註:
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
Contained By:
Dissertations Abstracts International84-12B.
標題:
Bioinformatics. -
電子資源:
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30420042
ISBN:
9798379737405
Machine Learning for Prediction of Protein Properties.
Kool, Daniel.
Machine Learning for Prediction of Protein Properties.
- Ann Arbor : ProQuest Dissertations & Theses, 2023 - 221 p.
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
Thesis (Ph.D.)--Iowa State University, 2023.
The first part of this thesis presents a comprehensive investigation into the intricate relationship between protein residues, geometries, and pocket features and their impact on protein functionality. The primary investigative tool used throughout this study is machine learning, with a focus on the eXtreme Gradient Boosting (XGBoost) tree-based classification method. The approach emphasizes the importance of accurately treating and preparing the data to obtain reliable and insightful results, and careful attention is given to the preparation and analysis of input features to gain a comprehensive understanding of the underlying mechanisms, particularly in terms of the characteristics of the individual amino acids participating.One of the main challenges that was addressed in this thesis is ways to deal with highly imbalanced datasets. To address this challenge, various scaling/standardization functions and techniques have been employed to generate synthetic samples. The results highlight significant differences and consistencies between these different data preparation schemes. Additionally, we used the SHAP method to identify important features and variables for the machine learning model, obtaining global and residue-level importance values. By identifying these key features and variables, we gain a deeper understanding of some of the details of the underlying mechanisms that influence protein functionality.The methods and data preparation strategies are extended in this study to predict ligand binding residues. Specifically, the binding of two biologically significant ligands, HEM and PLP, is investigated using similar geometric and physicochemical properties. The insights gained from this study can inform future experimental work and accelerate the discovery of new therapies for diseases. By accurately predicting ligand binding residues, we can better understand how proteins interact with their environment in general and in specific ways, and how we can modify these interactions to improve health outcomes.In addition to ligand binding, we also explore the use of machine learning to predict free energy and phenotype changes caused by mutations in proteins. By understanding how mutations affect protein function, we can better understand the mechanisms of diseases and ultimately develop more effective restorative treatments. We also present a novel method to predict the melting temperature of proteins from different datasets with high accuracy, utilizing a neural network approach and consider two different sets of input features. This method can be used to better understand how proteins behave under different conditions, and to develop more stable and effective proteins for use in biotechnology and medicine.Finally, this study describes the development of BioMakie.jl, a Julia programming package that provides a range of tools for investigating proteins. The package currently allows users to view proteins and multiple sequence alignments, with ongoing development focused on creating new visualizations and connecting Julia's event systems to web/JavaScript. By reducing the need to know multiple coding languages and lowering the learning curve for protein analysis, BioMakie.jl aims to make it easier for researchers to explore and understand protein structures and functions. This tool can be used by researchers across a wide range of disciplines to better understand the fundamental building blocks of life and their mechanisms.Overall we demonstrate the power of machine learning and feature importance methods in analyzing complex biological systems, such as proteins. By gaining a deeper understanding of the underlying mechanisms that influence protein functionality, we can eventually develop more effective therapies for diseases. Additionally, the development of BioMakie.jl provides a powerful tool for researchers to investigate proteins and gain new insights into their structures and functions.
ISBN: 9798379737405Subjects--Topical Terms:
553671
Bioinformatics.
Subjects--Index Terms:
Feature importance
Machine Learning for Prediction of Protein Properties.
LDR
:05139nmm a2200409 4500
001
2401056
005
20241015112509.5
006
m o d
007
cr#unu||||||||
008
251215s2023 ||||||||||||||||| ||eng d
020
$a
9798379737405
035
$a
(MiAaPQ)AAI30420042
035
$a
AAI30420042
035
$a
2401056
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Kool, Daniel.
$3
3771118
245
1 0
$a
Machine Learning for Prediction of Protein Properties.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2023
300
$a
221 p.
500
$a
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
500
$a
Advisor: Jernigan, Robert L.
502
$a
Thesis (Ph.D.)--Iowa State University, 2023.
520
$a
The first part of this thesis presents a comprehensive investigation into the intricate relationship between protein residues, geometries, and pocket features and their impact on protein functionality. The primary investigative tool used throughout this study is machine learning, with a focus on the eXtreme Gradient Boosting (XGBoost) tree-based classification method. The approach emphasizes the importance of accurately treating and preparing the data to obtain reliable and insightful results, and careful attention is given to the preparation and analysis of input features to gain a comprehensive understanding of the underlying mechanisms, particularly in terms of the characteristics of the individual amino acids participating.One of the main challenges that was addressed in this thesis is ways to deal with highly imbalanced datasets. To address this challenge, various scaling/standardization functions and techniques have been employed to generate synthetic samples. The results highlight significant differences and consistencies between these different data preparation schemes. Additionally, we used the SHAP method to identify important features and variables for the machine learning model, obtaining global and residue-level importance values. By identifying these key features and variables, we gain a deeper understanding of some of the details of the underlying mechanisms that influence protein functionality.The methods and data preparation strategies are extended in this study to predict ligand binding residues. Specifically, the binding of two biologically significant ligands, HEM and PLP, is investigated using similar geometric and physicochemical properties. The insights gained from this study can inform future experimental work and accelerate the discovery of new therapies for diseases. By accurately predicting ligand binding residues, we can better understand how proteins interact with their environment in general and in specific ways, and how we can modify these interactions to improve health outcomes.In addition to ligand binding, we also explore the use of machine learning to predict free energy and phenotype changes caused by mutations in proteins. By understanding how mutations affect protein function, we can better understand the mechanisms of diseases and ultimately develop more effective restorative treatments. We also present a novel method to predict the melting temperature of proteins from different datasets with high accuracy, utilizing a neural network approach and consider two different sets of input features. This method can be used to better understand how proteins behave under different conditions, and to develop more stable and effective proteins for use in biotechnology and medicine.Finally, this study describes the development of BioMakie.jl, a Julia programming package that provides a range of tools for investigating proteins. The package currently allows users to view proteins and multiple sequence alignments, with ongoing development focused on creating new visualizations and connecting Julia's event systems to web/JavaScript. By reducing the need to know multiple coding languages and lowering the learning curve for protein analysis, BioMakie.jl aims to make it easier for researchers to explore and understand protein structures and functions. This tool can be used by researchers across a wide range of disciplines to better understand the fundamental building blocks of life and their mechanisms.Overall we demonstrate the power of machine learning and feature importance methods in analyzing complex biological systems, such as proteins. By gaining a deeper understanding of the underlying mechanisms that influence protein functionality, we can eventually develop more effective therapies for diseases. Additionally, the development of BioMakie.jl provides a powerful tool for researchers to investigate proteins and gain new insights into their structures and functions.
590
$a
School code: 0097.
650
4
$a
Bioinformatics.
$3
553671
650
4
$a
Biochemistry.
$3
518028
650
4
$a
Biology.
$3
522710
650
4
$a
Systematic biology.
$3
3173492
653
$a
Feature importance
653
$a
Machine learning
653
$a
Protein biology
653
$a
Imbalanced datasets
653
$a
BioMakie.jl
690
$a
0715
690
$a
0487
690
$a
0306
690
$a
0423
710
2
$a
Iowa State University.
$b
Biochem, Biophysics, and Molecular Biology.
$3
3688747
773
0
$t
Dissertations Abstracts International
$g
84-12B.
790
$a
0097
791
$a
Ph.D.
792
$a
2023
793
$a
English
856
4 0
$u
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30420042
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9509376
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入