Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Linked to FindBook
Google Book
Amazon
博客來
Predicting, Engineering and Interpreting Gene Regulatory Sequences and Proteins with Deep Learning.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Predicting, Engineering and Interpreting Gene Regulatory Sequences and Proteins with Deep Learning./
Author:
Linder, Johannes Staffan Anders.
Published:
Ann Arbor : ProQuest Dissertations & Theses, : 2021,
Description:
126 p.
Notes:
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
Contained By:
Dissertations Abstracts International83-02B.
Subject:
Computer science. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28545714
ISBN:
9798535503622
Predicting, Engineering and Interpreting Gene Regulatory Sequences and Proteins with Deep Learning.
Linder, Johannes Staffan Anders.
Predicting, Engineering and Interpreting Gene Regulatory Sequences and Proteins with Deep Learning.
- Ann Arbor : ProQuest Dissertations & Theses, 2021 - 126 p.
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
Thesis (Ph.D.)--University of Washington, 2021.
This item must not be sold to any third party vendors.
The vast majority of the 3.1 billion base-pairs in the (haploid) human genome do not code for a particular protein, yet mutations in these non-coding regions can have a profound impact on phenotype and be deleterious. The reason is that within these regions - enhancers, promoters, introns and untranslated regions (UTRs) - reside a cis-regulatory code which governs gene expression and is sensitive to disruption. Ongoing efforts of mapping the relationship between genetic variants and disease phenotype are limited by data and the lack of generalizability. Furthermore, engineering de novo gene-regulatory sequences and proteins according to target specifications, which would aid the development of vaccines, medical therapeutics, molecular sensing devices and more, is hampered by the lack of methods that can reliably generate large sets of diverse and optimized candidate designs for high-throughput screening.This dissertation presents an approach combining Massively Parallel Reporter Assays (MPRAs) with Deep Learning to obtain a sequence-predictive model of Alternative Polyadenylation (APA), a regulatory process occurring mainly in the 3' UTR of pre-mRNA. The trained neural network predicts 3'-end cleavage at base-pair resolution and can accurately prioritize human variants. By developing methods to visualize features learned in higher-order network layers, we extract a cis-regulatory APA code that aligns well with established biology.Next, the dissertation presents a family of methods that were developed to design de novo biological sequences based on the response of a differentiable fitness predictor. These methods, which are based on activation maximization, can be used to efficiently generate millions of diverse, optimized sequence designs on the basis of a deep generative model. Finally, we present a feature attribution method for interpreting neural network predictions. The method, which learns input masks that either reconstruct or destroy the prediction, implements a masking operator based on probabilistic sampling that is shown to be particularly well-suited for interpreting biological sequence models. The developed design- and interpretation methods are demonstrated on several DNA-, RNA- and protein function predictors and outperform state-of-the-art methods for multiple target applications.
ISBN: 9798535503622Subjects--Topical Terms:
523869
Computer science.
Subjects--Index Terms:
Massively Parallel Reporter Assays
Predicting, Engineering and Interpreting Gene Regulatory Sequences and Proteins with Deep Learning.
LDR
:03598nmm a2200397 4500
001
2342370
005
20220318093125.5
008
241004s2021 ||||||||||||||||| ||eng d
020
$a
9798535503622
035
$a
(MiAaPQ)AAI28545714
035
$a
AAI28545714
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Linder, Johannes Staffan Anders.
$3
3680722
245
1 0
$a
Predicting, Engineering and Interpreting Gene Regulatory Sequences and Proteins with Deep Learning.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2021
300
$a
126 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
500
$a
Advisor: Seelig, Georg.
502
$a
Thesis (Ph.D.)--University of Washington, 2021.
506
$a
This item must not be sold to any third party vendors.
520
$a
The vast majority of the 3.1 billion base-pairs in the (haploid) human genome do not code for a particular protein, yet mutations in these non-coding regions can have a profound impact on phenotype and be deleterious. The reason is that within these regions - enhancers, promoters, introns and untranslated regions (UTRs) - reside a cis-regulatory code which governs gene expression and is sensitive to disruption. Ongoing efforts of mapping the relationship between genetic variants and disease phenotype are limited by data and the lack of generalizability. Furthermore, engineering de novo gene-regulatory sequences and proteins according to target specifications, which would aid the development of vaccines, medical therapeutics, molecular sensing devices and more, is hampered by the lack of methods that can reliably generate large sets of diverse and optimized candidate designs for high-throughput screening.This dissertation presents an approach combining Massively Parallel Reporter Assays (MPRAs) with Deep Learning to obtain a sequence-predictive model of Alternative Polyadenylation (APA), a regulatory process occurring mainly in the 3' UTR of pre-mRNA. The trained neural network predicts 3'-end cleavage at base-pair resolution and can accurately prioritize human variants. By developing methods to visualize features learned in higher-order network layers, we extract a cis-regulatory APA code that aligns well with established biology.Next, the dissertation presents a family of methods that were developed to design de novo biological sequences based on the response of a differentiable fitness predictor. These methods, which are based on activation maximization, can be used to efficiently generate millions of diverse, optimized sequence designs on the basis of a deep generative model. Finally, we present a feature attribution method for interpreting neural network predictions. The method, which learns input masks that either reconstruct or destroy the prediction, implements a masking operator based on probabilistic sampling that is shown to be particularly well-suited for interpreting biological sequence models. The developed design- and interpretation methods are demonstrated on several DNA-, RNA- and protein function predictors and outperform state-of-the-art methods for multiple target applications.
590
$a
School code: 0250.
650
4
$a
Computer science.
$3
523869
650
4
$a
Bioinformatics.
$3
553671
650
4
$a
Molecular biology.
$3
517296
650
4
$a
Computational chemistry.
$3
3350019
650
4
$a
Deep learning.
$3
3554982
650
4
$a
Mutation.
$3
837917
650
4
$a
Optimization.
$3
891104
650
4
$a
Binding sites.
$3
3560349
650
4
$a
Amino acids.
$3
558768
650
4
$a
Genomes.
$3
592593
650
4
$a
Biology.
$3
522710
650
4
$a
Kinases.
$3
3558077
650
4
$a
Entropy.
$3
546219
650
4
$a
Design techniques.
$3
3561498
650
4
$a
Annealing.
$2
lcstt
$3
3267268
650
4
$a
Proteins.
$3
558769
650
4
$a
Gene expression.
$3
643979
650
4
$a
Temperature.
$3
711968
650
4
$a
Neural networks.
$3
677449
650
4
$a
Medical research.
$2
bicssc
$3
1556686
650
4
$a
Engineering.
$3
586835
650
4
$a
Methods.
$3
3560391
653
$a
Massively Parallel Reporter Assays
653
$a
Trained neural networks
653
$a
Sequence-predictive model
653
$a
Alternative polyadenylation
690
$a
0984
690
$a
0307
690
$a
0219
690
$a
0715
690
$a
0306
690
$a
0537
710
2
$a
University of Washington.
$b
Computer Science and Engineering.
$3
2097608
773
0
$t
Dissertations Abstracts International
$g
83-02B.
790
$a
0250
791
$a
Ph.D.
792
$a
2021
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28545714
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9464808
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login