Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Linked to FindBook
Google Book
Amazon
博客來
Optimisation & Generalisation in Networks of Neurons.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Optimisation & Generalisation in Networks of Neurons./
Author:
Bernstein, Jeremy.
Description:
1 online resource (99 pages)
Notes:
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
Contained By:
Dissertations Abstracts International84-12B.
Subject:
Applied mathematics. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30548381click for full text (PQDT)
ISBN:
9798379694326
Optimisation & Generalisation in Networks of Neurons.
Bernstein, Jeremy.
Optimisation & Generalisation in Networks of Neurons.
- 1 online resource (99 pages)
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
Thesis (Ph.D.)--California Institute of Technology, 2023.
Includes bibliographical references
The goal of this thesis is to develop the optimisation and generalisation theoretic foundations of learning in artificial neural networks. The thesis tackles two central questions. Given training data and a network architecture:Which weight setting will generalise best to unseen data, and why?What optimiser should be used to recover this weight setting?On optimisation, an essential feature of neural network training is that the network weights affect the loss function only indirectly through their appearance in the network architecture. This thesis proposes a three-step framework for deriving novel "architecture aware" optimisation algorithms. The first step-termed functional majorisation-is to majorise a series expansion of the loss function in terms of functional perturbations. The second step is to derive architectural perturbation bounds that relate the size of functional perturbations to the size of weight perturbations. The third step is to substitute these architectural perturbation bounds into the functional majorisation of the loss and to obtain an optimisation algorithm via minimisation. This constitutes an application of the majorise-minimise meta-algorithm to neural networks.On generalisation, a promising recent line of work has applied PAC-Bayes theory to derive non-vacuous generalisation guarantees for neural networks. Since these guarantees control the average risk of ensembles of networks, they do not address which individual network should generalise best. To close this gap, the thesis rekindles an old idea from the kernels literature: the Bayes point machine. A Bayes point machine is a single classifier that approximates the aggregate prediction of an ensemble of classifiers. Since aggregation reduces the variance of ensemble predictions, Bayes point machines tend to generalise better than other ensemble members. The thesis shows that the space of neural networks consistent with a training set concentrates on a Bayes point machine if both the network width and normalised margin are sent to infinity. This motivates the practice of returning a wide network of large normalised margin.Potential applications of these ideas include novel methods for uncertainty quantification, more efficient numerical representations for neural hardware, and optimisers that transfer hyperparameters across learning problems.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023
Mode of access: World Wide Web
ISBN: 9798379694326Subjects--Topical Terms:
2122814
Applied mathematics.
Index Terms--Genre/Form:
542853
Electronic books.
Optimisation & Generalisation in Networks of Neurons.
LDR
:03618nmm a2200349K 4500
001
2360144
005
20230925052836.5
006
m o d
007
cr mn ---uuuuu
008
241011s2023 xx obm 000 0 eng d
020
$a
9798379694326
035
$a
(MiAaPQ)AAI30548381
035
$a
(MiAaPQ)Caltech_oaithesislibrarycaltechedu15041
035
$a
AAI30548381
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Bernstein, Jeremy.
$3
924212
245
1 0
$a
Optimisation & Generalisation in Networks of Neurons.
264
0
$c
2023
300
$a
1 online resource (99 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
500
$a
Advisor: Yue, Yisong.
502
$a
Thesis (Ph.D.)--California Institute of Technology, 2023.
504
$a
Includes bibliographical references
520
$a
The goal of this thesis is to develop the optimisation and generalisation theoretic foundations of learning in artificial neural networks. The thesis tackles two central questions. Given training data and a network architecture:Which weight setting will generalise best to unseen data, and why?What optimiser should be used to recover this weight setting?On optimisation, an essential feature of neural network training is that the network weights affect the loss function only indirectly through their appearance in the network architecture. This thesis proposes a three-step framework for deriving novel "architecture aware" optimisation algorithms. The first step-termed functional majorisation-is to majorise a series expansion of the loss function in terms of functional perturbations. The second step is to derive architectural perturbation bounds that relate the size of functional perturbations to the size of weight perturbations. The third step is to substitute these architectural perturbation bounds into the functional majorisation of the loss and to obtain an optimisation algorithm via minimisation. This constitutes an application of the majorise-minimise meta-algorithm to neural networks.On generalisation, a promising recent line of work has applied PAC-Bayes theory to derive non-vacuous generalisation guarantees for neural networks. Since these guarantees control the average risk of ensembles of networks, they do not address which individual network should generalise best. To close this gap, the thesis rekindles an old idea from the kernels literature: the Bayes point machine. A Bayes point machine is a single classifier that approximates the aggregate prediction of an ensemble of classifiers. Since aggregation reduces the variance of ensemble predictions, Bayes point machines tend to generalise better than other ensemble members. The thesis shows that the space of neural networks consistent with a training set concentrates on a Bayes point machine if both the network width and normalised margin are sent to infinity. This motivates the practice of returning a wide network of large normalised margin.Potential applications of these ideas include novel methods for uncertainty quantification, more efficient numerical representations for neural hardware, and optimisers that transfer hyperparameters across learning problems.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2023
538
$a
Mode of access: World Wide Web
650
4
$a
Applied mathematics.
$3
2122814
650
4
$a
Deep learning.
$3
3554982
650
4
$a
Computer peripherals.
$3
659962
650
4
$a
Hilbert space.
$3
558371
650
4
$a
Neural networks.
$3
677449
650
4
$a
Computer engineering.
$3
621879
650
4
$a
Algorithms.
$3
536374
655
7
$a
Electronic books.
$2
lcsh
$3
542853
690
$a
0364
690
$a
0464
690
$a
0800
710
2
$a
ProQuest Information and Learning Co.
$3
783688
710
2
$a
California Institute of Technology.
$b
Biology and Biological Engineering.
$3
3700756
773
0
$t
Dissertations Abstracts International
$g
84-12B.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30548381
$z
click for full text (PQDT)
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9482500
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login