Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
A Data Science Perspective on Search...
~
Espejo Morales, Irina.
Linked to FindBook
Google Book
Amazon
博客來
A Data Science Perspective on Searches for New Physics at the LHC.
Record Type:
Electronic resources : Monograph/item
Title/Author:
A Data Science Perspective on Searches for New Physics at the LHC./
Author:
Espejo Morales, Irina.
Published:
Ann Arbor : ProQuest Dissertations & Theses, : 2023,
Description:
128 p.
Notes:
Source: Dissertations Abstracts International, Volume: 85-04, Section: B.
Contained By:
Dissertations Abstracts International85-04B.
Subject:
Computer science. -
Online resource:
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30314708
ISBN:
9798380623179
A Data Science Perspective on Searches for New Physics at the LHC.
Espejo Morales, Irina.
A Data Science Perspective on Searches for New Physics at the LHC.
- Ann Arbor : ProQuest Dissertations & Theses, 2023 - 128 p.
Source: Dissertations Abstracts International, Volume: 85-04, Section: B.
Thesis (Ph.D.)--New York University, 2023.
This item must not be sold to any third party vendors.
The use of Machine Learning (ML) techniques in the field of Physical Sciences has gained growing attention in recent years due to its potential to accelerate scientific discovery. In particular, Computational High Energy Physics (HEP) presents distinctive challenges to Data Science (DS), from methodology design to deployment of solutions, when searching for new physics at the Large Hadron Collider (LHC). In this thesis, we address key bottlenecks in HEP by leveraging the knowledge and practices accumulated from decades of Data Science (DS) end-to-end research. We provide practical solutions to contemporary HEP issues, emphasizing re-usability, with potential for mainstream deployments at scale in the ATLAS Experiment.In the first project, we develop scalable cyberinfrastructure that integrates pre-existing ML techniques into a standard analysis pipeline for new physics searches. The outcome is a distributed workflow that is containerized, parametrized, with a shallow learning curve, and reproducible. Experiments to test the scalability limits of this approach were performed at the National Energy Research Scientific Computing Center (NERSC). The processing time for 11 million collision samples was reduced from days to 5 hours.In the second project, we address the curse of dimensionality in producing confidence limit contours for hypothesis testing of new physics theories. We use the Active Learning framework and low-fidelity data to train a Multitask Gaussian Process (GP) to intelligently evaluate the high-fidelity hypothesis testing pipeline only in regions of interest. This approach leads to the production of 4D contours, an improvement compared to traditional 1-2D contours. The study is performed in real-world settings without approximations, as reviewed by the ATLAS Experiment.Finally, we conclude with a discussion on the interplay between Data Science and High Energy Physics, focusing on leveraging domain knowledge in a general way, producing long-lasting results for the community to build upon, and systematically exploiting past results beyond benchmarking studies.
ISBN: 9798380623179Subjects--Topical Terms:
523869
Computer science.
Subjects--Index Terms:
Active learning
A Data Science Perspective on Searches for New Physics at the LHC.
LDR
:03337nmm a2200409 4500
001
2395642
005
20240517104940.5
006
m o d
007
cr#unu||||||||
008
251215s2023 ||||||||||||||||| ||eng d
020
$a
9798380623179
035
$a
(MiAaPQ)AAI30314708
035
$a
AAI30314708
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Espejo Morales, Irina.
$3
3765152
245
1 2
$a
A Data Science Perspective on Searches for New Physics at the LHC.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2023
300
$a
128 p.
500
$a
Source: Dissertations Abstracts International, Volume: 85-04, Section: B.
500
$a
Advisor: Cranmer, Kyle S.
502
$a
Thesis (Ph.D.)--New York University, 2023.
506
$a
This item must not be sold to any third party vendors.
520
$a
The use of Machine Learning (ML) techniques in the field of Physical Sciences has gained growing attention in recent years due to its potential to accelerate scientific discovery. In particular, Computational High Energy Physics (HEP) presents distinctive challenges to Data Science (DS), from methodology design to deployment of solutions, when searching for new physics at the Large Hadron Collider (LHC). In this thesis, we address key bottlenecks in HEP by leveraging the knowledge and practices accumulated from decades of Data Science (DS) end-to-end research. We provide practical solutions to contemporary HEP issues, emphasizing re-usability, with potential for mainstream deployments at scale in the ATLAS Experiment.In the first project, we develop scalable cyberinfrastructure that integrates pre-existing ML techniques into a standard analysis pipeline for new physics searches. The outcome is a distributed workflow that is containerized, parametrized, with a shallow learning curve, and reproducible. Experiments to test the scalability limits of this approach were performed at the National Energy Research Scientific Computing Center (NERSC). The processing time for 11 million collision samples was reduced from days to 5 hours.In the second project, we address the curse of dimensionality in producing confidence limit contours for hypothesis testing of new physics theories. We use the Active Learning framework and low-fidelity data to train a Multitask Gaussian Process (GP) to intelligently evaluate the high-fidelity hypothesis testing pipeline only in regions of interest. This approach leads to the production of 4D contours, an improvement compared to traditional 1-2D contours. The study is performed in real-world settings without approximations, as reviewed by the ATLAS Experiment.Finally, we conclude with a discussion on the interplay between Data Science and High Energy Physics, focusing on leveraging domain knowledge in a general way, producing long-lasting results for the community to build upon, and systematically exploiting past results beyond benchmarking studies.
590
$a
School code: 0146.
650
4
$a
Computer science.
$3
523869
650
4
$a
Computational physics.
$3
3343998
650
4
$a
Information technology.
$3
532993
653
$a
Active learning
653
$a
Data science
653
$a
High Energy Physics
653
$a
Hypothesis testing
653
$a
Machine learning
690
$a
0984
690
$a
0216
690
$a
0489
690
$a
0800
710
2
$a
New York University.
$b
Center for Data Science.
$3
3686663
773
0
$t
Dissertations Abstracts International
$g
85-04B.
790
$a
0146
791
$a
Ph.D.
792
$a
2023
793
$a
English
856
4 0
$u
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30314708
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9503962
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login