語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Modeling Dependence in Large and Com...
~
Zhang, Chao.
FindBook
Google Book
Amazon
博客來
Modeling Dependence in Large and Complex Data Sets.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Modeling Dependence in Large and Complex Data Sets./
作者:
Zhang, Chao.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2022,
面頁冊數:
135 p.
附註:
Source: Dissertations Abstracts International, Volume: 84-03, Section: A.
Contained By:
Dissertations Abstracts International84-03A.
標題:
Statistics. -
電子資源:
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29206390
ISBN:
9798841779292
Modeling Dependence in Large and Complex Data Sets.
Zhang, Chao.
Modeling Dependence in Large and Complex Data Sets.
- Ann Arbor : ProQuest Dissertations & Theses, 2022 - 135 p.
Source: Dissertations Abstracts International, Volume: 84-03, Section: A.
Thesis (Ph.D.)--University of California, Santa Barbara, 2022.
Classical statistical theory mostly focuses on independent samples that reside in finite dimensional vector spaces. While such methods are often appropriate and yield fruitful results, practical data analyses often go beyond the scope of these classical settings. In particular, with technological advancements, the computing power to record large volume of data points at a high frequency is becoming more accessible than ever before. The large volume of data sets makes it possible to produce metadata on sample points\extemdash such as distributions, networks, or shapes, to name a few, and the high frequency of data records enables one to model data dependency structures at a fine temporal and/or spatial resolution that would not have been possible with sparsely recorded data. In the age of big data, the study of data atoms which constitute complex data objects and the statistical modeling of high resolution signals endowed with rich dependency structures are hitting their stride.In this dissertation, we consider two specific instances of such big data. One is time dependent distributional data represented by the corresponding probability density functions. Indeed, data consisting of time-indexed distributions of cross-sectional or intraday returns have been extensively studied in finance, and provide one example in which the data atoms consist of serially dependent probability distributions. Motivated by such data, we propose an autoregressive model for density time series by exploiting the tangent space structure on the space of distributions that is induced by the Wasserstein metric. The densities themselves are not assumed to have any specific parametric form, leading to flexible forecasting of future unobserved densities. The main estimation targets in the order-$p$ Wasserstein autoregressive model are Wasserstein autocorrelations and the vector-valued autoregressive parameter. We propose suitable estimators and establish their asymptotic normality, which is verified in a simulation study. The new order-p Wasserstein autoregressive model leads to a prediction algorithm, which includes a data driven order selection procedure. Its performance is compared to existing prediction procedures via application to four financial return data sets, where a variety of metrics are used to quantify forecasting accuracy. For most metrics, the proposed model outperforms existing methods in two of the data sets, while the best empirical performance in the other two data sets is attained by existing methods based on functional transformations of the densities.The second instance is the brain functional magnetic resonance imaging (fMRI) signals that are contaminated by spatiotemporal noise at the voxel level. Such data feature a rich spatiotemporal dependency structure due to a fine acquisition resolution. In neuroscience studies, resting state brain functional connectivity quantifies the similarity between pairs of brain regions, each of which consists of voxels at which dynamic signals are acquired via neuroimaging techniques, for example, the blood-oxygen-level-dependent (BOLD) signals that quantify an fMRI scan. Pearson correlation and similar metrics have been adopted to estimate inter-regional connectivity, often through averaging of signals within regions. However, dependencies between signals within each region and the presence of noise contaminate such inter-regional correlation estimates. We propose a mixed-effects model with a simple spatiotemporal covariance structure that explicitly isolates the different sources of variability in the observed BOLD signals, including correlated regional signals, local spatiotemporal noise, and measurement error. Methods for tackling the computational challenges associated with restricted maximum likelihood estimation will be discussed. Large sample properties are established by posing mild and practically verifiable sufficient conditions. Simulation results demonstrate that the parameters of the proposed model can be accurately estimated and is superior to the Pearson correlation of averages in the presence of spatiotemporal noise. The model was also implemented on data collected from a dead rat and an anesthetized live rat. Brain networks were constructed from estimated model parameters. Large scale parallel computing and GPU acceleration were implemented to speed up connectivity estimation.
ISBN: 9798841779292Subjects--Topical Terms:
517247
Statistics.
Subjects--Index Terms:
Functional connectivity
Modeling Dependence in Large and Complex Data Sets.
LDR
:05576nmm a2200385 4500
001
2399123
005
20240909100728.5
006
m o d
007
cr#unu||||||||
008
251215s2022 ||||||||||||||||| ||eng d
020
$a
9798841779292
035
$a
(MiAaPQ)AAI29206390
035
$a
AAI29206390
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Zhang, Chao.
$3
1620564
245
1 0
$a
Modeling Dependence in Large and Complex Data Sets.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2022
300
$a
135 p.
500
$a
Source: Dissertations Abstracts International, Volume: 84-03, Section: A.
500
$a
Advisor: Petersen, Alexander.
502
$a
Thesis (Ph.D.)--University of California, Santa Barbara, 2022.
520
$a
Classical statistical theory mostly focuses on independent samples that reside in finite dimensional vector spaces. While such methods are often appropriate and yield fruitful results, practical data analyses often go beyond the scope of these classical settings. In particular, with technological advancements, the computing power to record large volume of data points at a high frequency is becoming more accessible than ever before. The large volume of data sets makes it possible to produce metadata on sample points\extemdash such as distributions, networks, or shapes, to name a few, and the high frequency of data records enables one to model data dependency structures at a fine temporal and/or spatial resolution that would not have been possible with sparsely recorded data. In the age of big data, the study of data atoms which constitute complex data objects and the statistical modeling of high resolution signals endowed with rich dependency structures are hitting their stride.In this dissertation, we consider two specific instances of such big data. One is time dependent distributional data represented by the corresponding probability density functions. Indeed, data consisting of time-indexed distributions of cross-sectional or intraday returns have been extensively studied in finance, and provide one example in which the data atoms consist of serially dependent probability distributions. Motivated by such data, we propose an autoregressive model for density time series by exploiting the tangent space structure on the space of distributions that is induced by the Wasserstein metric. The densities themselves are not assumed to have any specific parametric form, leading to flexible forecasting of future unobserved densities. The main estimation targets in the order-$p$ Wasserstein autoregressive model are Wasserstein autocorrelations and the vector-valued autoregressive parameter. We propose suitable estimators and establish their asymptotic normality, which is verified in a simulation study. The new order-p Wasserstein autoregressive model leads to a prediction algorithm, which includes a data driven order selection procedure. Its performance is compared to existing prediction procedures via application to four financial return data sets, where a variety of metrics are used to quantify forecasting accuracy. For most metrics, the proposed model outperforms existing methods in two of the data sets, while the best empirical performance in the other two data sets is attained by existing methods based on functional transformations of the densities.The second instance is the brain functional magnetic resonance imaging (fMRI) signals that are contaminated by spatiotemporal noise at the voxel level. Such data feature a rich spatiotemporal dependency structure due to a fine acquisition resolution. In neuroscience studies, resting state brain functional connectivity quantifies the similarity between pairs of brain regions, each of which consists of voxels at which dynamic signals are acquired via neuroimaging techniques, for example, the blood-oxygen-level-dependent (BOLD) signals that quantify an fMRI scan. Pearson correlation and similar metrics have been adopted to estimate inter-regional connectivity, often through averaging of signals within regions. However, dependencies between signals within each region and the presence of noise contaminate such inter-regional correlation estimates. We propose a mixed-effects model with a simple spatiotemporal covariance structure that explicitly isolates the different sources of variability in the observed BOLD signals, including correlated regional signals, local spatiotemporal noise, and measurement error. Methods for tackling the computational challenges associated with restricted maximum likelihood estimation will be discussed. Large sample properties are established by posing mild and practically verifiable sufficient conditions. Simulation results demonstrate that the parameters of the proposed model can be accurately estimated and is superior to the Pearson correlation of averages in the presence of spatiotemporal noise. The model was also implemented on data collected from a dead rat and an anesthetized live rat. Brain networks were constructed from estimated model parameters. Large scale parallel computing and GPU acceleration were implemented to speed up connectivity estimation.
590
$a
School code: 0035.
650
4
$a
Statistics.
$3
517247
650
4
$a
Statistical physics.
$3
536281
650
4
$a
Information science.
$3
554358
653
$a
Functional connectivity
653
$a
Functional data analysis
653
$a
Object-oriented statistics
653
$a
Spatiotemporal modeling
653
$a
Time series
690
$a
0463
690
$a
0723
690
$a
0217
710
2
$a
University of California, Santa Barbara.
$b
Statistics and Applied Probability.
$3
3170160
773
0
$t
Dissertations Abstracts International
$g
84-03A.
790
$a
0035
791
$a
Ph.D.
792
$a
2022
793
$a
English
856
4 0
$u
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29206390
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9507443
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入