語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
到查詢結果
[ null ]
切換:
標籤
|
MARC模式
|
ISBD
FindBook
Google Book
Amazon
博客來
Statistical Methods to Incorporate External Summary-Level Information into a Current Study.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Statistical Methods to Incorporate External Summary-Level Information into a Current Study./
作者:
Gu, Tian.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:
115 p.
附註:
Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
Contained By:
Dissertations Abstracts International83-05B.
標題:
Biostatistics. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28844406
ISBN:
9798471100381
Statistical Methods to Incorporate External Summary-Level Information into a Current Study.
Gu, Tian.
Statistical Methods to Incorporate External Summary-Level Information into a Current Study.
- Ann Arbor : ProQuest Dissertations & Theses, 2021 - 115 p.
Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
Thesis (Ph.D.)--University of Michigan, 2021.
This item must not be sold to any third party vendors.
In the era of big data, it is becoming increasingly common for researchers to consider incorporating external information from large studies to improve the accuracy of statistical inference instead of relying on a modestly sized dataset collected internally. We consider a general statistical problem where there are some known regression models or risk calculators to predict an outcome of interest from a set of commonly used predictors. Different types of summary information are available for these external models. An internal modest-sized dataset containing individual-level data for the variables in the known models and some new variables is available for our current analysis. In all three chapters below, we consider different settings to achieve the same goal--to build an improved prediction model that includes the new variables, using both the internal individual-level data and summary information obtained from the known external model(s). In Chapter 2, we focus on the simple case where there is only one large, well-characterized previous study from the external population. We propose a synthetic data approach, which first converts the external information into synthetic data, and then analyzes a combined dataset consisting of the observed internal data and the synthetic data. A theoretical justification and extensive simulation studies establish the efficiency gain and improved prediction performance of the proposed data integration method. We also illustrate that even under less restrictive requirements on the information that is available externally, the combined estimates have the same asymptotic properties as an alternative constraint maximum likelihood estimation approach. In Chapter 3, we consider a more complicated but quite plausible situation where several external prediction models are available to aid inference and prediction for the internal study. We assume that each of the external studies developed a prediction model for the same outcome but may use a slightly different set of covariates. We propose a meta-inference framework using an empirical Bayes estimation approach, which adaptively combines the estimates from the external models. This adaptive approach diminishes the influence of information that is less compatible with the internal data while balancing the bias-variance trade-off. The estimators we proposed are more efficient than the naive analysis of the internal data. In Chapter 4, we first extend the synthetic data method from Chapter 2 to accommodate the situation with multiple external prediction models, and further allow for heterogeneity of covariate effects across the external populations. Each external model could potentially be built on slightly different subsets of covariates that are measured in the internal study. The proposed approach generates synthetic outcome data in each population, uses stacked multiple imputation to create a long dataset with complete covariate information, and finally analyzes the imputed data with weighted regression. Leveraging multiple sources of auxiliary information from a broad class of externally fitted predictive models or established risk calculators based on parametric regression or machine learning methods, this new strategy can make statistical inference more accurate for both the internal population and the external populations. We evaluate the proposed methods through extensive simulations and apply them to improve models for predicting the risk of high-grade prostate cancer.
ISBN: 9798471100381Subjects--Topical Terms:
1002712
Biostatistics.
Subjects--Index Terms:
Data integration
Statistical Methods to Incorporate External Summary-Level Information into a Current Study.
LDR
:04779nmm a2200373 4500
001
2346917
005
20220706051323.5
008
241004s2021 ||||||||||||||||| ||eng d
020
$a
9798471100381
035
$a
(MiAaPQ)AAI28844406
035
$a
(MiAaPQ)umichrackham003740
035
$a
AAI28844406
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Gu, Tian.
$3
3686124
245
1 0
$a
Statistical Methods to Incorporate External Summary-Level Information into a Current Study.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2021
300
$a
115 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
500
$a
Advisor: Mukherjee, Bhramar;Taylor, Jeremy M. G.
502
$a
Thesis (Ph.D.)--University of Michigan, 2021.
506
$a
This item must not be sold to any third party vendors.
506
$a
This item must not be added to any third party search indexes.
520
$a
In the era of big data, it is becoming increasingly common for researchers to consider incorporating external information from large studies to improve the accuracy of statistical inference instead of relying on a modestly sized dataset collected internally. We consider a general statistical problem where there are some known regression models or risk calculators to predict an outcome of interest from a set of commonly used predictors. Different types of summary information are available for these external models. An internal modest-sized dataset containing individual-level data for the variables in the known models and some new variables is available for our current analysis. In all three chapters below, we consider different settings to achieve the same goal--to build an improved prediction model that includes the new variables, using both the internal individual-level data and summary information obtained from the known external model(s). In Chapter 2, we focus on the simple case where there is only one large, well-characterized previous study from the external population. We propose a synthetic data approach, which first converts the external information into synthetic data, and then analyzes a combined dataset consisting of the observed internal data and the synthetic data. A theoretical justification and extensive simulation studies establish the efficiency gain and improved prediction performance of the proposed data integration method. We also illustrate that even under less restrictive requirements on the information that is available externally, the combined estimates have the same asymptotic properties as an alternative constraint maximum likelihood estimation approach. In Chapter 3, we consider a more complicated but quite plausible situation where several external prediction models are available to aid inference and prediction for the internal study. We assume that each of the external studies developed a prediction model for the same outcome but may use a slightly different set of covariates. We propose a meta-inference framework using an empirical Bayes estimation approach, which adaptively combines the estimates from the external models. This adaptive approach diminishes the influence of information that is less compatible with the internal data while balancing the bias-variance trade-off. The estimators we proposed are more efficient than the naive analysis of the internal data. In Chapter 4, we first extend the synthetic data method from Chapter 2 to accommodate the situation with multiple external prediction models, and further allow for heterogeneity of covariate effects across the external populations. Each external model could potentially be built on slightly different subsets of covariates that are measured in the internal study. The proposed approach generates synthetic outcome data in each population, uses stacked multiple imputation to create a long dataset with complete covariate information, and finally analyzes the imputed data with weighted regression. Leveraging multiple sources of auxiliary information from a broad class of externally fitted predictive models or established risk calculators based on parametric regression or machine learning methods, this new strategy can make statistical inference more accurate for both the internal population and the external populations. We evaluate the proposed methods through extensive simulations and apply them to improve models for predicting the risk of high-grade prostate cancer.
590
$a
School code: 0127.
650
4
$a
Biostatistics.
$3
1002712
650
4
$a
Statistics.
$3
517247
650
4
$a
Statistical physics.
$3
536281
653
$a
Data integration
653
$a
Prediction models
653
$a
Regression inference
690
$a
0308
690
$a
0217
690
$a
0463
710
2
$a
University of Michigan.
$b
Biostatistics.
$3
3352160
773
0
$t
Dissertations Abstracts International
$g
83-05B.
790
$a
0127
791
$a
Ph.D.
792
$a
2021
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28844406
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9469355
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入
(1)帳號:一般為「身分證號」;外籍生或交換生則為「學號」。 (2)密碼:預設為帳號末四碼。
帳號
.
密碼
.
請在此電腦上記得個人資料
取消
忘記密碼? (請注意!您必須已在系統登記E-mail信箱方能使用。)