語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
到查詢結果
[ null ]
切換:
標籤
|
MARC模式
|
ISBD
FindBook
Google Book
Amazon
博客來
Learning with Imperfect Data and Supervision for Visual Perception and Understanding.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Learning with Imperfect Data and Supervision for Visual Perception and Understanding./
作者:
Zhang, Cheng.
面頁冊數:
1 online resource (231 pages)
附註:
Source: Dissertations Abstracts International, Volume: 84-04, Section: A.
Contained By:
Dissertations Abstracts International84-04A.
標題:
Computer engineering. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30013409click for full text (PQDT)
ISBN:
9798351442631
Learning with Imperfect Data and Supervision for Visual Perception and Understanding.
Zhang, Cheng.
Learning with Imperfect Data and Supervision for Visual Perception and Understanding.
- 1 online resource (231 pages)
Source: Dissertations Abstracts International, Volume: 84-04, Section: A.
Thesis (Ph.D.)--The Ohio State University, 2022.
Includes bibliographical references
Having access to large amounts of well-balanced and well-labeled data is one of the most important components of learning-based perception systems. However, gathering and labeling high-quality training examples are often time-consuming, expensive, and error-prone. For example, human-curated datasets may come with a variety of unsatisfactory forms such as noisy, weak labels, and long-tailed data distributions. As a result, machine learning models trained with these imperfect data and annotations appear to be brittle in complex real-world scenarios. In this dissertation, we provide a set of techniques for taming imperfections of data and supervision in different applications, ranging from instance-level object perception to multimodal scene understanding. The core idea is to exploit the massive amounts of raw data and multimodal structures from different sources and integrate them with deep learning.The main body of this dissertation consists of three parts. First, we focus on long-tailed visual learning. Specifically, we investigate how to leverage heterogeneous, out-of-domain data to facilitate long-tailed object detection and instance segmentation. Second, going beyond instance-level perception, we explore data-efficient learning for visual question answering (VQA) with multiple levels of focus, including bootstrapping VQA datasets, and understanding multimodal information with graph-based representations, and debiasing VQA models with question-conditioned calibration. Third, we introduce a case study on building practical perception systems with imperfect, multi-sensory signals. We design and implement a real-time, accurate sports analysis system based on vision and motion sensor integration. In each part, we introduce the problem and present our solutions, followed by demonstrating the effectiveness of the developed methods on well-benchmarked datasets, tasks, and real-world applications.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023
Mode of access: World Wide Web
ISBN: 9798351442631Subjects--Topical Terms:
621879
Computer engineering.
Subjects--Index Terms:
Visual perceptionIndex Terms--Genre/Form:
542853
Electronic books.
Learning with Imperfect Data and Supervision for Visual Perception and Understanding.
LDR
:03283nmm a2200373K 4500
001
2363154
005
20231116093809.5
006
m o d
007
cr mn ---uuuuu
008
241011s2022 xx obm 000 0 eng d
020
$a
9798351442631
035
$a
(MiAaPQ)AAI30013409
035
$a
(MiAaPQ)OhioLINKosu1658476769549166
035
$a
AAI30013409
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Zhang, Cheng.
$3
1035638
245
1 0
$a
Learning with Imperfect Data and Supervision for Visual Perception and Understanding.
264
0
$c
2022
300
$a
1 online resource (231 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 84-04, Section: A.
500
$a
Advisor: Chao, Wei-Lun;Xuan, Dong.
502
$a
Thesis (Ph.D.)--The Ohio State University, 2022.
504
$a
Includes bibliographical references
520
$a
Having access to large amounts of well-balanced and well-labeled data is one of the most important components of learning-based perception systems. However, gathering and labeling high-quality training examples are often time-consuming, expensive, and error-prone. For example, human-curated datasets may come with a variety of unsatisfactory forms such as noisy, weak labels, and long-tailed data distributions. As a result, machine learning models trained with these imperfect data and annotations appear to be brittle in complex real-world scenarios. In this dissertation, we provide a set of techniques for taming imperfections of data and supervision in different applications, ranging from instance-level object perception to multimodal scene understanding. The core idea is to exploit the massive amounts of raw data and multimodal structures from different sources and integrate them with deep learning.The main body of this dissertation consists of three parts. First, we focus on long-tailed visual learning. Specifically, we investigate how to leverage heterogeneous, out-of-domain data to facilitate long-tailed object detection and instance segmentation. Second, going beyond instance-level perception, we explore data-efficient learning for visual question answering (VQA) with multiple levels of focus, including bootstrapping VQA datasets, and understanding multimodal information with graph-based representations, and debiasing VQA models with question-conditioned calibration. Third, we introduce a case study on building practical perception systems with imperfect, multi-sensory signals. We design and implement a real-time, accurate sports analysis system based on vision and motion sensor integration. In each part, we introduce the problem and present our solutions, followed by demonstrating the effectiveness of the developed methods on well-benchmarked datasets, tasks, and real-world applications.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2023
538
$a
Mode of access: World Wide Web
650
4
$a
Computer engineering.
$3
621879
650
4
$a
Computer science.
$3
523869
650
4
$a
Information science.
$3
554358
653
$a
Visual perception
653
$a
Visual Question Answering
655
7
$a
Electronic books.
$2
lcsh
$3
542853
690
$a
0984
690
$a
0464
690
$a
0723
710
2
$a
ProQuest Information and Learning Co.
$3
783688
710
2
$a
The Ohio State University.
$b
Computer Science and Engineering.
$3
1674144
773
0
$t
Dissertations Abstracts International
$g
84-04A.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30013409
$z
click for full text (PQDT)
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9485510
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入
(1)帳號:一般為「身分證號」;外籍生或交換生則為「學號」。 (2)密碼:預設為帳號末四碼。
帳號
.
密碼
.
請在此電腦上記得個人資料
取消
忘記密碼? (請注意!您必須已在系統登記E-mail信箱方能使用。)