Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Linked to FindBook
Google Book
Amazon
博客來
Towards Comprehensive Action Understanding in Videos.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Towards Comprehensive Action Understanding in Videos./
Author:
Ji, Jingwei.
Published:
Ann Arbor : ProQuest Dissertations & Theses, : 2021,
Description:
141 p.
Notes:
Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Contained By:
Dissertations Abstracts International83-03B.
Subject:
Cooperative learning. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28688315
ISBN:
9798544203698
Towards Comprehensive Action Understanding in Videos.
Ji, Jingwei.
Towards Comprehensive Action Understanding in Videos.
- Ann Arbor : ProQuest Dissertations & Theses, 2021 - 141 p.
Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Thesis (Ph.D.)--Stanford University, 2021.
This item must not be sold to any third party vendors.
An enormous amount of videos are created, spread, and watched daily. In the ocean of videos, the actions and activities of humans are often the pivots. We desire machines to understand human actions in videos as this is essential to various applications, including but not limited to healthcare, security system, and human-robot interactions. For these applications to be realized, action understanding must go beyond simply answering "what is the action", but more comprehensive. An intelligent agent should be able to know "who/where is the actor", "what/where is the object", "what interaction is happening between the actor and the object", "when does an action start and end", and more. Achieving comprehensive action understanding is non-trivial since the need for data and labels combinatorially increases when trying to solve multiple problems, not to mention that video data and labels are expensive to collect, store, and consume. Therefore, to obtain comprehensive action understanding, we not only need to perform multiple tasks but also have to ensure data eciency.In this dissertation, we discuss three questions to realize data-ecient and comprehensive action understanding. How to reduce the need for data and labels? How to perform multiple tasks without combinatorial growth of data? How to solve new problems eciently with some other problems solved? For the first question, our works on few-shot video classification and semi-supervised temporal action proposals introduce video-specific techniques and strategies for learning with less supervision. For the second question, we demonstrate how to avoid enumerating all combinations of categories from subtasks by knowledge disentanglement in a study on actor-action segmentation. For the third question, we propose constructing compositional representation from human-object relationships in videos, and such representation leads to better generalizability in action recognition models.
ISBN: 9798544203698Subjects--Topical Terms:
3682760
Cooperative learning.
Towards Comprehensive Action Understanding in Videos.
LDR
:02927nmm a2200301 4500
001
2348624
005
20220912135621.5
008
241004s2021 ||||||||||||||||| ||eng d
020
$a
9798544203698
035
$a
(MiAaPQ)AAI28688315
035
$a
(MiAaPQ)STANFORDwc099nh9969
035
$a
AAI28688315
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Ji, Jingwei.
$3
3687989
245
1 0
$a
Towards Comprehensive Action Understanding in Videos.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2021
300
$a
141 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
500
$a
Advisor: Li, Fei-Fei.
502
$a
Thesis (Ph.D.)--Stanford University, 2021.
506
$a
This item must not be sold to any third party vendors.
520
$a
An enormous amount of videos are created, spread, and watched daily. In the ocean of videos, the actions and activities of humans are often the pivots. We desire machines to understand human actions in videos as this is essential to various applications, including but not limited to healthcare, security system, and human-robot interactions. For these applications to be realized, action understanding must go beyond simply answering "what is the action", but more comprehensive. An intelligent agent should be able to know "who/where is the actor", "what/where is the object", "what interaction is happening between the actor and the object", "when does an action start and end", and more. Achieving comprehensive action understanding is non-trivial since the need for data and labels combinatorially increases when trying to solve multiple problems, not to mention that video data and labels are expensive to collect, store, and consume. Therefore, to obtain comprehensive action understanding, we not only need to perform multiple tasks but also have to ensure data eciency.In this dissertation, we discuss three questions to realize data-ecient and comprehensive action understanding. How to reduce the need for data and labels? How to perform multiple tasks without combinatorial growth of data? How to solve new problems eciently with some other problems solved? For the first question, our works on few-shot video classification and semi-supervised temporal action proposals introduce video-specific techniques and strategies for learning with less supervision. For the second question, we demonstrate how to avoid enumerating all combinations of categories from subtasks by knowledge disentanglement in a study on actor-action segmentation. For the third question, we propose constructing compositional representation from human-object relationships in videos, and such representation leads to better generalizability in action recognition models.
590
$a
School code: 0212.
650
4
$a
Cooperative learning.
$3
3682760
650
4
$a
Genomes.
$3
592593
650
4
$a
Localization.
$3
3560711
650
4
$a
Ablation.
$3
3562462
650
4
$a
Graph representations.
$3
3560730
650
4
$a
Semantics.
$3
520060
650
4
$a
Genetics.
$3
530508
650
4
$a
Accuracy.
$3
3559958
650
4
$a
Datasets.
$3
3541416
650
4
$a
Methods.
$3
3560391
650
4
$a
Experiments.
$3
525909
650
4
$a
Classification.
$3
595585
690
$a
0369
710
2
$a
Stanford University.
$3
754827
773
0
$t
Dissertations Abstracts International
$g
83-03B.
790
$a
0212
791
$a
Ph.D.
792
$a
2021
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28688315
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9471062
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login