東華大學圖書館 |

Machine Learning Methods for Autonomous Driving: Visual Privacy, 3D Depth Perception and Trajectory Prediction Modeling.

Record Type:	Electronic resources : Monograph/item
Title/Author:	Machine Learning Methods for Autonomous Driving: Visual Privacy, 3D Depth Perception and Trajectory Prediction Modeling./
Author:	Elezovikj, Semir.
Published:	Ann Arbor : ProQuest Dissertations & Theses, : 2024,
Description:	133 p.
Notes:	Source: Dissertations Abstracts International, Volume: 85-11, Section: A.
Contained By:	Dissertations Abstracts International85-11A.
Subject:	Computer engineering. -
Online resource:	https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=31235105
ISBN:	9798382754833

Machine Learning Methods for Autonomous Driving: Visual Privacy, 3D Depth Perception and Trajectory Prediction Modeling.
Elezovikj, Semir.

Machine Learning Methods for Autonomous Driving: Visual Privacy, 3D Depth Perception and Trajectory Prediction Modeling. - Ann Arbor : ProQuest Dissertations & Theses, 2024 - 133 p.

Source: Dissertations Abstracts International, Volume: 85-11, Section: A.

Thesis (Ph.D.)--Temple University, 2024.

Autonomous driving could bring profound benefits for our society. The benefits range from economic and safety benefits due to the reduction of the number of traffic accidents, to environmental gains due to reduced traffic congestion. However, the utopian future of self-driving vehicles is yet to come. To this end, we propose machine learning methods to address three pivotal aspects of autonomous driving: visual privacy, 3D depth perception, and trajectory prediction modeling.We begin by exploring the crucial issue of visual privacy within person-aware visual systems. We propose the use of depth-information to protect privacy in person-aware visual systems while preserving important foreground subjects and scene structures. We aim to preserve the identity of foreground subjects while hiding superfluous details in the background that may contain sensitive information. In particular, for an input color and depth image pair, we first create a sensitivity map which favors background regions (where privacy should be preserved) and low depth-gradient pixels (which often relates a lot to scene structure but little to identity). We then combine this per-pixel sensitivity map with an inhomogeneous image obscuration process for privacy protection. We tested the proposed method using data involving different scenarios including various illumination conditions, various number of subjects, different context, etc. The experiments demonstrate the quality of preserving the identity of humans and edges obtained from the depth information while obscuring privacy intrusive information in the background.Next, we focus on the label layout problem: AR technologies can overlay virtual annotations directly onto the real-world view of a self-driving vehicle (SDV). Autonomous vehicles operate in dynamic environments, due to the complexity of the traffic scene and the interactions between the participating agents. Overlaying virtual annotations directly onto the real-world view of a SDV, can provide additional context, such as highlighting important information or projecting the future trajectories of other participants. Designing a layout of labels that does not violate domain-specific design requirements, while at the same time satisfying aesthetic and functional principles of good design, can be a daunting task even for skilled visual designers. Presenting the annotations in 3D object space instead of projection space, allows for the preservation of spatial and depth cues. This results in stable layouts in dynamic environments, since the annotations are anchored in 3D space. In this domain, we make two major contributions. First, we propose a technique for managing the layout and rendering of annotations in Virtual/Augmented Reality scenarios by manipulating the annotations directly in 3D space. For this, we make use of Artificial Potential Fields and use 3D geometric constraints to adapt them in 3D space. Second, we introduce PartLabeling: an open source platform in the form of a web application that acts as a much-needed generic framework allowing to easily add labeling algorithms and 3D models. This serves as a catalyst for researchers in this field to make their algorithms and implementations publicly available, as well as ensure research reproducibility. The PartLabeling framework relies on a dataset that we generate as a subset of the original PartNet dataset consisting of models suitable for the label management task. The dataset consists of 1,000 3D models with part annotations.Finally, we focus on the trajectory prediction task in the context of autonomous driving. Predicting the trajectories of multiple participating agents in the context of autonomous driving is a challenging problem due to the complexity of the traffic scene and the interactions between the agents. Autonomous vehicles need to effectively anticipate the behavior of other moving participants in the traffic scene (human pedestrians, cyclists, animals, other moving vehicles). The task of modeling human driver behavior, as well as the interactions between the traffic participants must be addressed to enable a safe and optimized autonomous vehicle systems. There are many factors that traffic participants take into consideration in order to safely interact with other traffic participants. Human drivers have sophisticated interaction strategies that come naturally to them. Given the highly interactive nature of traffic scenarios, representing the interactions between multiple participating agents in a traffic scene in the form of a graph structure is a natural conclusion. In order to leverage the influences between multiple agents in a traffic scene, we structure the scene as a graph whose nodes represent the traffic participants. The node features are each agent's surrounding context encoded as a raster image. For this purpose, we leveragel R-GCN (Relational Graph-Convolutional Netowrks). Then, we propose a novel Cross-Modal Attention Network (CMAN) to encourage interactions between two modalities: 1) the latent features of an ego-agent's raster image and 2) the latent features of the surrounding agents' influences on the ego-agent, in a manner that allows these two modalities to complement each other.

ISBN: 9798382754833Subjects--Topical Terms:

621879
Computer engineering.
Subjects--Index Terms:

Autonomous driving

Machine Learning Methods for Autonomous Driving: Visual Privacy, 3D Depth Perception and Trajectory Prediction Modeling.
LDR:06526nmm a2200421 4500 001 2402222
005 20241028051500.5
006 m o d
007 cr#unu||||||||
008 251215s2024 ||||||||||||||||| ||eng d
020 $a 9798382754833
035 $a (MiAaPQ)AAI31235105
035 $a AAI31235105
035 $a 2402222
040 $a MiAaPQ $c MiAaPQ
100 1 $a Elezovikj, Semir. $3 3772446
245 1 0 $a Machine Learning Methods for Autonomous Driving: Visual Privacy, 3D Depth Perception and Trajectory Prediction Modeling.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2024
300 $a 133 p.
500 $a Source: Dissertations Abstracts International, Volume: 85-11, Section: A.
500 $a Advisor: Tan, Chiu C.
502 $a Thesis (Ph.D.)--Temple University, 2024.
520 $a Autonomous driving could bring profound benefits for our society. The benefits range from economic and safety benefits due to the reduction of the number of traffic accidents, to environmental gains due to reduced traffic congestion. However, the utopian future of self-driving vehicles is yet to come. To this end, we propose machine learning methods to address three pivotal aspects of autonomous driving: visual privacy, 3D depth perception, and trajectory prediction modeling.We begin by exploring the crucial issue of visual privacy within person-aware visual systems. We propose the use of depth-information to protect privacy in person-aware visual systems while preserving important foreground subjects and scene structures. We aim to preserve the identity of foreground subjects while hiding superfluous details in the background that may contain sensitive information. In particular, for an input color and depth image pair, we first create a sensitivity map which favors background regions (where privacy should be preserved) and low depth-gradient pixels (which often relates a lot to scene structure but little to identity). We then combine this per-pixel sensitivity map with an inhomogeneous image obscuration process for privacy protection. We tested the proposed method using data involving different scenarios including various illumination conditions, various number of subjects, different context, etc. The experiments demonstrate the quality of preserving the identity of humans and edges obtained from the depth information while obscuring privacy intrusive information in the background.Next, we focus on the label layout problem: AR technologies can overlay virtual annotations directly onto the real-world view of a self-driving vehicle (SDV). Autonomous vehicles operate in dynamic environments, due to the complexity of the traffic scene and the interactions between the participating agents. Overlaying virtual annotations directly onto the real-world view of a SDV, can provide additional context, such as highlighting important information or projecting the future trajectories of other participants. Designing a layout of labels that does not violate domain-specific design requirements, while at the same time satisfying aesthetic and functional principles of good design, can be a daunting task even for skilled visual designers. Presenting the annotations in 3D object space instead of projection space, allows for the preservation of spatial and depth cues. This results in stable layouts in dynamic environments, since the annotations are anchored in 3D space. In this domain, we make two major contributions. First, we propose a technique for managing the layout and rendering of annotations in Virtual/Augmented Reality scenarios by manipulating the annotations directly in 3D space. For this, we make use of Artificial Potential Fields and use 3D geometric constraints to adapt them in 3D space. Second, we introduce PartLabeling: an open source platform in the form of a web application that acts as a much-needed generic framework allowing to easily add labeling algorithms and 3D models. This serves as a catalyst for researchers in this field to make their algorithms and implementations publicly available, as well as ensure research reproducibility. The PartLabeling framework relies on a dataset that we generate as a subset of the original PartNet dataset consisting of models suitable for the label management task. The dataset consists of 1,000 3D models with part annotations.Finally, we focus on the trajectory prediction task in the context of autonomous driving. Predicting the trajectories of multiple participating agents in the context of autonomous driving is a challenging problem due to the complexity of the traffic scene and the interactions between the agents. Autonomous vehicles need to effectively anticipate the behavior of other moving participants in the traffic scene (human pedestrians, cyclists, animals, other moving vehicles). The task of modeling human driver behavior, as well as the interactions between the traffic participants must be addressed to enable a safe and optimized autonomous vehicle systems. There are many factors that traffic participants take into consideration in order to safely interact with other traffic participants. Human drivers have sophisticated interaction strategies that come naturally to them. Given the highly interactive nature of traffic scenarios, representing the interactions between multiple participating agents in a traffic scene in the form of a graph structure is a natural conclusion. In order to leverage the influences between multiple agents in a traffic scene, we structure the scene as a graph whose nodes represent the traffic participants. The node features are each agent's surrounding context encoded as a raster image. For this purpose, we leveragel R-GCN (Relational Graph-Convolutional Netowrks). Then, we propose a novel Cross-Modal Attention Network (CMAN) to encourage interactions between two modalities: 1) the latent features of an ego-agent's raster image and 2) the latent features of the surrounding agents' influences on the ego-agent, in a manner that allows these two modalities to complement each other.
590 $a School code: 0225.
650 4 $a Computer engineering. $3 621879
650 4 $a Automotive engineering. $3 2181195
650 4 $a Information technology. $3 532993
650 4 $a Transportation. $3 555912
653 $a Autonomous driving
653 $a 3D object space
653 $a Machine learning
653 $a Person-aware visual systems
653 $a Trajectory prediction modeling
690 $a 0800
690 $a 0489
690 $a 0464
690 $a 0709
690 $a 0540
710 2 $a Temple University. $b Computer and Information Science. $3 1065462
773 0 $t Dissertations Abstracts International $g 85-11A.
790 $a 0225
791 $a Ph.D.
792 $a 2024
793 $a English
856 4 0 $u https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=31235105