語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Chain of Thought Reasoning for Robot...
~
Yang, Fan.
FindBook
Google Book
Amazon
博客來
Chain of Thought Reasoning for Robotic Arm Grasping and Embodied Spatial Perception.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Chain of Thought Reasoning for Robotic Arm Grasping and Embodied Spatial Perception./
作者:
Yang, Fan.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2024,
面頁冊數:
66 p.
附註:
Source: Masters Abstracts International, Volume: 85-11.
Contained By:
Masters Abstracts International85-11.
標題:
Computer engineering. -
電子資源:
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=31297337
ISBN:
9798382715230
Chain of Thought Reasoning for Robotic Arm Grasping and Embodied Spatial Perception.
Yang, Fan.
Chain of Thought Reasoning for Robotic Arm Grasping and Embodied Spatial Perception.
- Ann Arbor : ProQuest Dissertations & Theses, 2024 - 66 p.
Source: Masters Abstracts International, Volume: 85-11.
Thesis (M.S.)--New York University Tandon School of Engineering, 2024.
The rapid development of language models such as BERT, GPT-3, and GPT-4 in recent years, has promoted the emergence of visual language models and multi-modal models, further enhancing the model's scene perception and interaction capabilities. At the same time, with the development of robots and embedded artificial intelligence, we have also seen continuous growth in research on embodied artificial intelligence (AI). This article introduces how to apply large language models (LLMs), visual models, and multi-modal models to robot tasks to enhance their scene perception and interaction capabilities.Our research is divided into three experiments. The first experiment focused on environmental perception, improving the text output quality of the visual language model in the current scene through carefully designed prompt engineering and auxiliary prompts. The second experiment further explored the interaction between robotic agents and the scene. We designed an end-to-end system based on a large language model and a Thought-to-Action Reasoning (TAR) module to enhance the robotic arm's understanding of target grasping tasks. The third experiment focuses on spatial information understanding, and we propose the Embodied Spatial Reasoning (EMBOSR) module to enhance the robotic agent's understanding of the 3D point cloud scene and answer various questions based on that scene. We propose a human instruction analysis system of robotic arm grasping and a 3D scene perception and question-answering system based on LLMs. The comprehensive reasoning ability of the systems is demonstrated through various simulated and real experiments. They indicate the important role of prompt engineering and chain of thought reasoning in completing robotic tasks, and also the importance and potential value of applying large language models to human-robot interaction tasks.
ISBN: 9798382715230Subjects--Topical Terms:
621879
Computer engineering.
Subjects--Index Terms:
Computer vision
Chain of Thought Reasoning for Robotic Arm Grasping and Embodied Spatial Perception.
LDR
:03047nmm a2200385 4500
001
2404422
005
20241209114629.5
006
m o d
007
cr#unu||||||||
008
251215s2024 ||||||||||||||||| ||eng d
020
$a
9798382715230
035
$a
(MiAaPQ)AAI31297337
035
$a
AAI31297337
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Yang, Fan.
$3
1020735
245
1 0
$a
Chain of Thought Reasoning for Robotic Arm Grasping and Embodied Spatial Perception.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2024
300
$a
66 p.
500
$a
Source: Masters Abstracts International, Volume: 85-11.
500
$a
Advisor: Fang, Yi.
502
$a
Thesis (M.S.)--New York University Tandon School of Engineering, 2024.
520
$a
The rapid development of language models such as BERT, GPT-3, and GPT-4 in recent years, has promoted the emergence of visual language models and multi-modal models, further enhancing the model's scene perception and interaction capabilities. At the same time, with the development of robots and embedded artificial intelligence, we have also seen continuous growth in research on embodied artificial intelligence (AI). This article introduces how to apply large language models (LLMs), visual models, and multi-modal models to robot tasks to enhance their scene perception and interaction capabilities.Our research is divided into three experiments. The first experiment focused on environmental perception, improving the text output quality of the visual language model in the current scene through carefully designed prompt engineering and auxiliary prompts. The second experiment further explored the interaction between robotic agents and the scene. We designed an end-to-end system based on a large language model and a Thought-to-Action Reasoning (TAR) module to enhance the robotic arm's understanding of target grasping tasks. The third experiment focuses on spatial information understanding, and we propose the Embodied Spatial Reasoning (EMBOSR) module to enhance the robotic agent's understanding of the 3D point cloud scene and answer various questions based on that scene. We propose a human instruction analysis system of robotic arm grasping and a 3D scene perception and question-answering system based on LLMs. The comprehensive reasoning ability of the systems is demonstrated through various simulated and real experiments. They indicate the important role of prompt engineering and chain of thought reasoning in completing robotic tasks, and also the importance and potential value of applying large language models to human-robot interaction tasks.
590
$a
School code: 1988.
650
4
$a
Computer engineering.
$3
621879
650
4
$a
Robotics.
$3
519753
653
$a
Computer vision
653
$a
Vision Language Model
653
$a
Large language models
653
$a
Embodied Spatial Reasoning
653
$a
Human-robot interaction
690
$a
0800
690
$a
0464
690
$a
0771
710
2
$a
New York University Tandon School of Engineering.
$b
Electrical & Computer Engineering.
$3
3694303
773
0
$t
Masters Abstracts International
$g
85-11.
790
$a
1988
791
$a
M.S.
792
$a
2024
793
$a
English
856
4 0
$u
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=31297337
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9512742
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入