東華大學圖書館 |

Language: English

Help

回圖書館首頁

手機版館藏查詢

Back

Switch To: Labeled | MARC Mode | ISBD

On characteristics of Markov decisio...

Ratitch, Bohdana.

Linked to FindBook

Google Book

Amazon

博客來

On characteristics of Markov decision processes and reinforcement learning in large domains.

Record Type:	Electronic resources : Monograph/item
Title/Author:	On characteristics of Markov decision processes and reinforcement learning in large domains./
Author:	Ratitch, Bohdana.
Published:	Ann Arbor : ProQuest Dissertations & Theses, : 2005,
Description:	284 p.
Notes:	Source: Dissertations Abstracts International, Volume: 68-02, Section: B.
Contained By:	Dissertations Abstracts International68-02B.
Subject:	Computer science. -
Online resource:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=NR12934
ISBN:	9780494129340

On characteristics of Markov decision processes and reinforcement learning in large domains.
Ratitch, Bohdana.

On characteristics of Markov decision processes and reinforcement learning in large domains. - Ann Arbor : ProQuest Dissertations & Theses, 2005 - 284 p.

Source: Dissertations Abstracts International, Volume: 68-02, Section: B.

Thesis (Ph.D.)--McGill University (Canada), 2005.

This item must not be sold to any third party vendors.

Reinforcement learning is a general computational framework for learning sequential decision strategies from the interaction of an agent with a dynamic environment. In this thesis, we focus on value-based learning methods, which rely on computing utility values for different behavior strategies. Value-based reinforcement learning methods have a solid theoretical foundation and a growing history of successful applications to real-world problems. However, most existing theoretically-sound algorithms work for small problems only. For complex real-world decision tasks, approximate methods have to be used; in this case there is a significant gap between the existing theoretical results and the methodologies applied in practice. This thesis is devoted to the analysis of various factors that contribute to the difficulty of learning with popular reinforcement learning algorithms, as well as to developing new methods that facilitate the practical application of reinforcement learning techniques. In the first part of this thesis, we investigate properties of reinforcement learning tasks that influence the performance of value-based algorithms. We present five domain-independent quantitative attributes that can be used to measure various task characteristics. We study the effect of these characteristics on learning and how they can be used for improving the efficiency of existing algorithms. In particular, we develop one application that uses measurements of the proposed attributes for improving exploration (the process by which the agent gathers experience for learning good behavior strategies). In large realistic domains, function approximation methods have to be incorporated into reinforcement learning algorithms. The second part of this thesis is devoted to the use of a function approximation model based on Sparse Distributed Memories (SDMs) in approximate value-based methods. Like for all other function approximators, the success of using SDMs in reinforcement learning depends, to a large extent, on a good choice of the structure of the approximator. We propose a new technique for automatically selecting certain structural parameters of the SDM model on-line based on training data. Our algorithm takes into account the interaction of function approximation with reinforcement learning algorithms and avoids some of the difficulties faced by other methods from the existing literature. In our experiments, this method provides very good performance and is computationally efficient.

ISBN: 9780494129340Subjects--Topical Terms:

523869
Computer science.

On characteristics of Markov decision processes and reinforcement learning in large domains.
LDR:03584nmm a2200301 4500 001 2206923
005 20190906083345.5
008 201008s2005 ||||||||||||||||| ||eng d
020 $a 9780494129340
035 $a (MiAaPQ)AAINR12934
035 $a AAINR12934
040 $a MiAaPQ $c MiAaPQ
100 1 $a Ratitch, Bohdana. $3 3319190
245 1 0 $a On characteristics of Markov decision processes and reinforcement learning in large domains.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2005
300 $a 284 p.
500 $a Source: Dissertations Abstracts International, Volume: 68-02, Section: B.
500 $a Publisher info.: Dissertation/Thesis.
502 $a Thesis (Ph.D.)--McGill University (Canada), 2005.
506 $a This item must not be sold to any third party vendors.
506 $a This item must not be added to any third party search indexes.
520 $a Reinforcement learning is a general computational framework for learning sequential decision strategies from the interaction of an agent with a dynamic environment. In this thesis, we focus on value-based learning methods, which rely on computing utility values for different behavior strategies. Value-based reinforcement learning methods have a solid theoretical foundation and a growing history of successful applications to real-world problems. However, most existing theoretically-sound algorithms work for small problems only. For complex real-world decision tasks, approximate methods have to be used; in this case there is a significant gap between the existing theoretical results and the methodologies applied in practice. This thesis is devoted to the analysis of various factors that contribute to the difficulty of learning with popular reinforcement learning algorithms, as well as to developing new methods that facilitate the practical application of reinforcement learning techniques. In the first part of this thesis, we investigate properties of reinforcement learning tasks that influence the performance of value-based algorithms. We present five domain-independent quantitative attributes that can be used to measure various task characteristics. We study the effect of these characteristics on learning and how they can be used for improving the efficiency of existing algorithms. In particular, we develop one application that uses measurements of the proposed attributes for improving exploration (the process by which the agent gathers experience for learning good behavior strategies). In large realistic domains, function approximation methods have to be incorporated into reinforcement learning algorithms. The second part of this thesis is devoted to the use of a function approximation model based on Sparse Distributed Memories (SDMs) in approximate value-based methods. Like for all other function approximators, the success of using SDMs in reinforcement learning depends, to a large extent, on a good choice of the structure of the approximator. We propose a new technique for automatically selecting certain structural parameters of the SDM model on-line based on training data. Our algorithm takes into account the interaction of function approximation with reinforcement learning algorithms and avoids some of the difficulties faced by other methods from the existing literature. In our experiments, this method provides very good performance and is computationally efficient.
590 $a School code: 0781.
650 4 $a Computer science. $3 523869
690 $a 0984
710 2 $a McGill University (Canada). $3 1018122
773 0 $t Dissertations Abstracts International $g 68-02B.
790 $a 0781
791 $a Ph.D.
792 $a 2005
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=NR12934