Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
A novel hybrid focused crawling algo...
~
Chen, Yuxin.
Linked to FindBook
Google Book
Amazon
博客來
A novel hybrid focused crawling algorithm to build domain-specific collections.
Record Type:
Language materials, printed : Monograph/item
Title/Author:
A novel hybrid focused crawling algorithm to build domain-specific collections./
Author:
Chen, Yuxin.
Description:
85 p.
Notes:
Adviser: Edward A. Fox.
Contained By:
Dissertation Abstracts International68-03B.
Subject:
Computer Science. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3256116
A novel hybrid focused crawling algorithm to build domain-specific collections.
Chen, Yuxin.
A novel hybrid focused crawling algorithm to build domain-specific collections.
- 85 p.
Adviser: Edward A. Fox.
Thesis (Ph.D.)--Virginia Polytechnic Institute and State University, 2007.
The Web, containing a large amount of useful information and resources, is expanding rapidly. Collecting domain-specific documents/information from the Web is one of the most important methods to build digital libraries for the scientific community. Focused Crawlers can selectively retrieve Web documents relevant to a specific domain to build collections for domain-specific search engines or digital libraries. Traditional focused crawlers normally adopting the simple Vector Space Model and local Web search algorithms typically only find relevant Web pages with low precision. Recall also often is low, since they explore a limited sub-graph of the Web that surrounds the starting URL set, and will ignore relevant pages outside this sub-graph. In this work, we investigated how to apply an inductive machine learning algorithm and meta-search technique, to the traditional focused crawling process, to overcome the above mentioned problems and to improve performance. We proposed a novel hybrid focused crawling framework based on Genetic Programming (GP) and meta-search. We showed that our novel hybrid framework can be applied to traditional focused crawlers to accurately find more relevant Web documents for the use of digital libraries and domain-specific search engines. The framework is validated through experiments performed on test documents from the Open Directory Project [22]. Our studies have shown that improvement can be achieved relative to the traditional focused crawler if genetic programming and meta-search methods are introduced into the focused crawling process.Subjects--Topical Terms:
626642
Computer Science.
A novel hybrid focused crawling algorithm to build domain-specific collections.
LDR
:02450nam 2200253 a 45
001
969742
005
20110920
008
110921s2007 eng d
035
$a
(UMI)AAI3256116
035
$a
AAI3256116
040
$a
UMI
$c
UMI
100
1
$a
Chen, Yuxin.
$3
1293800
245
1 2
$a
A novel hybrid focused crawling algorithm to build domain-specific collections.
300
$a
85 p.
500
$a
Adviser: Edward A. Fox.
500
$a
Source: Dissertation Abstracts International, Volume: 68-03, Section: B, page: 1717.
502
$a
Thesis (Ph.D.)--Virginia Polytechnic Institute and State University, 2007.
520
$a
The Web, containing a large amount of useful information and resources, is expanding rapidly. Collecting domain-specific documents/information from the Web is one of the most important methods to build digital libraries for the scientific community. Focused Crawlers can selectively retrieve Web documents relevant to a specific domain to build collections for domain-specific search engines or digital libraries. Traditional focused crawlers normally adopting the simple Vector Space Model and local Web search algorithms typically only find relevant Web pages with low precision. Recall also often is low, since they explore a limited sub-graph of the Web that surrounds the starting URL set, and will ignore relevant pages outside this sub-graph. In this work, we investigated how to apply an inductive machine learning algorithm and meta-search technique, to the traditional focused crawling process, to overcome the above mentioned problems and to improve performance. We proposed a novel hybrid focused crawling framework based on Genetic Programming (GP) and meta-search. We showed that our novel hybrid framework can be applied to traditional focused crawlers to accurately find more relevant Web documents for the use of digital libraries and domain-specific search engines. The framework is validated through experiments performed on test documents from the Open Directory Project [22]. Our studies have shown that improvement can be achieved relative to the traditional focused crawler if genetic programming and meta-search methods are introduced into the focused crawling process.
590
$a
School code: 0247.
650
4
$a
Computer Science.
$3
626642
690
$a
0984
710
2 0
$a
Virginia Polytechnic Institute and State University.
$3
1017496
773
0
$t
Dissertation Abstracts International
$g
68-03B.
790
$a
0247
790
1 0
$a
Fox, Edward A.,
$e
advisor
791
$a
Ph.D.
792
$a
2007
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3256116
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9128230
電子資源
11.線上閱覽_V
電子書
EB W9128230
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login