東華大學圖書館 |

Language: English

Help

回圖書館首頁

手機版館藏查詢

Back

Switch To: Labeled | MARC Mode | ISBD

Comparative Mining of Multiple Web D...

Alahmad, Yanal.

Linked to FindBook

Google Book

Amazon

博客來

Comparative Mining of Multiple Web Data Source Contents with Object Oriented Model.

Record Type:	Language materials, printed : Monograph/item
Title/Author:	Comparative Mining of Multiple Web Data Source Contents with Object Oriented Model./
Author:	Alahmad, Yanal.
Description:	122 p.
Notes:	Source: Masters Abstracts International, Volume: 51-04.
Contained By:	Masters Abstracts International51-04(E).
Subject:	Computer Science. -
Online resource:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=MR84909
ISBN:	9780494849095

Comparative Mining of Multiple Web Data Source Contents with Object Oriented Model.
Alahmad, Yanal.

Comparative Mining of Multiple Web Data Source Contents with Object Oriented Model. - 122 p.

Source: Masters Abstracts International, Volume: 51-04.

Thesis (M.Sc.)--University of Windsor (Canada), 2013.

Web contents usually contain different types of data which are embedded under different complex structures. Existing approaches for extracting data contents from the web are manual wrappers, supervised wrapper induction, or automatic data extraction. The WebOminer system is an automatic extraction system that attempts to extract diverse heterogeneous web contents by modeling web sites as object oriented schemas. The goal is to generate and integrate various web site object schemas for deeper comparative querying of historical and derived contents of Business to Customer (B2C) such as BestBuy and Future Shop. The current WebOMiner system generates and extracts from only one product list page (e.g., computer page) of B2C web sites and still needs to generate and extract from a more comprehensive web site object schemas (e.g., those of Computer, Laptop and Desktop products). The current WebOMiner system does not yet handle historical aspects of data objects from different web pages.

ISBN: 9780494849095Subjects--Topical Terms:

626642
Computer Science.

Comparative Mining of Multiple Web Data Source Contents with Object Oriented Model.
LDR:02716nam a2200289 4500 001 1964140
005 20141010091534.5
008 150210s2013 ||||||||||||||||| ||eng d
020 $a 9780494849095
035 $a (MiAaPQ)AAIMR84909
035 $a AAIMR84909
040 $a MiAaPQ $c MiAaPQ
100 1 $a Alahmad, Yanal. $3 2100536
245 1 0 $a Comparative Mining of Multiple Web Data Source Contents with Object Oriented Model.
300 $a 122 p.
500 $a Source: Masters Abstracts International, Volume: 51-04.
500 $a Adviser: Christie I. Ezeife.
502 $a Thesis (M.Sc.)--University of Windsor (Canada), 2013.
520 $a Web contents usually contain different types of data which are embedded under different complex structures. Existing approaches for extracting data contents from the web are manual wrappers, supervised wrapper induction, or automatic data extraction. The WebOminer system is an automatic extraction system that attempts to extract diverse heterogeneous web contents by modeling web sites as object oriented schemas. The goal is to generate and integrate various web site object schemas for deeper comparative querying of historical and derived contents of Business to Customer (B2C) such as BestBuy and Future Shop. The current WebOMiner system generates and extracts from only one product list page (e.g., computer page) of B2C web sites and still needs to generate and extract from a more comprehensive web site object schemas (e.g., those of Computer, Laptop and Desktop products). The current WebOMiner system does not yet handle historical aspects of data objects from different web pages.
520 $a This thesis extends and advances the WebOMiner system to automatically generate a more comprehensive web site object schema, extract and mine structured web contents from different web pages based on objects' patterns similarity matching, and stores the extracted objects in historical object-oriented data warehouse. Approaches to be used include similarity matching of DOM tree tag nodes for identifying data blocks and data regions, automatic Non-Deterministic and Deterministic Finite Automata (NFA and DFA) for generating web site object schemas and content extraction, which contain similar data objects. Experimental results show that our system is effective and able to extract and mine structured data tuples from different web websites with 79% recall and 100% precision. The average execution time of our system is 21.8 seconds.
590 $a School code: 0115.
650 4 $a Computer Science. $3 626642
650 4 $a Information Technology. $3 1030799
690 $a 0984
690 $a 0489
710 2 $a University of Windsor (Canada). $b Computer Science. $3 2092546
773 0 $t Masters Abstracts International $g 51-04(E).
790 $a 0115
791 $a M.Sc.
792 $a 2013
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=MR84909