語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Information Extraction and Knowledge...
~
Kumar, Aman.
FindBook
Google Book
Amazon
博客來
Information Extraction and Knowledge Graph Development for Manufacturing Science Domain Using Natural Language Processing.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Information Extraction and Knowledge Graph Development for Manufacturing Science Domain Using Natural Language Processing./
作者:
Kumar, Aman.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2023,
面頁冊數:
152 p.
附註:
Source: Dissertations Abstracts International, Volume: 85-01, Section: A.
Contained By:
Dissertations Abstracts International85-01A.
標題:
Search engines. -
電子資源:
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30516422
ISBN:
9798379880675
Information Extraction and Knowledge Graph Development for Manufacturing Science Domain Using Natural Language Processing.
Kumar, Aman.
Information Extraction and Knowledge Graph Development for Manufacturing Science Domain Using Natural Language Processing.
- Ann Arbor : ProQuest Dissertations & Theses, 2023 - 152 p.
Source: Dissertations Abstracts International, Volume: 85-01, Section: A.
Thesis (Ph.D.)--North Carolina State University, 2023.
This item must not be sold to any third party vendors.
The number of published manufacturing science digital articles available from scientific journals and the broader web has increased exponentially every year since the 1990s. Assimilating all of this knowledge by a novice engineer or an experienced researcher requires a significant synthesis of the existing knowledge space contained within published material to find answers to basic and complex queries. Recent advances in Technical Language Processing (TLP), a sub-field of Natural Language Processing (NLP), have created opportunities to leverage the rich data available from multiple procedural sources, including research articles, magazines, and textbooks, to develop new applications in data-driven manufacturing, education, and data-guided design. TLP enables the extraction of valuable text-based scientific information fast and efficiently using word embeddings. This dissertation focuses on developing tools for enabling information extraction and structured representation of knowledge for the manufacturing science domain.One of the significant challenges to analyzing manufacturing vocabulary is the lack of a named entity recognition model that enables algorithms to classify the manufacturing corpus of words under various manufacturing semantic categories. In this work, we present a supervised machine learning approach to categorize unstructured text from manufacturing science abstracts and content obtained from relevant technical books in discrete manufacturing and label them under various manufacturing topic categories. We created a FabNER dataset and model to perform the manufacturing-specific information extraction. Two use cases in topic modeling and similar entity recommendations demonstrate the value of the developed NER model as a Technical Language Processing (TLP) workflow for manufacturing science documents. The textual data is further used to create the word embeddings of contextual representation through transformers-based language models. To address the limitations of using vanilla language models, we introduce ManuBERT (Bidirectional Encoder Representation from Transformers for Manufacturing text) as an extension of the state-of-the-art BERT model with manufacturing domain-specific language information. We evaluate this language model using the named entity recognition, mask prediction, and binary classification task. Furthermore, the combination of traditional word embeddings and entity recognition is used to conduct unsupervised NLP research on solar cell materials to anticipate new solar cell materials based on existing knowledge present in 1.72 million abstracts and patents.The information extraction tools developed are further used to create a structured representation of manufacturing knowledge through Knowledge Graphs (KG). A KG can power question-answering systems, organize knowledge and infer results through queries. We created FabKG (Manufacturing knowledge graph) by utilizing structured and unstructured knowledge sources. For the structured knowledge, textbook index words, research paper keywords, and FabNER (manufacturing NER) was utilized to extract a sub-knowledge base contained within Wikidata. For the KG developed through unstructured knowledge source, a novel crowdsourcing method for KG creation is proposed by leveraging student notes, which have invaluable information but are not captured as meaningful information, excluding their use in personal preparation for learning and written exams. The knowledge graph developed through notes and human annotations is evaluated by annotators. Finally, some potential use cases and the question types answered through the knowledge graph are shown.
ISBN: 9798379880675Subjects--Topical Terms:
869493
Search engines.
Information Extraction and Knowledge Graph Development for Manufacturing Science Domain Using Natural Language Processing.
LDR
:04901nmm a2200349 4500
001
2395351
005
20240517100604.5
006
m o d
007
cr#unu||||||||
008
251215s2023 ||||||||||||||||| ||eng d
020
$a
9798379880675
035
$a
(MiAaPQ)AAI30516422
035
$a
(MiAaPQ)NCState_Univ18402040758
035
$a
AAI30516422
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Kumar, Aman.
$3
3764858
245
1 0
$a
Information Extraction and Knowledge Graph Development for Manufacturing Science Domain Using Natural Language Processing.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2023
300
$a
152 p.
500
$a
Source: Dissertations Abstracts International, Volume: 85-01, Section: A.
500
$a
Advisor: Cohen, Paul;Ngaile, Gracious;Lynch, Collin;Shirwaiker, Rohan;Starly, Binil.
502
$a
Thesis (Ph.D.)--North Carolina State University, 2023.
506
$a
This item must not be sold to any third party vendors.
520
$a
The number of published manufacturing science digital articles available from scientific journals and the broader web has increased exponentially every year since the 1990s. Assimilating all of this knowledge by a novice engineer or an experienced researcher requires a significant synthesis of the existing knowledge space contained within published material to find answers to basic and complex queries. Recent advances in Technical Language Processing (TLP), a sub-field of Natural Language Processing (NLP), have created opportunities to leverage the rich data available from multiple procedural sources, including research articles, magazines, and textbooks, to develop new applications in data-driven manufacturing, education, and data-guided design. TLP enables the extraction of valuable text-based scientific information fast and efficiently using word embeddings. This dissertation focuses on developing tools for enabling information extraction and structured representation of knowledge for the manufacturing science domain.One of the significant challenges to analyzing manufacturing vocabulary is the lack of a named entity recognition model that enables algorithms to classify the manufacturing corpus of words under various manufacturing semantic categories. In this work, we present a supervised machine learning approach to categorize unstructured text from manufacturing science abstracts and content obtained from relevant technical books in discrete manufacturing and label them under various manufacturing topic categories. We created a FabNER dataset and model to perform the manufacturing-specific information extraction. Two use cases in topic modeling and similar entity recommendations demonstrate the value of the developed NER model as a Technical Language Processing (TLP) workflow for manufacturing science documents. The textual data is further used to create the word embeddings of contextual representation through transformers-based language models. To address the limitations of using vanilla language models, we introduce ManuBERT (Bidirectional Encoder Representation from Transformers for Manufacturing text) as an extension of the state-of-the-art BERT model with manufacturing domain-specific language information. We evaluate this language model using the named entity recognition, mask prediction, and binary classification task. Furthermore, the combination of traditional word embeddings and entity recognition is used to conduct unsupervised NLP research on solar cell materials to anticipate new solar cell materials based on existing knowledge present in 1.72 million abstracts and patents.The information extraction tools developed are further used to create a structured representation of manufacturing knowledge through Knowledge Graphs (KG). A KG can power question-answering systems, organize knowledge and infer results through queries. We created FabKG (Manufacturing knowledge graph) by utilizing structured and unstructured knowledge sources. For the structured knowledge, textbook index words, research paper keywords, and FabNER (manufacturing NER) was utilized to extract a sub-knowledge base contained within Wikidata. For the KG developed through unstructured knowledge source, a novel crowdsourcing method for KG creation is proposed by leveraging student notes, which have invaluable information but are not captured as meaningful information, excluding their use in personal preparation for learning and written exams. The knowledge graph developed through notes and human annotations is evaluated by annotators. Finally, some potential use cases and the question types answered through the knowledge graph are shown.
590
$a
School code: 0155.
650
4
$a
Search engines.
$3
869493
650
4
$a
Neural networks.
$3
677449
650
4
$a
Web studies.
$3
2122754
690
$a
0800
690
$a
0338
690
$a
0646
710
2
$a
North Carolina State University.
$3
1018772
773
0
$t
Dissertations Abstracts International
$g
85-01A.
790
$a
0155
791
$a
Ph.D.
792
$a
2023
793
$a
English
856
4 0
$u
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30516422
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9503671
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入