
基于知识图谱多跳推理的中文矿物知识问答方法与系统
季晓慧, 董雨航, 杨中基, 杨眉, 何明跃, 王玉柱
基于知识图谱多跳推理的中文矿物知识问答方法与系统
Mineral question-answering system in Chinese based on multi-hop reasoning in knowledge graphs
已有相关矿物数据库用于存储和查询相关矿物知识,常用的搜索引擎也可以对矿物知识进行查询,但无法回答用自然语言进行提问的矿物问题,查询返回的答案需要进一步筛选。亦有基于知识图谱进行矿物知识问答的相关研究,但只能回答涉及知识图谱中一个三元组的简单问题,无法回答涉及多个三元组的多跳复杂问题。为此,本文提出基于知识图谱多跳推理的矿物复杂知识问答方法,采用ComplEx模型将矿物实体、关系和问句表示为复数向量,以更好地获取相互之间的语义及推理关系。输入矿物问句后,通过Bert-LSTM-CRF获取其中心词,采用基于编辑距离及分词的方法获得中心词的候选实体集合,然后采用全连接网络确定最相关的实体作为推理起点,与矿物问句拼接后通过全连接网络获得当前跳的最相关关系。根据当前跳的起始实体及最相关关系,在矿物知识图谱中获得另一实体作为下一跳的推理起点,并将下一跳的问句更新为原问句,与当前跳最相关关系拼接,以将当前跳的推理信息带入到下一跳推理中,直到获得的最相关推理关系为预定义的结束标识符,推理结束,返回最后一跳的实体为答案,并给出推理路径。采用Python语言,在Tensorflow框架下实现了本文提出的矿物复杂知识问答并与相关模型进行对比,证明了本文方法的有效性。采用前后端分离架构,使用RESTful API、React、Ajax、echarts和Flask等框架和技术,开发了基于知识图谱多跳推理的矿物复杂知识问答系统,为矿物知识获取及相关地质研究提供了平台和工具。
Mineral knowledge is important for geosciences research. Some mineral databases are used for storing and retrieving mineral knowledge, and common search engines can also answer mineral questions. But the mineral databases cannot answer mineral questions in natural language and the answers returned from the common search engines need to be filtered. To solve the above problems knowledge graphs have been used; however, the current mineral question-answering based on knowledge graphs can only answer simple questions involving one triplet, but not complex questions involving multiple triplets and multi-hop reasoning. This paper presents a mineral question-answering system based on multi-hop reasoning in knowledge graphs. The mineral entities, relations and questions are first transformed into vectors of complex domain to obtain their semantic and reasoning relations by using the ComplEx model, and Bert-LSTM-CRF is applied to obtain the head of the question. Candidate entities of the head are then obtained by calculating the edit distance and word segmentation, and a fully connected network is constructed to obtain the most related entity of the head of the question from the candidate entities and the entity is the start of the reasoning. Next, the entity and question vectors are concatenated into an input vector into the fully connected network to get their most related relation; afterward another entity most related to the starting entity/relation can be obtained from the mineral knowledge graph to start the reasoning of the next hop; the question of the next hop is updated by the concatenated vector of this hop to bring the reasoning information of this hop to the next hop. This process continues until the most related relation obtained is the stop sign predefined. The last entity obtained in this process is the answer to the question and the reasoning path is also remembered. This method is implemented using Python under Tensorflow and compared with related methods, which show the effectiveness of the method. Using this method, a question-answering system capable of answering complex mineral questions is developed under the front and back end separation architecture employing RESTful API, React, Ajax, echarts and Flask, which provides a platform for acquiring mineral knowledge and performing geosciences research.
mineral / question answering / knowledge graph / multi-hop reasoning
TP391.1;P628
[1] |
|
[2] |
|
[3] |
周永章, 张良均, 张奥多, 等. 地球科学大数据挖掘与机器学习[M]. 广州: 中山大学出版社, 2018.
|
[4] |
周永章, 陈川, 张旗, 等. 地质大数据分析的若干工具与应用[J]. 大地构造与成矿学, 2020, 44(2): 173-182.
|
[5] |
周永章, 张前龙, 黄永健, 等. 钦杭成矿带斑岩铜矿知识图谱构建及应用展望[J]. 地学前缘, 2021, 28(3): 67-75.
|
[6] |
周永章, 左仁广, 刘刚, 等. 数学地球科学跨越发展的十年: 大数据、 人工智能算法正在改变地质学[J]. 矿物岩石地球化学通报, 2021, 40(3): 556-573, 777.
|
[7] |
周成虎, 王华, 王成善, 等. 大数据时代的地学知识图谱研究[J]. 中国科学: 地球科学, 2021, 51(7): 1070-1079.
|
[8] |
冉一早, 董少春, 王汝成, 等. 铌钽矿床知识图谱的构建及应用实践[J]. 高校地质学报, 2023, 29(3): 359-371.
|
[9] |
张春菊, 刘文聪, 张雪英, 等. 基于本体的金矿知识图谱构建方法[J]. 地球信息科学学报, 2023, 25(7): 1269-1281.
|
[10] |
张悦. 矿物领域知识图谱构建技术研究与实现[D]. 北京: 中国地质大学(北京), 2021.
|
[11] |
燕群. 矿床知识图谱构建方法及找矿预测应用: 以甘肃寨上-马坞金矿集区为例[D]. 长春: 吉林大学, 2022.
|
[12] |
陈忠良, 袁峰, 李晓晖, 等. 基于BERT-BiLSTM-CRF模型的中文岩石描述文本命名实体与关系联合提取[J]. 地质论评, 2022, 68(2): 742-750.
|
[13] |
|
[14] |
|
[15] |
|
[16] |
张天杭, 李婷婷, 张永刚. 基于知识图谱嵌入的多跳中文知识问答方法[J]. 吉林大学学报(理学版), 2022, 60(1): 119-126.
|
[17] |
叶蕾, 张宇迪, 杨旭华. 利用知识图谱的多跳可解释问答[J/OL]. 小型微型计算机系统, 2023: 1-11[2024-04-24]. https://kns.cnki.net/kcms/detail/21.1106.TP.20230519.0923.002.html.
|
[18] |
|
[19] |
|
[20] |
Hudson Institute of Mineralogy. Mindat[EB/OL]. (2020-05-15)[2024-04-24]. https://www.mindat.org/.
|
[21] |
Mineralogical Society of America. American mineralogist crystal structure database[DB/OL]. (2019-04-01)[2024-04-24]. http://rruff.geo.arizona.edu/AMS/amcsd.php.
|
[22] |
中国地质大学北京. 国家岩矿化石标本资源共享平台[EB/OL]. (2003-01-01)[2024-04-24]. http://www.nimrf.net.cn/.
|
[23] |
|
[24] |
|
[25] |
|
[26] |
|
/
〈 |
|
〉 |