PDF(1135 KB)
Machine Reading Comprehension for Document-level Person Aspect Term Extraction
LIU Ziyun, ZHANG Shiqi, CHEN Wenliang
PDF(1135 KB)
PDF(1135 KB)
Machine Reading Comprehension for Document-level Person Aspect Term Extraction
Person aspect term extraction aims to extract various attributes of individuals such as gender and nationality from their descriptions. Existing extraction methods typically train sequence labeling models on distantly-supervised data to obtain the extraction model. However, this approach has issues with inaccurate annotations and overlapping different attribute values in the data, and lacks scalability and generalizability in their models. To solve the problems, this article proposes to transform this task into a machine reading comprehension (MRC) problem, that is, to fill in the person attribute-value table by reading the person profile. This paper constructs a person attribute recognition data based on the reading comprehension framework from the person encyclopedia, and constructs two baseline models of bidirectional encoder representations from transformers-machine reading comprehension (BERT-MRC) and bidirectional encoder representations from transformers-conditional random field-machine reading comprehension (BERT-CRF-MRC). Among them, BERT-CRF-MRC is three percentage points higher than BERT-MRC on average in F1 score and the experimental results of BERT-CRF-MRC are about 92% F1 average in short text person profiles while about 75% in long text person profiles. The constructed data and code are exposed on Github.
| 1 |
|
| 2 |
|
| 3 |
李昊迪. 医学领域知识抽取方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2018.
|
| 4 |
|
| 5 |
|
| 6 |
李红亮. 基于规则的百科人物属性抽取算法的研究[D]. 成都: 西南交通大学, 2013.
|
| 7 |
|
| 8 |
|
| 9 |
|
| 10 |
|
| 11 |
|
| 12 |
|
| 13 |
|
| 14 |
|
| 15 |
马进, 杨一帆, 陈文亮. 基于远程监督的人物属性抽取研究[J]. 中文信息学报, 2020, 34(6): 64-72. DOI: 10.3969/j.issn.1003-0077.2020.06.009 .
|
| 16 |
张巧, 熊锦华, 程学旗. 基于弱监督学习的主页人物属性抽取方法[J]. 山西大学学报(自然科学版), 2015, 38(1): 8-15. DOI:10.13451/j.cnki.shanxi.univ(nat.sci.).2015.01.002 .
|
| 17 |
|
| 18 |
苏丰龙, 谢庆华, 邱继远, 等. 基于深度学习的领域实体属性词聚类抽取研究[J]. 微型机与应用, 2016, 35(1): 53-55. DOI: 10.19358/j.issn.1674-7720.2016.01.017 .
|
| 19 |
向晓雯. 基于条件随机场的中文命名实体识别[D]. 厦门: 厦门大学, 2006.
|
| 20 |
|
| 21 |
|
| 22 |
|
| 23 |
|
/
| 〈 |
|
〉 |