基于注意力机制语义增强的文档级关系抽取
作者:
作者单位:

1.同济大学 电子与信息工程学院,上海 201804;2.上海视觉感知与智能计算工程技术研究中心,上海 200092

作者简介:

柳先辉,副教授,工学博士,主要研究方向为自然语言学习。E-mail: lxh@tongji.edu.cn

通讯作者:

吴文达,硕士生,主要研究方向为知识图谱。E-mail:2130766@tongji.edu.cn

中图分类号:

TP391

基金项目:

国家重点研发计划(2020YFB1709303)


Document-Level Relation Extraction Method Based on Attention Semantic Enhancement
Author:
Affiliation:

1.College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China;2.Shanghai Visual Perception and Intelligent Computing Engineering Technology Research Center, Shanghai 200092, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    文档级关系抽取旨在从文档中抽取出多个实体对之间的关系,具有较高的复杂性。针对文档级关系抽取中的多实体、关系相关性、关系分布不平衡等问题,提出了一种基于注意力机制(Attention)语义增强的文档级关系抽取方法,能够实现实体对之间关系的推理。具体来说,首先在数据编码模块改进编码策略,引入更多实体信息,通过编码网络捕获文档的语义特征,获得实体对矩阵;然后,设计了一个基于Attention门控机制的U-Net网络,对实体对矩阵进行局部信息捕获和全局信息汇总,实现语义增强;最后,使用自适应焦点损失函数缓解关系分布不平衡的问题。在4个公开的文档级关系抽取数据集(DocRED、CDR、GDA和DWIE)上评估了Att-DocuNet模型并取得了良好的实验结果。

    Abstract:

    Document-level relation extraction aims to extract the relations between multiple entity pairs from a document, a task characterized by high complexity. This paper proposes a method for document-level relation extraction based on attention semantic enhancement to address challenges such as handling multiple entities, capturing relationship correlations, and dealing with imbalanced relationship distributions within documents. The method proposed facilitates the inference of relationships between entity pairs. Specifically, the data encoding module enhances the encoding strategy by incorporating additional entity information, capturing semantic features of the document through the encoding network, and generating an entity pair matrix. Subsequently, a U-Net network employing an attention gating mechanism is devised to capture local information and aggregate global information from entity pair matrices, thereby achieving semantic enhancement. Finally, this paper introduces an adaptive focal loss function to mitigate imbalanced relationship distributions. The Att-DocuNet model proposed is evaluated on four publicly available document-level relation extraction datasets (DocRED, CDR, GDA, and DWIE), yielding promising experimental results.

    参考文献
    相似文献
    引证文献
引用本文

柳先辉,吴文达,赵卫东,侯文龙.基于注意力机制语义增强的文档级关系抽取[J].同济大学学报(自然科学版),2024,52(5):822~828

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-11-29
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-05-24
  • 出版日期:
文章二维码