Making LLMs as Fine-Grained Relation Extraction Data Augmentor

Making LLMs as Fine-Grained Relation Extraction Data Augmentor

Yifan Zheng, Wenjun Ke, Qi Liu, Yuting Yang, Ruizhuo Zhao, Dacheng Feng, Jianwei Zhang, Zhi Fang

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 6660-6668. https://doi.org/10.24963/ijcai.2024/736

Relation Extraction (RE) identifies relations between entities in text, typically relying on supervised models that demand abundant high-quality data. Various approaches, including Data Augmentation (DA), have been proposed as promising solutions for addressing low-resource challenges in RE. However, existing DA methods in RE often struggle to ensure consistency and contextual diversity in generated data due to the fine-grained nature of RE. Inspired by the extensive generative capabilities of large language models (LLMs), we introduce a novel framework named ConsistRE, aiming to maintain context consistency in RE. ConsistRE initiates by collecting a substantial corpus from external resources and employing statistical algorithms and semantics to identify keyword hints closely related to relation instances. These keyword hints are subsequently integrated as contextual constraints in sentence generation, ensuring the preservation of relation dependence and diversity with LLMs. Additionally, we implement syntactic dependency selection to enhance the syntactic structure of the generated sentences. Experimental results from the evaluation of SemEval, TACRED, and TACREV datasets unequivocally demonstrate that ConsistRE outperforms other baselines in F1 values by 1.76%, 3.92%, and 2.53%, respectively, particularly when operating under low-resource experimental conditions.
Keywords:
Natural Language Processing: NLP: Language generation
Natural Language Processing: NLP: Information extraction
Natural Language Processing: NLP: Information retrieval and text mining
Natural Language Processing: NLP: Resources and evaluation