LEEC for Judicial Fairness: A Legal Element Extraction Dataset with Extensive Extra-Legal Labels
LEEC for Judicial Fairness: A Legal Element Extraction Dataset with Extensive Extra-Legal Labels
Zongyue Xue, Huanghai Liu, Yiran Hu, Yuliang Qian, Yajing Wang, Kangle Kong, Chenlu Wang, Yun Liu, Weixing Shen
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
AI for Good. Pages 7527-7535.
https://doi.org/10.24963/ijcai.2024/833
An extensive label system is pivotal to facilitate judicial fairness and social justice. Prior empirical research and our interview with legal professionals underscore the importance of extra-legal factors in criminal trials. To help identify sentencing biases and facilitate downstream applications, we introduce the Legal Element ExtraCtion (LEEC) dataset comprising 15,919 judicial documents and 155 labels. This dataset was constructed through two main steps: First, designing the label system by legal experts based on prior empirical research which identified critical factors driving and processes generating sentencing outcomes in criminal cases; Second, employing legal knowledge to annotate judicial documents according to the label system and annotation guideline. LEEC represents the most extensive and domain-specific legal element extraction dataset for the Chinese legal system. Our experiments reveal that despite certain capabilities, both Document Event Extraction (DEE) models and Large Language Models(LLMs) face significant restrictions in legal element extraction tasks. Finally, our empirical analysis based on LEEC provides evidence for judicial unfairness in Chinese criminal sentencing and confirms the applicability of LEEC for future empirical research and other downstream applications. LEEC and related resources are available on https://github.com/THUlawtech/LEEC.
Keywords:
Multidisciplinary Topics and Applications: General
Natural Language Processing: General