Knowledge-enhanced Graph Topic Transformer for Explainable Biomedical Text Summarization
2024 (English)In: IEEE journal of biomedical and health informatics, ISSN 2168-2194, E-ISSN 2168-2208, Vol. 8, no 4, p. 1836-1847Article in journal (Refereed) Published
Abstract [en]
Given the overwhelming and rapidly increasing volumes of the published biomedical literature, automatic biomedical text summarization has long been a highly important task. Recently, great advances in the performance of biomedical text summarization have been facilitated by pre-trained language models (PLMs) based on fine-tuning. However, existing summarization methods based on PLMs do not capture domain-specific knowledge. This can result in generated summaries with low coherence, including redundant sentences, or excluding important domain knowledge conveyed in the full-text document. Furthermore, the black-box nature of the transformers means that they lack explainability, i.e. it is not clear to users how and why the summary was generated. The domain-specific knowledge and explainability are crucial for the accuracy and transparency of biomedical text summarization methods. In this article, we aim to address these issues by proposing a novel domain knowledge-enhanced graph topic transformer (DORIS) for explainable biomedical text summarization. The model integrates the graph neural topic model and the domain-specific knowledge from the Unified Medical Language System (UMLS) into the transformer-based PLM, to improve the explainability and accuracy. Experimental results on four biomedical literature datasets show that our model outperforms existing state-of-the-art (SOTA) PLM-based summarization methods on biomedical extractive summarization. Furthermore, our use of graph neural topic modeling means that our model possesses the desirable property of being explainable, i.e. it is straightforward for users to understand how and why the model selects particular sentences for inclusion in the summary. The domain-specific knowledge helps our model to learn more coherent topics, to better explain the performance. © IEEE
Place, publisher, year, edition, pages
Piscataway, NJ: IEEE, 2024. Vol. 8, no 4, p. 1836-1847
Keywords [en]
Biological system modeling, Biomedical text summarization, Correlation, domain knowledge, explainability, graph neural topic model, Knowledge based systems, pre-trained language models, Semantics, Task analysis, Transformers, Unified modeling language
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:hh:diva-51639DOI: 10.1109/JBHI.2023.3308064ISI: 001197865400005PubMedID: 37610905Scopus ID: 2-s2.0-85168715034OAI: oai:DiVA.org:hh-51639DiVA, id: diva2:1798584
Note
Funding: The New Energy and Industrial Technology Development Organization (NEDO)
2023-09-192023-09-192024-06-26Bibliographically approved