Biomedical Text Simplification Models Trained on Aligned Abstracts and Lay Summaries

Authors

Jan Bakker
Taiki Papandreou-Lazos
Jaap Kamps

Date (dd-mm-yyyy)

2025

Title

Biomedical Text Simplification Models Trained on Aligned Abstracts and Lay Summaries

Publication Year

2025

Number of pages

Publisher

National Institute for Standards and Technology. NIST Special Publication 1329

Document type

Conference contribution

Abstract

This paper documents the University of Amsterdam’s participation in the TREC 2024 Plain Language Adaptation of Biomedical Abstracts (PLABA) Track. We investigated the effectiveness of text simplification models trained on aligned pairs of sentences in biomedical abstracts and plain language summaries. We participated in Task 2 on Complete Abstract Adaptation and conducted post-submission experiments in Task 1 on Term Replacement. Our main findings are the following. First, we used text simplification models trained on aligned real-world scientific abstracts and plain language summaries. We observed better performance for the context-aware model relative to the sentence-level model. Second, our experiments show the value of training on external corpora and demonstrate very reasonable out-of-domain performance on the PLABA data. Third, more generally, our models are conservative and cautious in gratuitous edits or information insertions. This approach ensures the fidelity of the generated output and limits the risk of overgeneration or hallucination.

Permalink

https://hdl.handle.net/11245.1/70f0d5bc-87be-4430-89f8-8db7d08d96ee