This article introduces the LAMBADA dataset, developed in 2016 to evaluate the ability of computational NLP models to understand
texts longer than a single sentence. The dataset consists of passages from unpublished novels where the final word has been
masked. While human speakers can easily guess the missing word when provided with the broad context preceding it, this task
becomes nearly impossible when only the target sentence is available. At the time of its release, language models performed
poorly on LAMBADA, revealing significant gaps in their ability to leverage broader contexts for accurate word prediction.
Since its introduction, the landscape of NLP has changed dramatically with the advent of the Transformer architecture that
powered a new generation of models trained on next-word prediction as part of the language modeling objective. These models
have demonstrated substantial improvements in handling larger contextual information, and LAMBADA has become an essential
benchmark for measuring their quality and progress. In this article, I provide a detailed overview of the dataset, its design,
and its role in shaping the development of current state-of-the-art language models over the past eight years and continuing
to the present day.