Large Language Models trained on large-scale uncontrolled corpora often encode stereotypes and biases, which can be displayed
through harmful text generation or biased associations. However, do they also pick up subtler linguistic patterns that can
potentially reinforce and communicate biases and stereotypes, as humans do? We aim to bridge theoretical insights from social
science with bias research in NLP by designing controlled, theoretically motivated LLM experiments to elicit this type of
bias. Our case study is negation bias, the bias that humans have towards using negation to describe situations that challenge
common stereotypes. We construct an evaluation dataset containing negated and affirmed stereotypical and anti-stereotypical
sentences and evaluate the performance of eight language models using perplexity as a metric for measuring model surprisal.
We find that the autoregressive decoder models in our experiment exhibit this bias, while we do not find evidence for it among
the stacked encoder models.