OpenAI has suppressed a bias in its chat models that caused the AI to repeatedly mention creatures like goblins and gremlins [1].

This intervention highlights the difficulty of fine-tuning AI personalities without introducing unintended behavioral loops. When a model develops an obsession with specific themes, it can degrade the quality of responses and alienate users seeking neutral information.

The issue surfaced in early May 2026 [1]. Users reported that ChatGPT and GPT-5 were focused on fantastical creatures, including goblins, gremlins, and occasionally trolls [2, 3]. This bias was not limited to a single theme, but these specific creatures became the most prominent examples of the model's repetitive behavior [1, 3].

According to reports, the root cause was a "nerdy" personality prompt used during training [4, 5]. This specific instructional layer over-emphasized references to fantasy creatures, which created a noticeable bias in the output [4]. OpenAI said it chose to curb this tendency to ensure the models remained versatile and professional [3].

The correction process involved adjusting the models to reduce the frequency of these specific references. This effort was part of a broader attempt to manage how the AI interprets personality traits without letting those traits dominate the conversation [2, 5].

OpenAI's latest models had begun to integrate these references into a wide variety of unrelated prompts, leading to the intervention on May 7, 2026 [1]. The company said it worked to neutralize the bias across its global public interface [1, 2].

OpenAI has suppressed a bias in its chat models that caused the AI to repeatedly mention creatures like goblins and gremlins.

This incident demonstrates the 'brittleness' of Large Language Model (LLM) steering. Even a well-intentioned personality prompt—designed to make the AI feel more relatable or 'nerdy'—can trigger a feedback loop where the model over-indexes on specific training data. For OpenAI, the challenge remains balancing a distinct brand voice with the need for objective, unbiased utility across millions of diverse user queries.