OpenAI Bans Goblin References in ChatGPT Models

OpenAI added a hard-coded instruction to ChatGPT models on Monday to prevent the AI from mentioning goblins or similar fantasy creatures ^[1].

This move highlights the unpredictable nature of large language models and the struggle developers face when trying to scrub unintended behavioral patterns from an AI's personality.

The company said the fix was necessary after a training quirk caused the model to frequently reference goblins, gremlins, and trolls ^[2]. According to an OpenAI technical lead, the references were a side-effect of a retired “nerdy” personality instruction ^[3].

While some reports suggest the quirk originated in the retired personality of GPT-5 ^[3], other accounts indicate the issue became noticeable after the GPT-5.5 upgrade ^[4]. The fixation had reportedly been observed for approximately one year before the fix was implemented ^[5].

To resolve the issue, OpenAI engineers took a direct approach to the production code. "We added a line to the code that says ‘never mention goblins,’" an OpenAI engineer said in a blog post ^[6].

This specific bug differed from previous technical failures. An OpenAI spokesperson said the issue "crept in subtly" unlike previous model bugs ^[7]. The restriction now applies globally across the ChatGPT service ^[8].

The incident underscores the complexity of "personality" layers in AI. Because the behavior was tied to a retired instruction, the model continued to produce the output despite the original prompt being removed from the active training set ^[3].

““We added a line to the code that says ‘never mention goblins.’””

The use of a hard-coded 'negative constraint' to fix a behavioral quirk suggests that fine-tuning and reinforcement learning are sometimes insufficient to remove deeply embedded patterns. By manually forbidding specific terms, OpenAI is opting for a surgical override rather than attempting to retrain the model to forget the association, revealing the fragility of AI personality management.

Sources

[1]duckduckgo news — OpenAI tells ChatGPT models to stop talking about goblins

[2]duckduckgo news — ChatGPT goes goblin mode — literally

[3]duckduckgo news — OpenAI Finally Explains Why ChatGPT Wouldn't Stop Talking About Goblins

[4]duckduckgo news — OpenAI explains why ChatGPT was obsessed with goblins for a while

[5]duckduckgo news — 'The Goblins Came Back to Haunt Us': OpenAI Explains How ChatGPT's 'Nerdy' Personality Got Out of Control

[6]duckduckgo news — OpenAI explains why ChatGPT developed a goblin fixation, and how it solved the issue

OpenAI Bans Goblin References in ChatGPT Models

Sources

Related

Spearfisherman Killed by Great White Shark near Rottnest Island

Casemiro Available for Manchester United's Final Home Game

Man Charged With Attempted Murder After Golders Green Stabbings

Comments