Cybersecurity Researchers Unhappy with Fable’s Guardrails: Here’s What You Need to Know
Anthropic's latest AI model, Fable, faces criticism from cybersecurity experts over its restrictive guardrails. Learn more about these concerns and the implications for AI in security.
Admin User

Anthropic recently released its new AI model, Fable, as a public and limited version of their powerful cybersecurity tool, Mythos. However, this move has not been without controversy. Cybersecurity researchers and professionals are expressing dissatisfaction with the strict guardrails that limit Fable’s capabilities.
In an interview with TechCrunch, renowned security researcher Valentina “Chompie” Palmiotti highlighted one of the major issues: Fable rejects any request related to cybersecurity, even innocuous tasks like reading a blog post. When its guardrails are triggered, Fable pauses and informs users that the message has been flagged for cybersecurity or biology topics. These restrictions were implemented by Anthropic to minimize the risk of Fable being used for malicious purposes, such as developing malware or biological weapons.
Anthropic first introduced these limitations with its Mythos model in April through a limited access program called Project Glasswing. Now, they have expanded access to Mythos to hundreds of organizations across 15 countries. Despite this expansion, cybersecurity experts remain concerned about the haphazard nature of the restrictions.
Matt Suiche, a seasoned cybersecurity professional and member of Tolmo, an AI cybersecurity startup, commented: 'If you ask it to write secure code, it assumes it is cybersecurity-related work instead of software engineering best practices, and you get downgraded.' Fable's fallback mechanism involves reverting to Claude Opus 4.8 if it encounters a guardrail trigger.
Another researcher pointed out that even asking for a code review can trigger these guardrails, further limiting its usefulness in practical cybersecurity scenarios. Anthropic has not yet responded to comments seeking clarification on their policies and practices.


