By

Anthropic the Great: Is the AI Company as Saintly as People Think?

 

Anthropic the Great: Is the AI Company as Saintly as People Think?

Is Anthropic AI as ethical as claimed? A critical look at Claude’s values

 

 

The Myth of the Ethical AI Company

Anthropic was founded in 2021 by former OpenAI researchers Dario and Daniela Amodei, who left OpenAI citing concerns about the company’s direction and commercial pressures. From the beginning, Anthropic has marketed itself as the responsible alternative to OpenAI – a company that puts safety first and profits second.

The company’s public messaging has consistently emphasized its Constitutional AI approach, which aims to make AI systems helpful, harmless, and honest through careful training and constitutional principles. But critics have long wondered whether this is genuine commitment or clever marketing.

Recent events suggest the answer may be more complicated than Anthropic would like to admit. The questions that loom large are these: Is Anthropic as saintly as people made them out to be? If I install Anthropic AI, will it disobey my request if they are against progressive thinking?

These are increasingly important questions as Anthropic has positioned itself as the “ethical” AI company, with a strong emphasis on AI safety and responsible development. But recent revelations about their system prompts and corporate decisions have raised questions about whether the company lives up to its saintly reputation.

The System Prompt Leak: A Peek Behind the Curtain

One of the most revealing moments came with the leak of Claude’s system prompts – the hidden instructions that guide how the AI behaves. These system prompts are the “secret sauce” that determines what Claude will and won’t do, how it prioritizes different values, and where it draws its ethical lines.

The leaked prompts revealed that Claude has extensive guardrails built in – some obvious, some subtle. For instance, the system prompt explicitly directs Claude to avoid certain types of content, to prioritize safety in ambiguous situations, and to err on the side of caution when dealing with potentially harmful requests.

This isn’t necessarily problematic in itself. Every AI company builds guardrails into their systems. The question is who gets to decide where those guardrails are placed, and whether they reflect genuine ethical concerns or simply the political and cultural preferences of the company building them.

The implications are profound. When a company like Anthropic encodes its values into the very fabric of how millions of people interact with AI, those values become the default framework through which users understand and engage with artificial intelligence. This is not merely a technical decision; it is a philosophical one with far-reaching consequences for how knowledge is produced, shared, and consumed in the digital age.

When AI Companies Control Knowledge Work

The deeper concern is what happens when AI companies like Anthropic become the gatekeepers of knowledge work. As these systems become more integrated into how people write, research, code, and think, the values embedded in their system prompts effectively become the values that shape a significant portion of human intellectual output.

This is the “finger on the scales” problem. Even if Anthropic’s intentions are good, the mere fact that they can shape how millions of people think and write through their AI systems gives them enormous power. And power, as we know, tends to corrupt. Mean while Elon Musk Grok Ai is criticised for low amount of censorship and alignment!

The recent release of Claude Fable 5 – Anthropic’s most powerful model – has made these concerns more acute. Fable 5 represents a new class of AI capabilities, but it also comes with new layers of control and guidance. The model is designed to be exceptionally capable while remaining firmly within Anthropic’s ethical framework.

But whose ethics are we talking about? Anthropic’s employees? Its leadership? The investors who fund it? The academic advisors who consult for it? When a single company controls the boundaries of acceptable discourse for a tool used by millions, we must ask whether that concentration of influence is compatible with a free and open society.

The Super Bowl Advertising Gambit

Anthropic’s Super Bowl advertising campaign in early 2026 was telling. The company spent millions to position itself as the ethical alternative to OpenAI, with ads that explicitly contrasted Claude’s ad-free experience with competitors’ plans to monetize through advertising.

On the surface, this was a smart marketing move. But it also revealed something important about Anthropic’s strategy: they are actively competing not just on capabilities, but on values. They want to be seen as the “good” AI company.

The problem with this positioning is that it creates enormous pressure to maintain a spotless reputation. And when a company is under that kind of pressure, the temptation to hide problems, downplay failures, or manipulate perceptions becomes very real. (Google do no evil).

OpenAI CEO Sam Altman directly criticized Anthropic’s Super Bowl campaign, calling the ads “funny” but “clearly dishonest.” He argued that Anthropic serves “an expensive product to rich people” while pretending to be the ethical choice for everyone. Whether or not Altman’s critique is fair, it highlights the growing rivalry between the companies and the increasingly heated debate over AI ethics.

The advertising battle also revealed how AI companies are increasingly competing on moral grounds rather than purely technical ones. When the primary differentiator becomes “we’re more ethical than them,” the stage is set for a kind of moral grandstanding that may have little to do with actual safety or responsibility.

The Progressive Bias Question

This brings us back to the question of whether Anthropic AI will disobey requests that go against progressive thinking. The evidence is mixed.

On one hand, Anthropic has been relatively transparent about building certain values into Claude. The company has acknowledged that they train their models to avoid generating harmful content, which includes hate speech, instructions for violence, and certain types of adult content. (define Hate speech or what is a woman, in the UK we Non-crime hate incident?)

But critics argue that the definition of “harmful” has expanded beyond what most people would consider genuinely dangerous. Some users have reported that Claude seems reluctant to engage with certain political topics, appears to have implicit biases on controversial issues, and sometimes refuses to help with legitimate research or creative projects that touch on sensitive subjects.

The leaked system prompts seem to confirm that Claude is instructed to be particularly careful about content that could be seen as promoting discrimination, violence, or harm. Again, this isn’t necessarily bad – but it does raise questions about where the line is drawn. (I first complained about Ai with progressive bias using Jasper.ai).

Consider the implications: a researcher studying extremist movements for academic purposes might find their queries refused. A writer exploring controversial themes in fiction might encounter resistance from the AI. A user asking legitimate questions about politically charged topics might receive carefully hedged responses that reflect Anthropic’s values rather than neutral information. (writing not safe for work content(NSFW) is nearly impossible with Ai)!

The Commercial Reality

For all its talk of safety and ethics, Anthropic is still a commercial enterprise. The company has raised billions of dollars in funding and is under pressure to deliver returns to investors. This creates an inherent tension between its stated values and its business needs.

The release of Claude Fable 5 is a perfect example. Anthropic initially announced a more powerful “Mythos” model but held it back from public release, citing safety concerns. Now that OpenAI has released competitive models, Anthropic has decided that Fable 5 – which includes access to Mythos-class capabilities – is safe enough for public release after all.

Is this because the safety issues have been resolved, or because the competitive pressure became too intense? The timing suggests the latter.

The contradiction is stark: a company that claims to prioritize safety above all else suddenly discovers that previously unsafe capabilities are, in fact, safe enough to release when market share is at stake. This pattern – of safety concerns evaporating in the face of commercial pressure – does not inspire confidence in Anthropic’s commitment to its stated principles.

The Future of AI Ethics

As AI systems become more powerful and more integrated into our lives, questions about who controls them and what values they embody will only become more important. Anthropic has positioned itself as the company that takes these questions seriously – but its actions don’t always match its rhetoric.

The company has lobbied for government oversight of AI, which sounds responsible until you consider that such oversight would likely favor established players like Anthropic over potential competitors. They have trumpeted their safety research while releasing increasingly powerful models that they themselves admit could be dangerous.

And through it all, they have maintained a carefully cultivated image as the “good guys” of AI – the ones who care about safety, who aren’t just in it for the money, who will put ethics above profits.

Maybe that’s true. Maybe Anthropic really is different from its competitors, really does care more about safety, really will make the hard choices when they conflict with commercial interests.

But the evidence is at best mixed. And as AI becomes more central to how we work, create, and think, we should be asking hard questions about whether any single company – no matter how well-intentioned – should have this much influence over the tools that shape human knowledge.

Recently Anthropic has had to make a deal with Elon Musk’s SpaceX/Xai to use their data center, paying over $1 billion a month, with option for access for 1 year! Such is the demand for Claude Ai usage! The saint making a deal with the devil?

The saintly image is nice. But in the end, Anthropic is a company like any other, subject to the same pressures and incentives. Trusting them to be the guardians of AI safety might be the biggest gamble of all.

This post contains affiliate links. If you purchase through these links, I may earn a commission at no extra cost to you.

 

Leave a Reply

Discover more from Thoughts on Technology

Subscribe now to keep reading and get access to the full archive.

Continue reading