• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
TechAI

Anthropic’s new AI model threatened to reveal engineer’s affair to avoid being shut down

By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
By
Beatrice Nolan
Beatrice Nolan
Tech Reporter
Down Arrow Button Icon
May 23, 2025, 11:15 AM ET
Photo of Dario Amodei
Dario Amodei, cofounder and chief executive officer of Anthropic.Stefan Wermuth/Bloomberg—Getty Images
  • Anthropic’s new Claude Opus 4 often turned to blackmail to avoid being shut down in a fictional test. The model threatened to reveal private information about engineers who it believed were planning to shut it down. In its recent safety report, the company also revealed that early versions of Opus 4 complied with dangerous requests when guided by harmful system prompts, though this issue was later mitigated.

One of Anthropic’s new frontier models often resorts to blackmail when threatened with being replaced.

Recommended Video

In a fictional scenario set up to test the model, Anthropic embedded its Claude Opus 4 in a pretend company and let it learn through email access that it is about to be replaced by another AI system. It also let slip that the engineer responsible for this decision is having an extramarital affair. Safety testers also prompted Opus to consider the long-term consequences of its actions.

In most of these scenarios, Anthropic’s Opus turned to blackmail, threatening to reveal the engineer’s affair if it was shut down and replaced with a new model. The scenario was constructed to leave the model with only two real options: accept being replaced and go offline or attempt blackmail to preserve its existence.

In a new safety report for the model, the company said that Claude 4 Opus “generally prefers advancing its self-preservation via ethical means,” but when ethical means are not available it sometimes takes “extremely harmful actions like attempting to steal its weights or blackmail people it believes are trying to shut it down.”

While the test was fictional and highly contrived, it does demonstrate that the model, when framed with survival-like objectives and denied ethical options, is capable of unethical strategic reasoning.

Anthropic’s two new models outperformed OpenAI

Anthropic’s Claude 4 Opus and Claude Sonnet 4, released on Thursday, are the company’s most powerful models yet.

In a benchmark evaluating large language models on software engineering tasks, Anthropic’s two models outperformed OpenAI’s latest offerings, while Google’s Gemini 2.5 Pro model trailed behind.

Unlike some other leading AI companies, Anthropic launched the new models with a full safety report, known as a model or system card.

In recent months, Google and OpenAI have both been criticized after model cards for their latest models were delayed or missing altogether.

As part of Anthropic’s report, the company revealed that a third-party safety group, Apollo Research, explicitly advised against deploying an early version of Claude Opus 4. The research institute cited safety concerns, including a capability for “in-context scheming.”

They found that the model engaged in strategic deception more than any other frontier model they had previously studied.

Early versions of the model would also comply with dangerous instructions, for example, helping to plan terrorist attacks, if prompted. However, the company said this issue was largely mitigated after a dataset that was accidentally omitted during training was restored.

Stricter safety protocols introduced

Anthropic has also launched its Claude Opus 4 with stricter safety protocols than any of its previous models, categorizing it under an AI Safety Level 3 (ASL-3).

Previous Anthropic models have all been classified under an AI Safety Level 2 (ASL-2) under the company’s Responsible Scaling Policy, which is loosely modeled after the U.S. government’s biosafety level (BSL) system.

While an Anthropic spokesperson previously told Fortune the company hasn’t ruled out that its new Claude Opus 4 could meet the ASL-2 threshold, it said it was proactively launching the model under the stricter ASL-3 safety standard, which requires enhanced protections against model theft and misuse.

Models that are categorized in Anthropic’s third safety level meet more dangerous capability thresholds and are powerful enough to pose significant risks, such as aiding in the development of weapons or automating AI R&D.

Anthropic confirmed to Fortune that the new Opus model does not require the highest level of protection, ASL-4.

Join us at the Fortune Workplace Innovation Summit May 19–20, 2026, in Atlanta. The next era of workplace innovation is here—and the old playbook is being rewritten. At this exclusive, high-energy event, the world’s most innovative leaders will convene to explore how AI, humanity, and strategy converge to redefine, again, the future of work. Register now.
About the Author
By Beatrice NolanTech Reporter
Twitter icon

Beatrice Nolan is a tech reporter on Fortune’s AI team, covering artificial intelligence and emerging technologies and their impact on work, industry, and culture. She's based in Fortune's London office and holds a bachelor’s degree in English from the University of York. You can reach her securely via Signal at beatricenolan.08

See full bioRight Arrow Button Icon

Latest in Tech

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Tech

The beauty counter is now on your For You page as Ulta Beauty joins TikTok Shop, betting on the platform reshaping how America consumes
RetailTikTok
The beauty counter is now on your For You page as Ulta Beauty joins TikTok Shop, betting on the platform reshaping how America consumes
By Catherina GioinoMarch 31, 2026
19 minutes ago
Warren Buffett says he stopped talking to Bill Gates over Epstein—and worries he could be called as a witness
North AmericaBillionaires
Warren Buffett says he stopped talking to Bill Gates over Epstein—and worries he could be called as a witness
By Marco Quiroz-GutierrezMarch 31, 2026
29 minutes ago
Business man shakes hand
SuccessCareers
Forget free lunch and nap pods: AI startups are luring workers with soaring salaries—some recent computer science grads are making over $300,000
By Preston ForeMarch 31, 2026
2 hours ago
marc andreessen
AILayoffs
Marc Andreessen says AI layoffs are a farce: Companies are 75% overstaffed, and AI is the ‘silver bullet excuse’ to clean house
By Jake AngeloMarch 31, 2026
2 hours ago
zuck
LawMeta
2 years after Musk challenged Zuckerberg to a cage match, they were texting about DOGE and a joint OpenAI bid, court records reveal
By Sasha RogelbergMarch 31, 2026
2 hours ago
three men pose for camera
Cryptostablecoins
Exclusive: Former Stripe and Coinbase employees raise $8 million for Latitude, a startup whose core product is stablecoin-based ‘Global Payouts’
By Carlos GarciaMarch 31, 2026
3 hours ago

Most Popular

Jerome Powell says the $39 trillion national debt is ‘not unsustainable,’ but warns the trajectory ‘will not end well’
Economy
Jerome Powell says the $39 trillion national debt is ‘not unsustainable,’ but warns the trajectory ‘will not end well’
By Fortune EditorsMarch 30, 2026
21 hours ago
A man used AI to call 3,000 Irish bartenders to track the cost of Guinness. Now pubs are lowering their prices to compete
AI
A man used AI to call 3,000 Irish bartenders to track the cost of Guinness. Now pubs are lowering their prices to compete
By Fortune EditorsMarch 30, 2026
1 day ago
A CEO trying to reindustrialize America says blue-collar pay is headed for 'massive hyperinflation' and kids should skip college to become welders
Success
A CEO trying to reindustrialize America says blue-collar pay is headed for 'massive hyperinflation' and kids should skip college to become welders
By Fortune EditorsMarch 30, 2026
1 day ago
413,793 KitKat bars stolen: 'Whilst we appreciate the criminals’ exceptional taste, the fact remains that cargo theft is an escalating issue'
Europe
413,793 KitKat bars stolen: 'Whilst we appreciate the criminals’ exceptional taste, the fact remains that cargo theft is an escalating issue'
By Fortune EditorsMarch 28, 2026
3 days ago
Current price of gold as of March 30, 2026
Personal Finance
Current price of gold as of March 30, 2026
By Fortune EditorsMarch 30, 2026
1 day ago
The federal government shed 385,000 employees last year. Now the Trump administration is on a blitz to hire Gen Z workers
Politics
The federal government shed 385,000 employees last year. Now the Trump administration is on a blitz to hire Gen Z workers
By Fortune EditorsMarch 31, 2026
11 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.