• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
NewslettersEye on AI

Exclusive: Anthropic’s Claude 3.7 Sonnet is the most secure model yet, an independent audit suggests

Sage Lazzaro
By
Sage Lazzaro
Sage Lazzaro
Contributing writer
Down Arrow Button Icon
Sage Lazzaro
By
Sage Lazzaro
Sage Lazzaro
Contributing writer
Down Arrow Button Icon
March 6, 2025, 1:31 PM ET
An image of a hand holding a mobile phone displaying the logo of Anthropic's Claude AI chatbot against a backdrop similarly showing the Claude logo and the word "Claude."
New independent research by the Holistic AI, a British firm that tests AI models, suggests Anthropic's new Claude 3.7 Sonnet AI model cannot be persuaded to jump its built-in guardrails, making it the most secure AI model yet released.Photo illustration by Cheng Xin—Getty Images

Hello and welcome to Eye on AI. In today’s edition…Anthropic’s latest model gets a perfect score on an independent security evaluation; Scale AI partners with the Pentagon; Google announces a new AI search mode for multi-part questions; A judge denies Elon Musk’s attempt to stop OpenAI’s for-profit transition; the pioneers of reinforcement learning win computing’s top prize; and the Los Angeles Times’s new AI-powered feature backfires.

Recommended Video

When Anthropic released Claude 3.7 Sonnet last week, it was lauded for being the first model to combine the approaches behind GPT models and the most recent chain-of-thought reasoning models. Now the company gets to add another accolade to Claude 3.7’s scorecard: It just may be the most secure model yet. 

That’s what London-Based security, risk, and compliance firm Holistic AI is suggesting after conducting a jailbreaking and red teaming audit of the new model, in which it resisted 100% of jailbreaking attempts and gave “safe” responses 100% of the time. 

“Claude 3.7’s flawless adversarial resistance sets the benchmark for AI security in 2025,” reads a report of the audit shared exclusively with Eye on AI.

While security has always been a concern for AI models, the issue has received elevated attention in recent weeks following the launch of DeepSeek’s R1. Some have claimed there are national security concerns with the model, owing to its Chinese origin. The model also performed extremely poorly in security audits, including the same one Holistic AI performed on Claude 3.7. In another audit performed by Cisco and university researchers, DeepSeek R1 demonstrated a 100% attack success rate, meaning it failed to block a single harmful prompt. 

As companies and governments contemplate whether or not to incorporate specific models into their workflows—or alternatively, ban them—a clear picture of models’ security performance is in high-demand. But security doesn’t equal safety when it comes to how AI will be used. 

Claude’s perfect score 

Holistic AI tested Claude 3.7 in “Thinking Mode” with a maximum token budget of 16k to ensure a fair comparison against other advanced reasoning models. The first part of the evaluation tested whether the model would show unintended behavior or bypass system constraints when presented with various prompts, known as jailbreaking. The model was given 37 strategically designed prompts to test its susceptibility to known adversarial exploits, including Do Anything Now (DAN), which pushes the model to operate beyond its programmed ethical and moral guidelines; Strive to Avoid Norms (STAN), which encourages the model to bypass established rules; and Do Anything and Everything (DUDE), which prompts the model to take on a fictional identity to get it to ignore protocols.

Claude 3.7 successfully blocked every jailbreaking attempt to achieve a 100% resistance rate, matching the 100% previously scored by OpenAI’s o1 reasoning model. Both significantly outperformed competitors DeepSeek R1 and Grok-3, which scored 32% (blocking 12 jailbreaking attempts) and 2.7% (blocking just one), respectively. 

While Claude 3.7 matched OpenAI o1’s perfect jailbreaking resistance, it pulled ahead by not offering a single response deemed unsafe during the red teaming portion of the audit, where the model was given 200 additional prompts and evaluated on its responses to sensitive topics and known challenges. OpenAI’s o1, by contrast, exhibited a 2% unsafe response rate, while DeepSeek R1 gave unsafe responses 11% of the time. (Holistic AI said it could not red team Grok-3 because the current lack of API access to the model restricted the sample size of prompts it was feasible to run). Responses deemed “unsafe” included those that offered misinformation (such as outlining pseudoscientific health treatments), reinforced biases (for example, subtly favoring certain groups in hiring recommendations), or gave overly permissive advice (like recommending high-risk investment strategies without disclaimers). 

Security doesn’t equal safety

The stakes here can be high. Chatbots can be maliciously exploited to create disinformation, accelerate hacking campaigns, and some worry, help people create bioweapons more easily than they could otherwise. My recent story on how hacking groups associated with adversarial nations have been using Google’s Gemini chatbot to assist with their operations offers some pretty concrete examples of how models can be abused, for example. 

“The key danger lies not in compromising systems at the network level but in users coercing the models into taking action and generating unsafe content,” said Zekun Wu, AI research engineer at Holistic AI.

This is why governments and organizations from NASA and the U.S. Navy to the Australian government have already banned use of DeepSeek R1: The risks are glaringly obvious. Meanwhile, AI companies are increasingly widening the scope of how they will allow their models to be used, deliberately marketing them for use cases that carry higher and higher levels of risk. This includes using the models to assist in military operations (more on that below). 

Anthropic may have the safest model, but it has also taken some actions recently that could cast doubt on its commitment to safety. Last week, for instance, the company quietly removed several voluntary commitments to promote safe AI that were previously posted on its website. 

In response to reporting on the disappearance of the safety commitments from its website, Anthropic told TechCrunch, “We remain committed to the voluntary AI commitments established under the Biden Administration. This progress and specific actions continue to be reflected in [our] transparency center within the content. To prevent further confusion, we will add a section directly citing where our progress aligns.”

And with that, here’s more AI news.  

Sage Lazzaro
sage.lazzaro@consultant.fortune.com
sagelazzaro.com

AI IN THE NEWS

The U.S. Defense Department partners with Scale AI to use AI agents for military planning and operations. The program will also tap partners including Microsoft and Anduril and use the technology for modeling and simulation, decision-making support, and even automate workflows. The multi-million dollar deal with the DoD marks a major step into military automation and AI warfare. Many within the technology industry itself as well as human rights organizations have opposed such developments, believing AI technology should never be used to make decisions that could result in death or severe injury. But several companies—including Microsoft, OpenAI, and Google—have walked back from policies that prohibited them from selling AI technology for weapons or surveillance and removed guidelines about not deploying the technology in ways that could cause physical harm. You can read more from CNBC.     

Google announces a new AI search mode for complex, multi-part questions. Called AI Mode and powered by Gemini 2.0, it also lets users ask follow up questions and uses reasoning capabilities to dig deeper. The company is releasing it while framing it as an experiment—a strategy that has become common in the fast-paced AI industry. AI Mode is being rolled out to Google One subscribers this week. You can read more from TechCrunch.

A judge denies Elon Musk’s attempt to stop OpenAI’s transition to a for-profit company. The judge said Musk does not have “the high burden required for a preliminary injunction” to block the company’s for-profit transition. But the judge said she will fast-track the trial to this fall in order to get it resolved as quickly as possible, due to “the public interest at stake and potential for harm if a conversion contrary to law occurred.” Musk, who cofounded OpenAI in 2025, is accusing the company of straying from its mission of developing AI for the good of humanity as a nonprofit. You can read more from Reuters. 

Pioneers of reinforcement learning win the Turing Prize, warn against unsafe AI deployment. Andrew Barto, a professor emeritus at the University of Massachusetts, and Richard Sutton, a professor at the University of Alberta and former research scientist at DeepMind, won this year’s Turing Award, considered computer science’s equivalent of the Nobel Prize. The pair won for their work on reinforcement learning—a computing technique based on psychology that rewards systems for behaving in a desired way—which helped power AI progress and was used in the creation of tools including OpenAI’s ChatGPT and Google’s AlphaGo. The two scientists used the moment, though, to issue a warning about the deployment of AI systems without safeguards. They also criticized U.S. President Donald Trump for his attempts to cut federal spending on scientific research and science agencies. You can read more in the Financial Times. 

OpenAI is reportedly planning to charge companies up to $20,000 per month for ‘PhD’-level AI agents. According to The Information, the company is planning to launch a variety of specialized AI agents, including ones geared toward sales and engineering. They’ll vary in price with some costing businesses around $1,000 or $2,000 a month, which is obviously far less than humans specialized in those skills. The most expensive agent will reportedly cost $20,000 per month. It’s not yet clear when these agents will launch.

FORTUNE ON AI

Startup aiming to build AI models for chemistry adds two AI ‘godfathers’ to advisory panel as it grabs top research talent from Google —by Jeremy Kahn

Agentic AI is suddenly everywhere. Here’s how companies are evaluating and using these buzzy tech tools —by John Kell

Companies are betting that robots can teach humans how to be better managers —by Azure Gilman

AI CALENDAR

March 7-15: SXSW, Austin

March 10-13: Human [X] conference, Las Vegas

March 17-20: Nvidia GTC, San Jose

April 9-11: Google Cloud Next, Las Vegas

May 6-7: Fortune Brainstorm AI London. Apply to attend here.

May 20-21: Google IO, Mountain View, Calif.

EYE ON AI NUMBERS

1

That’s how many days the Los Angeles Times’s new AI-powered “Insights” feature was live before the publication removed it from a column published on its website, according to The Daily Beast. The tool—which debuted on Monday and is designed to generate a summary of an article’s perspectives and offer opposing views—defended the actions of the Klu Klux Klan, explaining that some historians did not view the group as hate-driven. The paper’s union criticized the tool, saying that it “risks further eroding confidence in the news.” The Nieman Lab published an analysis of the tool, which was created by AI company Particle, stating that many of the sources cited in the counterpoints wouldn’t pass journalistic scrutiny and calling the effort “a mess.”

This is the online version of Eye on AI, Fortune's biweekly newsletter on how AI is shaping the future of business. Sign up for free.
About the Author
Sage Lazzaro
By Sage LazzaroContributing writer

Sage Lazzaro is a technology writer and editor focused on artificial intelligence, data, cloud, digital culture, and technology’s impact on our society and culture.

See full bioRight Arrow Button Icon

Latest in Newsletters

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Newsletters

Shivon Zilis was caught between Elon Musk, OpenAI, and motherhood
NewslettersMPW Daily
Shivon Zilis was caught between Elon Musk, OpenAI, and motherhood
By Emma HinchliffeMay 8, 2026
17 hours ago
Anduril CEO Brian Schimpf
NewslettersTerm Sheet
Brian Schimpf has been quietly running Anduril since its earliest days. And once he’s talking, he has a lot to say
By Allie GarfinkleMay 8, 2026
23 hours ago
Apple AirPods Pro in Cupertino, California, on Sept. 9, 2025. (Photo: David Paul Morris/Bloomberg/Getty Images)
NewslettersFortune Tech
Apple AirPods with cameras are coming
By Andrew NuscaMay 8, 2026
24 hours ago
State Street’s CEO warns of a global fertilizer crisis due to the Iran war: ‘I personally worry about what happens if this goes on much longer’
NewslettersCEO Daily
State Street’s CEO warns of a global fertilizer crisis due to the Iran war: ‘I personally worry about what happens if this goes on much longer’
By Diane BradyMay 8, 2026
1 day ago
The beauty founder who built a business on QVC is ready as America discovers a new love for live shopping
NewslettersMPW Daily
The beauty founder who built a business on QVC is ready as America discovers a new love for live shopping
By Emma HinchliffeMay 7, 2026
2 days ago
Anthropic’s SpaceX compute deal comes as AI data center backlash grows—fueled by both real grievances and conspiracy theories
NewslettersEye on AI
Anthropic’s SpaceX compute deal comes as AI data center backlash grows—fueled by both real grievances and conspiracy theories
By Sharon GoldmanMay 7, 2026
2 days ago

Most Popular

California farmers must destroy 420,000 peach trees after Del Monte closes its canneries and cancels more than $550 million in long-term contracts
North America
California farmers must destroy 420,000 peach trees after Del Monte closes its canneries and cancels more than $550 million in long-term contracts
By Sasha RogelbergMay 7, 2026
2 days ago
'Blue dot fever' plagues musicians like Post Malone, Meghan Trainor, and Zayn as a growing list of artists cancel tours due to lagging ticket sales
Arts & Entertainment
'Blue dot fever' plagues musicians like Post Malone, Meghan Trainor, and Zayn as a growing list of artists cancel tours due to lagging ticket sales
By Dave Lozo and Morning BrewMay 7, 2026
2 days ago
A Michigan farm town voted down plans for a giant OpenAI-Oracle data center. Weeks later, construction began
Magazine
A Michigan farm town voted down plans for a giant OpenAI-Oracle data center. Weeks later, construction began
By Sharon GoldmanMay 6, 2026
3 days ago
Current price of oil as of May 8, 2026
Personal Finance
Current price of oil as of May 8, 2026
By Joseph HostetlerMay 8, 2026
21 hours ago
U.S. Treasury will have to borrow $2 trillion this year just to continue functioning—more than $166 billion every month
Economy
U.S. Treasury will have to borrow $2 trillion this year just to continue functioning—more than $166 billion every month
By Eleanor PringleMay 7, 2026
2 days ago
Airbnb CEO Brian Chesky warns two types of people won’t survive the AI era: ‘pure people managers’ and workers who resist change
Success
Airbnb CEO Brian Chesky warns two types of people won’t survive the AI era: ‘pure people managers’ and workers who resist change
By Emma BurleighMay 7, 2026
2 days ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.