• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

Trendingnow

1

After forcing workers back to the office, Goldman Sachs and JPMorgan Chase are now letting their staff work remotely—but only for the World Cup

2

The Pentagon said Iran War costs $29 billion, but the real cost is closer to $200 billion—and counting

3

Current price of oil as of June 23, 2026

1

After forcing workers back to the office, Goldman Sachs and JPMorgan Chase are now letting their staff work remotely—but only for the World Cup

2

The Pentagon said Iran War costs $29 billion, but the real cost is closer to $200 billion—and counting

3

Current price of oil as of June 23, 2026
NewslettersEye on AI

Cerebras hopes planned IPO will supercharge its race against Nvidia and fellow chip startups for the fastest generative AI

Sharon Goldman
By
Sharon Goldman
Sharon Goldman
AI Reporter
Down Arrow Button Icon
Sharon Goldman
By
Sharon Goldman
Sharon Goldman
AI Reporter
Down Arrow Button Icon
October 1, 2024, 3:13 PM ET
Andrew Feldman, CEO of Cerebras Systems.
Andrew Feldman, CEO of Cerebras Systems.Ramsey Cardy—Sportsfile for Collision via Getty Images
Add Fortune on Google for similar content.

Hello and welcome to Eye on AI! In this edition…Governor Newsom vetoes SB 1047; ByteDance plans new AI model based on Huawei chips; Microsoft announces AI models will improve Windows search; and the U.S. Commerce Department sets a new rule that eases restrictions on AI chip shipments to the Middle East.

Recommended Video

Cerebras has a need for speed. In a bid to take on Nvidia, the AI chip startup is rapidly moving toward an IPO after announcing its filing for one yesterday. At the same time, the company is also in a fierce race with fellow AI chip startups Groq and SambaNova for the title of ‘fastest generative AI.’ All three are pushing the boundaries of their highly-specialized hardware and software to enable AI models to produce responses using ultra-fast generative AI that even outperform Nvidia GPUs. 

Here’s what that means: When you ask an AI assistant a question, it must sift through all of the knowledge in its AI model to quickly come up with an answer. In industry parlance, that process is known as “inference.” But large language models don’t sift through words during the inference process. When you ask a question or give a chatbot a prompt, the AI breaks that into smaller pieces called “tokens”—which could represent a word, or a chunk of a word—to process its answer and respond. 

Pushing for faster and faster output

So what does “ultra-fast” inference mean? If you’ve tried chatbots like OpenAI’s ChatGPT, Anthropic’s Claude, or Google’s Gemini, you probably think the output of your prompts arrives at a perfectly reasonable pace. In fact, you may be impressed by how quickly it spits out answers to your queries. But in February 2024, demos of a Groq chatbot based on a Mistral model produced answers far faster than people could read. It went viral. The setup served up 500 tokens per second to produce answers that were nearly instantaneous. By April, Groq delivered an even speedier 800 tokens per second, and by May SambaNova boasted it had broken the 1,000 tokens per second barrier. 

Today, Cerebras, SambaNova, and Groq are all delivering over 1,000 tokens per second, and the “token wars” have revved up considerably. At the end of August, Cerebras claimed it had launched the “world’s fastest AI inference” at 1,800 tokens per second, and last week Cerebras said it had beaten that record and become the “first hardware of any kind” to exceed 2,000 tokens per second on one of Meta’s Llama models. 

When will fast be fast enough?

This led me to ask: Why would anyone need generative AI output to be that fast? When will fast be fast enough?

According to Cerebras CEO Andrew Feldman, generative AI speed is essential since search results will increasingly be powered by generative AI, as well as new capabilities like streaming video. Those are two areas where latency, or the delay between an action and a response, is particularly annoying. 

“Nobody’s going to build a business on an application that makes you sit around and wait,” he told Fortune. 

In addition, AI models are quickly being used to power far more complex applications than just chat. One rapidly growing area of interest is developing application workflows based on AI agents, in which a user asks a question or prompts an action that doesn’t simply involve one query to one model. Instead it leads to multiple queries to multiple models that can go off and do things like search the web or a database. 

“Then the performance really matters,” said Feldman, explaining that a reasonably slow output today could quickly become painfully slow. 

Unlocking AI potential with speed

The bottom line is that speed matters because faster inference unlocks greater potential in applications built with AI, Mark Heaps, chief technology evangelist at Groq, told Fortune. That is especially true for data-heavy applications in fields like financial trading, traffic monitoring, and cybersecurity: “You need insights in real time, a form of instant intelligence that keeps up with the moment,” he said. “The race to increase speed…will provide better quality, accuracy, and potential for greater ROI.” 

It’s worth noting, he pointed out, that AI models still have nowhere near as many neural connections as the human brain. “As the models get more advanced, bigger, or layered with lots of agents using smaller models, it will require more speed to keep the application useful,” he explained, adding that this has been an issue throughout history. “Why do we need cars to get beyond 50 mph? Was it so we could go fast? Or producing an engine that could do 100 mph enabled the ability to carry more weight at 50 mph?” 

Rodrigo Liang, CEO and cofounder of SambaNova, agreed. Inference speed, he told Fortune, “is where the rubber hits the road—where all the training, the building of models, gets put to work to deliver real business value.” That’s particularly true now that the AI industry is moving more of its training from training AI models to putting them into production. “The world is looking for the most efficient way to produce tokens so you can support an ever-growing number of users,” he said. “Speed allows you to service many customers concurrently.” 

Sharon Goldman
sharon.goldman@fortune.com

AI IN THE NEWS

Governor Newsom vetoes California’s SB-1047. On Sunday, news spread quickly through Silicon Valley that Governor Newsom had vetoed SB-1047, a widely debated and ambitious AI regulatory proposal. The bill, if enacted, would have required developers to conduct safety testing on large AI models before public release, the New York Times reported. Critics, however, raised concerns over provisions granting the state’s attorney general the authority to sue companies for harm caused by their technologies. The bill also mandated a “kill switch” to shut down AI systems in the event of potential threats like biowarfare, mass casualties, or significant property damage. “I do not believe this is the best approach to protecting the public from real threats posed by the technology,” Newsom said in a statement. “Instead, the bill applies stringent standards to even the most basic functions—so long as a large system deploys it.”

Sources say ByteDance plans new AI model trained with Huawei chips. Reuters reported that TikTok's Chinese parent ByteDance plans to develop an AI model trained primarily with chips from China’s Huawei Technologies. It's a response to U.S. moves since 2022 to restrict exports of advanced AI chips, particularly from market leader Nvidia. The article claimed that sources said ByteDance's next step in the AI race is to use Huawei's Ascend 910B chip to train a large-language AI model, but ByteDance denied a new model is being developed.

Microsoft announces AI models will improve Windows search on Copilot Plus PCs. Microsoft said today its new Copilot Plus PCs will use AI models to improve Windows search, available starting in November, including a new Click to Do feature that is similar to Google’s Circle to Search function. “AI-powered search makes it dramatically easier to find virtually anything,” said Yusuf Mehdi, executive vice president and consumer chief marketing officer at Microsoft, as reported by the Verge. “You no longer need to remember file names and document locations, nor even specific names of words. Windows will better understand your intent and match the right document, image, file, or email.”

U.S. Commerce Department sets new rule that eases restrictions on AI chip shipments to Middle East. According to Reuters, yesterday the U.S. Commerce Department unveiled a rule that could ease shipments of AI chips like those from Nvidia to Middle East data centers. Since October 2023, U.S. exporters have been required to obtain licenses before shipping advanced chips to parts of the Middle East and Central Asia. But now, data centers will be able to apply for status that will allow them to receive chips, rather than requiring their suppliers to obtain individual licenses to ship to them.

FORTUNE ON AI

Before Mira Murati’s surprise exit from OpenAI, staff grumbled its o1 model had been released prematurely—by Jeremy Kahn, Kali Hays and Sharon Goldman

Why investors want startup founders to own equity—including OpenAI’s Sam Altman—by Sharon Goldman, Kali Hays and Verne Kopytoff

Nvidia shares fall and its Chinese rivals soar after Beijing urges AI companies to look elsewhere for chips—by David Meyer

Mark Cuban warns the U.S. must win the AI race ‘or we lose everything’—by Jason Ma

AI CALENDAR

Oct. 22-23: TedAI, San Francisco

Oct. 28-30: Voice & AI, Arlington, Va.

Nov. 19-22: Microsoft Ignite, Chicago

Dec. 2-6: AWS re:Invent, Las Vegas

Dec. 8-12: Neural Information Processing Systems (Neurips) 2024 in Vancouver, British Columbia

Dec. 9-10: Fortune Brainstorm AI San Francisco (register here)

EYE ON AI RESEARCH

Could generative AI chatbots help reduce belief in conspiracy theories? New research published in Science by Thomas Costello of American University and Gordon Pennycook of Cornell found that discussions with AI chatbots could reduce individuals’ beliefs in conspiracy theories. Using OpenAI’s GPT-4 Turbo, human participants described a conspiracy theory that they subscribed to, and then the AI responded with back and forth with persuasive arguments that refuted their beliefs with evidence. According to the research, “the AI chatbot’s ability to sustain tailored counterarguments and personalized in-depth conversations reduced their beliefs in conspiracies for months, challenging research suggesting that such beliefs are impervious to change.”

BRAIN FOOD

Want a glimpse of your future self using generative AI? If you’ve ever wanted to receive a visit from your future self like in Back to the Future, you may be interested in new research from MIT that created a chatbot for users to have a conversation with an “AI-generated simulation of their potential future self.” The tool, called “ Future You,” uses a large language model and information provided by the user to help young people “improve their sense of future self-continuity, a psychological concept that describes how connected a person feels with their future self.” What if the Future Tool offers negative predictions, causing young people to freak out? The researchers explained that the tool cautions users that its results are only one potential version of their future self, and they can still change their lives. “This is not a prophesy, but rather a possibility,” the lead researcher said. 

This is the online version of Eye on AI, Fortune's biweekly newsletter on how AI is shaping the future of business. Sign up for free.
About the Author
Sharon Goldman
By Sharon GoldmanAI Reporter
LinkedIn icon

Sharon Goldman is an AI reporter at Fortune and co-authors Eye on AI, Fortune’s flagship AI newsletter. She has written about digital and enterprise tech for over a decade.

See full bioRight Arrow Button Icon
Add Fortune on Google for similar content.

Latest in Newsletters

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Newsletters

How Home Depot is rebuilding retailing with AI
NewslettersCIO Intelligence
How Home Depot is rebuilding retailing with AI
By John KellJune 24, 2026
47 minutes ago
As America turns 250, women’s financial independence remains a work in progress
NewslettersMPW Daily
As America turns 250, women’s financial independence remains a work in progress
By Emma HinchliffeJune 24, 2026
3 hours ago
As mega-funds grab 72% of all capital raised, the gap between VC’s haves and have-nots keeps widening
NewslettersTerm Sheet
As mega-funds grab 72% of all capital raised, the gap between VC’s haves and have-nots keeps widening
By Allie GarfinkleJune 24, 2026
8 hours ago
Business is moving past the tech bro era and learning to value ‘real people, real places’
NewslettersCEO Daily
Business is moving past the tech bro era and learning to value ‘real people, real places’
By Diane BradyJune 24, 2026
8 hours ago
Tencent COO and interactive entertainment group president Ren Yuxin on July 9, 2020 in Shanghai, China. (Photo: Wu Jun/VCG/Getty Images)
NewslettersFortune Tech
Tencent winds down its Japanese game studio investments
By Andrew NuscaJune 24, 2026
8 hours ago
Google DeepMind CEO Demis Hassabis (left) stands on a spiral staircase next to Google DeepMind researcher John Jumper.
NewslettersEye on AI
Defections from Google DeepMind prompt questions about Alphabet’s efforts to stay at the forefront of AI
By Jeremy KahnJune 23, 2026
23 hours ago

Most Popular

After forcing workers back to the office, Goldman Sachs and JPMorgan Chase are now letting their staff work remotely—but only for the World Cup
Success
After forcing workers back to the office, Goldman Sachs and JPMorgan Chase are now letting their staff work remotely—but only for the World Cup
By Orianna Rosa RoyleJune 23, 2026
1 day ago
The Pentagon said Iran War costs $29 billion, but the real cost is closer to $200 billion—and counting
Economy
The Pentagon said Iran War costs $29 billion, but the real cost is closer to $200 billion—and counting
By Jacqueline MunisJune 24, 2026
11 hours ago
Current price of oil as of June 23, 2026
Personal Finance
Current price of oil as of June 23, 2026
By Joseph HostetlerJune 23, 2026
1 day ago
Markets tumble worldwide as Fed resets expectations: $400 billion wiped off SpaceX stock
Banking
Markets tumble worldwide as Fed resets expectations: $400 billion wiped off SpaceX stock
By Jim EdwardsJune 23, 2026
1 day ago
Current price of gold as of June 23, 2026
Personal Finance
Current price of gold as of June 23, 2026
By Danny BakstJune 23, 2026
1 day ago
Texas and Charlotte used to build huge McMansions—now they're copying the California design tricks they once mocked
Real Estate
Texas and Charlotte used to build huge McMansions—now they're copying the California design tricks they once mocked
By Sydney LakeJune 22, 2026
2 days ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.