Who Needs Big AI Models?

10 hours ago

Cerebras Systems CEO and Founder Andrew Feldman
The AI world continues to evolve rapidly, especially since the introduction of DeepSeek and its followers. Many have concluded that enterprises don't really need the large, expensive AI models touted by OpenAI, Meta, and Google, and are focusing instead on smaller models, such as DeepSeek V2-Lite with 2.4B parameters, or Llama 4 Scout and Maverick with 17B parameters, which can provide decent accuracy at a lower cost. It turns out that this is not the case for coders, or more accurately, the models that can and will replace many coders. Nor does the smaller-is-better mantra apply to reasoning or agentic AI, the next big thing.
AI code generators require large models that can handle a wider context window, capable of accommodating approximately 100,000 lines of code. Mixture of expert (MOE) models supporting agentic and reasoning AI is also large. But these massive models are typically quite expensive, costing around $10 to $15 per million output tokens on modern GPUs. Therein lies an opportunity for novel AI architectures to encroach on GPUs' territory.
Cerebras Systems Launches Big AI with Qwen3-235B
Cerebras Systems (a client of Cambrian-AI Research) has announced support for the large Qwen3-235B, supporting 131K context length (about 200–300 pages of text), four times what was previously available. At the RAISE Summit in Paris, Cerebras touted Alibaba's Qwen3-235B, which uses a highly efficient mixture-of-experts architecture to deliver exceptional compute efficiency. But the real news is that Cerebras can run the model at only $0.60 per million input tokens and per million output tokens—less than one-tenth the cost of comparable closed-source models. While many consider the Cerebras wafer-scale engine expensive, this data turns that perception on its head.
Agents are a use case that frequently requires very large models.
One question I frequently get is, if Cerebras is so fast, why don't they have more customers? One reason is that they have not supported large context windows and larger models. Those seeking to develop code, for example, do not want to break down the problem into smaller fragments to fit, say, a 32KB context. Now, that barrier to sales has evaporated.
'We're seeing huge demand from developers for frontier models with long context, especially for code generation,' said Cerebras Systems CEO and Founder Andrew Feldman. "Qwen3-235B on Cerebras is our first model that stands toe-to-toe with frontier models like Claude 4 and DeepSeek R1. And with full 131K context, developers can now use Cerebras on production-grade coding applications and get answers back in less than a second instead of waiting for minutes on GPUs.'
Cerebras is not just 30 times faster, it is 92% cheaper than GPUs.
Cerebras has quadrupled its context length support from 32K to 131K tokens—the maximum supported by Qwen3-235B. This expansion directly impacts the model's ability to reason over large codebases and complex documentation. While 32K context is sufficient for simple code generation use cases, 131K context enables the model to process dozens of files and tens of thousands of lines of code simultaneously, allowing for production-grade application development.
Cerebras is 15-100 times more affordable than GPUs when running Qwen3-235B
Qwen3-235B excels at tasks requiring deep logical reasoning, advanced mathematics, and code generation, thanks to its ability to switch between "thinking mode" (for high-complexity tasks) and "non-thinking mode" (for efficient, general-purpose dialogue). The 131K context length allows the model to ingest and reason over large codebases (tens of thousands of lines), supporting tasks such as code refactoring, documentation, and bug detection.
Cerebras also announced the further expansion of its ecosystem, with support from Amazon AWS, as well as DataRobot, Docker, Cline, and Notion. The addition of AWS is huge;
Cerebras has added AWS to its cloud portfolio.
Where is this heading?
Big AI has constantly been downsized and optimized, with orders of magnitude of performance gains, model sizes, and price reductions. This trend will undoubtedly continue, but will be constantly offset by increases in capabilities, accuracy, intelligence, and entirely new features across modalities. So, if you want last year's AI, you're in great shape, as it continues to get cheaper.
But if you want the latest features and functions, you will require the largest models and the longest input context length.
It's the Yin and Yang of AI.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Gizmodo

12 minutes ago

Gizmodo

The Hidden Cost of OpenAI's Genius

OpenAI is the undisputed poster child of the AI revolution, the company that forced the world to pay attention with the launch of ChatGPT. But behind the scenes, a desperate and wildly expensive battle is raging, and the cost of keeping the company's geniuses in-house is becoming astronomical. According to a recent report from The Information, OpenAI revealed to investors that its stock-based compensation for employees surged more than fivefold last year to an astonishing $4.4 billion. That figure isn't just large; it's more than the company's entire revenue for the year, accounting for a staggering 119% of its $3.7 billion in total revenue. This is an unheard-of figure, even for Silicon Valley. For comparison, Google's stock compensation was just 16% of its revenue the year before its IPO. For Facebook, it was 6%. So what's going on? In short, OpenAI is fighting for its life in an unprecedented talent war, and its chief rival, Meta, is on the offensive. Mark Zuckerberg has been personally courting top AI researchers with massive compensation packages, successfully poaching several key minds from OpenAI's core teams. This has reportedly prompted a crisis at OpenAI, forcing it to 'recalibrate compensation' and promise even more rewarding pay packages to prevent a catastrophic brain drain. While stock-based compensation doesn't immediately burn through a company's cash reserves, it creates a major risk by diluting the value of shares held by investors. Every billion dollars in stock handed to employees means the slices of the pie owned by major backers like Microsoft and other venture capital firms get smaller. OpenAI is trying to sell this strategy as a long-term vision. The company projects that this massive expense will fall to 45% of revenue this year, and below 10% by 2030. Furthermore, OpenAI has reportedly discussed a future plan where its employees would collectively own roughly one-third of the restructured company, with Microsoft also owning another third. The goal is to turn employees into deeply invested partners who have a massive incentive to stay and build. But the 'Meta effect' is throwing a wrench in those neat projections. The aggressive poaching and the ensuing pay bumps mean OpenAI's costs are likely to remain sky-high. This high-stakes financial strategy puts OpenAI in a precarious position. The company is already spending billions of dollars a year as it spends heavily on the computing power needed to run its models. Adding billions more in stock compensation puts immense pressure on the company to dramatically increase revenue and find a path to profitability before its investors get spooked. While Microsoft seems locked in for the long haul, other investors may grow weary of having their ownership diluted so heavily. It forces a countdown timer on the company to deliver a massive financial return to justify the cost. OpenAI was founded with a mission to build artificial general intelligence (AGI) that 'benefits all of humanity.' This costly talent war, fueled by capitalist competition, puts immense pressure on that founding ideal. It becomes harder to prioritize safety and ethics when you're burning billions to keep your top minds from joining the competition. Ultimately, OpenAI is betting these billions to ensure it has the best talent to win the race to create the world's first true superintelligence. If they succeed, the financial cost will seem trivial. If they fail, or if a competitor gets there first, they will have spent themselves into a hole for nothing. OpenAI did not immediately respond to a request for comment.

The Verge

2 hours ago

The Verge

Sam Altman: 'I don't like smart glasses.'

Posted Jul 9, 2025 at 12:35 AM UTC Sam Altman: 'I don't like smart glasses.' The OpenAI CEO wore a pair of bold glasses while talking to reporters at the Sun Valley conference, and he says they aren't smart glasses and that he doesn't like the form factor. Altman briefly teased OpenAI's upcoming hardware, but only to say that 'it's going to be great.' He also touches on the talent war between OpenAI and Meta.

SoFi Offers Exposure to OpenAI and SpaceX Through New Private-Market Funds

Yahoo

2 hours ago

Yahoo

SoFi Offers Exposure to OpenAI and SpaceX Through New Private-Market Funds

SoFi Technologies said it's adding new private-market funds offering exposure to startups like OpenAI and SpaceX. The funds have investment minimums as low as $10, well below what is required by some of its other private-market funds. The move also comes after Robinhood last week said it would offer "tokenized" stakes of OpenAI and SpaceX to users in Technologies (SOFI) said Tuesday it's adding new private-market funds offering exposure to startups like OpenAI and SpaceX. The new funds are meant to expand access to investments in a wide range of private companies 'across AI, machine learning, space technology, consumer products, healthcare, e-commerce, and financial technology,' SoFi said. They also have investment minimums as low as $10, well below what is required by some of its other private-market funds, such as the $25,000 minimum for SoFi's Cosmos Fund, which offers exposure to SpaceX as well. SoFi said it is working with asset managers Cashmere, Fundrise and Liberty Street Advisors to offer the funds. 'SoFi is expanding alternative investment opportunities for a new generation of investors,' SoFi CEO Anthony Noto said in a release. Shares of SoFi rose nearly 4% Tuesday, and have added close to a third of their value so far in 2025. The move also comes after Robinhood (HOOD) last week said it would offer "tokenized" stakes in OpenAI and SpaceX to users in Europe. OpenAI distanced itself from the offering, writing, 'These 'OpenAI tokens' are not OpenAI equity. We did not partner with Robinhood, were not involved in this, and do not endorse it.' Read the original article on Investopedia Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Who Needs Big AI Models?

Hashtags

Try Our AI Features

Comments

Related Articles

The Hidden Cost of OpenAI's Genius

Sam Altman: 'I don't like smart glasses.'

SoFi Offers Exposure to OpenAI and SpaceX Through New Private-Market Funds

Get Started Now: Download the App