logo
#

Latest news with #Qwen3-235B

Who Needs Big AI Models?
Who Needs Big AI Models?

Forbes

time08-07-2025

  • Business
  • Forbes

Who Needs Big AI Models?

Cerebras Systems CEO and Founder Andrew Feldman The AI world continues to evolve rapidly, especially since the introduction of DeepSeek and its followers. Many have concluded that enterprises don't really need the large, expensive AI models touted by OpenAI, Meta, and Google, and are focusing instead on smaller models, such as DeepSeek V2-Lite with 2.4B parameters, or Llama 4 Scout and Maverick with 17B parameters, which can provide decent accuracy at a lower cost. It turns out that this is not the case for coders, or more accurately, the models that can and will replace many coders. Nor does the smaller-is-better mantra apply to reasoning or agentic AI, the next big thing. AI code generators require large models that can handle a wider context window, capable of accommodating approximately 100,000 lines of code. Mixture of expert (MOE) models supporting agentic and reasoning AI is also large. But these massive models are typically quite expensive, costing around $10 to $15 per million output tokens on modern GPUs. Therein lies an opportunity for novel AI architectures to encroach on GPUs' territory. Cerebras Systems Launches Big AI with Qwen3-235B Cerebras Systems (a client of Cambrian-AI Research) has announced support for the large Qwen3-235B, supporting 131K context length (about 200–300 pages of text), four times what was previously available. At the RAISE Summit in Paris, Cerebras touted Alibaba's Qwen3-235B, which uses a highly efficient mixture-of-experts architecture to deliver exceptional compute efficiency. But the real news is that Cerebras can run the model at only $0.60 per million input tokens and per million output tokens—less than one-tenth the cost of comparable closed-source models. While many consider the Cerebras wafer-scale engine expensive, this data turns that perception on its head. Agents are a use case that frequently requires very large models. One question I frequently get is, if Cerebras is so fast, why don't they have more customers? One reason is that they have not supported large context windows and larger models. Those seeking to develop code, for example, do not want to break down the problem into smaller fragments to fit, say, a 32KB context. Now, that barrier to sales has evaporated. 'We're seeing huge demand from developers for frontier models with long context, especially for code generation,' said Cerebras Systems CEO and Founder Andrew Feldman. "Qwen3-235B on Cerebras is our first model that stands toe-to-toe with frontier models like Claude 4 and DeepSeek R1. And with full 131K context, developers can now use Cerebras on production-grade coding applications and get answers back in less than a second instead of waiting for minutes on GPUs.' Cerebras is not just 30 times faster, it is 92% cheaper than GPUs. Cerebras has quadrupled its context length support from 32K to 131K tokens—the maximum supported by Qwen3-235B. This expansion directly impacts the model's ability to reason over large codebases and complex documentation. While 32K context is sufficient for simple code generation use cases, 131K context enables the model to process dozens of files and tens of thousands of lines of code simultaneously, allowing for production-grade application development. Cerebras is 15-100 times more affordable than GPUs when running Qwen3-235B Qwen3-235B excels at tasks requiring deep logical reasoning, advanced mathematics, and code generation, thanks to its ability to switch between "thinking mode" (for high-complexity tasks) and "non-thinking mode" (for efficient, general-purpose dialogue). The 131K context length allows the model to ingest and reason over large codebases (tens of thousands of lines), supporting tasks such as code refactoring, documentation, and bug detection. Cerebras also announced the further expansion of its ecosystem, with support from Amazon AWS, as well as DataRobot, Docker, Cline, and Notion. The addition of AWS is huge; Cerebras has added AWS to its cloud portfolio. Where is this heading? Big AI has constantly been downsized and optimized, with orders of magnitude of performance gains, model sizes, and price reductions. This trend will undoubtedly continue, but will be constantly offset by increases in capabilities, accuracy, intelligence, and entirely new features across modalities. So, if you want last year's AI, you're in great shape, as it continues to get cheaper. But if you want the latest features and functions, you will require the largest models and the longest input context length. It's the Yin and Yang of AI.

Cerebras Launches Qwen3-235B: World's Fastest Frontier AI Model with Full 131K Context Support
Cerebras Launches Qwen3-235B: World's Fastest Frontier AI Model with Full 131K Context Support

Business Wire

time08-07-2025

  • Business
  • Business Wire

Cerebras Launches Qwen3-235B: World's Fastest Frontier AI Model with Full 131K Context Support

PARIS--(BUSINESS WIRE)--Cerebras Systems today announced the launch of Qwen3-235B with full 131K context support on its inference cloud platform. This milestone represents a breakthrough in AI model performance, combining frontier-level intelligence with unprecedented speed at one-tenth the cost of closed-source models, fundamentally transforming enterprise AI deployment. "With Cerebras' inference, developers using Cline are getting a glimpse of the future, as Cline reasons through problems, reads codebases, and writes code in near real-time," said Saoud Rizwan, CEO of Cline. Frontier Intelligence on Cerebras Alibaba's Qwen3-235B delivers model intelligence that rivals frontier models such as Claude 4 Sonnet, Gemini 2.5 Flash, and DeepSeek R1 across a range of science, coding, and general knowledge benchmarks according to independent tests by Artificial Analysis. Qwen3-235B uses an efficient mixture-of-experts architecture that delivers exceptional compute efficiency, enabling Cerebras to offer the model at $0.60 per million input tokens and $1.20 per million output tokens—less than one-tenth the cost of comparable closed-source models. Cut Reasoning Time from Minutes to Seconds Reasoning models are notoriously slow, often taking minutes to answer a simple question. By leveraging the Wafer Scale Engine, Cerebras accelerates Qwen3-235B to an unprecedented 1,500 tokens per second, reducing response times from 1-2 minutes to 0.6 seconds, making coding, reasoning, and deep-RAG workflows nearly instantaneous. Based on Artificial Analysis measurements, Cerebras is the only company globally offering a frontier AI model capable of generating output at over 1,000 tokens per second, setting a new standard for real-time AI performance. 131K Context Enables Production-grade Code Generation Concurrent with this launch, Cerebras has quadrupled its context length support from 32K to 131K tokens—the maximum supported by Qwen3-235B. This expansion directly impacts the model's ability to reason over large codebases and complex documents. While 32K context is sufficient for simple code generation use cases, 131K context allows the model to process dozens of files and tens of thousands of lines of code simultaneously, enabling production-grade application development. This enhanced context length means Cerebras now directly addresses the enterprise code generation market, which is one of the largest and fastest-growing segments for generative AI. Strategic Partnership with Cline To showcase these new capabilities, Cerebras has partnered with Cline, the leading agentic coding agent for Microsoft VS Code with over 1.8 million installations. Cline users can now access Cerebras Qwen models directly within the editor—starting with Qwen3-32B at 64K context on the free tier. This rollout will expand to include Qwen3-235B with 131K context, delivering 10–20x faster code generation speeds compared to alternatives like DeepSeek R1. "With Cerebras' inference, developers using Cline are getting a glimpse of the future, as Cline reasons through problems, reads codebases, and writes code in near real-time. Everything happens so fast that developers stay in flow, iterating at the speed of thought. This kind of fast inference isn't just nice to have -- it shows us what's possible when AI truly keeps pace with developers,' said Saoud Rizwan, CEO of Cline. Frontier Intelligence at 30x the Speed and 1/10 th the Cost With today's launch, Cerebras has significantly expanded its inference offering, providing developers looking for an open alternative to OpenAI and Anthropic with comparable levels of model intelligence and code generation capabilities. Moreover, Cerebras delivers something that no other AI provider in the world—closed or open—can do: instant reasoning speed at over 1,500 tokens per second, increasing developer productivity by an order of magnitude vs. GPU solutions. All of this is delivered at one-tenth the token cost of leading closed-source models. About Cerebras Systems Cerebras Systems is a team of pioneering computer architects, computer scientists, deep learning researchers, and engineers of all types. We have come together to accelerate generative AI by building from the ground up a new class of AI supercomputer. Our flagship product, the CS-3 system, is powered by the world's largest and fastest commercially available AI processor, our Wafer-Scale Engine-3. CS-3s are quickly and easily clustered together to make the largest AI supercomputers in the world, and make placing models on the supercomputers dead simple by avoiding the complexity of distributed computing. Cerebras Inference delivers breakthrough inference speeds, empowering customers to create cutting-edge AI applications. Leading corporations, research institutions, and governments use Cerebras solutions for the development of pathbreaking proprietary models, and to train open-source models with millions of downloads. Cerebras solutions are available through the Cerebras Cloud and on-premises. For further information, visit or follow us on LinkedIn, X and/or Threads

Alibaba unveils Qwen3 AI models that it says outperform DeepSeek R1
Alibaba unveils Qwen3 AI models that it says outperform DeepSeek R1

The Star

time29-04-2025

  • Business
  • The Star

Alibaba unveils Qwen3 AI models that it says outperform DeepSeek R1

The release comes just days after Baidu introduced two advanced models amid speculation about the imminent release of DeepSeek's R2. — SCMP Alibaba Group Holding on Tuesday unveiled the highly anticipated third generation of its open-source artificial intelligence (AI) model series, which promises faster processing and enhanced multilingual capabilities, intensifying competition in an already crowded Chinese market. The Qwen3 family consists of eight models, ranging from 600 million parameters to 235 billion, with enhancements across all models, according to the Qwen team at Alibaba's cloud computing unit. Alibaba owns the South China Morning Post. In AI, parameters are a measurement of the variables present during model training. They serve as an indicator of sophistication: larger parameter sizes typically suggest greater capacity. Benchmark tests cited by Alibaba revealed that models such as Qwen3-235B and Qwen3-4B matched or exceeded the performance of advanced models from both domestic and overseas competitors – including OpenAI's o1, Google's Gemini and DeepSeek's R1 – in areas like instruction following, coding assistance, text generation, mathematical skills and complex problem solving. The launch of Qwen3, which was anticipated this month as previously reported by the Post, is expected to solidify Alibaba's position as a leading provider of open-source models. With over 100,000 derivative models built upon it, Qwen is currently the world's largest open-source AI ecosystem, surpassing Facebook parent Meta Platforms' Llama community. 'Qwen3 represents a significant milestone in our journey towards artificial general intelligence and artificial superintelligence,' the Qwen team said, adding that the new models achieved a higher level of intelligence through enhanced pre-training and reinforcement learning. Trained on 36 trillion tokens covering 119 languages and dialects – tripling the language coverage of Qwen2.5 – Qwen3 shows improved capabilities in understanding and translating instructions across multiple languages, according to the team. The Qwen3 model family is available on Microsoft's GitHub, the open-source AI community Hugging Face and Alibaba's own AI model hosting service, ModelScope. It has also been integrated into the web-based Qwen chatbot as the default model for user queries. All Qwen3 models feature hybrid reasoning functionality, allowing users to toggle between a 'thinking' mode, which is suitable for complex problems and takes longer to respond, and a 'non-thinking' mode, which offers quicker responses for everyday tasks. Alibaba's release of its latest AI model comes just days after Baidu introduced two advanced models amid speculation about the imminent release of DeepSeek's R2. The development underscores the intensifying competition in China's foundational AI model market, as Big Tech firms race to develop and upgrade their offerings. The Hangzhou-based e-commerce giant has been doubling down on its AI investments, focusing on funding and talent acquisition to maintain its competitive edge and enhance its business operations. Earlier this year, Alibaba pledged more than US$52bil over the next three years to build AI infrastructure, marking the largest computing project by a private company in China. Additionally, the group launched a spring hiring campaign, with half of the internship positions dedicated to AI-focused roles. – South China Morning Post

Alibaba unveils Qwen3 AI models that it says outperform DeepSeek R1
Alibaba unveils Qwen3 AI models that it says outperform DeepSeek R1

South China Morning Post

time29-04-2025

  • Business
  • South China Morning Post

Alibaba unveils Qwen3 AI models that it says outperform DeepSeek R1

Advertisement The Qwen3 family consists of eight models, ranging from 600 million parameters to 235 billion, with enhancements across all models, according to the Qwen team at Alibaba's cloud computing unit. Alibaba owns the South China Morning Post. In AI, parameters are a measurement of the variables present during model training. They serve as an indicator of sophistication: larger parameter sizes typically suggest greater capacity. Benchmark tests cited by Alibaba revealed that models such as Qwen3-235B and Qwen3-4B matched or exceeded the performance of advanced models from both domestic and overseas competitors – including OpenAI's o1, Google's Gemini and DeepSeek's R1 – in areas like instruction following, coding assistance, text generation, mathematical skills and complex problem solving. 11:13 How is betting on AI to transform e-commerce How is betting on AI to transform e-commerce The launch of Qwen3, which was anticipated this month as previously reported by the Post , is expected to solidify Alibaba's position as a leading provider of open-source models. With over 100,000 derivative models built upon it, Qwen is currently the world's largest open-source AI ecosystem, surpassing Facebook parent Meta Platforms' Llama community. Advertisement

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store