Alibaba Introduces Qwen3, Setting New Benchmark in Open-Source AI with Hybrid Reasoning - Middle East Business News and Information

30-04-2025

April 2025 – Alibaba has launched Qwen3, the latest generation of its open-sourced large language model (LLM) family, setting a new benchmark for AI innovation.
The Qwen3 series features six dense models and two Mixture-of-Experts (MoE) models, offering developers flexibility to build next-generation applications across mobile devices, smart glasses, autonomous vehicles, robotics and beyond.
All Qwen3 models – including dense models (0.6B, 1.7B, 4B, 8B, 14B, and 32B parameters) and MoE models (30B with 3B active, and 235B with 22B active) – are now open sourced and available globally.
Hybrid Reasoning Combining Thinking and Non-thinking Modes
Qwen3 marks Alibaba's debut of hybrid reasoning models, combining traditional LLM capabilities with advanced, dynamic reasoning. Qwen3 models can seamlessly switch between thinking mode for complex, multi-step tasks such as mathematics, coding, and logical deduction and non-thinking mode for fast, general-purpose responses.
For developers accessing Qwen3 through API, the model offers granular control over thinking duration (up to 38K tokens), enabling an optimized balance between intelligent performance and compute efficiency. Notably, the Qwen3-235B-A22B MoE model significantly lowers deployment costs compared to other state-of-the-art models, reinforcing Alibaba's commitment to accessible, high-performance AI.
Breakthroughs in Multilingual Skills, Agent Capabilities, Reasoning and Human Alignment
Trained on a massive dataset of 36 trillion tokens – double that of its predecessor Qwen2.5 — Qwen3 delivers significant advancement on reasoning, instruction following, tool use and multilingual tasks.
Key capabilities include: Multilingual Mastery: Supports 119 languages and dialects, with leading performance in translation and multilingual instruction-following.
Advanced Agent Integration: Natively supports the Model Context Protocol (MCP) and robust function-calling, leading open-source models in complex agent-based tasks.
Superior Reasoning: Surpasses previous Qwen models (QwQ in thinking mode and Qwen2.5 in non-thinking mode) in mathematics, coding, and logical reasoning benchmarks.
Enhanced Human Alignment: Delivers more natural creative writing, role-playing, and multi-turn dialogue experiences for more natural, engaging conversations.
Qwen3 models achieve top-tier results across industry benchmarks
Thanks to advancements in model architecture, increase in training data, and more effective training methods, Qwen3 models achieve top-tier results across industry benchmarks such as AIME25 (mathematical reasoning), LiveCodeBench (coding proficiency), BFCL (tool and function-calling capabilities), and Arena-Hard (benchmark for instruction-tuned LLMs). Additionally, to develop the hybrid reasoning model, a four-stage training process was implemented, which includes long chain-of-thought (CoT) cold start, reasoning-based reinforcement learning (RL), thinking mode fusion, and general RL.
Open Access to Drive Innovation:
Qwen3 models are now freely available for download on Hugging Face, Github, and ModelScope, and can be explored on chat.qwen.ai. API access will soon be available through Alibaba's AI model development platform Model Studio. Qwen3 also powers Alibaba's flagship AI super assistant application, Quark.
Since its debut, the Qwen model family has attracted over 300 million downloads worldwide. Developers have created more than 100,000 Qwen-based derivative models on Hugging Face, making Qwen one of the world's most widely adopted open-source AI model series.
About Alibaba Group:
Alibaba Group's mission is to make it easy to do business anywhere. The company aims to build the future infrastructure of commerce. It envisions that its customers will meet, work and live at Alibaba, and that it will be a good company that lasts for 102 years.

Hashtags

#ModelContextProtocol

#MCP

#Alibaba

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Alibaba Unveils Cutting-Edge AI Coding Model Qwen3-Coder - Middle East Business News and Information

Mid East Info

2 days ago

Mid East Info

Alibaba Unveils Cutting-Edge AI Coding Model Qwen3-Coder - Middle East Business News and Information

Alibaba has launched Qwen3-Coder, its most advanced agentic AI coding model to date. Designed for high-performance software development, Qwen3-Coder excels in agentic AI coding tasks, from generating new codes and managing complex coding workflows to debugging across entire codebases. Built on a Mixture-of-Experts MoE architecture, this open-sourced model Qwen3-Coder-480B-A35B-Instruct, which has a total of 480 billion parameters but activates 35 billion parameters per token, delivers efficiency without sacrificing performance. The model achieves competitive results against leading state-of-the-art (SOTA) models across key benchmarks in agentic coding, browser use, and tool use. Qwen3-Coder-480B-A35B-Instruct achieves competitive results against leading state-of-the-art (SOTA) models across key benchmarks Additionally, Alibaba is open-sourcing Qwen Code, a powerful command-line interface (CLI) tool that enables developers to delegate engineering tasks to AI using natural language. Optimized with custom prompts and interaction protocols, Qwen Code unlocks the full potential of Qwen3-Coder for real-world agentic programming. The model also supports integration with the Claude Code interface, making it even easier for developers to execute their coding tasks. Trained on an extensive dataset of codes and general text data, Qwen3-Coder is engineered for robust agentic coding. It natively supports a context window of 256K tokens, extendable up to 1 million tokens, enabling it to process vast codebases in a single session. Its superior performance stems not only from scaling across tokens, context length, and synthetic data during pre-training, but also from innovative post-training techniques such as long-horizon reinforcement learning agent RL. This advancement allows the model to solve complex, real-world problems through multi-step interactions with external tools. As a result, Qwen3-Coder achieves SOTA performance among open-source models on SWE-Bench Verified (a benchmark for evaluating AI models' ability to solve real-world software issues), even without test-time or inference scaling. Agentic AI coding is transforming software development by enabling more autonomous, efficient, and accessible programming workflows. With its open-source availability, strong agentic coding capabilities, and seamless compatibility with popular developer tools and interfaces, Qwen3-Coder is positioned as a valuable tool for global developers in software development. The Qwen3-Coder-480B-A35B-Instruct model is now available on Hugging Face and GitHub. Developers can also access the model on Qwen Chat or via cost-effective APIs through Model Studio, Alibaba's generative AI development platform. Qwen-based coding models have already surpassed 20 million downloads globally. Tongyi Lingma, Alibaba Cloud's Qwen-powered coding assistant, will soon be upgraded with Qwen3-Coder's enhanced agentic capabilities. Since its launch in June 2024, Tongyi Lingma's 'AI Programmer' feature—offering code completion, optimization, debugging support, snippet search, and batch unit test generation—has generated over 3 billion lines of code. About Alibaba Cloud: Established in 2009, Alibaba Cloud is the digital technology and intelligence backbone of Alibaba Group. It offers a complete suite of cloud services to customers worldwide, including elastic computing, database, storage, network virtualization services, large-scale computing, security, big data analytics, machine learning and artificial intelligence (AI) services. Alibaba has been named the leading IaaS provider in Asia Pacific by revenue in U.S. dollars since 2018, according to Gartner. It has also maintained its position as one of the world's leading public cloud IaaS service providers since 2018, according to IDC.

China's DeepSeek releases update to AI model that sent US shares tumbling earlier this year

Egypt Independent

29-05-2025

Egypt Independent

China's DeepSeek releases update to AI model that sent US shares tumbling earlier this year

Shanghai Reuters — Chinese artificial intelligence startup DeepSeek released an update to its R1 reasoning model in the early hours of Thursday, stepping up competition with US rivals such as OpenAI. DeepSeek launched R1-0528 on developer platform Hugging Face, but has yet to make an official public announcement. It did not publish a description of the model or comparisons. But the LiveCodeBench leaderboard, a benchmark developed by researchers from UC Berkeley, MIT, and Cornell, ranked DeepSeek's updated R1 reasoning model just slightly behind OpenAI's o4 mini and o3 reasoning models on code generation and ahead of xAI's Grok 3 mini and Alibaba's Qwen 3. Bloomberg earlier reported the update on Wedneday. It said that a DeepSeek representative had told a WeChat group that it had completed what it described as a 'minor trial upgrade' and that users could start testing it. DeepSeek earlier this year upended beliefs that US export controls were holding back China's AI advancements after the startup released AI models that were on a par with or better than industry-leading models in the United States at a fraction of the cost. The launch of R1 in January sent tech shares outside China plummeting in January and challenged the view that scaling AI requires vast computing power and investment. Since R1's release, Chinese tech giants like Alibaba and Tencent have released models claiming to surpass DeepSeek's. Google's Gemini has introduced discounted tiers of access while OpenAI cut prices and released an o3 mini model that relies on less computing power. The company is still widely expected to release R2, a successor to R1. Reuters reported in March, citing sources, that R2's release was initially planned for May. DeepSeek also released an upgrade to its V3 large language model in March.

Red Hat Optimizes Red Hat AI to Speed Enterprise AI Deployments Across Models, AI Accelerators and Clouds - Middle East Business News and Information

Mid East Info

22-05-2025

Mid East Info

Red Hat Optimizes Red Hat AI to Speed Enterprise AI Deployments Across Models, AI Accelerators and Clouds - Middle East Business News and Information

Red Hat AI Inference Server, validated models and integration of Llama Stack and Model Context Protocol help users deliver higher-performing, more consistent AI applications and agents Red Hat, the world's leading provider of open source solutions, today continues to deliver customer choice in enterprise AI with the introduction of Red Hat AI Inference Server, Red Hat AI third-party validated models and the integration of Llama Stack and Model Context Protocol (MCP) APIs, along with significant updates across the Red Hat AI portfolio. With these developments, Red Hat intends to further advance the capabilities organizations need to accelerate AI adoption while providing greater customer choice and confidence in generative AI (gen AI) production deployments across the hybrid cloud. According to Forrester, open source software will be the spark for accelerating enterprise AI efforts.1 As the AI landscape grows more complex and dynamic, Red Hat AI Inference Server and third party validated models provide efficient model inference and a tested collection of AI models optimized for performance on the Red Hat AI platform. Coupled with the integration of new APIs for gen AI agent development, including Llama Stack and MCP, Red Hat is working to tackle deployment complexity, empowering IT leaders, data scientists and developers to accelerate AI initiatives with greater control and efficiency. Efficient inference across the hybrid cloud with Red Hat AI Inference Server: The Red Hat AI portfolio now includes the new Red Hat AI Inference Server, providing faster, more consistent and cost-effective inference at scale across hybrid cloud environments. This key addition is integrated into the latest releases of Red Hat OpenShift AI and Red Hat Enterprise Linux AI, and is also available as a standalone offering, enabling organizations to deploy intelligent applications with greater efficiency, flexibility and performance. Tested and optimized models with Red Hat AI third party validated models Red Hat AI third party validated models, available on Hugging Face, make it easier for enterprises to find the right models for their specific needs. Red Hat AI offers a collection of validated models, as well as deployment guidance to enhance customer confidence in model performance and outcome reproducibility. Select models are also optimized by Red Hat, leveraging model compression techniques to reduce size and increase inference speed, helping to minimize resource consumption and operating costs. Additionally, the ongoing model validation process helps Red Hat AI customers continue to stay at the forefront of optimized gen AI innovation. Standardized APIs for AI application and agent development with Llama Stack and MCP Red Hat AI is integrating Llama Stack, initially developed by Meta, along with Anthropic's MCP, to provide users with standardized APIs for building and deploying AI applications and agents. Currently available in developer preview in Red Hat AI, Llama Stack provides a unified API to access inference with vLLM, retrieval-augmented generation (RAG), model evaluation, guardrails and agents, across any gen AI model. MCP enables models to integrate with external tools by providing a standardized interface for connecting APIs, plugins and data sources in agent workflows. The latest release of Red Hat OpenShift AI (v2.20) delivers additional enhancements for building, training, deploying and monitoring both gen AI and predictive AI models at scale. These include: Optimized model catalog (technology preview) provides easy access to validated Red Hat and third party models, enables the deployment of these models on Red Hat OpenShift AI clusters through the web console interface and manages the lifecycle of those models leveraging Red Hat OpenShift AI's integrated registry. Distributed training through the KubeFlow Training Operator enables the scheduling and execution of InstructLab model tuning and other PyTorch-based training and tuning workloads, distributed across multiple Red Hat OpenShift nodes and GPUs and includes distributed RDMA networking–acceleration and optimized GPU utilization to reduce costs. Feature store (technology preview), based on the upstream Kubeflow Feast project, provides a centralized repository for managing and serving data for both model training and inference, streamlining data workflows to improve model accuracy and reusability. Red Hat Enterprise Linux AI 1.5 brings new updates to Red Hat's foundation model platform for developing, testing and running large language models (LLMs). Key features in version 1.5 include: Google Cloud Marketplace availability, expanding the customer choice for running Red Hat Enterprise Linux AI in public cloud environments–along with AWS and Azure–to help simplify the deployment and management of AI workloads on Google Cloud. Enhanced multi-language capabilities for Spanish, German, French and Italian via InstructLab, allowing for model customization using native scripts and unlocking new possibilities for multilingual AI applications. Users can also bring their own teacher models for greater control over model customization and testing for specific use cases and languages, with future support planned for Japanese, Hindi and Korean. The Red Hat AI InstructLab on IBM Cloud service is also now generally available. This new cloud service further streamlines the model customization process, improving scalability and user experience, empowering enterprises to make use of their unique data with greater ease and control. Red Hat's vision: Any model, any accelerator, any cloud. The future of AI must be defined by limitless opportunity, not constrained by infrastructure silos. Red Hat sees a horizon where organizations can deploy any model, on any accelerator, across any cloud, delivering an exceptional, more consistent user experience without exorbitant costs. To unlock the true potential of gen AI investments, enterprises require a universal inference platform–a standard for more seamless, high-performance AI innovation, both today and in the years to come. Red Hat Summit: Join the Red Hat Summit keynotes to hear the latest from Red Hat executives, customers and partners: Modernized infrastructure meets enterprise-ready AI — Tuesday, May 20, 8-10 a.m. EDT (YouTube) Hybrid cloud evolves to deliver enterprise innovation — Wednesday, May 21, 8-9:30 a.m. EDT (YouTube) Supporting Quotes: Joe Fernandes, vice president and general manager, AI Business Unit, Red Hat 'Faster, more efficient inference is emerging as the newest decision point for gen AI innovation. Red Hat AI, with enhanced inference capabilities through Red Hat AI Inference Server and a new collection of validated third-party models, helps equip organizations to deploy intelligent applications where they need to, how they need to and with the components that best meet their unique needs.' Michele Rosen, research manager, IDC 'Organizations are moving beyond initial AI explorations and are focused on practical deployments. The key to their continued success lies in the ability to be adaptable with their AI strategies to fit various environments and needs. The future of AI not only demands powerful models, but models that can be deployed with ability and cost-effectiveness. Enterprises seeking to scale their AI initiatives and deliver business value will find this flexibility absolutely essential.' About Red Hat: Red Hat is the open hybrid cloud technology leader, delivering a trusted, consistent and comprehensive foundation for transformative IT innovation and AI applications. Its portfolio of cloud, developer, AI, Linux, automation and application platform technologies enables any application, anywhere—from the datacenter to the edge. As the world's leading provider of enterprise open source software solutions, Red Hat invests in open ecosystems and communities to solve tomorrow's IT challenges. Collaborating with partners and customers, Red Hat helps them build, connect, automate, secure and manage their IT environments, supported by consulting services and award-winning training and certification offerings. Forward-Looking Statements: Except for the historical information and discussions contained herein, statements contained in this press release may constitute forward-looking statements within the meaning of the Private Securities Litigation Reform Act of 1995. Forward-looking statements are based on the company's current assumptions regarding future business and financial performance. These statements involve a number of risks, uncertainties and other factors that could cause actual results to differ materially. Any forward-looking statement in this press release speaks only as of the date on which it is made. Except as required by law, the company assumes no obligation to update or revise any forward-looking statements.