Latest news with #WeiSun


NBC News
5 days ago
- Business
- NBC News
Another Chinese AI model is turning heads
BEIJING — The latest Chinese generative artificial intelligence model to take on OpenAI's ChatGPT is offering coding capabilities — at a lower price. Alibaba-backed startup Moonshot released on late Friday night its Kimi K2 model: a low-cost, open source large language model — the two factors that underpinned China-based DeepSeek's industry disruption in January. Open-source technology provides source code access for free, an approach that few U.S. tech giants have taken, other than Meta and Google to some extent. Coincidentally, OpenAI CEO Sam Altman announced early Saturday that there would be an indefinite delay of its first open-source model yet again due to safety concerns. OpenAI did not immediately respond to a CNBC request for comment on Kimi K2. One of Kimi K2′s strengths is in writing computer code for applications, an area in which businesses see potential to reduce or replace staff with generative AI. OpenAI's U.S. rival Anthropic focused on coding with its Claude Opus 4 model released in late May. In its release announcement on social media platforms X and GitHub, Moonshot claimed Kimi K2 surpassed Claude Opus 4 on two benchmarks, and had better overall performance than OpenAI's coding-focused GPT-4.1 model, based on several industry metrics. 'No doubt [Kimi K2 is] a globally competitive model, and it's open sourced,' Wei Sun, principal analyst in artificial intelligence at Counterpoint, said in an email Monday. Cheaper option 'On top of that, it has lower token costs, making it attractive for large-scale or budget-sensitive deployments,' she said. The new K2 model is available via Kimi's app and browser interface for free unlike ChatGPT or Claude, which charge monthly subscriptions for their latest AI models. Kimi is also only charging 15 cents for every 1 million input tokens, and $2.50 per 1 million output tokens, according to its website. Tokens are a way of measuring data for AI model processing. In contrast, Claude Opus 4 charges 100 times more for input — $15 per million tokens — and 30 times more for output — $75 per million tokens. Meanwhile, for every one million tokens, GPT-4.1 charges $2 for input and $8 for output. Moonshot AI said on GitHub that developers can use K2 however they wish, with the only requirement that they display 'Kimi K2' on the user interface if the commercial product or service has more than 100 million monthly active users, or makes the equivalent of $20 million in monthly revenue. Hot AI market Initial reviews of K2 on both English and Chinese social media have largely been positive, although there are some reports of hallucinations, a prevalent issue in generative AI, in which the models make up information. Still, K2 is 'the first model I feel comfortable using in production since Claude 3.5 Sonnet,' Pietro Schirano, founder of startup MagicPath that offers AI tools for design, said in a post on X. Moonshot has open sourced some of its prior AI models. The company's chatbot surged in popularity early last year as China's alternative to ChatGPT, which isn't officially available in the country. But similar chatbots from ByteDance and Tencent have since crowded the market, while tech giant Baidu has revamped its core search engine with AI tools. Kimi's latest AI release comes as investors eye Chinese alternatives to U.S. tech in the global AI competition. Still, despite the excitement about DeepSeek, the privately-held company has yet to announce a major upgrade to its R1 and V3 model. Meanwhile, Manus AI, a Chinese startup that emerged earlier this year as another DeepSeek-type upstart, has relocated its headquarters to Singapore. Over in the U.S., OpenAI also has yet to reveal GPT-5. Work on GPT-5 may be taking up engineering resources, preventing OpenAI from progressing on its open-source model, Counterpoint's Sun said, adding that it's challenging to release a powerful open-source model without undermining the competitive advantage of a proprietary model. Grok 4 competitor Kimi K2 is not the company's only recent release. Moonshot launched a Kimi research model last month and claimed it matched Google's Gemini Deep Research 's 26.9 score and beat OpenAI's version on a benchmark called 'Humanity's Last Exam.' The Kimi research model even got a mention last week during Elon Musk's xAI release of Grok 4 — which scored 25.4 on its own on the 'Humanity's Last Exam' benchmark, but attained a 44.4 score when allowed to use a variety of AI tools and web search. 'Kimi-Researcher represents a paradigm shift in agentic AI,' said Winston Ma, adjunct professor at NYU School of Law. He was referring to AI's capability of simultaneously making several decisions on its own to complete a complex task. 'Instead of merely generating fluent responses, it demonstrates autonomous reasoning at an expert level — the kind of complex cognitive work previously missing from LLMs,' Ma said. He is also author of 'The Digital War: How China's Tech Power Shapes the Future of AI, Blockchain and Cyberspace.'


CNBC
5 days ago
- Business
- CNBC
Alibaba-backed Moonshot releases new Kimi AI model that beats ChatGPT, Claude in coding — and it costs less
BEIJING — The latest Chinese generative artificial intelligence model to take on OpenAI's ChatGPT is offering coding capabilities — at a lower price. Alibaba-backed startup Moonshot released on late Friday night its Kimi K2 model: a low-cost, open source large language model — the two factors that underpinned China-based DeepSeek's industry disruption in January. Open-source technology provides source code access for free, an approach that few U.S. tech giants have taken, other than Meta and Google to some extent. Coincidentally, OpenAI CEO Sam Altman announced early Saturday that there would be an indefinite delay of its first open-source model yet again due to safety concerns. OpenAI did not immediately respond to a CNBC request for comment on Kimi K2. One of Kimi K2's strengths is in writing computer code for applications, an area in which businesses see potential to reduce or replace staff with generative AI. OpenAI's U.S. rival Anthropic focused on coding with its Claude Opus 4 model released in late May. In its release announcement on social media platforms X and GitHub, Moonshot claimed Kimi K2 surpassed Claude Opus 4 on two benchmarks, and had better overall performance than OpenAI's coding-focused GPT-4.1 model, based on several industry metrics. "No doubt [Kimi K2 is] a globally competitive model, and it's open sourced," Wei Sun, principal analyst in artificial intelligence at Counterpoint, said in an email Monday. "On top of that, it has lower token costs, making it attractive for large-scale or budget-sensitive deployments," she said. The new K2 model is available via Kimi's app and browser interface for free unlike ChatGPT or Claude, which charge monthly subscriptions for their latest AI models. Kimi is also only charging 15 cents for every 1 million input tokens, and $2.50 per 1 million output tokens, according to its website. Tokens are a way of measuring data for AI model processing. In contrast, Claude Opus 4 charges 100 times more for input — $15 per million tokens — and 30 times more for output — $75 per million tokens. Meanwhile, for every one million tokens, GPT-4.1 charges $2 for input and $8 for output. Moonshot AI said on GitHub that developers can use K2 however they wish, with the only requirement that they display "Kimi K2" on the user interface if the commercial product or service has more than 100 million monthly active users, or makes the equivalent of $20 million in monthly revenue. Initial reviews of K2 on both English and Chinese social media have largely been positive, although there are some reports of hallucinations, a prevalent issue in generative AI, in which the models make up information. Still, K2 is "the first model I feel comfortable using in production since Claude 3.5 Sonnet," Pietro Schirano, founder of startup MagicPath that offers AI tools for design, said in a post on X. Moonshot has open sourced some of its prior AI models. The company's chatbot surged in popularity early last year as China's alternative to ChatGPT, which isn't officially available in the country. But similar chatbots from ByteDance and Tencent have since crowded the market, while tech giant Baidu has revamped its core search engine with AI tools. Kimi's latest AI release comes as investors eye Chinese alternatives to U.S. tech in the global AI competition. Still, despite the excitement about DeepSeek, the privately-held company has yet to announce a major upgrade to its R1 and V3 model. Meanwhile, Manus AI, a Chinese startup that emerged earlier this year as another DeepSeek-type upstart, has relocated its headquarters to Singapore. Over in the U.S., OpenAI also has yet to reveal GPT-5. Work on GPT-5 may be taking up engineering resources, preventing OpenAI from progressing on its open-source model, Counterpoint's Sun said, adding that it's challenging to release a powerful open-source model without undermining the competitive advantage of a proprietary model. Kimi K2 is not the company's only recent release. Moonshot launched a Kimi research model last month and claimed it matched Google's Gemini Deep Research 's 26.9 score and beat OpenAI's version on a benchmark called "Humanity's Last Exam." The Kimi research model even got a mention last week during Elon Musk's xAI release of Grok 4 — which scored 25.4 on its own on the "Humanity's Last Exam" benchmark, but attained a 44.4 score when allowed to use a variety of AI tools and web search. "Kimi-Researcher represents a paradigm shift in agentic AI," said Winston Ma, adjunct professor at NYU School of Law. He was referring to AI's capability of simultaneously making several decisions on its own to complete a complex task. "Instead of merely generating fluent responses, it demonstrates autonomous reasoning at an expert level — the kind of complex cognitive work previously missing from LLMs," Ma said. He is also author of "The Digital War: How China's Tech Power Shapes the Future of AI, Blockchain and Cyberspace."


CNBC
29-04-2025
- Business
- CNBC
Alibaba launches new Qwen LLMs in China's latest open-source AI breakthrough
Alibaba released the next generation of its open-sourced large language models, Qwen3, on Tuesday — and experts are calling it yet another breakthrough in China's booming open-source artificial intelligence space. In a blog post, the Chinese tech giant said Qwen3 promises improvements in reasoning, instruction following, tool usage and multilingual tasks, rivaling other top-tier models such as DeepSeek's R1 in several industry benchmarks. The LLM series includes eight variations that span a range of architectures and sizes, offering developers flexibility when using Qwen to build AI applications for edge devices like mobile phones. Qwen3 is also Alibaba's debut into so-called "hybrid reasoning models," which it says combines traditional LLM capabilities with "advanced, dynamic reasoning." According to Alibaba, such models can seamlessly transition between a "thinking mode" for complex tasks such as coding and a "non-thinking mode" for faster, general-purpose responses. "Notably, the Qwen3-235B-A22B MoE model significantly lowers deployment costs compared to other state-of-the-art models, reinforcing Alibaba's commitment to accessible, high-performance AI," Alibaba said. The new models are already freely available for individual users on platforms like Hugging Face and GitHub, as well as Alibaba Cloud's web interface. Qwen3 is also being used to power Alibaba's AI assistant, Quark. AI analysts told CNBC that the Qwen3 represents a serious challenge to Alibaba's counterparts in China, as well as industry leaders in the U.S. In a statement to CNBC, Wei Sun, principal analyst of artificial intelligence at Counterpoint Research, said the Qwen3 series is a "significant breakthrough—not just for its best-in-class performance" but also for several features that point to the "application potential of the models." Those features include Qwen3's hybrid thinking mode, its multilingual support covering 119 languages and dialects and its open-source availability, Sun added. Open-source software generally refers to software in which the source code is made freely available on the web for possible modification and redistribution. At the start of this year, DeepSeek's open-sourced R1 model rocked the AI world and quickly became a catalyst for China's AI space and open-source model adoption. "Alibaba's release of the Qwen 3 series further underscores the strong capabilities of Chinese labs to develop highly competitive, innovative, and open-source models, despite mounting pressure from tightened U.S. export controls," said Ray Wang, a Washington-based analyst focusing on U.S.-China economic and technology competition. According to Alibaba, Qwen has already become one of the world's most widely adopted open-source AI model series, attracting over 300 million downloads worldwide and more than 100,000 derivative models on Hugging Face. Wang said that this adoption could continue with Qwen3, adding that its performance claims may make it the best open-source model globally — though still behind the world's most cutting-edge models like OpenAI's o3 and o4-mini. Chinese competitors like Baidu have also rushed to release new AI models after the emergence of DeepSeek, including making plans to shift toward a more open-source business model. Meanwhile, Reuters reported in February that DeepSeek is accelerating the launch of its successor to its R1, citing anonymous sources. "In the broader context of the U.S.-China AI race, the gap between American and Chinese labs has narrowed—likely to a few months, and some might argue, even to just weeks," Wang said. "With the latest release of Qwen 3 and the upcoming launch of DeepSeek's R2, this gap is unlikely to widen—and may even continue to shrink."
Yahoo
28-04-2025
- Business
- Yahoo
DeepSeek is hiring for an 'urgent' role in product management and design
DeepSeek is hiring for a job in product management and design. It's a major shift from the startup's focus on AI model research. The rush to hire product talent mirrors a broader trend in the US. DeepSeek, the Chinese startup that rattled the AI industry earlier this year, is hiring for a product role that illustrates the company's shift from research to commercialization. In a job notice posted Tuesday on its official WeChat account, DeepSeek said it is looking to fill a "product and design" position on its teams in Beijing and Hangzhou. It is unclear from the notice if the job refers to a single role or multiple positions. The Hangzhou-based firm labeled the job notice "urgent." The company wrote that it wants people to help create the "next generation of intelligent product experience" centered on large language models. Candidates are expected to have product management experience and be proficient in product and visual design, the notice said. DeepSeek did not respond to Business Insider's request for comment. DeepSeek is also hiring a chief financial officer and chief operating officer — jobs not labeled urgent. The company is expanding its research and engineering teams, according to other listings on its WeChat account. The move marks a major shift for the company, which has been focused on fundamental AI model research. Last month, DeepSeek released an upgraded version of its open-source V3 large language model, boosting its reasoning and coding capabilities. Founded in 2023 by Chinese entrepreneur Liang Wenfeng, DeepSeek made headlines and disrupted markets in January after unveiling its low-cost reasoning model, R1. The startup claims R1 can rival top competitors like OpenAI's GPT-4 — but at a fraction of the cost. An analyst told Business Insider earlier this month that DeepSeek's latest models — especially the reasoning-focused R1 and R2 set to launch later this month or in May — mark a "significant inflection point" in China's AI ambitions. "These models not only match the best-in-class performance globally, but are also open-sourced under the most permissive MIT License," said Wei Sun, the principal analyst for AI at Counterpoint Research. "That changes the game," she added. Unlike flagship models in the US, which are typically closed-sourced and monetized through APIs or enterprise licensing, DeepSeek's models like R1 and V3 are free for anyone to download, modify, and integrate. DeepSeek has been quiet about the progress of its next-generation R2 model. Amid high costs and chip shortages, Chinese firms have been prioritizing AI integration and consolidation to stay competitive, an analyst told BI earlier this month. Tencent has deployed its Hunyuan model and DeepSeek R1 across its massive ecosystem, including WeChat, said Ray Wang, a Washington-based analyst who specializes in AI and US-China tech statecraft. WeChat, China's biggest social media app, is used by nearly 1.4 billion people. Baidu has also integrated DeepSeek R1 into its search engine, he said. While details about DeepSeek's hiring process are scant, Liang, the founder, has made it clear that he values creativity over experience when it comes to hiring. In a 2023 interview with Chinese tech publication 36KR, he said that "experience is not that important," even in a similar role. "Basic skills, creativity, and passion are much more important," he added. "Our core technical positions are mainly filled by fresh graduates or those who have graduated one or two years ago," he said. The rush to hire product talent mirrors a broader trend in the AI world. In the US, product managers are seen as increasingly critical for some companies in the AI era, helping bridge the gap between rapidly advancing AI technology and real-world user needs. "The future really does belong to product managers," Frank Fusco, a product manager turned CEO of a software company called Silicon Society, told BI in November. As AI becomes more capable of handling coding and other engineering tasks, Fusco said it's an opportunity for product managers to take on an even greater role. OpenAI is hiring seven product manager roles in its New York and San Francisco offices, and Anthropic is hiring 11 product-related roles, according to the companies' websites. However, some tech companies are revisiting their reliance on product managers. Microsoft wants to increase the number of engineers relative to product or program managers, BI's Ashley Stewart reported last month. Other companies like Airbnb and Snap have been rethinking the need for product managers. Read the original article on Business Insider Sign in to access your portfolio

Business Insider
28-04-2025
- Business
- Business Insider
DeepSeek is hiring for an 'urgent' role in product management and design
DeepSeek, the Chinese startup that rattled the AI industry earlier this year, is hiring for a product role that illustrates the company's shift from research to commercialization. In a job notice posted Tuesday on its official WeChat account, DeepSeek said it is looking to fill a "product and design" position on its teams in Beijing and Hangzhou. It is unclear from the notice if the job refers to a single role or multiple positions. The Hangzhou-based firm labeled the job notice "urgent." The company wrote that it wants people to help create the "next generation of intelligent product experience" centered on large language models. Candidates are expected to have product management experience and be proficient in product and visual design, the notice said. DeepSeek did not respond to Business Insider's request for comment. DeepSeek is also hiring a chief financial officer and chief operating officer — jobs not labeled urgent. The company is expanding its research and engineering teams, according to other listings on its WeChat account. The move marks a major shift for the company, which has been focused on fundamental AI model research. Last month, DeepSeek released an upgraded version of its open-source V3 large language model, boosting its reasoning and coding capabilities. Founded in 2023 by Chinese entrepreneur Liang Wenfeng, DeepSeek made headlines and disrupted markets in January after unveiling its low-cost reasoning model, R1. The startup claims R1 can rival top competitors like OpenAI's GPT-4 — but at a fraction of the cost. An analyst told Business Insider earlier this month that DeepSeek's latest models — especially the reasoning-focused R1 and R2 set to launch later this month or in May — mark a "significant inflection point" in China's AI ambitions. "These models not only match the best-in-class performance globally, but are also open-sourced under the most permissive MIT License," said Wei Sun, the principal analyst for AI at Counterpoint Research. "That changes the game," she added. Unlike flagship models in the US, which are typically closed-sourced and monetized through APIs or enterprise licensing, DeepSeek's models like R1 and V3 are free for anyone to download, modify, and integrate. DeepSeek has been quiet about the progress of its next-generation R2 model. Amid high costs and chip shortages, Chinese firms have been prioritizing AI integration and consolidation to stay competitive, an analyst told BI earlier this month. Tencent has deployed its Hunyuan model and DeepSeek R1 across its massive ecosystem, including WeChat, said Ray Wang, a Washington-based analyst who specializes in AI and US-China tech statecraft. WeChat, China's biggest social media app, is used by nearly 1.4 billion people. Baidu has also integrated DeepSeek R1 into its search engine, he said. While details about DeepSeek's hiring process are scant, Liang, the founder, has made it clear that he values creativity over experience when it comes to hiring. In a 2023 interview with Chinese tech publication 36KR, he said that "experience is not that important," even in a similar role. "Basic skills, creativity, and passion are much more important," he added. "Our core technical positions are mainly filled by fresh graduates or those who have graduated one or two years ago," he said. Rise of product managers The rush to hire product talent mirrors a broader trend in the AI world. In the US, product managers are seen as increasingly critical for some companies in the AI era, helping bridge the gap between rapidly advancing AI technology and real-world user needs. "The future really does belong to product managers," Frank Fusco, a product manager turned CEO of a software company called Silicon Society, told BI in November. As AI becomes more capable of handling coding and other engineering tasks, Fusco said it's an opportunity for product managers to take on an even greater role. OpenAI is hiring seven product manager roles in its New York and San Francisco offices, and Anthropic is hiring 11 product-related roles, according to the companies' websites. However, some tech companies are revisiting their reliance on product managers. Microsoft wants to increase the number of engineers relative to product or program managers, BI's Ashley Stewart reported last month. Other companies like Airbnb and Snap have been rethinking the need for product managers.