Latest news with #languageModel

Huawei's AI lab denies that one of its Pangu models copied Alibaba's Qwen

Yahoo

a day ago

Business
Yahoo

Huawei's AI lab denies that one of its Pangu models copied Alibaba's Qwen

BEIJING/SHANGHAI (Reuters) -Huawei's artificial intelligence research division has rejected claims that a version of its Pangu Pro large language model has copied elements from an Alibaba model, saying that it was independently developed and trained. The division, called Noah Ark Lab, issued the statement on Saturday, a day after an entity called HonestAGI posted an English-language paper on code-sharing platform Github, saying Huawei's Pangu Pro Moe (Mixture of Experts) model showed "extraordinary correlation" with Alibaba's Qwen 2.5 14B. This suggests that Huawei's model was derived through "upcycling" and was not trained from scratch, the paper said, prompting widespread discussion in AI circles online and in Chinese tech-focused media. The paper added that its findings indicated potential copyright violation, the fabrication of information in technical reports and false claims about Huawei's investment in training the model. Noah Ark Lab said in its statement that the model was "not based on incremental training of other manufacturers' models" and that it had "made key innovations in architecture design and technical features." It is the first large-scale model built entirely on Huawei's Ascend chips, it added. It also said that its development team had strictly adhered to open-source license requirements for any third-party code used, without elaborating which open-source models it took reference from. Alibaba did not immediately respond to a Reuters request for comment. Reuters was unable to contact HonestAGI or learn who is behind the entity. The release of Chinese startup DeepSeek's open-source model R1 in January this year shocked Silicon Valley with its low cost and sparked intense competition between China's tech giants to offer competitive products. Qwen 2.5-14B was released in May 2024 and is one of Alibaba's small-sized Qwen 2.5 model family which can be deployed on PC and smartphones. While Huawei entered the large language model arena early with its original Pangu release in 2021, it has since been perceived as lagging behind rivals. It open-sourced its Pangu Pro Moe models on Chinese developer platform GitCode in late June, seeking to boost the adoption of its AI tech by providing free access to developers. While Qwen is more consumer-facing and has chatbot services like ChatGPT, Huawei's Pangu models tend to be more used in government as well as the finance and manufacturing sectors. Error while retrieving data Sign in to access your portfolio Error while retrieving data Error while retrieving data Error while retrieving data Error while retrieving data

A ‘Sputnik' moment in the global AI race

Japan Times

5 days ago

Science
Japan Times

A ‘Sputnik' moment in the global AI race

When Chinese AI startup DeepSeek unveiled the open-source large language model DeepSeek-R1 in January, many referred to it as the "AI Sputnik shock" — a reference to the monumental significance of the Soviet Union's 1957 launch of the first satellite into orbit. Much remains uncertain about DeepSeek's LLM and its capabilities should not be overestimated — but its release nevertheless has sparked intense discussion about its superiority especially in terms of cost. DeepSeek claims that its model possesses reasoning abilities on par with or even superior to OpenAI's leading models, with training costs at less than one-tenth of OpenAI's — reportedly just $5.6 million — largely due to the use of NVIDIA's lower-cost H800 GPUs rather than the more powerful H200 or H100 models. Tech giants like Meta and Google have spent billions of dollars on high-performance GPUs to develop cutting-edge AI models. However, DeepSeek's ability to produce a high-performance AI model at a significantly lower cost challenges the prevailing belief that computational power—determined by the number and quality of GPUs—is the primary driver of AI performance.

Microsoft Launches 'Mu,' New On-Device AI Model for Copilot+ PCs

Yahoo

27-06-2025

Business
Yahoo

Microsoft Launches 'Mu,' New On-Device AI Model for Copilot+ PCs

Microsoft Corporation (NASDAQ:MSFT) is one of the best US tech stocks to buy now. On June 23, Microsoft officially launched a new small language model called Mu. The AI tool is designed for efficient local operation on personal computers, particularly the new Copilot+ PCs. Unlike larger AI models that rely on cloud processing, Mu operates entirely on a device's Neural Processing Unit/NPU, which enables rapid responses while consuming less power and memory. Mu is an efficient 330 million parameter encoder-decoder language model optimized for small-scale deployment on NPUs. Its design was carefully tuned to fit the hardware's parallelism and memory limits, and ensure peak efficiency for operations. The model's development used insights gained from Microsoft's earlier Phi models and was pre-trained on hundreds of billions of high-quality educational tokens. A development team working together to create the next version of Windows. To enhance its performance despite having fewer parameters, Mu was fine-tuned using advanced techniques such as distillation and low-rank adaptation, and it also incorporates transformer upgrades like Dual LayerNorm, Rotary Positional Embeddings/RoPE, and Grouped-Query Attention/GQA. Initially, the Mu model will be applied to the Settings function within the Windows system, using natural language processing to convert user inputs into system commands. Microsoft Corporation (NASDAQ:MSFT) develops and supports global software, services, devices, and solutions. While we acknowledge the potential of MSFT as an investment, we believe certain AI stocks offer greater upside potential and carry less downside risk. If you're looking for an extremely undervalued AI stock that also stands to benefit significantly from Trump-era tariffs and the onshoring trend, see our free report on the . READ NEXT: and . Disclosure: None. This article is originally published at Insider Monkey.

Bloomberg

26-06-2025

Business
Bloomberg

Impact of Anthropic Copyright Ruling

Bloomberg Technology TV Shows A judge has ruled that Anthropic's use of millions of books to train its language model without payment to the sources is legal under copyright law. Bloomberg Opinion's Dave Lee discusses the wider implications this could have for AI startups and content rights holders with Caroline Hyde and Ed Ludlow on "Bloomberg Tech." (Source: Bloomberg)

Mitsubishi Electric Develops Edge-device Language Model for Domain-specific Manufacturing

Yahoo

18-06-2025

Automotive
Yahoo

Mitsubishi Electric Develops Edge-device Language Model for Domain-specific Manufacturing

Leverages data augmentation to optimize language-model responses for user applications TOKYO, June 18, 2025--(BUSINESS WIRE)--Mitsubishi Electric Corporation (TOKYO: 6503) announced today that it has developed a language model tailored for manufacturing processes operating on edge devices. The Maisart®-branded AI technology has been pre-trained with data from Mitsubishi Electric's internal operations, enabling it to support a wide range of applications in specific manufacturing domains. In addition, the model leverages a uniquely developed data-augmentation technique to generate responses optimized for user-specific applications. The widespread adoption of generative AI is accelerating the use of large language models (LLMs). However, the significant computational and energy costs associated with LLMs are a growing concern. Additionally, there is increasing demand for generative AI solutions that can operate in on-premises environments due to data privacy and confidential information management requirements. In response, Mitsubishi Electric has developed a domain-specific language model by training a publicly available Japanese base model with the company's proprietary data from its own business domains, including factory automation (FA). Using training data generated through the company's original augmentation techniques enabled effective, task-specific fine-tuning. The resulting model is compact enough to run on limited hardware resources, making it suitable for environments with constrained computing capabilities such as edge devices, as well as for on-premises operations such as call centers that handle sensitive customer information. For the full text, please visit: View source version on Contacts Customer Inquiries Information Technology R&D CenterMitsubishi Electric Media Inquiries Takeyoshi KomatsuPublic Relations DivisionMitsubishi Electric CorporationTel: + Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Latest news with #languageModel

Huawei's AI lab denies that one of its Pangu models copied Alibaba's Qwen

A ‘Sputnik' moment in the global AI race

Microsoft Launches 'Mu,' New On-Device AI Model for Copilot+ PCs

Impact of Anthropic Copyright Ruling

Mitsubishi Electric Develops Edge-device Language Model for Domain-specific Manufacturing

Get Started Now: Download the App