Latest news with #SenseTime


South China Morning Post
18 hours ago
- Business
- South China Morning Post
Alibaba unveils new AI model for image creation, as open-source approach gains recognition
Alibaba Group Holding has launched a new artificial intelligence (AI) model, Qwen VLo, said to be capable of generating and editing images with a finesse akin to that of a human artist, intensifying the competition in multimodal models as the tech giant seeks to redefine itself as an AI leader. Released on Friday, Qwen VLo was a 'comprehensive upgrade' from previous models like QwenVL and Qwen2.5 VL, the company said. It could better understand input and create more precise images, accommodate open-ended instructions, and support multiple languages, including Chinese and English. A preview is now available on Qwen Chat. Qwen VLo also supports diverse input and output formats, offering increased flexibility for users and making it ideal for creating posters, illustrations, web banners, and social media covers. Alibaba owns the South China Morning Post. The new model adds to the intense competition in China's AI landscape, as rivals such as ByteDance and SenseTime strive to introduce their own multimodal models designed to interpret various types of input data, including text, video, and audio. In contrast, traditional AI models only handle one type of input. 10:41 How Hangzhou's 'Six Little Dragons' built a new Chinese tech hub How Hangzhou's 'Six Little Dragons' built a new Chinese tech hub Alibaba has been doubling down on AI and cloud computing, as it moves to streamline its sprawling operations. In February, the company pledged to invest more than 380 billion yuan (US$52 billion) in AI infrastructure over the next three years.


South China Morning Post
12-06-2025
- Business
- South China Morning Post
ByteDance, SenseTime unveil model updates as China's AI race heats up
China's artificial intelligence (AI) market sees heightened competition, as various model providers – from SenseTime to ByteDance – step up efforts to enhance their services. Hong Kong-listed SenseTime has upgraded its Cantonese-speaking chatbot, Sensechat, with a comprehensive set of new features that include real-time audio and video-interaction capabilities, according to the company's announcement on Thursday. Other enhancements include visual reasoning capabilities, which allow Sensechat to 'see' and 'think' while engaging with users. The feature was made possible by the multimodal reasoning capabilities of SenseTime's SenseNova V6 AI model , the company said. Multimodal models are designed to understand multiple types of input data such as text, video and audio, unlike traditional models that only handle one type. SenseTime is one of China's pioneering AI companies. Photo: Reuters That upgrade comes a day after TikTok parent ByteDance launched a suite of new AI models and tools at reduced prices, underscoring the intensifying competition in the domestic market after Chinese start-up DeepSeek's cost-effective products garnered global attention.


Korea Herald
12-04-2025
- Business
- Korea Herald
SenseTime's SenseNova V6: China's Most Advanced Multimodal Model with the Lowest Cost in the Industry
Integrating AI into Everyday Life HONG KONG, April 12, 2025 /PRNewswire/ -- SenseTime launched its newly upgraded large model series, SenseNova V6, at its Tech Day event held in several locations, including Shanghai and Shenzhen. Leveraging advances in the training of multimodal long chain-of-thought (CoT), global memory, and reinforcement learning, the model delivers industry-leading multimodal reasoning capabilities while setting a new benchmark for cost efficiency. The capabilities of the SenseNova V6 model have been greatly enhanced, with strong advantages in long CoT, reasoning, mathematical capabilities, and global memory. Its multimodal reasoning capabilities ranked first in China when benchmarked against GPT-o1, while its data analysis performance outpaced GPT-4o. It also combines high performance with cost efficiency. Its multimodal training efficiency is aligned with that of language models, providing the lowest training costs in the industry. Its reasoning costs are also the lowest in the industry. The new lightweight full-modal interactive model, SenseNova V6 Omni, delivers the most advanced multimodal interactive capabilities in China. It is China's first large model that supports in-depth analysis of 10-minute mid-to-long form videos, benchmarked against Gemini 2.5 Turbo to be among the strongest in its class. Dr. Xu Li, Chairman of the Board and CEO of SenseTime, said, "AI's true purpose is found in our everyday lives. SenseNova V6 has pushed past the boundaries of multimodality, unlocking infinite possibilities in reasoning and intelligence." Multimodal long-chain reasoning, reinforcement learning, and global memory: SenseNova V6 leads the way in enabling multimodal deep thinking As a native Mixture of Experts (MoE)-based multimodal general foundation model with over 600 billion parameters, SenseNova V6 has achieved multiple technological breakthroughs. A single model is able to perform a range of tasks across text and multimodal domains, including: In leading benchmark evaluations of reasoning and multimodal capabilities, SenseNova V6 achieved state-of-the-art results across multiple metrics. Based on more than 200B of high-quality multimodal long CoT data, SenseTime leverages multi-agent collaboration to synthesize and verify long CoT. SenseNova V6 has developed exceptional multimodal reasoning capabilities, supporting multimodal long CoTs up to 64K tokens, enabling the model's long-term thinking capability. In solving complex real-world problems, SenseNova V6 utilizes its robust hybrid image and text understanding and reasoning capabilities to help users with a range of tasks. For complex document processing scenarios, SenseNova V6 is able to help users with difficult tasks through its strong multimodal reasoning capabilities. For example, in insurance claims processing, SenseNova V6 can assess whether the submitted commercial health insurance claims meet the requirements. It can detect issues such as unnecessary prescriptions and examinations, missing documents, or incomplete submissions. Leveraging breakthroughs in multimodal reinforcement learning, SenseTime has developed a hybrid reinforcement learning framework for various image-text tasks, based on different difficulty levels and multi-reward models. China's first model to break the 10-minute barrier in video understanding, achieving analysis of extended content within seconds With its global memory capability, SenseNova V6 overcomes the limitations of traditional models that could only support short videos, and now supports full-framerate analysis of 10-minute videos. With advanced comprehension capabilities, SenseNova V6 is also able to intelligently edit and extract video highlights, helping users to retain memorable moments. SenseTime's proprietary technology aligns visual information (images), auditory information (speech and sounds), linguistic information (subtitles and spoken language), and temporal logic to form a multimodal unified sequential representation. Based on this framework, it applies fine-grained cascading compression and content-aware dynamic filtering to achieve high-ratio compression of long videos. A 10-minute video can be compressed into 16K tokens while retaining key semantics. Human-like interaction: SenseNova V6 Omni launches with multi-industry deployment With the launch of SenseNova V6, SenseNova's has upgraded its real-time interactive unified large model to SenseNova V6 Omni, with deep optimizations across scenarios, including role-playing, translation and reading, cultural tourism guiding, picture book narration, and mathematical explanation. In translation and reading scenarios, SenseNova V6 Omni enables users to achieve precise spatial interactions with a simple finger gesture. The model also accurately understands the relationship between local and global information, providing a more intuitive and human-like interactive experience. SenseNova V6 Omni features more human-like perceptual and expressive abilities, as well as emotional understanding. It has been deployed across multiple industries and scenarios, including embodied intelligence, becoming the first commercialized full-modality real-time interactive model in China. Full-featured version of SenseChat launched, now available for preview SenseTime has released a comprehensive update to SenseChat, along with a brand-new app built on the complete capabilities of SenseNova V6. Through a single access point, users can engage in seamless multimodal interactive streaming experiences across text, images, and video. The SenseChat app is available for preview and SenseNova V6 is now available for trial via the SenseChat web platform at RMB100 million in vouchers released to accelerate full-stack scenario implementation SenseTime also announced a dedicated subsidy of RMB100 million, aimed at advancing emerging fields such as embodied intelligence and AIGC. Through targeted and multi-dimensional initiatives, SenseTime is delivering a one-stop solution designed for high efficiency, low cost, and end-to-end AI implementation, spanning expert consulting, model training, and reasoning validation. - End - About SenseTime SenseTime is a leading AI software company focused on creating a better AI-empowered future through innovation. We are committed to advancing the state of the art in AI research, developing scalable and affordable AI software platforms that benefit businesses, people and society as a whole, while attracting and nurturing top talents to shape the future together. With our roots in the academic world, we invest in our original and cutting-edge research that allows us to offer and continuously improve industry-leading AI capabilities in universal multimodal and multi-task models, covering key fields across perception intelligence, natural language processing, decision intelligence, AI-enabled content generation, as well as key capabilities in AI chips, sensors and computing infrastructure. Our proprietary AI infrastructure, SenseCore, integrates computing power, algorithms, and platforms, enabling us to build the "SenseNova" foundation model sets and R&D system that unlocks the ability to perform general AI tasks at low cost and with high efficiency. Our technologies are trusted by customers and partners in many industry verticals including Generative AI, Computer Vision and Smart Auto. SenseTime has been actively involved in the development of national and international industry standards on data security, privacy protection, ethical and sustainable AI, working closely with multiple domestic and multilateral institutions on ethical and sustainable AI development. SenseTime was the only AI company in Asia to have its Code of Ethics for AI Sustainable Development selected by the United Nations as one of the key publication references in the United Nations Resource Guide on AI Strategies, and was published in June 2021. SenseTime Group Inc. has successfully listed on the Main Board of the Stock Exchange of Hong Kong Limited (HKEX). We have offices in markets including Hong Kong, Shanghai, Beijing, Shenzhen, Chengdu, Hangzhou, Nanping, Qingdao, Xi'an, Macau, Kyoto, Tokyo, Singapore, Riyadh, Abu Dhabi, Dubai, Kuala Lumpur and South Korea, etc., as well as presence in Germany, Thailand, Indonesia and the Philippines. For more information, please visit SenseTime's official website or LinkedIn, X, Facebook and Youtube pages.


South China Morning Post
12-04-2025
- Business
- South China Morning Post
SenseTime to expand computing power amid surging AI model demand
SenseTime , an artificial intelligence (AI) pioneer in China, is set to expand its computing power by up to a triple-digit percentage annually in the next two years, as the company continues its efforts to increase the use of domestic chips amid an intensified tech war. Advertisement Yang Fan, co-founder of SenseTime and president of the SenseCore business group, the company's AI infrastructure unit, said the company's computing power would maintain a 'high double-digit to triple-digit' annual growth rate in the coming 24 months, signalling a strategy to capitalise on surging demand for generative AI models. In 2024, the total computing power operated by SenseCore grew by 92 per cent year on year to over 23,000 petaflops. One of its major efforts in recent years has been to increase the adoption of home-grown chips to mitigate risks from the ongoing US-China tech war The company launched an upgraded version of SenseCore on Thursday, featuring better performance in computing, among other areas, as well as some industry-wide solutions aimed at accelerating the commercialisation of its infrastructure. Yang Fan, SenseTime co-founder and president of the SenseCore business group. Photo: Ann Cao The move reflects SenseTime's efforts to capitalise on surging AI demand, fuelled by OpenAI's GPT models and more recently the open-source models from China's DeepSeek , as it targets its first full-year profit in 2026. Advertisement SenseTime, founded in Hong Kong in 2014 and best known for its AI and computer vision software, is also an early mover in building up computing infrastructure. The company began this effort as early as 2018, according to Yang.


South China Morning Post
11-04-2025
- Business
- South China Morning Post
Chinese AI firm SenseTime bets on multimodal models to stand out from rivals
SenseTime , an artificial intelligence (AI) pioneer in China, has launched new models that it claims surpass OpenAI products in reasoning capabilities, as it bets on multimodal models to secure its position in the competitive AI landscape. Advertisement The company on Thursday unveiled SenseNova V6 and V6 Reasoner, new iterations of its self-developed AI model series. V6 outperformed OpenAI's GPT-4o across several metrics, including fact-checking, numerical reasoning, data analysis and visualisation, according to SenseTime chairman and CEO Xu Li , citing data from benchmarking platform TableBench. With 600 billion parameters, V6 is China's leading model in multimodal reasoning and also the most cost-effective option for inference across the industry, according to the company. Xu also said that V6 Reasoner outperformed OpenAI's o1 and Google's Gemini 2.0 Flash Thinking in multimodal reasoning abilities. The advances are designed to address an industry-wide challenge: the depletion of high-quality text data for training large language models (LLMs). SenseTime's booth at an AI conference in Shanghai. Photo: Costfoto/NurPhoto via Getty Images Unlike traditional LLMs that focus primarily on text, multimodal LLMs integrate various modalities – such as images, audio and video – to improve comprehension and generation capabilities. Advertisement The industry's initial strategy of expanding model parameters under the scaling law had 'hit a wall', Xu said in an interview in Shanghai on Thursday. 'We've nearly exhausted all text data that can be collected from the internet,' he said.