Latest news with #AIModels


Forbes
5 days ago
- Business
- Forbes
Why Low-Precision Computing Is The Future Of Sustainable, Scalable AI
Lee-Lean Shu, CEO, GSI Technology. The staggering computational demands of AI have become impossible to ignore. McKinsey estimates that training an AI model costs $4 million to $200 million per training run. The environmental impact is also particularly alarming. Training a single large language model can emit as much carbon as five gasoline-powered cars over their entire lifetimes. When enterprise adoption requires server farms full of energy-hungry GPUs just to run basic AI services, we face both an economic and ecological crisis. This dual challenge is now shining a spotlight on low-precision AI—a method of running artificial intelligence models using lower precision numerical representations for the calculations. Unlike traditional AI models that rely on high-precision, memory-intensive storage (such as 32-bit floating-point numbers), low-precision AI uses smaller numerical formats—like 8-bit or 4-bit integers or smaller—to perform faster and more memory-efficient computations. This approach lowers the cost of developing and deploying AI by reducing hardware requirements and speeding up processing. The environmental benefits of low-precision AI are particularly important. It helps mitigate climate impact by optimizing computations to use less power. Many of the most resource-intensive AI efforts are building out or considering their own data centers. Because low-precision models require fewer resources, they enable companies and researchers to innovate with reduced-cost, high-performance computing infrastructure—thus further decreasing energy consumption. Research shows that by reducing numerical precision from 32-bit floats to 8-bit integers (or lower), most AI applications can maintain accuracy while slashing power consumption by four to five times. We have seen Nvidia GPU structures, for instance, move from FP32 to FP16 and INT8 over several generations and families. This is achieved through a process called quantization, which effectively maps floating-point values to a discrete set of integer values. There are now even efforts to quantize INT4, which would further reduce computational overhead and energy usage, enabling AI models to run more efficiently on low-power devices like smartphones, IoT sensors and edge computing systems. The 32-Bit Bottleneck For decades, sensor data—whether time-series signals or multidimensional tensors—has been processed as 32-bit floating-point numbers by default. This standard wasn't necessarily driven by how the data was captured from physical sensors, but rather by software compatibility and the historical belief that maintaining a single format throughout the processing pipeline ensured accuracy and simplicity. However, modern systems—especially those leveraging GPUs—have introduced more flexibility, challenging the long-standing reliance on 32-bit floats. For instance, in traditional digital signal processing (DSP), 32-bit floats were the gold standard. Even early neural networks, trained on massive datasets, defaulted to 32-bit to ensure greater stability. But as AI moved from research labs to real-world applications—especially on edge devices—the limitations of 32-bit became clear. As our data requirements for processing have multiplied, particularly for tensor-based AI processing, the use of 32-bit float has put tremendous requirements on memory storage as well as on bus transfers between that storage and dynamic processing. The result is higher compute storage costs and immense amounts of wasted power with only small increases in compute performance per major hardware upgrades. In other words, memory bandwidth, power consumption and compute latency are all suffering under the weight of unnecessary precision. This problem is acutely evident in large language models, where the massive scale of parameters and computations magnifies these inefficiencies. The Implementation Gap Despite extensive research into low-precision AI, real-world adoption has lagged behind academic progress, with many deployed applications still relying on FP32 and FP16/BF16 precision levels. While OpenCV has long supported low-precision formats like INT8 and INT16 for traditional image processing, its OpenCV 5 release—slated for summer 2025—plans to expand support for low-precision deep learning inference, including formats like bfloat16. That this shift is only now becoming a priority in one of the most widely used vision libraries is a telling indicator of how slowly some industry practices around efficient inference are evolving. This implementation gap persists, even as studies consistently demonstrate the potential for four to five times improvements in power efficiency through precision reduction. The slow adoption stems from several interconnected factors, primarily hardware limitations. Current GPU architectures contain a limited number of specialized processing engines optimized for specific bit-widths, with most resources dedicated to FP16/BF16 operations while INT8/INT4 capabilities remain constrained. However, low-precision computing is proving that many tasks don't need 32-bit floats. Speech recognition models, for instance, now run efficiently in INT8 with minimal loss in accuracy. Convolutional neural networks (CNNs) for image classification can achieve near-floating-point performance with 4-bit quantized weights. Even in DSP, techniques like fixed-point FIR filtering and logarithmic number systems (LNS) enable efficient signal processing without the traditional floating-point overhead. The Promise Of Flexible Architectures A key factor slowing the transition to low-precision AI is the need for specialized hardware with dedicated processing engines optimized for different bit-widths. Current GPU architectures, while powerful, face inherent limitations in their execution units. Most modern GPUs prioritize FP16/BF16 operations with a limited number of dedicated INT8/INT4 engines, creating an imbalance in computational efficiency. For instance, while NVIDIA's Tensor Cores support INT8 operations, real-world INT4 throughput is often constrained—not by a lack of hardware capability, but by limited software optimization and quantization support—dampening potential performance gains. This practical bias toward higher-precision formats forces developers to weigh trade-offs between efficiency and compatibility, slowing the adoption of ultra-low-precision techniques. The industry is increasingly recognizing the need for hardware architectures specifically designed to handle variable precision workloads efficiently. Several semiconductor companies and research institutions are working on processors that natively support 1-bit operations and seamlessly scale across different bit-widths—from binary (INT1) and ternary (1.58-bit) up to INT4, INT8 or even arbitrary bit-widths like 1024-bit. This hardware-level flexibility allows researchers to explore precision as a tunable parameter, optimizing for speed, accuracy or power efficiency on a per-workload basis. For example, a 4-bit model could run just as efficiently as an INT8 or INT16 version on the same hardware, opening new possibilities for edge AI, real-time vision systems and adaptive deep learning. These new hardware designs have the potential to accelerate the shift toward dynamic precision scaling. Rather than being constrained by rigid hardware limitations, developers could experiment with ultra-low-precision networks for simple tasks while reserving higher precision only where absolutely necessary. This could result in faster innovation, broader accessibility and a more sustainable AI ecosystem. Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?


Tahawul Tech
19-06-2025
- Health
- Tahawul Tech
drug discovery Archives
The synthetic data, which SandboxAQ is releasing publicly, can be used to train AI models that can predict whether a new drug molecule is likely to stick to the protein researchers are targeting.


Tahawul Tech
19-06-2025
- Health
- Tahawul Tech
data creation Archives
The synthetic data, which SandboxAQ is releasing publicly, can be used to train AI models that can predict whether a new drug molecule is likely to stick to the protein researchers are targeting.


Phone Arena
17-06-2025
- Phone Arena
Adobe's main generative AI app is now available on iOS and Android
– Alexandru Costin, VP, Generative AI, Adobe, June 2025 Receive the latest iOS news By subscribing you agree to our terms and conditions and privacy policy Generative Fill Generative Expand Text to Image Text to Video Image to Video Adobe Firefly for mobile | Images credits: Adobe Speaking of which, Adobe announced a new list of AI models in the Firefly partner ecosystem, so if you're planning to use the app on your phone, here are the choices: Image Models : Black Forest Lab's Flux 1.1 Pro and Flux.1 Kontext; Ideogram's Ideogram 3.0, Google's Imagen 3 and Imagen 4; OpenAI's image generation model, and Runway's Gen-4 Image : Black Forest Lab's Flux 1.1 Pro and Flux.1 Kontext; Ideogram's Ideogram 3.0, Google's Imagen 3 and Imagen 4; OpenAI's image generation model, and Runway's Gen-4 Image Video Models : Google's Veo 2 and Veo 3; Luma AI's Ray2 and Pika's text-to-video generator The models above are available alongside Adobe's family of commercially safe, IP-friendly Firefly models for images, video, audio and vectors. Adobe also confirmed that anything Android and iOS users create on their phones in the Firefly mobile app will automatically be synced with the creator's Creative Cloud account, allowing them to start creating on mobile and pick up on desktop (or vice versa). Speaking of which, Adobe announced a new list of AI models in the Firefly partner ecosystem, so if you're planning to use the app on your phone, here are the choices:The models above are available alongside Adobe's family of commercially safe, IP-friendly Firefly models for images, video, audio and also confirmed that anything Android and iOS users create on their phones in the Firefly mobile app will automatically be synced with the creator's Creative Cloud account, allowing them to start creating on mobile and pick up on desktop (or vice versa). After bringing its Photoshop app to Android devices last month, Adobe is now making phone users happier with the launch of yet another one of its popular apps, the unaware, Adobe Firefly is a suite of creative generative AI models designed by Adobe, which allows users to create images, videos, and other content by simply using text prompts or by modifying existing started as part of Adobe's Creative Cloud family of apps and as a standalone web app, but starting today the AI app is also available on mobile. The Adobe Firefly app is available both on Android and iOS devices, and promises to bring all the main features of the desktop app, including:More importantly, Adobe allows Firefly users to choose between using its commercially safe Firefly models and partner models from Google and OpenAI, depending on their needs for Text to Image, Text to Video and Image to Video.


Bloomberg
17-06-2025
- Climate
- Bloomberg
Huawei's AI Weather Model Among Top Performers in China Tests
China's weather agency is testing more than a dozen artificial intelligence models in an effort to enhance its forecasting, with a system from Huawei Technologies Co. showing accelerated improvement. The best models from the trial will be prioritized for deployment by provincial bureaus, and granted priority access to official weather data, according to the China Meteorological Administration, which is running the program. The CMA has said it wants to ensure 'orderly and standardized development' as the technology rapidly develops at home and abroad.