Latest news with #AndrewWee


Mint
5 days ago
- Business
- Mint
The new chips designed to solve AI's energy problem
'I can't wrap my head around it," says Andrew Wee, who has been a Silicon Valley data-center and hardware guy for 30 years. The 'it" that has him so befuddled—irate, even—is the projected power demands of future AI supercomputers, the ones that are supposed to power humanity's great leap forward. Wee held senior roles at Apple and Meta, and is now head of hardware for cloud provider Cloudflare. He believes the current growth in energy required for AI—which the World Economic Forum estimates will be 50% a year through 2030—is unsustainable. 'We need to find technical solutions, policy solutions and other solutions that solve this collectively," he says. To that end, Wee's team at Cloudflare is testing a radical new kind of microchip, from a startup founded in 2023, called Positron, which has just announced a fresh round of $51.6 million in investment. These chips have the potential to be much more energy efficient than ones from industry leader Nvidia at the all-important task of inference, which is the process by which AI responses are generated from user prompts. While Nvidia chips will continue to be used to train AI for the foreseeable future, more efficient inference could collectively save companies tens of billions of dollars, and a commensurate amount of energy. There are at least a dozen chip startups all battling to sell cloud-computing providers the custom-built inference chips of the future. Then there are the well-funded, multiyear efforts by Google, Amazon and Microsoft to build inference-focused chips to power their own internal AI tools, and to sell to others through their cloud services. The intensity of these efforts, and the scale of the cumulative investment in them, show just how desperate every tech giant—along with many startups—is to provide AI to consumers and businesses without paying the 'Nvidia tax." That's Nvidia's approximately 60% gross margin, the price of buying the company's hardware. Nvidia is very aware of the growing importance of inference and concerns about AI's appetite for energy, says Dion Harris, a senior director at Nvidia who sells the company's biggest customers on the promise of its latest AI hardware. Nvidia's latest Blackwell systems are between 25 and 30 times as efficient at inference, per watt of energy pumped into them, as the previous generation, he adds. To accomplish their goals, makers of novel AI chips are using a strategy that has worked time and again: They are redesigning their chips, from the ground up, expressly for the new class of tasks that is suddenly so important in computing. In the past, that was graphics, and that's how Nvidia built its fortune. Only later did it become apparent graphics chips could be repurposed for AI, but arguably it's never been a perfect fit. Jonathan Ross is chief executive of chip startup Groq, and previously headed Google's AI chip development program. He says he founded Groq (no relation to Elon Musk's xAI chatbot) because he believed there was a fundamentally different way of designing chips—solely to run today's AI models. Groq claims its chips can deliver AI much faster than Nvidia's best chips, and for between one-third and one-sixth as much power as Nvidia's. This is due to their unique design, which has memory embedded in them, rather than being separate. While the specifics of how Groq's chips perform depends on any number of factors, the company's claim that it can deliver inference at a lower cost than is possible with Nvidia's systems is credible, says Jordan Nanos, an analyst at SemiAnalysis who spent a decade working for Hewlett Packard Enterprise. Positron is taking a different approach to delivering inference more quickly. The company, which has already delivered chips to customers including Cloudflare, has created a simplified chip with a narrower range of abilities, in order to perform those tasks more quickly. The company's latest funding round came from Valor Equity Partners, Atreides Management and DFJ Growth, and brings the total amount of investment in the company to $75 million. Positron's next-generation system will compete with Nvidia's next-generation system, known as Vera Rubin. Based on Nvidia's road map, Positron's chips will have two to three times better performance per dollar, and three to six times better performance per unit of electricity pumped into them, says Positron CEO Mitesh Agrawal. Competitors' claims about beating Nvidia at inference often don't reflect all of the things customers take into account when choosing hardware, says Harris. Flexibility matters, and what companies do with their AI chips can change as new models and use cases become popular. Nvidia's customers 'are not necessarily persuaded by the more niche applications of inference," he adds. Cloudflare's initial tests of Positron's chips were encouraging enough to convince Wee to put them into the company's data centers for more long-term tests, which are continuing. It's something that only one other chip startup's hardware has warranted, he says. 'If they do deliver the advertised metrics, we will open the spigot and allow them to deploy in much larger numbers globally," he adds. By commoditizing AI hardware, and allowing Nvidia's customers to switch to more-efficient systems, the forces of competition might bend the curve of future AI power demand, says Wee. 'There is so much FOMO right now, but eventually, I think reason will catch up with reality," he says. One truism of the history of computing is that whenever hardware engineers figure out how to do something faster or more efficiently, coders—and consumers—figure out how to use all of the new performance gains, and then some. Mark Lohmeyer is vice president of AI and computing infrastructure for Google Cloud, where he provides both Google's own custom AI chips, and Nvidia's, to Google and its cloud customers. He says that consumer and business adoption of new, more demanding AI models means that no matter how much more efficiently his team can deliver AI, there is no end in sight to growth in demand for it. Like nearly all other big AI providers, Google is making efforts to find radical new ways to produce energy to feed that AI—including both nuclear power and fusion. The bottom line: While new chips might help individual companies deliver AI more efficiently, the industry as a whole remains on track to consume ever more energy. As a recent report from Anthropic notes, that means energy production, not data centers and chips, could be the real bottleneck for future development of AI. Write to Christopher Mims at

Wall Street Journal
5 days ago
- Business
- Wall Street Journal
The New Chips Designed to Solve AI's Energy Problem
'I can't wrap my head around it,' says Andrew Wee, who has been a Silicon Valley data-center and hardware guy for 30 years. The 'it' that has him so befuddled—irate, even—is the projected power demands of future AI supercomputers, the ones that are supposed to power humanity's great leap forward. Wee held senior roles at Apple and Meta, and is now head of hardware for cloud provider Cloudflare. He believes the current growth in energy required for AI—which the World Economic Forum estimates will be 50% a year through 2030—is unsustainable.