Latest news with #MoorInsights


Forbes
15 hours ago
- Business
- Forbes
AMD Keeps Building Momentum In AI, With Plenty Of Work Still To Do
At the AMD Advancing AI event, CEO Lisa Su touted the company's AI compute portfolio. At the AMD Advancing AI event in San Jose earlier this month, CEO Lisa Su and her staff showcased the company's progress across many different facets of AI. They had plenty to announce in both hardware and software, including significant performance gains for GPUs, ongoing advances in the ROCm development platform and the forthcoming introduction of rack-scale infrastructure. There were also many references to trust and strong relationships with customers and partners, which I liked, and a lot of emphasis on open hardware and an open development ecosystem, which I think is less of a clear winner for AMD, as I'll explain later. Overall, I think the event was important for showing how AMD is moving the ball down the field for customers and developers. Under Su, AMD's M.O. is to have clear, ambitious plans and execute against them. Her 'say/do' ratio is high. The company does what it says it will do. This is exactly what it must continue doing to whittle away at Nvidia's dominance in the datacenter AI GPU market. What I saw at the Advancing AI event raised my confidence from last year — although there are a few gaps that need to be addressed. (Note: AMD is an advisory client of my firm, Moor Insights & Strategy.) AMD's AI Market Opportunity And Full-Stack Strategy When she took the stage, Su established the context for AMD's announcements by describing the staggering growth that is the backdrop for today's AI chip market. Just take a look at the chart below. So far, AMD's bullish projections for the growth of the AI chip market have turned out to be ... More accurate. So this segment of the chip industry is looking at a TAM of half a trillion dollars by 2028, with the whole AI accelerator market increasing at a 60% CAGR. The AI inference sub-segment — where AMD competes on better footing with Nvidia — is enjoying an 80% CAGR. People thought that the market numbers AMD cited last year were too high, but not so. This is the world we're living in. For the record, I never doubted the TAM numbers last year. AMD is carving out a bigger place in this world for itself. As Su pointed out, its Instinct GPUs are used by seven of the 10 largest AI companies, and they drive AI for Microsoft Office, Facebook, Zoom, Netflix, Uber, Salesforce and SAP. Its EPYC server CPUs continue to put up record market share (40% last quarter), and it has built out a full stack — partly through smart acquisitions — to support its AI ambitions. I would point in particular to the ZT Systems acquisition and the introduction of the Pensando DPU and the Pollara NIC. GPUs are at the heart of datacenter AI, and AMD's new MI350 series was in the spotlight at this event. Although these chips were slated to ship in Q3, Su said that production shipments had in fact started earlier in June, with partners on track to launch platforms and public cloud instances in Q3. There were cheers from the crowd when they heard that the MI350 delivers a 4x performance improvement over the prior generation. AMD says that its high-end MI355X GPU outperforms the Nvidia B200 to the tune of 1.6x memory, 2.2x compute throughput and 40% more tokens per dollar. (Testing by my company Signal65 showed that the MI355X running DeepSeek-R1 produced up to 1.5x higher throughput than the B200.) To put it in a different perspective, a single MI355X can run a 520-billion-parameter model. And I wasn't surprised when Su and others onstage looked ahead to even better performance — maybe 10x better — projected for the MI400 series and beyond. That puts us into the dreamland of an individual GPU running a trillion-parameter model. By the way, AMD has not forgotten for one second that it is a CPU company. The EPYC Venice processor scheduled to hit the market in 2026 should be better at absolutely everything — 256 high-performance cores, 70% more compute performance than the current generation and so on. EPYC's rapid gains in datacenter market share over the past few years are no accident, and at this point all the company needs to do for CPUs is hold steady on its current up-and-to-the-right trajectory. I am hopeful that Signal65 will get a crack at testing the claims the company made at the event. This level of performance is needed in the era of agentic AI and a landscape of many competing and complementary AI models. Su predicts — and I agree — that there will be hundreds of thousands of specialized AI models in the coming years. This is specifically true for enterprises that will have smaller models focused on areas like CRM, ERP, SCM, HCM, legal, finance and so on. To support this, AMD talked at the event about its plan to sustain an annual cadence of Instinct accelerators, adding a new generation every year. Easy to say, hard to do — though, again, AMD has a high say/do ratio these days. AMD's 2026 Rack-Scale Platform And Current Software Advances On the hardware side, the biggest announcement was the forthcoming Helios rack-scale GPU product that AMD plans to deliver in 2026. This is a big deal, and I want to emphasize how difficult it is to bring together high-performing CPUs (EPYC Venice), GPUs (MI400) and networking chips (next-gen Pensando Vulcano NICs) in a liquid-cooled rack. It's also an excellent way to take on Nvidia, which makes a mint off of its own rack-scale offerings for AI. At the event, Su said she believes that Helios will be the new industry standard when it launches next year (and cited a string of specs and performance numbers to back that up). It's good to see AMD provide a roadmap this far out, but it also had to after Nvidia did at the GTC event earlier this year. On the software side, Vamsi Boppana, senior vice president of the Artificial Intelligence Group at AMD, started off by announcing the arrival of ROCm 7, the latest version of the company's open source software platform for GPUs. Again, big improvements come with each generation — in this case, a 3.5x gain in inference performance compared to ROCm 6. Boppana stressed the very high cadence of updates for AMD software, with new features being released every two weeks. He also talked about the benefits of distributed inference, which allows the two steps of inference to be tasked to separate GPU pools, further speeding up the process. Finally, he announced — to a chorus of cheers — the AMD Developer Cloud, which makes AMD GPUs accessible from anywhere so developers can use them to test-drive their ideas. Last year, Meta had kind things to say about ROCm, and I was impressed because Meta is the hardest 'grader' next to Microsoft. This year, I heard companies talking about both training and inference, and again I'm impressed. (More on that below.) It was also great getting some time with Anush Elangovan, vice president for AI software at AMD, for a video I shot with him. Elangovan is very hardcore, which is exactly what AMD needs. Real grinders. Nightly code drops. What's Working Well For AMD in AI So that's (most of) what was new at AMD Advancing AI. In the next three sections, I want to talk about the good, the needs-improvement and the yet-to-be-determined aspects of what I heard during the event. Let's start with the good things that jumped out at me. What Didn't Work For Me At Advancing AI While overall I thought Advancing AI was a win for AMD, there were two areas where I thought the company missed the mark — one by omission, one by commission. The Jury Is Out On Some Elements Of AMD's AI Strategy In some areas, I suspect that AMD is doing okay or will be doing okay soon — but I'm just not sure. I can't imagine that any of the following items has completely escaped AMD's attention, but I would recommend that the company address them candidly so that customers know what to expect and can maintain high confidence in what AMD is delivering. What Comes Next In AMD's AI Development It is very difficult to engineer cutting-edge semiconductors — let alone rack-scale systems and all the attendant software — on the steady cadence that AMD is maintaining. So kudos to Su and everyone else at the company who's making that happen. But my confidence (and Wall Street's) would rise if AMD provided more granularity about what it's doing, starting with datacenter GPU forecasts. Clearly, AMD doesn't need to compete with Nvidia on every single thing to be successful. But it would be well served to fill in some of the gaps in its story to better speak to the comprehensive ecosystem it's creating. Having spent plenty of time working inside companies on both the OEM and semiconductor sides, I do understand the difficulties AMD faces in providing that kind of clarity. The process of landing design wins can be lumpy, and a few of the non-AMD speakers at Advancing AI mentioned that the company is engaged in the 'bake-offs' that are inevitable in that process. Meanwhile, we're left to wonder what might be holding things back, other than AMD's institutional conservatism — the healthy reticence of engineers not to make any claims until they're sure of the win. That said, with Nvidia's B200s sold out for the next year, you'd think that AMD should be able to sell every wafer it makes, right? So are AMD's yields not good enough yet? Or are hyperscalers having their own problems scaling and deploying? Is there some other gating item? I'd love to know. Please don't take any of my questions the wrong way, because AMD is doing some amazing things, and I walked away from the Advancing AI event impressed with the company's progress. At the show, Su was forthright about describing the pace of this AI revolution we're living in — 'unlike anything we've seen in modern computing, anything we've seen in our careers, and frankly, anything we've seen in our lifetime.' I'll keep looking for answers to my nagging questions, and I'm eager to see how the competition between AMD and Nvidia plays out over the next two years and beyond. Meanwhile, AMD moved down the field at its event, and I look forward to seeing where it is headed.


Forbes
2 days ago
- Business
- Forbes
Quantum Threats Reshape Commvault's Vision For Data Security
Commvault is incorporating post-quantum cryptography to address future data security risks. Data protection provider Commvault announced earlier this month that it is adding more quantum-safe capabilities to its platform to build out defenses against post-quantum cryptography. This is important because, as quantum computing shifts from theoretical to practical use, it brings a new class of cybersecurity threats. To help organizations prepare, Commvault has incorporated NIST-recommended PQC algorithms into its data protection offerings, covering both cloud and on-premises environments. The goal is to ensure long-term data security by protecting backups made today from potential decryption by future quantum systems. Over the past year, Commvault has introduced multiple post-quantum cryptography capabilities to safeguard data against future risks posed by quantum computing. PQC has important implications for customers, competitors and the broader industry, and all organizations should prepare for a quantum-driven — and quantum-safe — future. (Note: Commvault is an advisory client of my firm, Moor Insights & Strategy.) Understanding The Quantum Threat To Enterprise Data First, a little background on why this is so important. Quantum computers apply principles of quantum mechanics to process information in fundamentally different ways from classical computers. While this could unlock incredible advances in medicine, materials science, finance, AI and more, it also introduces new security concerns. This is because current encryption methods such as RSA and elliptic curve cryptography depend on mathematical problems that are very hard to reverse — unless a powerful quantum computer is involved. Once quantum computers that powerful are launched, probably in the next few years, these algorithms can potentially be broken quickly, compromising these widely used encryption methods. A crucial concern today is the 'harvest now, decrypt later' tactic, where bad actors can intercept and store encrypted data to decrypt it in the future once quantum capabilities mature. HNDL protection is especially critical for sectors with long-term data sensitivity, such as healthcare, finance and government. (Think of any setting in which sensitive information — names, dates of birth, government ID numbers, bank account numbers, medical histories and the like — remains unchanged for many years.) A survey by the Information Systems Audit and Control Association found that 63% of cybersecurity professionals believe quantum computing will shift or expand cyber risks, and half expect it to create compliance challenges. This image shows how users can enable PQC within Commvault's CommCell environment by selecting a ... More checkbox in the group configuration settings. Commvault's Post-Quantum Cryptography Response Commvault has taken a practical, multi-stage approach to quantum-era risks. In August 2024, it introduced a cryptographic agility framework, which is meant to allow organizations to adopt new cryptographic standards for PQC without major system changes. The framework includes several NIST-recommended quantum-resistant algorithms — CRYSTALS-Kyber, CRYSTALS-Dilithium, SPHINCS+ and FALCON. (My colleague Paul Smith-Goodson, who has been covering quantum computing for years, went into more detail about these algorithms in the context of IBM's PQC efforts, also in August 2024.) Commvault's announcement earlier this month builds on last year's release by adding support for the Hamming Quasi-Cyclic algorithm, which uses quantum error-correcting codes to resist quantum decryption. But rather than focusing only on algorithm support, Commvault also emphasizes operational integration. Its Risk Analysis tools help organizations identify sensitive data, allowing quantum-resistant encryption to be applied where it's most needed. The crypto-agility framework offered by Commvault allows organizations to shift between cryptographic methods via relatively simple configuration changes, without needing to overhaul their existing environments. This flexibility helps minimize disruptions and lowers the costs associated with adapting to new standards as they emerge. Securing Critical Industries For The Quantum Era Commvault's PQC features should be especially helpful to organizations in healthcare, finance and government as they address compliance needs, ensure continuity and — most importantly — protect data that is held for decades. As touched on above, these industries are especially at risk for deferred decryption attacks, so implementing PQC features now should help address the risk of HNDL exploits later. Besides the benefits already mentioned, this could help organizations using Commvault maintain trust among regulators, customers and partners for the long haul. As data protection standards in these industries become stricter in anticipation of quantum threats, solutions that incorporate quantum-resistant encryption are increasingly necessary. Forward-looking IT organizations are already adopting these technologies. For instance, the Nevada Department of Transportation has adopted Commvault's PQC tools to meet government security requirements and protect sensitive information. The company also cited Peter Hands, CISO of the British Medical Association, who said, 'Commvault's rapid integration of NIST's quantum-resistant standards, particularly HQC, gives us great confidence that our critical information is protected now and well into the future.' The adoption of PQC is accelerating as both technological developments and regulatory requirements create a framework for organizations to address emerging threats from quantum computing. In the United States, for instance, federal agencies have been instructed to integrate post-quantum standards into their procurement and operational practices. Similar regulatory efforts are taking place in the European Union and other jurisdictions, where updates to data protection frameworks increasingly include provisions for quantum-safe encryption. To maintain security and compatibility during the transition, many organizations are implementing hybrid encryption methods that combine traditional and quantum-resistant algorithms. This approach allows for gradual migration to fully quantum-resistant systems while enabling protection against both current and future threats. PQC Challenges And The Push For Wider Adoption Commvault's phased introduction of PQC capabilities is a step forward, but current support is mostly limited to cloud-based customers using particular software versions. This creates a gap for organizations relying on hybrid or on-premises environments, which are still widely used in sensitive sectors like those already mentioned. To address this, Commvault would benefit from providing a clear roadmap for extending PQC support across all deployment models. Such a roadmap should outline which software versions will be supported, specify the technical requirements and offer a realistic timeline for implementation. The broader data protection market is also shifting as major technology providers such as IBM and Microsoft integrate quantum-safe features into their platforms. Other data protection vendors, such as Cohesity, Veeam and Rubrik, are expected to follow suit as industry standards become more established. This means Commvault will likely face growing competition in offering robust PQC solutions. Keeping pace will require not only technical expansion but also practical guidance for customers on how to adopt and apply PQC in various enterprise scenarios. Flexibility and clear communication about available features and best practices will be important for supporting a wide range of customer environments and needs. Aligning Data Security Strategies For A Quantum Future Commvault's early efforts in post-quantum cryptography and crypto-agility demonstrate a commitment to long-term data security. However, maintaining progress will depend on expanding access to PQC features for all customers, providing transparent information about costs and continuing to work closely with regulatory bodies. Quantum computing presents both new risks and opportunities. As traditional encryption methods become more vulnerable, the need for quantum-resistant security will grow. Commvault's PQC features offer a practical way for organizations to protect data that must remain secure for years. By focusing on adaptability, compliance and targeted encryption strategies, Commvault helps customers build stronger defenses for the future. The timeline for quantum decryption could be shorter than many anticipate, making it important for organizations to start preparing now. For enterprises, taking early action is important to avoid exposure and regulatory issues. For vendors, ongoing improvements in accessibility, transparency and alignment with emerging standards will determine long-term success. Simplifying the path to quantum readiness will be a key factor in supporting customers through this transition.


Forbes
5 days ago
- Forbes
Could The HP Dimension With Google Beam Shake Up Videoconferencing?
HP Dimension with Google Beam HP and Google have launched the HP Dimension with Google Beam, a 3-D video communication system designed to mimic the dynamics of in-person meetings without the need for headsets or wearables. Unveiled a couple of weeks ago at InfoComm 2025, the system builds on Google's Project Starline, which my colleague Anshel Sag wrote about here. HP primarily designed it for small meeting environments to enhance virtual collaboration, touting it as having a unique sense of presence and spatial depth. (Note that both Google and HP are advisory clients of my firm, Moor Insights & Strategy.) HP Dimension With Google Beam — Specs And User Experience The system features a 65-inch 8K light field display, six cameras, adaptive lighting and spatial audio. Despite incorporating these advanced components, the design minimizes visible technology. Greg Baribault, vice president of product and portfolio for the Hybrid Systems unit at HP, called this out during our Six Five On the Road video interview at InfoComm: 'One of the beautiful things we did with the hardware was remove the technology. You see no technology.' Baribault explained that the 3-D rendering creates a natural spatial perception, making users feel as if they are behind the screen rather than experiencing uncomfortable pop-out 3-D effects. He emphasized that with 'no cameras pointed at you,' the design promotes a 'natural face-to-face conversation' without making users feel self-conscious. During my experience trying out the HP Dimension at InfoComm, I was particularly impressed by the realism of the 3-D rendering and its remarkable ability to create a sense of shared space. It felt like I was sitting across the table from someone, conversing with eye contact and observing the person's natural movements and the immediate objects around us. Preliminary internal testing conducted by HP and Google supports the level of presence I felt. Their tests indicate that users' interactions via the HP Dimension with Google Beam may significantly improve key communication metrics. Reported results show a 28% increase in recall of meeting details, up to 39% more non-verbal communication and at least a 14% increase in focus on conversation partners compared to traditional videoconferencing. Meeting participants see each other in realistic 3-D using the HP Dimension with Google Beam. Key Business Use Cases: High-Value Executive And Client Interactions Given its capabilities and focus on immersive presence, the HP Dimension with Google Beam appears well-suited for high-value enterprise applications. For instance, I think it could offer significant value in executive leadership meetings, where non-verbal cues take on even more importance. It could also enhance critical client engagements by creating greater impact for high-stakes presentations. The system also supports specialized collaborative workflows like design reviews, where three-dimensional visualization can significantly aid problem-solving. Beyond these business applications, the device's enhanced sense of presence could prove invaluable in scenarios requiring deep personal connection across significant distances. For example, it could enable military personnel on deployment to experience more authentic interactions with their families back home. The HP Dimension with Google Beam is designed for one-on-one interactions, but can also easily handle multi-participant meetings. It supports native Zoom Rooms and Google Meet, and is compatible with Microsoft Teams and Webex. The Foundational Audio Experience HP also announced its Poly Studio A2 Audio Solutions at InfoComm 2025, featuring them prominently in the Dimension demo. These solutions, comprising a versatile tabletop microphone setup and an audio bridge, can form a foundational element of the overall system. The microphones expand from one to eight units and connect to the bridge via standard wiring, allowing deployment in diverse room configurations. The Poly Studio A2 microphones and audio bridge can be used with the Dimension system or other Poly video solutions, providing flexibility for organizations with different meeting room setups. The audio solution aims to ensure that high-quality audio complements the video and 3-D experience. I believe its versatility reinforces HP's vision of 'equity at the table so everyone can be heard,' as Baribault put it. When discussing the role of audio, Baribault emphasized its critical importance in virtual meetings: 'Voice is the most important part of any online meeting. You can usually tolerate a video drop here or there, but if you lose audio, you lose all the context of the discussion.' He also pointed out that distractions, such as the crinkling of a potato chip bag or someone typing loudly on their keyboard, can be highly disruptive for remote participants. To address these issues and ensure clear audio, HP has integrated AI technology in its microphones, audio bridge and video codecs to reduce noise and echo. Additionally, AI is used within management tools to help IT staff optimize audio quality. Market Positioning And Investment In a market increasingly saturated with standard videoconferencing tools, HP aims to carve out a premium segment by addressing the limitations of two-dimensional interactions. In this context, the Dimension system represents a notable advance in collaborative technology, offering a distinct solution for enterprises seeking to enhance virtual engagement beyond current capabilities. Platforms such as Cisco Spatial Meetings and VR-based competitors, including Fabrik and Assemblr, offer immersive 3-D spaces that enable teams to collaborate on designs, visualize digital assets and work together in shared environments. Cisco Spatial Meetings, when paired with Apple Vision Pro, can deliver life-like spatial video conferencing that mimics in-person presence. However, these solutions require specialized headsets or glasses, potentially limiting accessibility and complicating deployment. The HP Dimension with Google Beam also represents a significant investment, given that it's priced at $24,999 for the hardware, with the Google Beam license sold separately. However, its value could become apparent when considering executives who currently take many flights a year between company facilities for in-person meetings. In such scenarios, the system's ability to replicate face-to-face interaction could justify the cost, save significant time and reduce environmental impact. Orders for the system open in June 2025, with shipments beginning in September to the U.S., Canada, the U.K., France, Germany and Japan. Beyond The Demo: Driving Adoption The technical innovation and potential for enhanced collaboration are clearly evident as soon as you use the HP Dimension with Google Beam. However, the most significant challenge for its makers could be achieving widespread adoption. Besides the initial investment, organizations would need to consider the practicalities of network bandwidth, data security and the like that arise when integrating such a specialized system into existing IT infrastructure. Shifting established user habits and workflows to embrace this new virtual presence will also require strategic planning and effective change management. For HP to drive widespread adoption, getting target users — especially executives and IT decision makers — into a live demo will be crucial. The face-to-face impact of the system is its most compelling feature, and experiencing it firsthand will always be the most effective way to convey its potential.


Forbes
10-06-2025
- Business
- Forbes
Will AI Agents Upend The Software Development Life Cycle?
Engaging Agent Mode on active code in GitHub May 2025 will go down in history as the month when agentic software development was truly unleashed upon the world. A significant step up from chat-based code assistants, agentic software tools have been positioned as a revolutionary change in the software development life cycle. Announcements from Microsoft (GitHub Copilot Agent Mode), Google (Jules), OpenAI (Codex) and Anthropic (Claude Agents) are all very promising. However, my conclusion is that we are not witnessing a revolution, but are simply seeing a further evolution of AI automation. And to be honest, I think that is good news. First, let's discuss what has driven this leap forward from coding assistants. A coding assistant is essentially a bot interface to a large language model, and this mechanism has been quite beneficial to many developers. As one proof point for this, at its Build conference in May, Microsoft said that the GitHub Copilot assistant has been used by 15 million developers. (Note: Microsoft is a client of my firm, Moor Insights & Strategy.) But three fundamental AI shifts have led to the creation of these new agentic development tools, which significantly expand the AI benefit to developers. At the Build conference, Agent Mode in GitHub Copilot was the cornerstone of Microsoft's vision of a new SDLC. The company's demonstrations of how much more quickly work can be completed and the seamless integration between VSCode and GitHub were quite impressive — and were always likely to get the lion's share of media coverage. But what's most impressive is the overall breadth of Microsoft's announcements and the fact that Microsoft may be the only company able to execute such a vision. Here are three non-Copilot things from Microsoft that also improve the SDLC. I applaud Microsoft for creating innovations across the whole SDLC. And I know that there are other improvements in security and software updates that I did not include. I do believe that Microsoft is likely the only company that can execute a broad vision for the SDLC since it owns some of the biggest pieces (tooling, repositories, security, collaboration) that a developer needs. That said, I'd like to offer up a couple of areas where Microsoft could look next. I walked away from Build impressed with the technology Microsoft has available and also what is in preview. A great deal of my research over the past 12 months has been around the impact and possibilities of AI agents and agentic workflows. However, I also am not sure we have seen something revolutionary in the SDLC — at least not yet. To me, revolutionary means that it changes the game and how it's played. Evolutionary is introducing new efficiencies to the existing game, which is what I see happening so far in this space. Here are a couple of examples of what I mean. So far, agent-based tools are more like the outsourcing trend — but that's not necessarily bad. AI is still very new and moving very fast. Revolutionizing everything would likely overwhelm many enterprises, so embracing new tools to do existing work may just be the first step in what could ultimately be a revolutionary movement.


Forbes
10-06-2025
- Business
- Forbes
IBM's Vision For A Large-Scale Fault-Tolerant Quantum Computer By 2029
IBM's vision for its large-scale fault-tolerant Starling quantum computer IBM has just made a major announcement about its plans to achieve large-scale quantum fault tolerance before the end of this decade. Based on the company's new quantum roadmap, by 2029 IBM expects to be able to run accurate quantum circuits with hundreds of logical qubits and hundreds of millions of gate operations. If all goes according to plan, this stands to be an accomplishment with sweeping effects across the quantum market — and potentially for computing as a whole. In advance of this announcement, I received a private briefing from IBM and engaged in detailed correspondence with some of its quantum researchers for more context. (Note: IBM is an advisory client of my firm, Moor Insights & Strategy.) The release of the new roadmap offers a good opportunity to review what IBM has already accomplished in quantum, how it has adapted its technical approach to achieve large-scale fault tolerance and how it intends to implement the milestones of its revised roadmap across the next several years. Let's dig in. First, we need some background on why fault tolerance is so important. Today's quantum computers have the potential, but not yet the broader capability, to solve complex problems beyond the reach of our most powerful classical supercomputers. The current generation of quantum computers are fundamentally limited by high error rates that are difficult to correct and that prevent complex quantum algorithms from running at scale. While there are numerous challenges being tackled by quantum researchers around the world, there is broad agreement that these error rates are a major hurdle to be cleared. In this context, it is important to understand the difference between fault tolerance and quantum error correction. QEC uses specialized measurements to detect errors in encoded qubits. And although it is also a core mechanism used in fault tolerance, QEC alone can only go so far. Without fault-tolerant circuit designs in place, errors that occur during operations or even in the correction process can spread and accumulate, making it exponentially more difficult for QEC on its own to maintain logical qubit integrity. Reaching well beyond QEC, fault-tolerant quantum computing is a very large and complex engineering challenge that applies a broad approach to errors. FTQC not only protects individual computational qubits from errors, but also systemically prevents errors from spreading. It achieves this by employing clever fault-tolerant circuit designs, and by making use of a system's noise threshold — that is, the maximum level of errors the system can handle and still function correctly. Achieving the reliability of FTQC also requires more qubits. FTQC can potentially lower error rates much more efficiently than QEC. If an incremental drop in logical error rate is desired, then fault tolerance needs only a small polynomial increase in the number of qubits and gates to achieve the desired level of accuracy for the overall computation. Despite its complexity, this makes fault tolerance an appealing and important method for improving quantum error rates. IBM's first quantum roadmap, released in 2020 Research on fault tolerance goes back several decades. IBM began a serious effort to build a quantum computer in the late 1990s when it collaborated with several leading universities to build a two-qubit quantum computer capable of running a small quantum algorithm. Continuing fundamental research eventually led to the 2016 launch of the IBM Quantum Experience, featuring a five-qubit superconducting quantum computer accessible via the cloud. IBM's first quantum roadmap, released in 2020 (see the image above), detailed the availability of the company's 27-qubit Falcon processor in 2019 and outlined plans for processors with a growing number of qubits in each of the subsequent years. The roadmap concluded with the projected development in 2023 of a research-focused processor, the 1,121-qubit Condor, that was never made available to the public. However, as IBM continued to scale its qubit counts and explore error correction and error mitigation, it became clear to its researchers that monolithic processors were insufficient to achieve the long-term goal of fault-tolerant quantum computing. To achieve fault tolerance in the context of quantum low-density parity-check (much more on qLDPC below), IBM knew it had to overcome three major issues: This helps explain why fault tolerance is such a large and complex endeavor, and why monolithic processors were not enough. Achieving all of this would require that modularity be designed into the system. IBM's shift to modular architecture first appeared in its 2022 roadmap with the introduction for 2024 of multi-chip processors called Crossbill and Flamingo. Crossbill was a 408-qubit processor that demonstrated the first application of short-range coupling. And Flamingo was a 1,386-qubit quantum processor that was the first to use long-range coupling. For more background on couplers, I previously wrote a detailed article explaining why IBM needed modular processors and tunable couplers. Couplers play an important role in IBM's current and future fault-tolerant quantum computers. They allow qubits to be logically scaled but without the difficulty, expense and additional time required to fabricate larger chips. Couplers also provide architectural and design flexibility. Short-range couplers provide chip-to-chip parallelization by extending IBM's heavy-hex lattice across multiple chips, while long-range couplers use cables to connect modules so that quantum information can be shared between processors. A year later, in 2023, IBM scientists made an important breakthrough by developing a more reliable way to store quantum information using qLDPC codes. These are also called bivariate bicycle codes, and you'll also hear this referred to as the gross code because it has the capability to encode 12 logical qubits into a gross of 144 physical qubits with 144 ancilla qubits, making a total of 288 physical qubits for error correction. Previously, surface code was the go-to error-correction code for superconducting because it had the ability to tolerate high error rates, along with the abilities to scale, use the nearest neighbor and protect qubits against bit-flip and phase-flip errors. It's important to note that IBM has verified that qLDPC performs error correction just as effectively and efficiently as surface code. Yet these two methods do not bring the same level of benefit. Although qLDPC code and surface code perform equally well in terms of error correction, qLDPC code has the significant advantage of needing only one-tenth as many qubits. (More details on that below.) This brings us to today's state of the art for IBM quantum. Currently, IBM has a fleet of quantum computers available over the cloud and at client sites, many of which are equipped with 156-qubit Heron processors. According to IBM, Heron has the highest performance of any IBM quantum processor. Heron is currently being used in the IBM Quantum System Two and it is available in other systems as well. IBM 2025 quantum innovation roadmap, showing developments from 2016 to 2033 and beyond IBM's new quantum roadmap shows several major developments on the horizon. In 2029 IBM expects to be the first organization to deliver what has long been the elusive goal of the entire quantum industry. After so many years of research and experimentation, IBM believes that in 2029 it will finally deliver a fault-tolerant quantum computer. By 2033, IBM also believes it will be capable of building a quantum-centric supercomputer capable of running thousands of logical qubits and a billion or so gates. Before we go further into specifics about the milestones that IBM projects for this new roadmap, let's dig a little deeper into the technical breakthroughs enabling this work. As mentioned earlier, one key breakthrough IBM has made comes in its use of gross code (qLDPC) for error correction, which is much more efficient than surface code. Comparison of surface code versus qLDPC error rates The above chart shows the qLDPC physical and logical error rates (diamonds) compared to two different surface code error rates (stars). The qLDPC code uses a total of 288 physical qubits (144 physical code qubits and 144 check qubits) to create 12 logical qubits (red diamond). As illustrated in the chart, one instance of surface code requires 2,892 physical qubits to create 12 logical qubits (green star) and the other version of surface code requires 4,044 physical qubits to create 12 logical qubits (blue star). It can be easily seen that qLDPC code uses far fewer qubits than surface code yet produces a comparable error rate. Connectivity between the gross code and the LPU Producing a large number of logical and physical qubits with low error rates is impressive; indeed, as explained earlier, large numbers of physical qubits with low error rates are necessary to encode and scale logical qubits. But what really matters is the ability to successfully run gates. Gates are necessary to manipulate qubits and create superpositions, entanglement and operational sequences for quantum algorithms. So, let's take a closer look at that technology. Running gates with qLDPC codes requires an additional set of physical qubits known as a logical processing unit. The LPU has approximately 100 physical qubits and adds about 35% of ancilla overhead per logical qubit to the overall code. (If you're curious, a similar low to moderate qubit overhead would also be required for surface code to run gates.) LPUs are physically attached to qLDPC quantum memory (gross code) to allow encoded information to be monitored. LPUs can also be used to stabilize logical computations such as Clifford gates (explained below), state preparations and measurements. It is worth mentioning that the LPU itself is fault-tolerant, so it can continue to operate reliably even with component failures or errors. IBM already understands the detailed connectivity required between the LPU and gross code. For simplification, the drawing of the gross code on the left above has been transformed into a symbolic torus (doughnut) in the drawing on the right; that torus has 12 logical qubits consisting of approximately 288 physical qubits, accompanied by the LPU. (As you look at the drawings, remember that 'gross code' and 'bivariate bicycle code' are two terms for the same thing.) The drawing on the right appears repeatedly in the diagrams below, and it will likely appear in future IBM documents and discussions about fault tolerance. The narrow rectangle at the top of the right-hand configuration is called a 'bridge' in IBM research papers. Its function is to couple one unit to a neighboring unit with 'L-couplers.' It makes the circuits fault-tolerant inside the LPU, and it acts as a natural connecting point between modules. These long-distance couplers, about a meter in length, are used for bell pair generation. It's a method that allows the entanglement of logical qubits. So what happens when multiple of these units are coupled together? IBM fault-tolerant quantum architecture Above is a generalized configuration of IBM's future fault-tolerant architecture. As mentioned earlier, each torus contains 12 logical qubits created by the gross code through the use of approximately 288 physical qubits. So, for instance, if a quantum computer was designed to run 96 logical qubits, it would be equipped with eight torus code blocks (8 x 12 = 96) which would require a total of approximately 2,304 physical qubits (8 x 288) plus eight LPUs. Two special quantum operations are needed for quantum computers to run all the necessary algorithms plus perform error correction. These two operations are Clifford gates and non-Clifford gates. Clifford gates — named after the 19th-century British mathematician William Clifford — handle error correction in a way that allows error-correction code to fix mistakes. Clifford gates are well-suited for FTQC because they are able to limit the spread of errors. Reliability is critical for practical fault-tolerant quantum systems, so running Clifford gates helps ensure accurate computations. The other necessary quantum operation is non-Clifford gates (particularly T-gates). A quantum computer needs both categories of gates so it can perform universal tasks such as chemistry simulations, factoring large numbers and other complex algorithms. However, there is a trick for using both of these operations together. Even though T-gates are important, they also break the symmetry needed by Clifford gates for error correction. That's where the 'magic state factory' comes in. It implements the non-Clifford group (T-gates) by combining a stream of so-called magic states alongside Clifford gates. In that way, the quantum computer can maintain its computational power and fault tolerance. IBM's research has proven it can run fault-tolerant logic within the stabilizer (Clifford) framework. However, without the extra non‑Clifford gates, a quantum computer would not be able to execute the full spectrum of quantum algorithms. IBM fault-tolerant quantum roadmap Now let's take a closer look at the specific milestones in IBM's new roadmap that will take advantage of the breakthroughs explained above, and how the company plans to create a large-scale fault-tolerant quantum computer within this decade. IBM expects to begin fabricating and testing the Loon processor sometime this year. The Loon will use two logical qubits and approximately 100 physical qubits. Although the Loon will not use the gross code, it will be using a smaller code with similar hardware requirements. IBM has drawn on its past four-way coupler research to develop and test a six-way coupler using a central qubit connected through tunable couplers to six neighboring qubits, a setup that demonstrates low crosstalk and high fidelity between connections. IBM also intends to demonstrate the use of 'c-couplers' to connect Loon qubits to non-local qubits. Couplers up to 16mm in length have been tested, with a goal of increasing that length to 20 mm. Longer couplers allow connections to be made over more areas of the chip. So far, the longer couplers have also maintained low error rates and acceptable coherence times — in the range of several hundred microseconds. In this phase of the roadmap, IBM plans to test one full unit of the gross code, long c-couplers and real-time decoding of the gross code. IBM also plans a demonstration of quantum advantage in 2026 via the Heron (a.k.a. Nighthawk) platform with HPC. The Cockatoo design employs two blocks of gross code connected to LPUs to create 24 logical qubits using approximately 288 physical qubits. In this year, IBM aims to test L-couplers and module-to-module communications capability. IBM also plans to test Clifford gates between the two code blocks, giving it the ability to perform computations, but not yet universal computations. A year later, the Starling processor should be equipped with approximately 200 logical qubits. Required components, including magic state distillation, will be tested. Although only two blocks of gross code are shown in the illustrative diagram above, the Starling will in fact require about 17 blocks of gross code, with each block connected to an LPU. The estimated size of IBM's 2029 large-scale fault-tolerant Starling quantum computer in a ... More datacenter setting, with human figures included for size comparison This is the year IBM plans to deliver the industry's first large‑scale, fault‑tolerant quantum computer — equipped with approximately 200 logical qubits and able to execute 100 million gate operations. A processor of this size will have approximately 17 gross code blocks equipped with LPUs and magic state distillation. IBM expects that quantum computers during this period will run billions of gates on several thousand circuits to demonstrate the full power and potential of quantum computing. IBM milestones in its roadmap for large-scale, fault-tolerant quantum computers Although there have been a number of significant quantum computing advancements in recent years, building practical, fault-tolerant quantum systems has been — and still remains — a significant challenge. Up until now, this has largely been due to a lack of a suitable method for error correction. Traditional methods such as surface code have important benefits, but limitations, too. Surface code, for instance, is still not a practical solution because of the large numbers of qubits required to scale it. IBM has overcome surface code's scaling limitation through the development of its qLDPC codes, which require only a tenth of the physical qubits needed by surface code. The qLDPC approach has allowed IBM to develop a workable architecture for a near-term, fully fault-tolerant quantum computer. IBM has also achieved other important milestones such as creating additional layers in existing chips to allow qubit connections to be made on different chip planes. Tests have shown that gates using the new layers are able to maintain high quality and low error rates in the range of existing devices. Still, there are a few areas in need of improvement. Existing error rates are around 3x10^-3, which needs to improve to accommodate advanced applications. IBM is also working on extending coherence times. Using isolated test devices, IBM has determined that coherence is running between one to two milliseconds, and up to four milliseconds in some cases. Since it appears to me that future utility-scale algorithms and magic state factories will need between 50,000 to 100,000 gates between resets, further improvement in coherence may be required. As stated earlier, IBM's core strategy relies on modularity and scalability. The incremental improvement of its processors through the years has allowed IBM to progressively develop and test its designs to incrementally increase the number of logical qubits and quantum operations — and, ultimately, expand quantum computing's practical utility. Without IBM's extensive prior research and its development of qLDPC for error correction, estimating IBM's chance for success would largely be guesswork. With it, IBM's plan to release a large-scale fault-tolerant quantum computer in 2029 looks aggressive but achievable.