DeepSeek paper offers new details on how it used 2,048 Nvidia chips to take on OpenAI

16-05-2025

Chinese
artificial intelligence (AI) research lab
DeepSeek has released a new research paper revealing in detail for the first time how it built one of the world's most powerful open-source AI systems at a fraction of the cost of its competitors.
'Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures', co-authored by DeepSeek founder Liang Wenfeng and released on Wednesday, attributes the start-up's breakthrough in training high-performance, cost-efficient AI systems to a hardware-software co-design approach.
'DeepSeek-V3, trained on 2,048 Nvidia H800 GPUs, demonstrates how hardware-aware model co-design can effectively address these challenges, enabling cost-efficient training and inference at scale,' the researchers wrote. DeepSeek and its hedge fund owner High-Flyer had previously stockpiled the H800, which
Nvidia originally designed for the China market to comply with US export restrictions but were banned from export to to the country in 2023.
The start-up's training approach stemmed from the team's awareness of hardware constraints and the 'exorbitant costs' of training large language models (LLMs) – the technology behind AI chatbots such as OpenAI's
ChatGPT – according to the paper.
The paper details technical optimisations that boost memory efficiency, streamline inter-chip communication, and enhance overall AI infrastructure performance – key advancements for reducing operational costs while scaling capabilities. These offer a 'practical blueprint for innovation in next-generation AI systems', the researchers said.
Play
DeepSeek also highlighted its use of a mixture-of-experts (MoE) model architecture, a machine-learning approach that divides an AI model into separate sub-networks, or experts, each focused on a subset of the input data while working collaboratively.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

DeepSeek founder shares best paper award at top global AI research conference

South China Morning Post

a day ago

South China Morning Post

DeepSeek founder shares best paper award at top global AI research conference

A research paper co-authored by Liang Wenfeng, founder of Chinese artificial intelligence start-up DeepSeek, was honoured with the best paper award at the Association for Computational Linguistics (ACL) conference in Vienna, Austria, widely recognised as the premier global conference for AI researchers. The paper, titled 'Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention,' was published on February 27, with Liang listed as one of 15 authors. The 'native sparse attention' mechanism is a core improvement that underpins the high efficiency and low-cost performance of DeepSeek's AI models. The paper's win comes as Chinese scientists and researchers are outperforming US peers in basic research in the field of computational linguistics and natural language processing. At this year's ACL conference, more than half of the first-named authors on accepted papers originated from China, up from less than 30 per cent last year. The US ranked second, with 14 per cent of first-named authors, according to ACL data. Among the four best papers recognised by ACL, two author teams were from China. They included Liang's DeepSeek team and Yang Yaodong's team from Peking University. An undated photo of DeepSeek's Liang Wenfeng. Photo: Weibo Yang, an assistant professor at the Institute of Artificial Intelligence and chief scientist of the Peking University-PsiBot Joint Laboratory, led research that explored a possible mechanism explaining the fragility of alignment in language models, attributed to the elasticity of language models.

China's AI leap elevating stealth fighter ambitions

AllAfrica

4 days ago

AllAfrica

China's AI leap elevating stealth fighter ambitions

The South China Morning Post (SCMP) has reported that Chinese scientists have developed advanced aircraft-design software they claim breaks the 'curse of dimensionality,' a computational barrier that contributed to the US Navy's cancellation of its X-47B stealth drone program in 2015. Led by Huang Jiangtao at the China Aerodynamics Research and Development Center, the team introduced a geometric sensitivity computation method that enables optimization of hundreds of variables—such as stealth, aerodynamics and propulsion—without increasing computational load. Unlike traditional methods that grow exponentially more complex, their approach decouples gradient computation costs from design intricacy and integrates radar-absorbent materials directly into aerodynamic sensitivity equations. Their paper, published in Acta Aeronautica et Astronautica Sinica, demonstrated dramatic improvements using the X-47B as a case study. The researchers say this breakthrough could provide critical technical support for next-generation low-observable aircraft, including China's J-36 and J-50 fighters and stealth drones. As sixth-generation fighter programs worldwide face delays or cancellations, China's approach—emphasizing algorithmic efficiency over raw computing power—may save time and resources in stealth warplane development. The SCMP has also previously reported that China's Shenyang Aircraft Design Institute is using the DeepSeek AI platform to tackle complex engineering challenges and reduce time spent on technical reviews, freeing researchers to focus on core innovation tasks. Lead designer Wang Yongqing has stated that the technology is already generating new ideas and approaches for aerospace development, and confirmed steady progress on new variants of the multi-role J-35 stealth fighter. This progress may be underpinned by China's development of increasingly capable AI models. Nature reported this month that Moonshot AI's Kimi K2, an open-weight agentic large language model, matches or surpasses Western and DeepSeek models. The report indicates that Kimi K2 appears to excel in coding, scoring high in tests such as LiveCodeBench. According to Nature, unlike traditional 'reasoner' models, Kimi K2 is designed to execute complex multi-step actions using external tools autonomously, and its accessibility via API at low cost has spurred rapid adoption on platforms like Hugging Face. However, in an April 2025 ChinaTalk article, Lennart Heim noted that while Chinese AI models are likely to match US counterparts in performance, the latter retains a decisive edge in computing capacity, driven by more advanced AI chips and superior system integration at scale. Moreover, Gregory Allen, in a March 2025 report for the Center for Strategic and International Studies (CSIS) think tank, stated that DeepSeek trained its V3 model using 2.8 million graphics processing unit (GPU) hours on Nvidia H800 chips—export-compliant processors specifically designed to comply with the US's October 2022 chip controls. Allen noted that although DeepSeek's publications claimed exclusive use of H800s, reporting from SemiAnalysis and Chinese media—cited in the report—alleged that DeepSeek's R1 model may have been trained using banned Nvidia H100 chips. He reported that SemiAnalysis estimated DeepSeek's parent company, High-Flyer Capital, had acquired 50,000 Hopper-generation GPUs, including 10,000 H100s, 10,000 H800s, and 30,000 H20s. Allen further observed that Nvidia's A800 and H800 chips initially skirted US export controls until regulatory updates in October 2023 closed those loopholes. In addition to relying on US AI chips, China depends on US Electronic Design Automation (EDA) software for chip development. Reuters reported that the US has lifted export controls on EDA software, coinciding with China's relaxation of rare-earth export restrictions. Despite efforts toward indigenous advanced AI chip production—particularly Extreme Ultraviolet (EUV) lithography machines—the Hunan Printed Circuit Association noted this month that China remains at an early developmental stage, encountering significant issues with throughput, durability, and integration into existing ecosystems. These chip-related limitations may have sharp implications for the development of strategic Chinese platforms, notably the H-20 stealth bomber. According to the 2024 US Department of Defense (DoD) China Military Power Report (CMPR), the H-20 is a critical next-generation long-range bomber designed to bolster China's nuclear triad and extend military reach beyond the Second Island Chain. The report states that the H-20, based on a flying-wing design similar to the US B-2, is expected to surpass an 8,500-kilometer range and carry conventional and nuclear payloads, giving China its first true strategic bomber and global strike capability. It notes that the H-20 has yet to be revealed or flight-tested and may only enter service by the 2030s. In contrast to the H-20, General Thomas Bussiere, head of US Air Force Global Strike Command, stated in an interview this month with Air & Space Forces Magazine, that a second developmental B-21 bomber 'should fly shortly,' following the first unit's initial November 2023 flight. The magazine reports that production acceleration was enabled by Congressional approval of a $4.5 billion increase through a reconciliation bill, which Bussiere described as expected and based on over a year of analysis regarding capability, cost, and ramp-rate potential. Bussiere told the publication that this expansion reflects a growing recognition of the strategic value of long-range strike, particularly amid the challenge of sustaining aging Cold War-era bombers and an increasingly volatile global environment. While the official production goal remains 'more than 100' B-21 bombers, the article notes that Bussiere informed the Senate Armed Services Committee he supports assessing an increase to 145 aircraft, citing strategic shifts such as Russia's invasion of Ukraine and China's expanding strategic forces. Air & Space Forces Magazine adds that General Anthony Cotton, head of US Strategic Command, also advocates for raising the total to 145 aircraft. Highlighting the role of AI in accelerating B-21 development, Newsweek reported in December 2023 that AI optimized digital design and engineering processes, including simulation-based testing before physical construction. According to the report, AI-driven tools have enabled Northrop Grumman to maintain tight schedules, enhance sustainability, and optimize supply chains. It also notes that the B-21 incorporates open-architecture software, facilitating rapid upgrades and AI-driven mission flexibility, transforming it into a stealthy sensor and data fusion node beyond its bomber role. Despite these advantages, the US faces significant challenges scaling B-21 production. In a June 2025 report for the Heritage Foundation, Shawn Barnes and Robert Peters noted that the US relies solely on a single B-21 production facility in Palmdale, California, which limits output to about ten bombers per year—insufficient to reach the US Air Force's 100-aircraft goal before the late 2030s. They highlighted the high up-front development costs of the program, the fragility of the defense industrial base and the single-point-of-failure risk associated with relying on one site. Barnes and Peters argued that establishing a second production line is essential to scale capacity, reduce risk, and potentially support future sales to close allies, as was done with the F-35. AI-driven breakthroughs are clearly reshaping stealth aircraft development, but progress for both China and the US hinges critically on overcoming chip dependencies and scaling production capacities. The outcome of this high-stakes technological competition will shape the future balance of strategic airpower.

Trump's AI Action Plan aims for global domination

AllAfrica

4 days ago

AllAfrica

Trump's AI Action Plan aims for global domination

'Winning the Race: America's AI Action Plan' envisions a world controlled by all-knowing US technology. It begins by declaring that 'The United States is in a race to achieve global dominance in artificial intelligence (AI). Whoever has the largest AI ecosystem will set global AI standards and reap broad economic and military benefits. Just like we won the space race, it is imperative that the United States and its allies win this race.' Released by the White House on July 23, the plan 'identifies over 90 Federal policy actions across three pillars – Accelerating Innovation, Building American AI Infrastructure, and Leading in International Diplomacy and Security – that the Trump Administration will take in the coming weeks and months.' In retrospect, it appears that the release of China's DeepSeek AI model last January really was a 'Sputnik moment.' With 23 pages of text, the Action Plan offers a highly detailed assessment of what needs to be done to 'achieve the President's vision of global AI dominance.' It is a mission statement from an activist government, complete with an alphabet soup of departmental acronyms. For example: 'Led by DOD, DHS, and ODNI, in coordination with OSTP, NSC, OMB, and the Office of the National Cyber Director [ONCD] encourage the responsible sharing of AI vulnerability information as part of ongoing efforts to implement Executive Order 14306, 'Sustaining Select Efforts to Strengthen the Nation's Cybersecurity and Amending Executive Order 13694 and Executive Order 14144.' That's Department of Defense, Department of Homeland Security, Office of the Director of National Intelligence, Office of Science and Technology Policy, National Security Council and Office of Management and Budget. Also, 'Through DOL, DOE, ED, NSF, and DOC, partner with state and local governments and workforce system stakeholders to support the creation of industry-driven training programs that address workforce needs tied to priority AI infrastructure occupations.' That's Department of Labor, Department of Energy, Education Department (Department of Education), National Science Foundation and Department of Commerce. Trump's opponents claim that he is gutting the federal bureaucracy and wiping out decades of accumulated expertise. It seems more accurate to say that he is wiping out opposition to his policies within the bureacracy and changing it to suit his own purposes. That is what we might expect from those responsible for the Action Plan: White House Office of Science and Technology Policy Director Michael Kratsios, AI and Crypto Czar David Sachs and Secretary of State and Acting National Security Advisor Marco Rubio. Krastios served as Chief Technology Officer of the United States and Under Secretary of Defense for Research and Engineering in the first Trump administration. Before that, he was a financial professional and investor who eventually became Peter Thiel's chief of staff. Theil was a co-founder of both PayPal and Palantir, the prominent developer of defense and intelligence data analytics software. Sachs, who is also chairman of the President's Council of Advisors on Science and Technology, is a venture capitalist and entrepreneur who started working for Thiel prior to the formation of PayPal, where he became COO. He is a member of the 'PayPal Mafia,' which also includes Elon Musk. According to Krastios, 'America's AI Action Plan charts a decisive course to cement US dominance in artificial intelligence.' According to Sachs: 'Artificial intelligence is a revolutionary technology with the potential to transform the global economy and alter the balance of power in the world… To win the AI race, the US must lead in innovation, infrastructure, and global partnerships. At the same time, we must center American workers and avoid Orwellian uses of AI.' 'Orwellian uses of AI' – we will come back to that. Rubio said, 'Winning the AI Race is non-negotiable.' As if China, the European Union and others negotiate their progress in science, technology and entrepreneurship with the US. The Action Plan has three main 'pillars': (1) Accelerate AI Innovation, (2) Build American AI Infrastructure, and (3) Lead in International AI Diplomacy and Security. To accelerate innovation, the authors recommend the elimination of red tape and the denial federal funding to states with regluations that 'may hinder the effectiveness of that funding.' 'President Trump,' they write, 'has already taken multiple steps toward this goal, including rescinding Biden Executive Order 14110 on AI that foreshadowed an onerous regulatory regime.' – i.e.. extending the roll-back of diversity, equity and inclusion initiatives, as well as attempts to limit the influence of big tech companies, from the federal government to states controlled by the Democrats. The goal is to 'Ensure that Frontier AI protects free speech and American values,' as they define them. The authors also want to support next-generation manufacturing, invest in AI-enabled science, build world-class scientific datasets, prioritize AI skills in education and workforce training programs, facilitate AI adoption across society as a whole, and accelerate AI adoption in government, particularly in the Department of Defense. To support next-generation manufacturing, the plan is to: Invest in developing and scaling foundational and translational manufacturing technologies via DOD, DOC, DOE, NSF, and other Federal agencies using the Small Business Innovation Research program, the Small Business Technology Transfer program, research grants, CHIPS R&D programs, Stevenson-Wydler Technology Innovation Act authorities, Title III of the Defense Production Act… and other authorities… Led by DOC through NTIA [National Telecommunications and Information Administration], convene industry and government stakeholders to identify supply chain challenges to American robotics and drone manufacturing. In order to speed up the rebuilding of US semiconductor manufacturing, the authors recommend 'removing all extraneous policy requirements for CHIPS-funded semiconductor manufacturing projects' – e.g., collective bargaining, hiring based on social position rather than experience and ability, and other Biden-era priorities – in favor of return on investment. All this will require streamlined permitting for data centers, semiconductor manufacturing facilities and energy infrastructure. The Action Plan declares that: AI is the first digital service in modern life that challenges America to build vastly greater energy generation than we have today. American energy capacity has stagnated since the 1970s while China has rapidly built out their grid. America's path to AI dominance depends on changing this troubling trend. This means dumping the climate change-focused concern with energy conservation and zero carbon. To this end, President Trump issued an Executive Order last February that established the National Energy Dominance Council (NEDC). The order states that: 'We must expand all forms of reliable and affordable energy production… including our crude oil, natural gas, lease condensates, natural gas liquids, refined petroleum products, uranium, coal, biofuels, geothermal heat, the kinetic movement of flowing water, and critical minerals.' The administration also wants to 'Expedite environmental permitting by streamlining or reducing regulations promulgated under the Clean Air Act, the Clean Water Act, the Comprehensive Environmental Response, Compensation, and Liability Act, and other relevant related laws.' Despite claims to the contrary, its policies are likely to accelerate environmental degradation. In order to 'Lead in International AI Diplomacy and Security,' the Action Plan states that the US 'must drive adoption of American AI systems… throughout the world… by exporting its full AI technology stack – hardware, models, software, applications, and standards – to all countries willing to join America's AI alliance.' This means countering the AI governance and development policies of the UN, OECD, G7, G20, International Telecommunication Union (ITU) and other international organizations, which have too often 'advocated for burdensome regulations, vague 'codes of conduct' that promote cultural agendas that do not align with American values, or have been influenced by Chinese companies attempting to shape standards for facial recognition and surveillance.' Never mind that the best facial recognition technology is Japanese. All international organizations are suspect. Even the G7 cannot be trusted. While promoting the use of American AI throughout the world, the plan also recommends expanding controls on exports of semiconductor manufacturing equipment from EUV lithography and other advanced technologies to sub-systems in cooperation with 'partners and allies,' using the Foreign Direct Product Rule (controls on the sale of any product made anywhere in the world using American technology) and secondary tariffs to force them to cooperate, if that is necessary. This is a carry-over from the Biden administration that reflects ongoing frustration with European and Japanese unwillingness to sacrifice even more of their business in China, which is the world's largest market for semiconductor manufacturing equipment. Trump's current approach to semiconductor sanctions on China reflects both the ideals of the plan and objective reality. He recently lifted restrictions on the sale of both Nvidia's H20 AI processors and EDA (electronic design automation) chip design software to China. Pressured into doing this by Chinese restrictions on exports of rare earth metals and magnets, he satisfied Nvidia CEO Jensen Huang, but also helped Chinese designers of AI processors. Huang, who does not want to be locked out of the world's largest market for semiconductors and who has arguably replaced Elon Musk as Trump's chief technology guru, said: 'The reason why it was so important to get H20 back into the China market is that… 50% of the world's AI researchers are in China, tens of thousands of AI startups in China. We want to make sure we have every opportunity to compete in that marketplace and win those developers, and when that happens… when half of the world's AI researchers develop on an American tech stack, as the technology diffuses around the world and proliferates around the world, we become the global standard.' In their AI Action Plan, Krastios, Sachs, Rubio – and, by extension, President Trump – see the promise of AI as virtually unlimited: Winning the AI race will usher in a new golden age of human flourishing, economic competitiveness, and national security for the American people. AI will enable Americans to discover new materials, synthesize new chemicals, manufacture new drugs, and develop new methods to harness energy – an industrial revolution. It will enable radically new forms of education, media, and communication – an information revolution. And it will enable altogether new intellectual achievements: unraveling ancient scrolls once thought unreadable, making breakthroughs in scientific and mathematical theory, and creating new kinds of digital and physical art – a renaissance. An industrial revolution, an information revolution, and a renaissance – all at once. This is the potential that AI presents.' Nvidia's Huang is on the same page, declaring that 'The age of AI has started. A new computing era that will impact every industry and every field of science,' after receiving an honorary doctorate in engineering from the Hong Kong University of Science and Technology last November As for the 'Orwellian uses of AI,' the authors note that 'Finally, we must prevent our advanced technologies from being misused or stolen by malicious actors as well as monitor for emerging and unforeseen risks from AI. Doing so will require constant vigilance.' Presumably, the primary malicious actor they have in mind is China, although hackers, scammers and thieves come to mind, as does what 'constant vigilance' might entail. In March, President Trump signed an executive order removing barriers to data-sharing across agencies of the federal government and Palantir has been hired to combine and organize that data. This opens the door to a degree of surveillance that could be used to create an American version of China's social credit system, which monitors and evaluates the trustworthiness of individuals and organizations in the eyes of the government. It might seem, therefore, that Trump is trying to meet the Chinese challenge by adopting Chinese methods; however, China leads the world in renewable energy and electric vehicles, while Trump is throwing environmental protection to the wind. Follow this writer on X: @ScottFo83517667

DeepSeek paper offers new details on how it used 2,048 Nvidia chips to take on OpenAI

Hashtags

Try Our AI Features

Comments

Related Articles

DeepSeek founder shares best paper award at top global AI research conference

China's AI leap elevating stealth fighter ambitions

Trump's AI Action Plan aims for global domination

Get Started Now: Download the App