logo
The transformer birthed GenAI. Meet the man who built it

The transformer birthed GenAI. Meet the man who built it

Mint06-06-2025
Eight years ago, Ashish Vaswani led a team of Google Brain researchers that invented the transformer, the magic sauce behind generative AI. The new model, which learns and generates human-like text, took the world by storm as it made its way into ChatGPT, and later, others like Gemini, Grok and DeepSeek.
India-born Vaswani, whose startup Essential AI works on building foundational models, now wants to establish an India team, access graphic processing units, find clients and spot a local strategic investment partner.
'We're building foundational AI models to automate coding and tasks in science, technology, engineering and mathematics (STEM) applications. The core idea is to build the best models in specific fields, so that we can then partner with large clients who will licence our models to build applications on," Vaswani said in an interview.
'Could have never imagined it'
In June 2017, Vaswani led a Stanford University research funded under Google Brain, floating the transformer model now regarded as one of the world's most significant inventions in computer science—alongside the likes of Frank Rosenblatt's neural networks, and Sergey Brin-Larry Page's PageRank.
Asked whether he expected the model to have the kind of impact it did, Vaswani, who is in India after 16 years, said that he 'could have never imagined it."
'What I had set out to build was a better version of machine learning, and improving the way machine understanding worked. I never thought it would explode into what it is today, and the way it has taken over our lives."
Also read | Generative AI, data centres to define India's tech industries in 2025
Vaswani, 39, is chief executive of Essential AI, which he co-founded in 2023 alongside Niki Parmar, a co-inventor of the transformer model. He stayed on at Google until 2021, leaving to build Adept AI Labs—a platform that today has a licensing deal with Amazon to build its AI initiatives. Vaswani and Parmar left Adept in less than two years over reported differences with investors, and started Essential AI.
Fundraise
GenAI burst into prominence when Sam Altman-led OpenAI, a Silicon Valley peer of Vaswani's, unveiled ChatGPT in November 2022. Since then, AI has become a household term, catapulting the field into prominence well beyond engineers and researchers.
His startup, which raised $56.5 million in December 2023 and counts AMD, Google and Nvidia as investors, will be looking to raise a second, larger funding round of around $100 million later this year, Vaswani said. 'The results of our early foundational models are here, and they look good. We'll be using these results as a reference point for our next fundraise," he said.
As part of this move, Vaswani is open to interest from Indian strategic partners as well. 'India has some of the brightest minds, and it is absolutely important that India pursues building its own AI. There's no reason why foundational work in AI cannot happen in India," he said.
Read this | Mint Explainer: What OpenAI o1 'reasoning' model means for the future of generative AI
Investors negotiating with global ventures concur, stating that foundational work in AI will have the scope to differentiate the work on GenAI that ventures across India and abroad are pursuing.
Foundational AI
'One has to look at a big enough problem, and assess how many millions of people a problem impacts," said Anand Daniel, partner at venture capital firm Accel. "Then, we look at the solution being built, and the foundational engineering that a venture is undertaking in order to build for the problem. It's still early days, but the scope for foundational work remains broader in the US, than what Indian startups have so far created," Daniel added.
Both agree that there is room for ventures to exist even in the foundational engineering space in GenAI in the long run, despite a battle for dominance playing out in the US among the likes of Google, Microsoft and OpenAI.
'I fully think that there is enough space now to build products and companies that exist alongside and outside the Big Tech environment, and that will further widen as generative AI evolves. Eventually, there is ample scope for many to disrupt the global technology environment," Vaswani said.
And this | India's generative AI startups look beyond building ChatGPT-like models
Foundational AI, to be sure, is seen as a tough to crack since it requires firms to build and train their own algorithms from scratch. While the advantages include the ability to have a proprietary AI model that squarely targets a specific use case, doing so requires significant working capital, a key challenge in India.
Capability questions
Vaswani, felicitated as one of India's 30 leading minds in AI by Accel in Bengaluru on Wednesday, is based in San Francisco. While Essential AI is headed by Vaswani and Parmar, the core team is in the US, highlighting the country's lack of focus on core engineering driven by access to capital being much lower than in the US.
'This is certainly an issue, and core engineering capability continues to lag in India. This is one key factor that we're also looking for in startups, but a lot of work happening here goes amiss in terms of core foundational work. Strategic companies doing foundational work will be key to progress in the field," added Prayank Swaroop, AI investor and partner in Accel.
Vaswani, however, said the evolution of GenAI likely has to do a lot with philosophy, alongside computer science and mathematics.
'Is computer science more mathematics or philosophy? It is perhaps both. Steve Jobs was the first person to articulate that successful products are a blend of technology, the liberal arts and philosophy. This is what can lead to us doing visionary work. Eventually, we're building the philosophy of how the world should be. The ethos behind technology is to solve problems, and that's the only job of innovation," he said.
And read | Gen AI pushes global firms to pour money into hardware upgrades
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Digital rules? Not EU's way
Digital rules? Not EU's way

Economic Times

timean hour ago

  • Economic Times

Digital rules? Not EU's way

Digital Rules? Not EU's Way Madrid: In 2024, Ministry of Corporate Affairs (MCA) released a draft ex-ante digital competition law, following the lead of the EU's Digital Markets Act (DMA). But does India need a new law that is an inspiration from a dubious DMA or is existing competition law enough?India's Competition Act 2002-especially in the wake of significant amendments made in 2023-offers a flexible and comprehensive framework to address anticompetitive conduct in the digital economy. It also includes sections that cover such core digital concerns as deep discounting, bundling, self-preferencing, exclusive tie-ups and the misuse of data by dominant platforms- without automatically prejudging whether such conduct is anti-competitive. CCI has also demonstrated that it can apply these provisions, as it did when it imposed penalties on Google for anticompetitive conduct in the Android ecosystem and Play Store policies. More recently, Meta and WhatsApp were fined for unfair data-sharing conditions imposed on users. CCI's scrutiny of MakeMyTrip, Flipkart and Uber demonstrates its ability to engage with platform-specific issues under existing legal mandate. While some jurisdictions have opted for 'ex-ante' digital competition laws-such as DMA-India would be wise to avoid this path. Ex-ante regimes often rely on absolute prohibitions and structural presumptions, which may inadvertently suppress innovative or pro-competitive conduct because it involves scale or the blueprint for many ex-ante digital competition rules, has been criticised by competition experts like OECD's Frederic Jenny for rigidity and potential to hinder innovation in fast-moving markets. Indeed, DMA has produced dubious outcomes in the has taken note. The draft Digital Competition Bill (DCB) 2024 proposed ex-ante obligations for 'systemically significant digital enterprises', drawing inspiration from foreign frameworks like DMA. But it was met with concern from industry and legal experts, who warned that such sweeping, pre-emptive rules risked stifling innovation and overregulating a still-developing digital economy. That proposal was put on the backburner, a prudent pause that underscores the need to pursue a tailored, evidence-based approach rather than adopt a one-size-fits-all regulatory model. Indeed, India's experience offers an alternative. In the landmark v. Google case, CCI displayed regulatory restraint, carefully weighing the need to preserve innovation. That decision reflected a more sophisticated understanding of digital markets, where conduct must be judged in context, rather than employing broad regulatory prohibitions that ignore efficiencies and procompetitive benefits. Furthermore, CCI has shown flexibility in defining relevant markets in digital cases, such as in Snapdeal and Meru v. Uber, where evolving business models required novel interpretations. Despite effectiveness of this legal framework, enforcement delays remain a significant concern. In CCI v. SAIL case, the Supreme Court highlighted the need for time-bound proceedings to ensure meaningful remedies and deterrence. The ability to enforce the law swiftly and decisively is what will ultimately determine the regime's than replicating foreign models, India should prioritise improving how its current system functions. This means faster case resolution, better resources for CCI and development of sector-specific guidelines that retain flexibility while offering clarity. Moreover, continuous training and upskilling of enforcement personnel are essential. That's why the proposed Digital Markets and Data Unit within CCI is a welcome development, as it promises to bring much-needed technical expertise and sector-specific knowledge to bear on increasingly complex pace of technological change in the digital economy exceeds the ability of prescriptive rules to remain relevant. A principles-based, evidence-driven regulatory approach-such as the one embodied in the Competition Act-is far better suited to this 2023 amendments bolstered this framework by introducing a deal value threshold for merger scrutiny, a voluntary settlement and commitment mechanism, and significantly higher penalties. Together, these amendments serve to equip CCI with modern tools to regulate proactively without becoming focus should now shift to building institutional capacity, streamlining procedures and ensuring timely enforcement. These improvements will ensure that Indian markets remain healthy, efficient and a global climate of regulatory overreach, India's approach should continue to remain principled and pragmatic: grounded in a refusal to legislate prematurely and a commitment to strengthen an already sound framework rooted in economic analysis that has served it well across sectors as varied as telecom, cement, pharmaceuticals and aviation. It should be no different for digital markets. (Disclaimer: The opinions expressed in this column are that of the writer. The facts and opinions expressed here do not reflect the views of Elevate your knowledge and leadership skills at a cost cheaper than your daily tea. Inside TechM CEO's 'baptism by fire' and the blaze he still needs to douse Can this cola maker get back bubble valuation pricked by Ambani? Delhivery survived the Meesho curveball. Can it keep on delivering profits? Why the RBI's stability report must go beyond rituals and routines Are Sebi's MII evaluations driving real change or just more paperwork? From takeovers to a makeover: Are cement stocks ready for re-rating? 8 cement stocks with upside potential from 6 to 42% Stock picks of the week: 5 stocks with consistent score improvement and return potential of more than 29% in 1 year For long-term investors with ability to ignore short-term volatility: 6 mid-caps from different sectors with upside potential of up to 39%

Coder 'village' at the heart of China's artificial intelligence boom
Coder 'village' at the heart of China's artificial intelligence boom

Business Standard

timean hour ago

  • Business Standard

Coder 'village' at the heart of China's artificial intelligence boom

Meaghan Tobin It was a sunny Saturday afternoon, and dozens of people sat in the grass around a backyard stage where aspiring founders of tech start-ups talked about their ideas. People in the crowd slouched over laptops, vaping and drinking strawberry Frappuccinos. A drone buzzed overhead. Inside the house, investors took pitches in the kitchen. It looked like Silicon Valley, but it was Liangzhu, a quiet suburb of the southern Chinese city of Hangzhou, which is a hot spot for entrepreneurs and tech talent lured by low rents and proximity to tech companies like Alibaba and DeepSeek. 'People come here to explore their own possibilities,' said Felix Tao, 36, a former Facebook and Alibaba employee who hosted the event. Virtually all of those possibilities involve artificial intelligence. As China faces off with the United States over tech primacy, Hangzhou has become the center of China's AI frenzy. A decade ago, the provincial and local governments started offering subsidies and tax breaks to new companies in Hangzhou, a policy that has helped incubate hundreds of startups. On weekends, people fly in from Beijing, Shanghai and Shenzhen to hire programmers. Lately, many of them have ended up in Tao's backyard. He helped found an AI research lab at Alibaba before leaving to start his own company, Mindverse, in 2022. Now Tao's home is a hub for coders who have settled in Liangzhu, many in their 20s and 30s. They call themselves 'villagers,' writing code in coffee shops during the day and gaming together at night, hoping to harness AI to create their own firms. Hangzhou has already birthed tech powerhouses, not only Alibaba and DeepSeek but also NetEase and Hikvision. In January, DeepSeek shook the tech world when it released an AI system that it said it had made for a small fraction of the cost that Silicon Valley companies had spent on their own. Since then, systems made by DeepSeek and Alibaba have ranked among the top-performing open source AI models in the world, meaning they are available for anyone to build on. Graduates from Hangzhou's Zhejiang University, where DeepSeek's founder studied, have become sought-after employees at Chinese tech companies. Chinese media closely followed the poaching of a core member of DeepSeek's team by the electronics company Xiaomi. In Liangzhu, many engineers said they were killing time until they could create their own startups, waiting out noncompete agreements they had signed at bigger companies like ByteDance. DeepSeek is one of six AI and robotics startups from the city that Chinese media calls the 'six tigers of Hangzhou'. Last year, one of the six, Game Science, released China's first big-budget video game to become a global hit, Black Myth: Wukong. Another firm, Unitree, grabbed public attention in January when its robots danced onstage during the Chinese state broadcaster's televised annual spring gala. Liangzhu villagers have been hosting film nights. They had recently gathered to watch 'The Matrix.' Afterward, they decided the movie should be required viewing, Lin said.

The new battlefield: AI-based warfare in the ‘agentic' age, multi-domain ops and energy as a big constraining factor
The new battlefield: AI-based warfare in the ‘agentic' age, multi-domain ops and energy as a big constraining factor

Indian Express

time2 hours ago

  • Indian Express

The new battlefield: AI-based warfare in the ‘agentic' age, multi-domain ops and energy as a big constraining factor

Even before China's DeepSeek model triggered a frenzy in the AI (artificial intelligence) world, its People's Liberation Army had started to deploy AI across its major warfighting functions under a somewhat gawky banner called 'intelligentised warfare'. Beijing is learnt to be taking a graded approach by starting with applying AI to improve the performance of battlefield equipment such as artillery systems by cutting the interval needed between each shot while improving accuracy, as well as integrating generative AI with military drones to automatically target opponents' radars with better precision as soon as they come on. The DeepSeek advances could only help China build on its military AI diffusions. Should that be a cause for worry for India? Yes, say experts, considering that the Chinese are actively aiding Pakistan with its Centre of Artificial Intelligence and Computing of the Pakistan Air Force that was established in 2020, which now has an elaborate Cognitive Electronic Warfare programme aiming to use AI and machine learning for 'effective analytical and tactical decision-making'. During the Operation Sindoor, some of this could have been used by that country, given that it was backed by China behind the scenes, as testified to by Lt General Rahul R Singh, Deputy Chief of Army Staff (Capability Development and Sustenance). Pakistan, he said, was aware of vectors being primed on a real-time basis, which meant it was likely getting live satellite updates from China, with some data crunching likely happening at the backend to enable all of this. Gen Singh's emphasis on the importance of C4ISR (Command, Control, Communications, Computers, Intelligence, Surveillance and Reconnaissance) and the need for civil-military fusion translates into the need for a certain expertise on the virtual domains, including the electromagnetic spectrum, and the domains of space and cyberspace. China is clearly a leader here, and a lot to be done in this area by India to catch up. Lt Gen Amardeep Singh Aujla, the Army's Master General Sustenance, said wars are becoming 'increasingly intense and complex' due to evolving geopolitical dynamics and rapid technological advancements, which are transforming war-fighting practices and the control over new age technologies. This is being read as a recognition of the multi-domain approach being put into practice by China, and implemented in part by Pakistan. 'Modern armed forces must analyse large volumes of data from even more domains —land, air, sea, space and cyberspace—to decode enemy movements and devise deterrence strategies,' according to Alexandr Wang, Founder CEO at Scale AI. The volume of information is all but impossible to handle with current technologies, and the ability to harness data and AI could mean the big difference in the next engagement. And while AI advances are important, there is another limiting factor in all this: energy. Fields such as Big Data analysis, machine learning, predictive analysis, and natural language processing need a lot of energy, including vast spinning reserves of grid power. The electricity grid needs new electricity sources to support AI technologies and countries are increasingly turning to nuclear sources of energy to supply the electricity used by the huge data centres that drive AI. 'At this point, India is clearly short of nuclear power, with (an installed capacity of) only about seven and a half gigawatts… South Korea, a much smaller country that has around three times India's installed nuclear capacity. So while opening up the sector to private participation is the only way to bring in new technologies into sectors such as nuclear, I think that there is a defence dimension too. Ten years from now, think of the next war… If it were to occur, it will be fought by robots and AI… That means you must have AI; you must have data crunching capability, and the ability to run big data centres… So, the investment in AI data centres and robotics is going to underpin the future defence posture of the country. And for that, the single most important ingredient is enormous amounts of electricity generation. If you don't generate power, all this would not be possible,' Kris P. Singh, the Indian-American promoter and CEO of Holtec International told The Indian Express. Camden, New Jersey-based Holtec, one of the world's largest exporters of capital nuclear components and a frontrunner in the emerging small modular reactor (SMR) space, is in the running to set up proposed SMR-based projects in India. 'Put the SMRs right next to data centers, and then you have defence capabilities right there. And from there, you can, can do missiles, drones, smart soldiers and robots. And you're not too far away, maybe just 10 years. So, I am basically asking that the country become aware that it needs enormous amounts of energy to clean energy to support the future in defense of the country,' Singh said. India's current energy woes have multiple triggers: a focus on rapid expansion of renewables in the absence of energy storage systems, especially over the last decade, that is now resulting in increasing instability in the country's electricity grid. This issue is compounded by a policy decision from roughly ten years ago to scale down thermal expansion, which provides critical baseload support to the grid during evenings in summer months, when solar generation dips and demand remains high. SMR-based nuclear projects of the kind that Holtec International is proposing are now being viewed in India's policy circles as solutions to scale up baseload capacity, alongside renewed efforts to draw the private sector back into thermal generation. On the battlefield use of AI, India's defence establishment was notably an early mover, given that the the Defence Research and Development Organisation's (DRDO) Centre for Artificial Intelligence and Robotics (CAIR) came up in 1986 with the specific objective of developing autonomous technologies in the domains of combat, path planning, sensors, target identification, underwater mine detection, patrolling, logistics, and localisation etc. 'They are making progress,' a senior government official said about the CAIR project. The problem for India, though, is the combination of China's burgeoning AI prowess and its willingness to help out the PAF with its Centre of Artificial Intelligence and Computing and in possibly meeting that country's energy needs at the backend. 'In the fast-changing landscape of warfare, the first nation to fully incorporate AI into military decision-making will shape the history of the 21st century. Humanity is entering a new era of 'agentic warfare', in which we will see some of the world's strongest armies beaten by rivals that are better at harnessing AI agents—autonomous intelligent systems that can perform a multitude of tasks,' Scale AI's Wang said in a piece dated March 4 that he wrote for The Economist. The new AI systems, according to Wang, will allow the most technologically advanced armed forces 'to outthink and outmanoeuvre' even very capable opponents by hooking up a military network of sensors, weapons and human decision-makers in a bid to sharply increase the speed at which tactical moves can be proposed, and allow battlefield advantages to be acted upon before humans are even able to survey the situation. AI is set to rapidly transform the landscape of warfare, with deep tech being deployed for tasks ranging from autonomous weapons systems to intelligence gathering and cybersecurity, according to a research report by Delhi-based Centre for Joint Warfare Studies, an autonomous think tank raised at the initiative of the Ministry of Defence in 2007. This includes the development of autonomous weapons systems that can select and engage targets without human intervention; analysing vast amounts of data to identify potential threats, tracking enemy movements, and forecasting future attacks; and creating realistic battlefield simulations to enable field evaluation trials as well as allowing soldiers to train in virtual environments to prepare for real-world combat scenarios. AI use in warfare is also spreading rapidly, with reports suggesting that Ukraine has equipped its long-range drones with AI that can autonomously identify terrain and military targets, using them to launch successful attacks against Russian refineries. Israel has also used its 'Lavender' AI system in the conflict in Gaza to identify 37,000 Hamas targets. As a result, the current conflict between Israel and Hamas has been dubbed the first 'AI war', according to Kristian Humble, an Associate Professor of International Law in the School of Law and Criminology at the University of Greenwich, London. As much as the AI-led battlefield is a now a looming reality, a robust energy back-end is key to powering this future. Anil Sasi is National Business Editor with the Indian Express and writes on business and finance issues. He has worked with The Hindu Business Line and Business Standard and is an alumnus of Delhi University. ... Read More

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store