
India's big AI test is here: Making sovereign language models work
English-dominated AI models can hallucinate (fabricate facts), mistranslate key phrases, or miss the cultural context when prompted in Indian languages.
The concern is also over inclusion. With over 1.4 billion people and 22 official languages, alongside thousands of dialects, India can ill afford to be an afterthought in the AI revolution. The country is expected to total over 500 million non-English internet users by 2030. If AI models can't understand them, the digital divide will only widen.
To address this, the Indian government launched a $1.2 billion IndiaAI Mission in February 2024. One of its central goals: to fund and foster the development of sovereign local language models and small language models (SLMs)—AI systems that are built, trained, and deployed entirely within India, on Indian data.
While large language models (LLMs), such as GPT-4, handle broad tasks, having been trained on copious amounts of data, SLMs are smaller, typically built for specific uses.
In January, the government opened a nationwide call for proposals to develop foundational AI models rooted in Indian languages and datasets. By April, more than 550 pitches had poured in from startups, researchers, and labs eager to build either SLMs or general-purpose LLMs.
In April, the government selected Sarvam AI to lead the charge. The Bengaluru-based startup will develop the country's first foundational model trained on local language datasets. It would build a massive 120-billion parameter open-source model to power new digital governance tools.
Parameters are settings that control how the AI model learns from data before making predictions or decisions. For instance, in a language model like ChatGPT, parameters help decide which word comes next in a sentence based on the words before it.
On 30 May, the government announced three more model-development efforts—from Soket AI, Gnani AI and Gan AI.
Soket AI, based in Gurugram, will build a 120-billion multilingual model focused on sectors like defence, healthcare, and education; Gnani AI, based in Bengaluru, will develop a 14-billion voice AI model for multilingual speech recognition and reasoning; Gan AI, also based in India's Silicon Valley, is working on a 70-billion parameter model aimed at advanced text-to-speech capabilities.
During the launch of the three additional models, union minister for electronics and information technology, Ashwini Vaishnaw, stressed the importance of more people being able to access technology and get better opportunities. 'That's the philosophy with which IndiaAI Mission was created," the minister said.
A senior official from the ministry of electronics and information technology (MeitY), speaking on condition of anonymity, told Mint that a foundational sovereign language model can be expected within the next 12 months. 'We will see many more sovereign models after a year or so, hosted on the government's AI marketplace platform," the official added.
Why it matters
Beyond the language gap, the global AI landscape is being shaped by rising concerns around sovereignty, data control, and geopolitical risk. As AI becomes the cornerstone of digital infrastructure, nations are racing to build their own models. In India, the move also aligns with India's broader vision of 'Atmanirbhar Bharat' (self-reliant India).
India now joins a fast-growing club of countries that have developed or are developing sovereign LLMs—China (Baidu), France (Mistral), Singapore (SEA-LION), UAE (Falcon), Saudi Arabia (Mulhem), and Thailand (ThaiLLM).
Even before Sarvam, India had seen an uptick in language model building activity. BharatGPT (by CoRover), Project Indus (Tech Mahindra), Hanooman (by Seetha Mahalaxmi Healthcare and 3AI), Krutrim (Ola), and Sutra (by Two AI) are some examples.
In October 2024, BharatGen, a government-backed project, released Param-1, a 2.9-billion parameter bilingual model along with 19 Indian language speech models. Led by IIT Bombay, BharatGen's mission is to boost public service delivery and citizen engagement using AI in language, speech, and computer vision.
Imagine a farmer in eastern Uttar Pradesh calling a helpline and interacting with a chatbot that understands and replies fluently in Bhojpuri, while also generating a clear summary for a government officer to act on. Or an AI tutor generating regional-language lessons, quizzes, and spoken explanations for students in languages like Marathi, Tamil, Telegu, or Kannada.
These efforts fit into India's broader digital stack, alongside Aadhaar (digital identity), UPI (unified payments interface), ULI (unified lending interface) and ONDC (the Open Network for Digital Commerce).
In a world where AI models are fast becoming a symbol of digital leadership, 'a sovereign LLM is also about owning the narrative, the data, and the future of its digital economy", said Akshay Khanna, managing partner at Avasant, a consulting firm.
'Sovereignty will be a key requirement in all nations including India," says Mitesh Agarwal, Asia-Pacific managing director at Google Cloud. He points out that Google's Gemini 1.5 processes data entirely within its India data centers. 'For sensitive projects, we also offer open-source AI models and sovereign cloud options," he added.
Showing the way
Founded in July 2023 by Vivek Raghavan and Pratyush Kumar, Sarvam has raised $41 million from private investors. While the IndiaAI Mission won't inject cash, it will take a minority equity stake in the startup.
For now, Sarvam will receive computing power—over 4,000 Nvidia H100 graphics processing units (GPUs) for six months—to train its model. The aim is to build a multimodal foundation model (text, speech, images, video, code, etc.) capable of reasoning and conversation, optimized for voice interfaces, and fluent in Indian languages.
'When we do so, a universe of applications will unfold," Sarvam co-founder Raghavan said at the launch on 26 April. 'For citizens, this means interacting with AI that feels familiar, not foreign. For enterprises, it means unlocking intelligence without sending data beyond borders."
Sarvam is developing three model variants—a large model for 'advanced reasoning and generation"; a smaller one for 'real-time interactive applications", and 'Sarvam-Edge for compact on-device tasks".
It is partnering with AI4Bharat, a research lab at the Indian Institute of Technology (IIT)-Madras, supported by Infosys co-founder Nandan Nilekani and his philanthropist wife Rohini, to build these models.
Sarvam has already developed Sarvam 1, a two-billion parameter multilingual language model, trained on four trillion tokens using Nvidia H100 GPUs.
The company claims its custom tokenizer (that breaks text into small units, like words or parts of words, so a language model can understand and process it) is up to four times more efficient than leading English-centric models when processing Indian languages, hence reducing costs.
Sarvam 1 supports 11 languages: Hindi, Bengali, Tamil, Telugu, Kannada, Malayalam, Marathi, Gujarati, Oriya, Punjabi, and English. It powers various generative AI (GenAI) agents and is also hosted on Hugging Face, enabling developers to build Indic-language apps.
Hugging Face is a platform for sharing and hosting open-source AI models and datasets.
Gnani.ai, meanwhile, is building voice-to-voice foundational LLMs that aim to produce near instant autonomous voice conversations, with very low latency. The models also aim to enable 'emotion aware conversations", which preserve intonation, stress and rhythm in the conversations, said Ganesh Gopalan, co-founder and CEO of Gnani.ai. 'The model will enable realistic conversations in governance, healthcare and education," he added.
Wait and watch
Sovereign LLMs and SLMs are likely to find strong acceptance in public service delivery and citizen engagement services across the country, just like it happened with UPI. However, enterprises will likely wait till the models show maturity, are secure enough, and hallucinate less.
Current sovereign models, Sanchit Vir Gogia, founder of Greyhound Research explained, 'lack deployment maturity, robust safety mechanisms, and domain-specific accuracy."
The Greyhound CIO Pulse 2025 survey found that 67% of enterprises exploring Indic LLMs report frequent failures in multilingual task execution, especially with mixed scripts (e.g., Devanagari+ Latin), identifying regional slang, or recognizing emotional cues in customer queries.
Further, language in India is hyper-local. Hindi spoken in Varanasi differs significantly from Hindi in Patna—not just in accent, but in vocabulary and usage. A health insurance aggregator in Bengaluru faced real-world fallout when its LLM couldn't differentiate between 'dard' (pain) and 'peeda' (suffering), leading to claim errors. The company had to halt rollout and invest in regionally-tuned data, Gogia said.
Moreover, there are limited safeguards against hallucinations. 'Without deeper fine-tuning, cultural grounding, and linguistic quality assurance, these models are too brittle for nuanced conversations and too coarse for enterprise-scale adoption," Gogia added. 'The ambition is clear—but execution still needs time and investment."
The missing millions
Building sovereign models without government or venture capital funding could also pose a big challenge since developing a foundational model from scratch is an expensive affair. For instance, OpenAI's GPT was in the works for more than six years and cost upwards of $100 million and used an estimated 30,000 GPUs.
Chinese AI lab DeepSeek did build an open-source reasoning model for just $6 million, demonstrating that high-performing models could be developed at low costs. But critics point out that the reported $6 million cheque would have excluded expenses for prior research and experiments on architectures, algorithms, and data.
Effectively, this means that only a lab which has already invested hundreds of millions in foundational research and secured access to extensive computing clusters could train a model of DeepSeek's quality with a $6 million run.
Ankush Sabharwal, founder and CEO of CoRover, says that its BharatGPT chatbot is a 'very small sovereign model with 500-million parameters". He has plans to build a 70-billion parameter sovereign model. 'But, we will need about $6 million to build and deploy it," Sabharwal says.
Long way to go
A glance at the download numbers for the month of May from Hugging Face underlines the wide gap between some of India's local language models and similar-sized global offerings.
For instance, Sarvam-1's 2-billion model saw just 3,539 downloads during the month. Krutrim, a 12-billion model from Ola-backed Krutrim SI Designs, fared similarly with only 1,451 downloads. Fractal AI's Fathom-R1 14-billion model showed the most promise with 9,582 downloads.
In contrast, international models with comparable or slightly larger sizes saw exponential traction. Google's Gemma-2 (2-billion) logged 376,800 downloads during the same period, while Meta's Llama 3.2 (3-billion) surpassed 1.5 million. Chinese models, too, outpaced Indian counterparts. Alibaba's Qwen3 (8- billion) recorded over 1.1 million downloads, while a fine-tuned version of the same model—DeepSeek-R1-0528-Qwen3-8B—clocked nearly 94,500 downloads.
The numbers underline the need for a stronger business case for Indian startups.
The senior government official quoted earlier in the story said that sovereign models must stand on their own feet. 'The government has created a marketplace where developers can access and build apps on top of sovereign models. But the startups must be able to offer their services first to India, and then globally," he said.
'API revenue, government usage fees, and long-term planning are key," Aakrit Vaish, former CEO of Haptik and mission lead for IndiaAI until March, said.
API revenue is what a company earns by letting others use its software features via an application programming interface. For example, OpenAI charges businesses to access models like ChatGPT through its API for writing, coding, or image generation.
Nonetheless, API access alone won't cover costs or deliver value, Gogia of Greyhound Research said. 'Sovereign LLM builders must focus on service-led revenue: co-creating solutions with large enterprises, developing industry-specific applications, and securing government-backed rollouts," he suggested.
Indian buyers, he added, want control—over tuning, deployment, and results. 'They'll pay for impact, not model access. This isn't LLM-as-a-Service; it's LLM-as-a-Stack."
In short, capability alone won't cut it. To scale and endure, sovereign language models must be backed by viable business propositions and stable funding—from public and private sources alike.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Economic Times
23 minutes ago
- Economic Times
Smarten Power Systems IPO opens today: Check price band, issue size and other details
Smarten Power Systems launches its IPO today, aiming to raise Rs 50 crore through a fresh issue and offer for sale, priced at Rs 100 per share. Synopsis Smarten Power Systems, a Delhi-based solar equipment manufacturer, is set to launch its IPO today. The company aims to raise Rs 50 crore through fresh issuance and offer for sale. The IPO will be priced at Rs 100 per share and will close on July 9. The proceeds will be used for expansion and working capital. Smarten Power Systems, a Delhi-based manufacturer of solar and power backup equipment, will open its IPO today, aiming to raise Rs 50 crore through a mix of fresh issuance and offer for sale. The issue, priced at Rs 100 per share, will close for subscription on July 9. ADVERTISEMENT The IPO consists of a fresh issue of 40.01 lakh equity shares aggregating to Rs 40.01 crore and an offer for sale (OFS) of 10 lakh shares worth Rs 10 crore. Arihant Capital Markets is acting as both the lead manager and market maker for the IPO, which will be listed on the NSE SME platform. The allotment is expected to be finalized by July 10, with shares likely to list on July 14. Retail investors can bid for a minimum of two lots or 2,400 shares, amounting to Rs 2.4 lakh. The company is known for designing and assembling a wide range of power solutions under its own brand, product portfolio includes home UPS systems, solar inverters, charge controllers, solar panels, and inverter batteries. It operates in 23 Indian states and exports to 17 countries across the Middle East, Africa, and South of May 2025, the company had a distribution network of 382 dealers and 52 service centers, supporting 372 SKUs across six categories. It employs 252 staff across departments. ADVERTISEMENT The proceeds from the issue will be used to purchase movable assets for a battery manufacturing unit, meet working capital needs, repay borrowings, and fund capital expenditure. For FY25, the company reported a revenue of Rs 203.2 crore and a net profit of Rs 12.77 crore. (Disclaimer: Recommendations, suggestions, views and opinions given by the experts are their own. These do not represent the views of the Economic Times) (You can now subscribe to our ETMarkets WhatsApp channel) Nikita Papers IPO opens on May 27, price band set at Rs 95-104 per share Nikita Papers IPO opens on May 27, price band set at Rs 95-104 per share Why gold prices could surpass $4,000: JP Morgan's bullish outlook explained Why gold prices could surpass $4,000: JP Morgan's bullish outlook explained Cyient shares fall over 9% after Q4 profit declines, core business underperforms Cyient shares fall over 9% after Q4 profit declines, core business underperforms L&T Technology Services shares slide 7% after Q4 profit dips L&T Technology Services shares slide 7% after Q4 profit dips Trump-Powell standoff puts U.S. Rate policy in crosshairs: Who will blink first? Trump-Powell standoff puts U.S. Rate policy in crosshairs: Who will blink first? SEBI warns of securities market frauds via YouTube, Facebook, X and more SEBI warns of securities market frauds via YouTube, Facebook, X and more API Trading for All: Pi42 CTO Satish Mishra on How Pi42 is Empowering Retail Traders API Trading for All: Pi42 CTO Satish Mishra on How Pi42 is Empowering Retail Traders Security, transparency, and innovation: What sets Pi42 apart in crypto trading Security, transparency, and innovation: What sets Pi42 apart in crypto trading Bitcoin, Ethereum, or Altcoins? How investors are structuring their crypto portfolios, Avinash Shekhar explains Bitcoin, Ethereum, or Altcoins? How investors are structuring their crypto portfolios, Avinash Shekhar explains The rise of Crypto Futures in India: Leverage, tax efficiency, and market maturity, Avinash Shekhar of Pi42 explains NEXT STORY


Hindustan Times
25 minutes ago
- Hindustan Times
From India to Vietnam: Where trade talks with the US stand ahead of Trump's deadline
US President Donald Trump has said he will send letters to select trade partners facing tariff hikes as early as Monday, piling pressure on countries to strike a deal with Washington before a new August 1 deadline. Trump announced on Friday the levies' imposition would be pushed to August 1 to allow time for talks to wrap up, but said he signed 12 letters to inform some countries of rate hikes, which will likely be sent on Monday.(AFP) The White House announced sharp levies on dozens of economies in April, citing a lack of "reciprocity" in trade relations, which were set to kick in on Wednesday, July 9. Also Read: 'It's all fake': In bombshell claim, White House insider says Trump's tariff threats a 'theatrical show' Trump announced on Friday the levies' imposition would be pushed to August 1 to allow time for talks to wrap up, but said he signed 12 letters to inform some countries of rate hikes, which will likely be sent on Monday. With Treasury Secretary Scott Bessent saying the administration was "close to several deals," where do things stand for economies from Taiwan to the European Union? EU: 'Ready' for deal The European Union said it is "ready for a deal" with Washington, with the bloc's trade chief meeting his US counterparts Thursday. European Commission President Ursula von der Leyen said the EU was targeting an "agreement in principle" when it came to the initial July 9 cutoff. Bessent said the European Union is "making very good progress" after a slow start. With no deal, the US tariff on EU goods doubles from the "baseline" of 10 percent to 20 percent -- with Trump previously threatening a 50 percent level. Vietnam: A pact with uncertainties Washington and Hanoi unveiled a trade pact Wednesday with much fanfare and few details, but it allowed Vietnam to avoid Trump's initial 46 percent tariff. Under the agreement, Vietnamese goods face a minimum 20 percent tariff while products made elsewhere face a 40 percent levy -- a clause to restrict "transshipping" by Chinese groups. But there remain questions on how the higher levy would apply to products using foreign parts. There is also a risk that Beijing will adopt retaliatory measures, analysts warned. Japan: Rice, autos at stake Despite being a close US ally and major source of foreign investment, Japan might not escape Trump's tariff hike. Tokyo's trade envoy Ryosei Akazawa has made numerous trips to Washington through the end of June. But Trump recently criticized what he described as Japan's reluctance to open up further to US rice and auto exports. "I'm not sure we're going to make a deal," Trump said, adding that the country could pay a tariff of "30 percent, 35 percent, or whatever the number is that we determine." India: A good position Indian manufacturers and exporters want to believe they can avoid a 26 percent tariff. Negotiations between both countries have been going well for weeks, and Trump himself suggested at the end of June that a "very big" agreement was imminent. Ajay Sahai, director general of the Federation of Indian Export Organizations, said the feedback he received "suggests positive developments." But he maintained that the situation was fluid. Finance Minister Nirmala Sitharaman has stressed that agriculture and dairy products remain "very big red lines." South Korea: Muted optimism Seoul, which is already reeling from US tariffs on steel and autos, wants to avert a sweeping 25 percent levy on its other exports. Cooperation in shipbuilding could be a bargaining chip, but "at this stage, both sides still haven't clearly defined what exactly they want," said new President Lee Jae Myung on Thursday. "I can't say with confidence that we'll be able to wrap everything up by July 8," he added. Indonesia, Thailand, Taiwan in the wings Other Asian economies including Indonesia, Thailand and Cambodia, which faces a 49 percent tariff, wait with bated breath. Indonesia has indicated willingness to boost energy, agriculture and merchandise imports from the United States. Bangladesh is proposing to buy Boeing planes and step up imports of US agriculture products. Taiwan, for whom Washington is a vital security partner, faces a 32 percent duty without a pact. Although both sides have faced bumps along the way, Taiwanese Vice President Hsiao Bi-khim said "negotiators from both sides are working diligently" to find a path forward. Switzerland: Hope for delay Switzerland's government said Washington has acknowledged it was acting in good faith, and assumes its tariff level will remain at 10 percent on July 9 while negotiations continue. But without a decision by the president as of the end of June, Switzerland did not rule out that levies could still rise to a promised 31 percent.


Hans India
28 minutes ago
- Hans India
Earnings season, trade talks to drive indices
Underthe shadow of mixed global cues and the impending US tariff deadline; renewed selling from FIIs and profit booking at higher levels led to modest decline in the benchmark indices. For the week ended, the Sensex shed 626.01 points or 0.74 per cent to close at 83,432.89, and the Nifty fell 176.8 points or 0.68 per cent to end at 25,461. Broader markets were mildly better with the BSE Mid-cap Index adding 0.6 per cent and the BSE Small-cap index rising 1 per cent. Investors were seen adopting a wait-and-watch approach. FIIs sold equities worth Rs 6,604.56 crore. On the other hand, DIIs continued their buying in 11th consecutive week with purchases worth Rs 7,609.42 crore. FIIs have turned cautious amid elevated market valuations and mixed global cues. However, DII fund managers say India's market momentum is 'structural, not cyclical,' driven by long-term factors such as political stability, rising domestic consumption, favorable demographics, and stable inflation. The Indian rupee extended the gains for the second week ending marginally higher at 85.39 per dollar. OPEC which produces about half of the world's oil, has reversed its earlier stance this year by agreeing to increase output and expand its market share. The additional production is expected to prevent any sharp spike in oil prices. The trajectory of crude oil prices remains critical to the global inflation outlook and for large importers like India. SEBI order barring US-based Jane Street Group from participating in the Indian stock market and ordering it to disgorge unlawful gains of Rs4,843 crore for allegedly manipulating stock indices through derivative positions; likely to have repercussions on the way F&O markets operate in coming days. manipulating index levels in the stock market to earn illegal profits, primarily through the highly liquid Bank Nifty and Nifty index options segments. According to securities lawyers, Sebi's interim order has all characteristics of a final order as it came after detailed investigations. On whether Jane Street's strategies constituted market manipulation, some observers say taking large positions in cash and option segments is merely a strategy and could not be termed as manipulative. People with deeper pockets will always be in a position to manipulate. Such orders intervene with the free spirit of the market in a disclosure-based regime like India. The discretion to trade after disclosure should be left with the individual investor. In the near term, direction of the market will be dictated by the outcome from the US-India trade negotiations and Q1 results. With no word still on India-US trade deal and the US President Donald Trump stating that he is not considering an extension and saying that US has begun process of sending letters regarding reciprocal tariffs to 12 countries has added 'suspense' to trading environment for coming days. With Union Commerce Minister stating that India will negotiate from a position of strength and not under deadlines; observers feel that the interim deal will involve only goods and a decision on services and labor issues will be taken later. Investors in trade-sensitive sectors such as IT, pharma, and Auto need to closely track developments as the deadline for the pause on Trump-era tariffs ends on Wednesday, July 9. The Q1 earnings season kicks off this week with 42 BSE-listed companies set to announce their April–June quarter results. IT bellwether Tata Consultancy Services (TCS), Avenue Supermarts (DMart), Anand Rathi Wealth, and Tata Elxsi are among the key names to watch in coming week. If you think investing is gambling, you're doing it wrong. The work involved requires planning and patience. However, the gains you see over time are indeed exciting. FUTURES & OPTIONS / SECTOR WATCH Derivative market remained cautious due to uncertainty around the India–US trade deal and the SEBI order on Jane Street. Largely range-bound, stock specific moves were observed. Both the Nifty and the Bank Nifty ended the week with small losses of about 0.70 per cent. In the options market, prominent Call open interest for the Nifty was seen at the 25,500 and 25,700 strike, while the notable Put open interest was at the 25,000 and 25,400 strike. For the Bank Nifty, the prominent Call open interest was seen at the 57,000 and 57,500 strikes, whereas notable Put open interest was at the 56,000 strike. Implied volatility (IV) for Nifty's Call options settled at 11.67 per cent, while Put options concluded at 12.27 per cent. The India VIX, a key market volatility indicator, closed the week at 12.39 per cent. The Put-Call Ratio Open Interest (PCR OI) for the week was 1.19. Techies identify 25,300 as key support for the Nifty. As long as the index remains above this level, bullish sentiment is expected to persist, with the potential for a swift rebound. On the higher side, the index could advance towards 25,800–26,100 in the near term. Immediate resistance is placed at 25,500, and a breakout above this level could further strengthen the upward momentum. If the Nifty slips below 25,300 that, it could head toward 24,800. As long as the indices stay above these levels, the market is likely to remain in a 'buy on dips' mode. Nifty futures saw rollover around 25,200–25,300, while for the Bank Nifty, it was in the 56,600–56,700 range. With the start of the Q1 earnings season, focus will be on stock specific action. Track the management commentary of TCS because it will be the first 'biggie' to announce results and give an inkling on the ongoing tariff turmoil. Stocks looking good are Aurobindo Pharma, Biocon, BDL, Hero Motocorp, HAL, JSW Steel, JIO Financial and Fortis Health. Stocks looking weak are Cholamandalam Finance, CG Power Jindal Steel, Nykaa, RVNL, TI India and PFC. (The author is a senior maket analyst and former vice-chairman, Andhra Pradesh State Planning Board)