The risk of shrinking linguistic usage due to a lack of institutional efforts and infrastructure to teach the Nepali language to technology.
KATHMANDU: On May 9, the Deputy Speaker of the House of Representatives, Rubi Kumari Thakur, posted a status on Facebook defending the Constitutional Council’s decision to bypass seniority and recommend Manoj Kumar Sharma as Chief Justice. To strengthen her argument in the status written in the Maithili language, she included a screenshot of a response generated by ChatGPT. The screenshot, written in the Nepali language, mentioned that the senior judges recommended by the Judicial Council for Chief Justice were close to the Nepali Congress and the CPN (UML). On the other hand, it stated that Sharma was not aligned with any political party.
The sources showing the political alignment of the judges in the ChatGPT screenshot were not official. Instead, social media platforms Reddit and Wikipedia were used as sources. After the status became controversial, Thakur deleted it.
Questions arose regarding the use of AI (Artificial Intelligence) tools like ChatGPT by lawmakers of the ruling Rastriya Swatantra Party (RSP) to gather content for parliamentary speeches. Following this, the party cautioned its legislators. On May 20, the party’s Chief Whip, Krantishikha Dhital, warned lawmakers in an RSP WhatsApp group that it would be better not to speak if they lacked content, but they should avoid using AI from now on. Stating that relying entirely on ChatGPT could yield incorrect or incomplete information, she advised them to study independently and present original, fact-based views when addressing parliament.
On June 19, 2026, former Prime Minister and co-coordinator of the Nepal Communist Party (NCP), Madhav Kumar Nepal, also shared an interesting experience regarding the use of AI. During a program organized at the NCP headquarters in Peris Danda, Koteshwor, that day, he mentioned receiving incorrect information when he asked ChatGPT about himself. He said, “I used ChatGPT and asked, ‘Do you know who I am?’ It started responding rapidly. But what it told me was wrong. I asked, ‘Are you giving wrong information?’ It then asked me, ‘What was incorrect?’ I replied, ‘You said I have two sons. While I have one son and one daughter, are both of them sons?’ It then admitted the mistake and said, ‘I will correct it.'” Leader Nepal has one son and one daughter.
The experiences of using AI tools, from leaders’ and MPs’ Facebook statuses to parliamentary speeches, show that these platforms have not yet become reliable for factual information and details regarding the Nepali language and local contexts.
Lately, people’s reliance on AI platforms has been increasing. The use of such platforms has grown for everything from gathering general information and facts and finding ways out of daily difficulties to seeking solutions for life’s complexities.
Lawmakers and leaders have begun using AI tools to write social media statuses and speeches, as well as to make their arguments effective on various issues. Although content prepared using AI tools appears concise and impactful, most people do not seem to notice that it can be factually incorrect, incomplete, and misleading. Failing to realize that these tools, which should be used as research aids, can also make mistakes, and instead treating them as absolute, has led public representatives and major leaders alike to fail on facts and fall into controversy.
Information and details provided by platforms like ChatGPT are considered superior in English compared to other languages. However, because they are not yet well-versed in the Nepali language and context, the margin of error and disruption is quite high. Due to a lack of adequate training efforts to teach technology the Nepali language, the results obtained from them in Nepali tend to be incorrect, misleading, half-baked, and sometimes completely nonsensical.
According to AI experts, the use of AI in the Nepali language can be helpful for writing assistance, preparing summaries, improving language, and tightening the flow of discussions on a topic. However, relying entirely on it to gather facts, seek legal advice, or prepare political analysis carries a high probability of error. While it is suitable for preparing an initial draft on a subject, fact-checking the facts, sources, citations, dates, places, and the implied context is mandatory.
Bal Krishna Bal, a professor at Kathmandu University’s Department of Computer Science and Engineering, explains that such problems occur because currently popular Large Language Model (LLM) platforms, including ChatGPT, Gemini, and Claude, are not tailored to Nepal’s local context. “While platforms like ChatGPT perform reasonably well in English compared to other languages, even those results require fact-checking. However, since it remains in an underdeveloped state for the Nepali language, if we begin to accept search results word-for-word as they appear, it will invite disaster,” he says.

The agreement signed between Nepal and India on June 6, 2026, for the development of language AI and digital infrastructure. Photo Source: pib.gov.in
Bal states that an LLM will deliver highly accurate results only if it is trained to match local culture, civilization, lifestyles, and colloquial speech. He says, “We need information suited to our soil, and for that, LLMs must be habituated with local contextual data.”
According to Bal, when we ask or search for something, an LLM does not simply pull facts and information directly from an official source. It provides a probable answer based on the data it was trained on. Due to the lack of a repository of Nepali language data containing sufficient information about local contexts and situations, the answers provided by AI tools contain many errors. This has constrained the use of AI tools in the Nepali language. Bal says, “If we train external, large-scale multilingual models by providing them with data from our context, it will yield good results.”
Experts note that the collection of Nepali language data is currently at a mediocre stage. Sant Basnet, a machine learning and algorithm programmer, states that pre-training models (models trained on large volumes of data before being customized for a specific task) are currently being used and then “re-trained” and “fine-tuned” according to local data. “It’s not that work isn’t happening here, but instead of an organized effort, work is being carried out by scattered groups,” he says.
Small and fragmented efforts
The high probability of errors in answers generated in the Nepali language by AI tools is not just due to the lack of data training for LLMs, but also because there haven’t been sufficient efforts to digitize the Nepali language. It has been nearly a year since the government introduced a policy on how to use AI in Nepal. The National Artificial Intelligence Policy, 2025, approved by the Council of Ministers in July/August 2025, mentions the establishment of an AI Regulatory Council, a National AI Center, and an AI Center of Excellence. However, no work has been done regarding this. The government’s budget for the upcoming fiscal year 2026/27 mentions establishing a data and computing center in Syuchatar, Kathmandu. Similarly, it states that thousands of AI processor units will be made available to research institutions.
Nepal falls among the weak nations in terms of AI infrastructure and policy readiness. In the ‘Government AI Readiness Index 2025’ prepared by Oxford Insights, Nepal ranks 106th out of 195 countries. Although Nepal jumped 46 positions compared to the previous year, it remains weaker than Bhutan and Bangladesh. On the list, Bhutan ranks 102nd, Bangladesh 75th, India 27th, and China 8th. This demonstrates that Nepal is still lagging behind in terms of AI readiness.
Despite appearing weak in government preparation, Professor Bal notes that the work of teaching technology to understand the Nepali language began two decades ago. According to him, research institutions, universities, and private sector companies have played roles in these efforts. Initially, work was done on spell-checking and later on Optical Character Recognition (OCR) to convert words in image formats into editable formats. Work has also been done on areas like ‘Text to Speech’ (reading out text), ‘Automatic Speech Recognition’ (transcribing spoken words), and machine translation. Under Bal’s leadership, translation work involving English, Nepali, and Tamang languages was carried out. Such efforts in Nepal, however, remain project-based.
Bhim Narayan Regmi, an Assistant Professor of Linguistics at Tribhuvan University, says that neither the state nor universities have worked on this matter in a planned manner. “A few individuals seem to have been involved over the past eight to ten years in organizing and downloading content published in newspapers and online media.
Such work has happened individually, by small groups, and by private companies,” he says. “I don’t see involvement at the level of the state or universities. Some works are project-based. No work has been done on how to preserve the completed work.” He adds that once a project ends, there are no mechanisms to retain the human resources that worked on it.
According to Basnet, the machine learning and algorithm programmer working at Wise-Yak, a Nepali company involved in AI, different groups are working in Nepal to teach machines the Nepali language. “So far, work has only happened on a very small scale; government agencies have also done small-scale work based on their requirements and used it internally. Comprehensive and integrated work applicable to everyone hasn’t happened at all,” Basnet says.
Basnet, who wrote a program to maintain consistency in Nepali language spelling, complains that no government agency has taken ownership to enforce standard spelling. Because of this, he says, different publishing houses and media outlets follow different spelling practices. He states, “The Kantipur newspaper writes one type of spelling, while other media outlets use another. The government has no interest in how to maintain uniformity in writing Nepali spelling.”
Professor Bal of Kathmandu University suggests that government institutions like the Nepal Academy and the Language Commission must take the initiative to establish Nepali language standards in technology. “Whatever work we have done so far has occurred separately through some research institute, company, or university. It has been many years since we raised our voices stating that this must be worked on in an organized and integrated manner, but it hasn’t happened yet,” he says.
Krishna Prasad Neupane, Chairperson of the Language Commission, states that they are not even in a position to work on AI due to a lack of budget. He says, “This year, the government has allocated only Rs 1.8 million for programs. With this budget, programs have to be conducted across all seven provinces; what can possibly be achieved with this?”
Neupane adds that alongside the budget crunch, the Commission also lacks the technical human resource required to work on AI. “Our sanctioned posts are only for computer operators, and AI work cannot be done by them,” he says. He notes that while the Commission has a sanctioned strength of 35 employees, even that workforce is not fully staffed.
Experts point out that because a shared ‘corpus’ (text collection) for the Nepali language has not been prepared, the answers provided by AI regarding the Nepali language and context remain weak. A corpus is a large and structured collection of text/audio data used to teach, test, or improve AI in any language. It includes news, books, government documents, public records, speeches, social media posts, dictionaries, translations, and audio transcriptions. A corpus is used to train, test, or evaluate AI models. However, there is no uniformity in the corpus.
Lack of standardization
According to experts, AI will deliver good results if it is trained on high-quality data available in the Nepali language. They state that for such training, a language standard must be developed. In other words, a maximum amount of high-quality content with linguistic uniformity must be integrated into computing systems.
Various organizations have been digitizing Nepali language written materials. One of them is the Madan Puraskar Pustakalaya (Library). Deepak Aryal, Research Director at the library, says that while putting high-quality Nepali language materials into computing systems is considered good, the sensitive question remains as to whose data will dominate that system.
Specifically, since the Nepali language is spoken in Nepal as well as in India’s northeastern states—including Darjeeling, Sikkim, and Assam—he worries that India’s corpus archiving work could influence us.
The language used across Nepal and Indian states differs not only in speech but also in writing. In this situation, if more Nepali language data from India enters the computing system than from Nepal, questions arise regarding what will happen to the language we speak. “When India works on the Nepali language, it will work on the language spoken and written there. If the Nepali language data of Indian Nepalis becomes dominant, that is what will prevail in technology, and AI will deliver results based on their data,” Aryal says.
He mentions that since many languages are written in the Devanagari script, there is a risk that AI might interpret words from other languages as Nepali, leading to errors. “In such a scenario, how do we distinguish which language the words written in Devanagari belong to?” Aryal asks. “The same word carries different meanings in different languages. Therefore, a mechanism to separate the Nepali language from words in other languages is necessary.”
To eliminate linguistic barriers for users and improve quality, the Government of India launched the AI-based platform ‘Bhashini’ in July 2022. It translates among India’s regional languages, turns speech to text and text to speech, and expands access to language services.
Apart from India, countries like South Korea, Japan, and Singapore have made massive investments in corpora for local languages, spoken language data, translation models, and platforms. South Korea and Japan have developed large models and corpora for their respective languages.
Expensive technology
Improving AI in the Nepali language can easily be done at two levels. First, by standardizing the Nepali corpus, a system can be developed where AI fetches answers from official sources. Second, fine-tuning can be done according to the Nepali context using the corpus, translated words, colloquial audio, and human-verified data.
Beyond this, another option is for the country to build its own LLM, i.e., an AI model. However, this is an extremely expensive and long-term project. It requires GPU (Graphics Processing Unit) infrastructure, data licensing, linguists, engineers, and continuous verification. India spent INR 4.5 billion on the Bhashini platform.
Sanjay Paudel, who teaches AI at Yonsei University in South Korea, notes that because hardware prices have skyrocketed recently, future AI projects will become even more expensive. “In the past two years, hardware prices have almost doubled. Now, a country like Nepal is in a position where it cannot afford the costs of building AI infrastructure.”
Sant Basnet, the machine learning and algorithm programmer, states that Nepalis do not even have the capacity to run AI systems built through their own efforts on local hosts.
“For a demo, looking at one or two seems fine, but no data center in Nepal has the infrastructure to commercially serve a large number of users,” he says.
He mentions that discussions haven’t even begun regarding how much hosting would cost and whether it is sustainable. As technology becomes more expensive, he notes that the inability to mold technology to local needs has increased the likelihood of shrinking the usage of the Nepali language. He adds that challenges persist because Nepal is also weak in infrastructure, including the energy required for computing systems.
Assistant Professor Regmi from Tribhuvan University’s Department of Linguistics warns that if necessary work is not carried out in computing systems, the usage of the Nepali language could shrink. “If AI delivered good results in the Nepali language, the use of the language would increase and flourish; since that is not happening, its usage will keep shrinking, eventually risking extinction,” he says.