Bengali AI is an initiative to develop and promote artificial intelligence in Bangla, the language spoken by more than 200 million people in Bangladesh and India. The initiative was inaugurated in 2018 by a non-profit community of researchers, engineers, and enthusiasts who wanted to create a platform for Bangla speakers to develop, learn, share, and collaborate on AI-related topics.
Bengali AI provides various resources, such as online courses, tutorials, datasets, tools, and competitions, to help people learn and apply AI in their domains of interest. The initiative also organizes events, such as workshops, hackathons, and meetups, to foster a community of Bangla AI enthusiasts and practitioners.
Industry Insider sat with Siha Haque, Bengali AI’s campaign manager, to discuss the project and its plans.
Whenever we talk about AI, we always think about the big tech backing its development. Microsoft Corp. is an investor in OpenAI, one of the world’s most prominent AI startups. Alphabet has been pouring money into its development of Google Bard. Nvidia, another tech giant, acquired DeepMap, an AI company that provides high-definition maps for autonomous vehicles. Meta, formerly known as Facebook, bought ReFlect, an AI company that creates realistic virtual humans. These are just some of the examples of how big tech companies are investing in and acquiring AI companies to gain access to their technology and talent. However, Bengali AI is neither funded by corporations nor aimed at profit-making. So why?
“Bengali AI is a non-profit community that relies on open-source collaboration from researchers and engineers worldwide. So, if we turn it into a profit-making scheme, the open-source collaborations will be adversely affected. We do not want that. Our main goal is to spread all of our research and datasets for the betterment of everyone. Profiteering may create a bottleneck,” elaborated Siha.

So, the Bengali AI team believes that open-source collaboration is the best way to develop AI in Bangla. This may sound a bit unusual for a tech company, which generally has a fixed team of developers. When asked about it, Siha’s reply was intriguing.
“Bengali AI is not an organization per se; we would like to call it a community of researchers and developers who work for the progress of technology in our mother tongue. And for this, the loose, flexible open source research is the most effective way forward.”
There is no business model behind Bengali AI. Various universities and independent researchers fully fund it. Individual projects are funded, such as their sign language to text language projects, OCR, etc.
The modus operandi is important for a large community as it is often spanned across different countries or even continents. So, the Bengali AI has developed a rather flexible structure and maintained an efficient skeleton crew.
“We work on projects where we recruit annotators, transcribers, and other stuff. Sometimes, researchers join us for academic purposes or their research. Anyone interested in research can work with us.”
There was a persistent notion that the Bengali language was incompatible with modern technology or advancements. This can not be particularly disregarded, given not much has been done in this field. Bengali AI was the first initiative to make AI available in Bangla. Now, has the notion been changed?

“Definitely, it has changed. We never think that the Bangla language is incompatible with modern technologies,” Siha explained, “However, we do believe that most of the research has been conducted on English or Latin-based languages. We have a scarcity of research on Bangla or similar oriental languages. So, English and other Latin languages enjoy more language acceptance in the technological field.”
“But it does not mean Bangla is incompatible; it just means that not enough research has been done here. We are working tirelessly to fill that gap.”
Bengali AI has cracked the code of public participation in the age of social media. The majority of their dataset is obtained through various social media campaigns. They have conducted dialect campaigns, speaking campaigns, and even cursing campaigns! Did those campaigns help?
“Yes, we launched several campaigns on social media. For example, we asked people through social media to provide their voices through Mozilla Common Voice for text-to-speech dataset collection. We used influencers to promote our campaigns. And the result was quite satisfactory. Two thousand hours worth of voice dataset was collected. We also targeted national holidays like International Mother Language Day, Independence Day, or Victory Day to tap into people’s patriotic sentiments and gather data.”
Incentives were also given for participating in these campaigns. There was prize money as well, enough to motivate people. One of their interesting campaigns was the cursing competition – where people were told to curse in Bangla, record it, and send it to the team. There was also a meme competition for collaboration with Rangatese, a Facebook group. “It always works if you know how to do it,” Siha replied with a smile.
So, how would the business environment benefit from the development of AI in Bangla? “The business community will be benefited in every sense. For example, small and medium online companies’ customer response relies mostly on English because of the lack of Bangla AI. It creates a significant barrier as most of the people of the country do not speak English. So, the businesses will get a bigger customer base. We can make Bangla Alexa or Bangla ChatGpt. Then, the people will not need English proficiency to communicate with technology. Like China or Japan, we too can communicate with technology in our own language,” explained Siha.
“Bengali AI can help businesses communicate more effectively with customers, partners, and employees who speak Bangla as their first or second language. It can also assist businesses in reaching out to new markets and audiences who speak Bangla, whether online or offline. Bengali AI can also make content in other languages or formats more accessible and clear for Bangla speakers by offering translation, transcription, summarization, or explanation. It will also assist businesses in preserving and promoting our rich heritage and diversity of language and culture,” Siha added.
It can also assist organizations in developing a distinct identity and brand image that expresses their beliefs and goals concerning Bangla. Bengali AI can also help organizations connect emotionally and culturally with their customers and stakeholders.
“Finally, Bengali AI can help businesses conduct research and development in various fields and domains related to or influenced by the Bangla language and culture. It can also help businesses collaborate with other researchers and developers who work on similar or complementary topics or projects involving Bangla. Bengali AI can also help businesses innovate new solutions or applications that address the specific needs or challenges of the Bengali people,” she concluded.
Thus, it will not be unjust to be hopeful for the future of Bengali AI and its impact on the synthesis of our mother tongue and technology soon. Let us keep our fingers crossed.