IndicBERT v2 is a free, open language model from AI4Bharat that helps computers understand text written in Indian languages. It is built to power tasks like sorting, tagging, and analysing text, such as detecting sentiment or finding names in a sentence. It is an 'understanding' model, so it does not chat, answer questions conversationally, or write new text on its own.
Works in these languages, in both native script and Roman typing.
An honest look at what it does well and where it struggles.
Who builds and maintains IndicBERT v2.
AI4Bharat is an open-source research lab at IIT Madras dedicated to advancing AI for Indian languages. Its freely released datasets, benchmarks, and models — like Airavata and the IndicTrans translation series — are widely used across research and industry.
Comparable models you may also want to try.