Models/IndicBERT v2
IndicBERT v2 mark

IndicBERT v2

BaseOpen-sourceMultilingualहिन्दीاردوਪੰਜਾਬੀ+18 more

In plain words

IndicBERT v2 is a free, open language model from AI4Bharat that helps computers understand text written in Indian languages. It is built to power tasks like sorting, tagging, and analysing text, such as detecting sentiment or finding names in a sentence. It is an 'understanding' model, so it does not chat, answer questions conversationally, or write new text on its own.

How to use it

🚧
This model is coming soon!
A step-by-step guide for IndicBERT v2 will appear here once it goes live.

Languages & scripts supported

Works in these languages, in both native script and Roman typing.

हिन्दी Hindiاردو Urduਪੰਜਾਬੀ Punjabiसंस्कृतम् Sanskritनेपाली Nepaliमैथिली Maithiliسنڌي Sindhiکٲشُر Kashmiriবাংলা Bengaliঅসমীয়া Assameseଓଡ଼ିଆ Odiaᱥᱟᱱᱛᱟᱲᱤ Santaliबड़ो Bodoমৈতৈলোন্ Manipuri (Meitei)मराठी Marathiગુજરાતી Gujaratiதமிழ் Tamilతెలుగు Teluguಕನ್ನಡ Kannadaമലയാളം Malayalamकोंकणी Konkani+ Roman & code-mixed

Strengths & limits

An honest look at what it does well and where it struggles.

Good at
Sorting and tagging text
Detecting sentiment and topics
Finding names and entities
Working across many Indian languages
Where it struggles
!Not for chat or writing
!Cannot answer questions conversationally
!Needs fine-tuning for each task
!Built for understanding, not generation

About the maker

Who builds and maintains IndicBERT v2.

AI4Bharat logo
AI4Bharat
4 models

AI4Bharat is an open-source research lab at IIT Madras dedicated to advancing AI for Indian languages. Its freely released datasets, benchmarks, and models — like Airavata and the IndicTrans translation series — are widely used across research and industry.

Compare

More like this

Comparable models you may also want to try.