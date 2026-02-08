Bengaluru, February 8: While the global AI conversation is often dominated by the US and China, India is beginning to assert itself in core AI innovation, and Bengaluru-based startup Sarvam AI is leading that shift. The company is building what it calls a “sovereign AI” by developing foundational models entirely in India. Its latest tools, Sarvam Vision and Bulbul V3, are now drawing global attention for their strong performance.

Sarvam Vision, the company’s optical character recognition (OCR) model, is outperforming major AI systems such as ChatGPT, Google Gemini and Anthropic Claude on key benchmarks. According to Sarvam AI, the model achieved an accuracy score of 84.3 percent on the olmOCR-Bench, beating Gemini 3 Pro and DeepSeek OCR v2, while ChatGPT ranked significantly lower. India AI Impact Summit 2026: UN Secretary-General António Guterres Hails India’s Leadership Ahead of Key Event.

On OmniDocBench v1.5, which evaluates how well AI systems read and understand real-world documents, Sarvam Vision scored an impressive 93.28 percent overall. The model showed especially strong performance on complex layouts, technical tables and mathematical formulas, areas where traditional OCR systems often struggle due to messy formatting and dense content.

Sarvam AI co-founder Pratyush Kumar shared details of these achievements in a series of posts on X, highlighting the progress of the company’s in-house AI models. The strong benchmark results have helped Sarvam gain global recognition, particularly after earlier criticism over its focus on Indic-language models. 'Ethical Use of AI Is Non-Negotiable': PM Narendra Modi Meets CEOs to Push India’s AI Mission Before India AI Impact Summit 2026.

Tech commentator Deedy Das, who had previously questioned the value of training smaller Indic-language models, publicly acknowledged that he had underestimated Sarvam AI. In a post on X, Das praised the company’s OCR and speech models for Indian languages, calling them highly valuable and noting that large global AI labs have largely ignored this space. Users have echoed similar sentiments, with many praising the accuracy and reliability of Sarvam’s tools.

In addition to Sarvam Vision, the startup has launched Bulbul V3, a new text-to-speech AI model designed for Indian languages. The tool aims to generate natural and expressive voices, competing with global players like ElevenLabs. Bulbul V3 currently supports more than 35 voices across 11 Indian languages, with plans to expand to 22 languages in the future.

Bulbul has also received strong praise from the ecosystem. Pratik Desai, founder of KissanAI, said that Bulbul is their go-to text-to-speech model for Indic use cases and noted that it has improved with every release, while global alternatives remain expensive and less practical for Indian languages.

