Google has officially introduced Gemma 4, its most intelligent family of open models to date, designed to handle advanced reasoning and complex autonomous tasks. Built using the same research and technology as the Gemini 3 flagship models, Gemma 4 is engineered to deliver a high level of "intelligence-per-parameter," allowing developers to run frontier-level AI on local hardware with reduced overhead.
The release marks a significant milestone for the "Gemmaverse," which has seen over 400 million downloads and the creation of more than 100,000 variants since its inception. To foster further innovation and digital sovereignty, Google has released the Gemma 4 weights under a commercially permissive Apache 2.0 license, providing developers with total control over their data and infrastructure. Gemma 3: Google Releases Its New Lightweight AI Model Based on Gemini 2.0, Offers Faster Inferences, Available in Multiple Sizes; Check Details.
Gemma 4: Versatile Model Sizes for Diverse Hardware
Google is releasing Gemma 4 in four distinct sizes tailored for different use cases and computing environments. The 31B Dense model currently ranks as the number three open model in the world on the Arena AI text leaderboard, while the 26B Mixture of Experts (MoE) model holds the sixth spot, outperforming models nearly 20 times its size.
For edge computing and mobile devices, Google introduced the Effective 2B (E2B) and Effective 4B (E4B) models. These are specifically optimized for the Google Pixel team and hardware from Qualcomm and MediaTek. These smaller models prioritize multimodal capabilities and low-latency processing, running completely offline on devices ranging from smartphones to Raspberry Pi units.
Google Gemma 4 Specifications and Key Features
The Gemma 4 family introduces several technical breakthroughs designed for professional developers. The models now natively support "agentic workflows," which include function-calling, structured JSON output, and system instructions. This allows the AI to interact with external APIs and execute multi-step plans autonomously.
In terms of multimodal performance, all models can natively process video and images, excelling at tasks like optical character recognition (OCR) and chart understanding. The E2B and E4B versions also include native audio input for speech recognition. Additionally, the models support over 140 languages and feature expanded context windows, with edge models handling 128K tokens and larger models supporting up to 256K tokens.
To ensure accessibility, the 26B and 31B models are optimized to fit on a single 80GB NVIDIA H100 GPU in their unquantized form, while quantized versions can run on consumer-grade gaming GPUs. This local-first approach allows workstations to function as private AI code assistants without requiring constant cloud connectivity. Grok Imagine Pro Mode With 1080P Image and Video Generation Set To Launch in Late April: Elon Musk.
Google has ensured day-one support for major AI tools and platforms, including Hugging Face, NVIDIA NIM, Ollama, and Kaggle. Developers can also scale their projects to production using Google Cloud’s Vertex AI or GKE for regulated workloads. To encourage social innovation, Google has also launched the "Gemma 4 Good Challenge" on Kaggle, inviting the community to build products that create positive global change.
(The above story first appeared on LatestLY on Apr 03, 2026 04:41 PM IST. For more news and updates on politics, world, sports, entertainment and lifestyle, log on to our website latestly.com).













Quickly


