Is Open Source the Best Model of 2026?

The latest set of open source models from Google they have arrived, the Gemma 4 family has arrived. Open source models have become popular lately due to privacy issues and flexibility to be easily fine-tuned, and now we have 4 open source models in the Gemma 4 family and they look very promising on paper. So without further ado let’s cut to the chase and see what the hype is all about.
The Gemma family
Gemma is a family of lightweight, open-source language models developed by Google. It’s built using the same research and technology that powers Google’s Gemini models, but designed to be affordable and efficient.
What this means: Gemma models are designed to run in virtual environments, such as laptops, consumer GPUs and mobile devices.
They both entered:
- Basic versions (modification and customization)
- Instruction-tuned (IT) versions. (good for conversation and general use)
So these are the models that come under the umbrella of Gemma 4 family:
- Gemma 4 E2B: With ~2B active parameters, it is a multimodal model optimized for edge devices such as smartphones.
- Gemma 4 E4B: Similar to the E2B model but this one comes with ~4B active parameters.
- Gemma 4 26B A4B: It is a combination of 26B parameters of the expert model, it activates only 3.8B parameters (~4B active parameters) during the determination. Limited versions of this model may run on consumer GPUs.
- Page 4 31B: A compact model with 31B parameters, it is the most powerful model in this range and is best suited for fine-tuning purposes.
The E2B and E4B models feature a 128K content window, while the larger 26B and 31B feature a 256K content window.
Note: All models are available both as a base model and an ‘IT’ model (tuned instructions).
Below are the benchmark scores for the Gemma 4 models:
Key features of Gemma 4
- Code execution: Gemma 4 models can be used to generate code, the LiveCodeBench benchmark scores look good too.
- Agent systems: Gemma 4 models can be used locally within agent workflows, or stand alone and integrated into production-grade applications.
- Multilingual systems: These models have been trained in more than 140 languages and can be used to support various languages or translation purposes.
- Advanced Agents: These models have significant improvements in math and logic compared to their predecessors. They can be used for agents that require planning and multi-step thinking.
- Multimodality: These models can process images, videos and audio naturally. They can be used for tasks such as OCR and speech recognition.
How to get Gemma 4 with Hugging Face?
Gemma 4 is released under the Apache 2.0 license, you can freely build with models and use models in any environment. These models can be accessed using Hugging Face, Ollama and Kaggle. Let’s try and check’Gemma 4 26B A4B IT‘ by the commenters on Hugging Face, this will give us a better picture of the model’s skills.
Prerequisites
Token face size:
- Go to
- Create a new token and configure it with a name and tick the boxes below before creating the token.
- Keep a token huggable face handy.
Python code
I will be using Google Colab for the demo, feel free to use what you like.
from getpass import getpass
hf_key = getpass("Enter Your Hugging Face Token: ")Paste the Hugging Face token when prompted:

Let’s try building the frontend of an e-commerce site and see how the model works.
prompt="""Generate a modern, visually appealing frontend for an e-commerce website using only HTML and inline CSS (no external CSS or JavaScript).
The page should include a responsive layout, navigation bar, hero banner, product grid, category section, product cards with images/prices/buttons, and a footer.
Use a clean modern design, good spacing, and laptop-friendly layout.
"""Sending a request to an index provider:
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
api_key=hf_key,
)
completion = client.chat.completions.create(
model="google/gemma-4-26B-A4B-it:novita",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": prompt,
},
],
}
],
)
print(completion.choices[0].message)
After copying the code and creating the HTML, this is the result I got:


The output looks good and the Gemma model seems to be working fine. WHAT DO YOU THINK?
The conclusion
The Gemma 4 family not only looks promising on paper but also in results. With different abilities and different models built for different needs, the Gemma 4 models got a lot of things right. And with open source AI becoming more and more popular, we should have options to try, test and find models that best suit our needs. It will also be interesting to see how devices such as mobile phones, Raspberry Pi, etc. benefit from dynamic models that use memory in the future.
Frequently Asked Questions
IA. E2B means 2.3B active parameters. While the total parameter including embedding reaches about 5.1B.
A. Large embedding tables are mainly used for lookup functions, so they increase the total number of parameters but not the effective computational size of the model.
A. Expert Blend activates only a small set of specialized expert networks per token, improving efficiency while maintaining the capacity of high-end models. Gemma 4 26B is a MoE model.
Sign in to continue reading and enjoy content curated by experts.



