Gemma
Quick Facts
- Vendor
- Google DeepMind
- Released
- Gemma 1 (February 2024); Gemma 2 (June 2024); Gemma 3 (2025)
- Current line
- Gemma 3 (1B / 4B / 12B / 27B) · CodeGemma · PaliGemma · ShieldGemma
- License
- Gemma Terms of Use (permissive; commercial use allowed with a use-policy clause)
- Hosting
- Self-hosted (vLLM, Ollama, llama.cpp); Vertex AI; Kaggle; Hugging Face
- Context window
- 128K tokens (Gemma 3)
- Modalities
- Text, vision (Gemma 3 native; PaliGemma specialized)
- Architecture
- Dense transformer based on Gemini research
Summary
Gemma is Google's open-weights line, launched in February 2024 as the open complement to Gemini. The models are trained on similar infrastructure and incorporate architectural choices from Gemini research, but ship with weights, permissive licensing, and a size range (1B–27B) aimed at self-hosted deployments. Gemma 3 (2025) added native multimodality and a 128K context window while keeping the same size-tier structure.
The family extends beyond the base text models. CodeGemma targets code workloads, PaliGemma is specialized for vision-language tasks, and ShieldGemma is the moderation classifier — Google's open answer to Llama Guard.
Model Lineup
- Gemma 3 27B — top tier. Competitive on general benchmarks with Llama 3.x 70B at less than half the parameter count. Native multimodal.
- Gemma 3 12B / 4B / 1B — size-tiered variants. 1B targets on-device; 4B is a strong Mac Mini fit; 12B covers mid-tier GPU hosting.
- CodeGemma — coding-specialized. 2B and 7B variants.
- PaliGemma — vision-language. Multimodal fine-tuning base.
- ShieldGemma — moderation classifier. Input/output safety filtering.
Where Gemma Fits
Gemma is the default when the deployment is self-hosted and multilingual quality matters — Google's training data mix yields notably stronger non-English performance than most peers at similar size. The permissive Gemma Terms of Use beat Llama's Community License for many commercial scenarios (no 700M MAU clause). On a Mac Mini, Gemma 3 4B is a credible alternative to Qwen3 7B or Phi-4-mini.
Tradeoffs
- Tool use and agentic behavior trail Qwen3 and Hermes at similar sizes. Fine-tune or pair with a strong agent framework for complex tool loops.
- License nuance. Gemma Terms of Use is commercial-friendly but includes a use policy that must be accepted and propagated. Not strictly "open source" by OSI definition.
- Ecosystem compared to Llama is smaller — third-party fine-tunes, quantizations, and tooling integrations exist but in smaller quantity.
- No frontier tier. 27B is the top; for frontier-class workloads, go to hosted Gemini or to Llama 4 / Qwen3 235B.
Deployment Notes
Within the Claw ecosystem, Gemma 3 is a strong alternative to Qwen3 for customers who want a Google-lineage model self-hosted — often preferred where existing Google Cloud / Workspace relationships argue for architectural consistency. ShieldGemma is a drop-in alternative to Llama Guard in the FrawdBot preprocessing pipeline. For multilingual-heavy workloads (i18n support, non-English documentation), Gemma is often the best open-weights pick.