Gemma open models
A family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models
Responsible by design
Incorporating comprehensive safety measures, these models help ensure responsible and trustworthy AI solutions through curated datasets and rigorous tuning.
Unmatched performance at size
Gemma models achieve exceptional benchmark results at its 2B and 7B sizes, even outperforming some larger open models.
Framework flexible
With Keras 3.0, enjoy seamless compatibility with JAX, TensorFlow, and PyTorch, empowering you to effortlessly choose and switch frameworks depending on your task.
Gemma model variants
Quick-start guides for developers
Partner quick-start guides
Benchmarks
Gemma sets a new bar for state-of-the-art performance for size compared to popular models like Llama 2 and Mistral 7B.
5-shot, top-1
MMLU
The MMLU benchmark is a test that measures the breadth of knowledge and problem-solving ability acquired by large language models during pretraining.
0-shot
HellaSwag
The HellaSwag benchmark challenges a language model's ability to understand and apply common sense reasoning by selecting the most logical ending to a story.
0-shot
PIQA
The PIQA benchmark tests a language model's ability to understand and apply physical commonsense knowledge by answering questions about everyday physical interactions.
0-shot
SIQA
The SIQA benchmark evaluates a language model's understanding of social interactions and social common sense by asking questions about people’s actions and their social implications.
0-shot
Boolq
The BoolQ benchmark tests a language model's ability to answer naturally occurring (generated in unprompted and unconstrained settings) yes/no questions, testing the models ability to do real-world natural language inference tasks.
partial scoring
Winogrande
The Winogrande benchmark tests a language model's ability to resolve ambiguous fill-in-the-blank tasks with binary options, requiring generalized commonsense reasoning.
7-shot
CQA
The CQA benchmark assesses the performance of language models on multiple-choice question-answering, requiring different types of commonsense knowledge.
OBQA
The OBQA benchmark evaluates a language model's ability to perform advanced question-answering with multi-step reasoning, commonsense knowledge, and rich text comprehension, modeled after open book exams.
ARC-e
The ARC-e benchmark tests a language model's advanced question-answering skills with genuine grade-school level, multiple-choice science questions.
ARC-c
The ARC-c benchmark is a more focused subset of the ARC-e dataset, containing only questions answered incorrectly by common (retrieval-base and word co-occurrence) algorithms.
5-shot
TriviaQA
The TriviaQA benchmark tests reading comprehension skills with question-answer-evidence triples.
pass@1
HumanEval
The HumanEval benchmark tests a language model's code generation abilities by evaluating whether its solutions pass functional unit tests for programming problems.
3-shot
MBPP
The MBPP benchmark tests a language model's ability to solve basic Python programming problems, focusing on fundamental programming concepts and standard library usage.
maj@1
GSM8K
The GSM8K benchmark tests a language model's ability to solve grade-school-level math problems that frequently require multiple steps of reasoning.
4-shot
MATH
The MATH benchmark evaluates a language model's ability to solve complex mathematical word problems, requiring reasoning, multi-step problem-solving, and the understanding of mathematical concepts.
AGIEval
The AGIEval benchmark tests a language model's general intelligence by using questions derived from real-world exams designed to assess human intellectual abilities (college entrance exams, law exams, etc.).
BBH
The BBH (BIG-Bench Hard) benchmark focuses on tasks deemed beyond the abilities of current language models, testing their limits across various reasoning and understanding domains.
100%
75%
50%
25%
0%
100%
75%
50%
25%
0%
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
Gemma
7b
Gemma
2b
Mistral
7b
LLAMA-2
13b
LLAMA-2
7b
*See the technical report for details on performance with other methodologies
Access Gemma today
Gemma models are available in all your favorite model hubs.
Responsible AI development
Responsibility by Design
Pre-trained on carefully curated data and tuned for safety on top, helping to empower safe and responsible AI development based with Gemma models.
Robust and Transparent Evaluation
Comprehensive evaluations and transparent reporting unveil model limitations to adopt a responsible approach for each use case.
Powering Responsible Development
The Responsible Generative AI Toolkit supports developers to design and implement Responsible AI best practices.
Optimized for Google Cloud
With Gemma models on Google Cloud, you can deeply customize the model to your specific needs with Vertex AI's fully-managed tools or GKE’s self-managed option and deploy it to flexible and cost-efficient AI-optimized infrastructure.
Accelerating academic research with Google Cloud credits
Advance your research with PaliGemma models in Google Cloud. This new wave of multimodal open models extends our support for cutting-edge research. Apply now to receive Google Cloud credits to push the boundaries of your research and contribute to the advancement of the scientific community.
Join the community
Connect, explore, and share your knowledge with others in the ML model community.