Google models

Vertex AI features a growing list of foundation models that you can test, deploy, and customize for use in your AI-based applications. Foundation models are fine-tuned for specific use cases and offered at different price points. This page summarizes the models that are available in the various APIs and gives you guidance on which models to choose by use case.

To learn more about all AI models and APIs on Vertex AI, see Explore AI models and APIs.

Gemini models

The following table summarizes the models available in the Gemini API:

Model name Description Specifications
Gemini 1.5 Flash
(gemini-1.5-flash)
Multimodal model that is designed for high volume, cost-effective applications. Gemini 1.5 Flash delivers speed and efficiency to build fast, lower-cost applications that don't compromise on quality. Max total tokens (input and output): 1M
Max output tokens: 8,192
Max raw image size: 20 MB
Max base64 encoded image size: 7 MB
Max images per prompt: 3,000
Max video length: 1 hour
Max videos per prompt: 10
Max audio length: approximately 8.4 hours
Max audio per prompt: 1
Max PDF size: 30 MB
Training data: Up to May 2024
Gemini 1.5 Pro
(gemini-1.5-pro)
Multimodal model that supports adding image, audio, video, and PDF files in text or chat prompts for a text or code response. Gemini 1.5 Pro supports long-context understanding with up to 1 million tokens. Max total tokens (input and output): 1M
Max output tokens: 8,192
Max images per prompt: 3,000
Max video length (frames only): approximately one hour
Max video length (frame and audio): approximately 45 minutes
Max videos per prompt: 10
Max audio length: approximately 8.4 hours
Max audio per prompt: 1
Max PDF size: 30 MB
Training data: Up to May 2024
Gemini 1.0 Pro
(gemini-1.0-pro)
The best performing model with features for a wide range of text-only tasks.

Supports only text as input.
Supports supervised tuning.
Max total tokens (input and output): 32,760
Max output tokens: 8,192
Training data: Up to Feb 2023
Gemini 1.0 Pro Vision
(gemini-1.0-pro-vision)
The best performing image/video understanding model to handle a broad range of applications.

Supports text, image, and video as inputs.
Max total tokens (input and output): 16,384
Max output tokens: 2,048
Max images per prompt: 16
Max video length: 2 minutes
Max videos per prompt: 1
Training data: Up to Feb 2023
Gemini 1.0 Ultra (GA with allow list) Google's most capable text model, optimized for complex tasks, including instruction, code, and reasoning.

Supports only text as input.
Max tokens input: 8,192
Max tokens output: 2,048
Gemini 1.0 Ultra Vision
(GA with allow list)
Google's most capable multimodal vision model, optimized to support joint text, images, and video inputs. Max tokens input: 8,192
Max tokens output: 2,048

Gemini language support

Gemini models support the following languages:
Arabic (ar), Bengali (bn), Bulgarian (bg), Chinese simplified and traditional (zh), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hebrew (iw), Hindi (hi), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Latvian (lv), Lithuanian (lt), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Serbian (sr), Slovak (sk), Slovenian (sl), Spanish (es), Swahili (sw), Swedish (sv), Thai (th), Turkish (tr), Ukrainian (uk), Vietnamese (vi).

Embeddings models

The following table summarizes the models available in the Embeddings API.

Model name Description Specifications
Embeddings for text
(textembedding-gecko@001,
textembedding-gecko@002,
textembedding-gecko@003,
text-embedding-004
)
Returns embeddings for English text inputs.

Supports supervised tuning of "text-embedding-gecko" models, English only.
Max token input: 3,072 (textembedding-gecko@001),
2,048 (others).

Embedding dimension: text-embedding-004: <=768
Others: 768.
Embeddings for text multilingual
(textembedding-gecko-multilingual@001,
text-multilingual-embedding-002)
Returns embeddings for text inputs of over 100 languages

Supports supervised tuning of the text-multilingual-embedding-002 model.
Supports 100 languages.
Max token input: 2,048

Embedding dimension: text-multilingual-embedding-002: <=768
Others: 768.
Embeddings for multimodal
(multimodalembedding)
Returns embedding for text, image and video inputs, to compare content across different models.

Converts text, image, and video into the same vector space. Video only supports 1408 dimensions.
English only
Max token input: 32,
Max image size: 20 MB, Max video length: Two mins,

Embedding dimension: 128, 256, 512, or 1408 for text+image input, 1408 for video input.

Embeddings language support

Text multilingual embedding models support the following languages:
Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Sotho, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, West Frisian, Xhosa, Yiddish, Yoruba, Zulu.

Imagen model

The following table summarizes the models available in the Imagen API:

Model name Description Specifications
Imagen 2
(imagegeneration@006)
This model supports image generation and editing to create high quality images in seconds.

The editing feature supports object removal and insertion, outpainting, and product editing
Max image output: four
Aspect ratio (for generation): 1:1, 9:16, 16:9, 3:4, 4:3

Resolution: ~1500 pixels (varies by aspect ratio)

Imagen language support

The Imagen model supports the following languages:
English, Chinese (simplified), Chinese (traditional), Hindi, Japanese, Korean, Portuguese, and Spanish.

Code completion model

The following table summarizes the models available in the Codey APIs:

Model name Description Specifications
Codey for Code Completion
(code-gecko)
A model fine-tuned to suggest code completion based on the context in code that's written. Maximum input tokens: 2048
Maximum output tokens: 64

MedLM models

The following table summarizes the models available in the MedLM API:

Model name Description Specifications
MedLM-medium (medlm-medium) A HIPAA-compliant suite of medically tuned models and APIs powered by Google Research.

This model helps healthcare practitioners with medical question and answer tasks, and summarization tasks for healthcare and medical documents. Provides better throughput and includes more recent data than the medlm-large model.
Max tokens (input + output): 32,768
Max output tokens: 8,192
MedLM-large (medlm-large) This model helps healthcare practitioners with medical question and answer tasks, and summarization tasks for healthcare and medical documents. Max input tokens: 8,192
Max output tokens: 1,024

Model versions and lifecycle

Each Generative AI on Vertex AI language model is available in a stable version and an auto-updated version. See the following topics to learn about how model versioning works with Gemini models. To learn about Imagen on Vertex AI model versions and their lifecycle, see Imagen on Vertex AI model versions and lifecycle.

If you tune a Gemini model, then the tuned model shares the same discontinuation date as the base model that you used in the tuning process. For more information, see Overview of model tuning for Gemini.

Gemini stable version

A stable version of a Gemini model does not change and continues to be available until its discontinuation date. See the tables in Available Gemini stable model versions on this page to learn the discontinuation dates of Gemini models. If you use a stable after its discontinuation date, you need to switch to a newer available stable version. You can identify the version of a stable model by the three-digit number that's appended to the model name. For example, gemini-1.0-pro-001 is version number one of the stable release of the Gemini 1.0 Pro model.

Google releases stable versions on a regular cadence. You can switch from one stable version to another as long as the other version is still available. When you move from one stable version to a different stable version, you need to run your tuning jobs again because there might be prompt, output, and other differences between the versions.

To use the stable version of a Gemini model, append the three digit version number to the model with a hyphen (-). For example, to specify the stable gemini-1.0-pro model that's version six, append -006 to the model's name:

https://us-central1-aiplatform.googleapis.com/v1/projects/my_project/locations/us-central1/publishers/google/models/gemini-1.0-pro-006

Available Gemini stable model versions

The following stable model versions are available for generally available Gemini models:

Gemini 1.5 Flash model Release date Discontinuation date
gemini-1.5-flash-001 May 24, 2024 May 24, 2025
Gemini 1.5 Pro model Release date Discontinuation date
gemini-1.5-pro-001 May 24, 2024 May 24, 2025
Gemini 1.0 Pro Vision model Release date Discontinuation date
gemini-1.0-pro-vision-001 February 15, 2024 February 15, 2025
Gemini 1.0 Pro model Release date Discontinuation date
gemini-1.0-pro-001 February 15, 2024 February 15, 2025
gemini-1.0-pro-002 April 9, 2024 April 9, 2025

Gemini auto-updated version

The auto-updated version of a Gemini model points to the most recent stable version. When a new stable version is released, the auto-updated version points to the new version. This means that if you specify the auto-updated version of a Gemini model in your code, it could behave differently without notice when the next stable version is released. Because of this, use a auto-updated version with caution if you tune your model.

To use the auto-updated version of a model, don't append anything to the model name. For example, the following uses the auto-updated version of the gemini-1.0-pro-vision model:

https://us-central1-aiplatform.googleapis.com/v1/projects/my_project/locations/us-central1/publishers/google/models/gemini-1.0-pro-vision

Gemini auto-updated models

The following table shows the available auto-updated Gemini model versions and the stable version each references.

Model name Auto-updated name Referenced stable version
Gemini 1.0 Pro Vision gemini-1.0-pro-vision gemini-1.0-pro-vision-001
Gemini 1.0 Pro gemini-1.0-pro gemini-1.0-pro-002

Gemini preview version

The preview version of a Gemini model is a model that's in preview and not generally available (GA). A preview version of a model contains functionality that's not in the most recent latest or auto-updated version of a model. Because a preview model version isn't stable, it's not recommended for use in production.

Each preview model includes its release date as part of the name of the model that you use in your code. The name pattern used by a preview model is model_name-preview-MMDD. For example, gemini-1.5-pro-preview-0409 is the first preview version of the Gemini 1.5 Pro model and it was released on April 9. When a new preview version of a model is released, the previous version is updated to point to the new preview version and is available until its discontinuation date.

Gemini preview models

The following table shows the available preview Gemini model versions and the preview version each references.

Model name Preview name Discontinuation date
Gemini 1.5 Flash (Preview) gemini-1.5-flash-preview-0514 June 24, 2024
Gemini 1.5 Pro (Preview) gemini-1.5-pro-preview-0514 June 24, 2024
Gemini 1.5 Pro (Preview) gemini-1.5-pro-preview-0409 (points to and uses gemini-1.5-pro-preview-0514) June 14, 2024

Code completion stable model versions

The following stable model versions are available for generally available Generative AI models:

code-gecko model Release date Discontinuation date
code-gecko@002 December 6, 2023 October 9, 2024
code-gecko@001 June 29, 2023 July 6, 2024

Embeddings stable model versions

The following stable model versions are available for generally available Generative AI models:

textembedding-gecko model Release date Discontinuation date
text-embedding-004 May 14, 2024 May 14, 2025
text-embedding-preview-0409 April 9, 2024 June 27, 2024
text-multilingual-embedding-002 May 14, 2024 May 14, 2025
text-multilingual-embedding-preview-0409 April 9, 2024 June 27, 2024
textembedding-gecko@003 December 12, 2023 December 12, 2024
textembedding-gecko-multilingual@001 November 2, 2023 December 12, 2024
textembedding-gecko@002
(regressed, but still supported)
November 2, 2023 October 9, 2024
textembedding-gecko@001 June 7, 2023 October 9, 2024
multimodalembedding@001 February 12, 2024 February 12, 2025

MedLM language support

The MedLM model supports the English language.

Explore all models in Model Garden

Model Garden is a platform that helps you discover, test, customize, and deploy Google proprietary and select OSS models and assets. To explore the generative AI models and APIs that are available on Vertex AI, go to Model Garden in the Google Cloud console.

Go to Model Garden

To learn more about Model Garden, including available models and capabilities, see Explore AI models in Model Garden.

What's next