Priced to help you bring your app to the world
Available now
Available now
Available now
Our fastest multimodal model with great performance for diverse, repetitive tasks and a 1 million context window. Now generally available for production use.
Free of charge*
Rate Limits**
15 RPM (requests per minute)
1 million TPM (tokens per minute)
1,500 RPD (requests per day)
Price (input)
Free of charge
Context caching - coming soon
Not applicable
Price (output)
Free of charge
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
1000 RPM (requests per minute)
2 million TPM (tokens per minute)
Price (input)
$0.35 / 1 million tokens (for prompts up to 128K tokens)
$0.70 / 1 million tokens (for prompts longer than 128K)
Context caching - coming soon
Not applicable
Price (output)
$1.05 / 1 million tokens (for prompts up to 128K tokens)
$2.10 / 1 million tokens (for prompts longer than 128K)
Prompts/responses used to improve our products
No
Our next-generation model with a breakthrough 1 million context window. Now generally available for production use.
Free of charge*
Rate Limits**
2 RPM (requests per minute)
32,000 TPM (tokens per minute)
50 RPD (requests per day)
Price (input)
Free of charge
Context caching - coming soon
Not applicable
Price (output)
Free of charge
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
360 RPM (requests per minute)
2 million TPM (tokens per minute)
10,000 RPD (requests per day)
Price (input)
$3.50 / 1 million tokens (for prompts up to 128K tokens)
$7.00 / 1 million tokens (for prompts longer than 128K)
Context caching - coming soon
$1.75 / 1 million tokens (for prompts up to 128K tokens)
$3.50 / 1 million tokens (for prompts longer than 128K)
$4.50 / 1 million tokens per hour (storage)
Price (output)
$10.50 / 1 million tokens (for prompts up to 128K tokens)
$21.00 / 1 million tokens (for prompts longer than 128K)
Prompts/responses used to improve our products
No
Our first-generation model offering only text and image reasoning. Generally available for production use.
Free of charge*
Rate Limits**
15 RPM (requests per minute)
32,000 TPM (tokens per minute)
1,500 RPD (requests per day)
Price (input)
Free of charge
Context caching - coming soon
Not applicable
Price (output)
Free of charge
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
360 RPM (requests per minute)
120,000 TPM (tokens per minute)
30,000 RPD (requests per day)
Price (input)
$0.50 / 1 million tokens
Context caching - coming soon
Not available
Price (output)
$1.50 / 1 million tokens
Prompts/responses used to improve our products
No
*Gemini API free tier usage restrictions apply to EEA (including EU), the UK and CH. See Billing FAQs for details.
**Specified rate limits are not guaranteed and actual capacity may vary. Apply for an increased maximum rate limit (for paid tier only).
***Tuned model inference costs are billed at the same price as the base models. To get help with billing, see Get Cloud Billing support.
Build with Vertex AI on Google Cloud