---
title: "Gemini provider"
description: "Setting up and using Google Gemini (3.1 Pro, 3 Flash, 2.5 family) in Cerevisor."
slug: guides/providers/gemini
section: guides
subsection: providers
canonical_url: https://cerevisor.com/docs/guides/providers/gemini
last_verified: 2026-05-18
last_verified_version: "1.2.0"
updated_at: 2026-05-18T15:08:18.053416+00:00
---

## Setup

You need an API key from [aistudio.google.com/apikey](https://aistudio.google.com/apikey).

1. **Settings → Providers → + Add provider → Gemini.**
2. Paste your API key into the **API key** field.
3. Click **Test connection.** Cerevisor sends a tiny `OK` round-trip via the `@google/genai` SDK to confirm auth and model accessibility (default smoke-test timeout: 30 seconds).
4. Click **Save.**

The key is stored in your OS keychain. It never lives in plain text on disk or in any `.cerevisor` file.

> **Dependency note:** Gemini requires the optional `@google/genai` package. Cerevisor ships it bundled in v1.2.0; if a future build doesn't, the setup modal shows an install hint instead of the API-key field.

## Models

Cerevisor reads the live model list from the SDK's `models.list()` and surfaces it everywhere a model picker appears (cached for 60 minutes per provider lifetime). If the list call fails (network blip, auth issue), Cerevisor falls back to a known set so the picker is never empty.

The most-used Gemini models you'll see:

| Model | Best for | Notes |
|---|---|---|
| **gemini-3.1-pro-preview** | Highest quality, deepest reasoning. | Per-million pricing doubles above 200K input tokens. Use for senior agents, large-context synthesis. |
| **gemini-3.1-flash-lite** | Fast and cheap. | Strong workhorse for reviewers, classifiers, and high-volume transforms. |
| **gemini-3-flash-preview** | Balanced quality at moderate cost. | The default if no model is set. |
| **gemini-2.5-pro** | Still current. Cheaper than 3.1-pro for similar quality on many tasks. | Also has the >200K tier pricing bump. |
| **gemini-2.5-flash** / **2.5-flash-lite** | Lower-cost alternatives if you don't need 3.x reasoning. | |

If you type a custom model id (e.g. a dated variant like `gemini-3.1-pro-preview-2026-04`), Cerevisor uses prefix matching to find the correct pricing row. Unknown ids bill at $0 and emit a one-shot console warning.

## Cost

Cerevisor reads token usage from each response's `usageMetadata` and multiplies by the per-model pricing table in `gemini-pricing.ts` (rates verified against `ai.google.dev/gemini-api/docs/pricing` on 2026-05-14). The status bar shows running cost during a run; the audit log saves the breakdown.

Two Gemini-specific quirks Cerevisor handles automatically:

- **Context-tier pricing.** On `gemini-3.1-pro-preview` and `gemini-2.5-pro`, input rates DOUBLE and output rates jump roughly 50% when the prompt crosses 200,000 tokens. Cerevisor picks the right tier per request based on actual `promptTokenCount`, not a static guess.
- **Thinking tokens bill as output.** Per Google's pricing page, "thinking tokens are included in output pricing." Cerevisor folds `thoughtsTokenCount` into the output total it reports so cost numbers match Google's billing console.

Audio-input pricing is tracked but not yet billed correctly because `ContentBlock` doesn't carry audio parts in v1.2.0. Audio inputs bill at the text rate until that lands; this only under-reports, never over-reports.

## Default model

In the Library entry, you can set a **default model** for this provider. When an agent's model preference is **(auto)**, it falls back to this default. If no default is set, Cerevisor uses `gemini-3-flash-preview`.

## Thinking modes

Gemini's reasoning models accept a `thinkingBudget` integer that caps how many internal tokens the model can spend before answering. Cerevisor maps its neutral `thinkingMode` setting (in **Agent Config → Reasoning**) to Google's budgets:

| Cerevisor mode | thinkingBudget |
|---|---|
| **off** | omitted (model decides) |
| **low** | 1024 |
| **medium** | 4096 |
| **high** | 16384 |

Higher budgets cost more but improve answer quality on multi-step problems. The mode is ignored on non-thinking models.

## Tools and tool choice

Function calling is fully supported. Cerevisor maps:

| Cerevisor toolChoice | Gemini behavior |
|---|---|
| **auto** (default) | Model chooses when to call a tool. |
| **any** | Model must call exactly one of the available tools. |
| **none** | Tool definitions are sent but the model is forbidden from calling. |

Tool ids are paired across the request/response loop. Older Gemini models that don't emit an id get a synthesized one so the loop still works.

## Prompt caching

When the model supports it, Cerevisor passes through Gemini's `cachedContentTokenCount` and bills cached input at the cache-read rate (typically 10% of the input rate) instead of the standard input rate. This is fully automatic; you don't need to configure anything.

## Common errors

| Error | What it means | Fix |
|---|---|---|
| **`@google/genai is not installed`** | The optional SDK is missing from this build. | Reinstall Cerevisor, or run `npm install @google/genai` in your local dev build. |
| **401 / API key invalid** | The key isn't accepted by Google. | Generate a new one at `aistudio.google.com/apikey` and re-test. |
| **Smoke test timed out** | The test request didn't return within 30 seconds. | Check your network. The Gemini API can be slow during regional incidents. |
| **Costs show $0 for a known model** | The model id isn't in `gemini-pricing.ts`. | Check the console for the "Unknown model" warning. New model ids land in patch releases. |

## Where to go next

- [Per-agent provider overrides](./per-agent-overrides.md): mix Gemini with Anthropic or Codex CLI in one workflow.
- [Provider overview](./overview.md): how Cerevisor picks which provider runs which agent.
