Gemini provider

Setting up and using Google Gemini (3.1 Pro, 3 Flash, 2.5 family) in Cerevisor.

Setup

You need an API key from aistudio.google.com/apikey.

Settings → Providers → + Add provider → Gemini.
Paste your API key into the API key field.
Click Test connection. Cerevisor sends a tiny OK round-trip via the @google/genai SDK to confirm auth and model accessibility (default smoke-test timeout: 30 seconds).
Click Save.

The key is stored in your OS keychain. It never lives in plain text on disk or in any .cerevisor file.

Dependency note: Gemini requires the optional @google/genai package. Cerevisor ships it bundled in v1.2.0; if a future build doesn't, the setup modal shows an install hint instead of the API-key field.

Models

Cerevisor reads the live model list from the SDK's models.list() and surfaces it everywhere a model picker appears (cached for 60 minutes per provider lifetime). If the list call fails (network blip, auth issue), Cerevisor falls back to a known set so the picker is never empty.

The most-used Gemini models you'll see:

Model	Best for	Notes
gemini-3.1-pro-preview	Highest quality, deepest reasoning.	Per-million pricing doubles above 200K input tokens. Use for senior agents, large-context synthesis.
gemini-3.1-flash-lite	Fast and cheap.	Strong workhorse for reviewers, classifiers, and high-volume transforms.
gemini-3-flash-preview	Balanced quality at moderate cost.	The default if no model is set.
gemini-2.5-pro	Still current. Cheaper than 3.1-pro for similar quality on many tasks.	Also has the >200K tier pricing bump.
gemini-2.5-flash / 2.5-flash-lite	Lower-cost alternatives if you don't need 3.x reasoning.

If you type a custom model id (e.g. a dated variant like gemini-3.1-pro-preview-2026-04), Cerevisor uses prefix matching to find the correct pricing row. Unknown ids bill at $0 and emit a one-shot console warning.

Cost

Cerevisor reads token usage from each response's usageMetadata and multiplies by the per-model pricing table in gemini-pricing.ts (rates verified against ai.google.dev/gemini-api/docs/pricing on 2026-05-14). The status bar shows running cost during a run; the audit log saves the breakdown.

Two Gemini-specific quirks Cerevisor handles automatically:

Context-tier pricing. On gemini-3.1-pro-preview and gemini-2.5-pro, input rates DOUBLE and output rates jump roughly 50% when the prompt crosses 200,000 tokens. Cerevisor picks the right tier per request based on actual promptTokenCount, not a static guess.
Thinking tokens bill as output. Per Google's pricing page, "thinking tokens are included in output pricing." Cerevisor folds thoughtsTokenCount into the output total it reports so cost numbers match Google's billing console.

Audio-input pricing is tracked but not yet billed correctly because ContentBlock doesn't carry audio parts in v1.2.0. Audio inputs bill at the text rate until that lands; this only under-reports, never over-reports.

Default model

In the Library entry, you can set a default model for this provider. When an agent's model preference is (auto), it falls back to this default. If no default is set, Cerevisor uses gemini-3-flash-preview.

Thinking modes

Gemini's reasoning models accept a thinkingBudget integer that caps how many internal tokens the model can spend before answering. Cerevisor maps its neutral thinkingMode setting (in Agent Config → Reasoning) to Google's budgets:

Cerevisor mode	thinkingBudget
off	omitted (model decides)
low	1024
medium	4096
high	16384

Higher budgets cost more but improve answer quality on multi-step problems. The mode is ignored on non-thinking models.

Tools and tool choice

Function calling is fully supported. Cerevisor maps:

Cerevisor toolChoice	Gemini behavior
auto (default)	Model chooses when to call a tool.
any	Model must call exactly one of the available tools.
none	Tool definitions are sent but the model is forbidden from calling.

Tool ids are paired across the request/response loop. Older Gemini models that don't emit an id get a synthesized one so the loop still works.

Prompt caching

When the model supports it, Cerevisor passes through Gemini's cachedContentTokenCount and bills cached input at the cache-read rate (typically 10% of the input rate) instead of the standard input rate. This is fully automatic; you don't need to configure anything.

Common errors

Error	What it means	Fix
`@google/genai is not installed`	The optional SDK is missing from this build.	Reinstall Cerevisor, or run `npm install @google/genai` in your local dev build.
401 / API key invalid	The key isn't accepted by Google.	Generate a new one at `aistudio.google.com/apikey` and re-test.
Smoke test timed out	The test request didn't return within 30 seconds.	Check your network. The Gemini API can be slow during regional incidents.
Costs show $0 for a known model	The model id isn't in `gemini-pricing.ts`.	Check the console for the "Unknown model" warning. New model ids land in patch releases.

Where to go next

Per-agent provider overrides: mix Gemini with Anthropic or Codex CLI in one workflow.
Provider overview: how Cerevisor picks which provider runs which agent.

Back to docs