🧠
Gemini Deep Think
Not a separate model. It's gemini-3-pro with thinking_level: "deep",
which activates extended reasoning at ~50x the compute of a standard query. Also supports
"low", "medium", and "high" levels.
Benchmarks
Codeforces
3455 Elo (Legendary Grandmaster)
ARC-AGI-2
84.6%
Olympiads
Gold-medal level on IMO, IPhO, IChO
Trade-offs
Cost
~$2-4/M input, $12-18/M output, but 10-50x token usage per query. A single deep query can cost $0.50-5+.
Latency
30s-5min per query
Availability
Early access only (Feb 2026). Not yet in Bright Wrapper.
When to use something else
Claude Opus 4.6
Better for production software engineering (SWE-Bench Pro leader). Deep Think leads on algorithmic/scientific problems but trails on real-world codebases.
Gemini Pro (high)
thinking_level: "high" — cheaper middle ground for reasoning tasks that don't need the full deep mode.
💻
Code Generation Models
Current models for writing, reviewing, and debugging code. They have different strengths.
Models
Claude Opus 4.6
Anthropic. Leads SWE-Bench Pro (real-world codebases). Has extended thinking.
~$5/M input, $25/M output.
Gemini Deep Think
Google. Leads Codeforces (3455 Elo). Trails Opus on real-world codebases.
10-50x token usage. Early access only.
GPT-5.3-Codex
OpenAI. Deep-reasoning coding model. Powers the Codex cloud agent.
Available via ChatGPT and API.
gemini-3-flash
Google. 1/10th the cost of Opus. SWE-Bench Verified 78%.
Good enough for simple scripts, formatting, boilerplate.
SWE-Bench Pro
Opus 4.6
Codeforces
Deep Think
Cheapest
gem-3-flash
🤖
Agentic Coding Tools
LLMs wrapped with tool use (file reads, terminal, git, web search) so they can autonomously
write, test, and commit code. The model reasons; the agent harness acts.
Terminal agents
Claude Code
Anthropic. Runs Opus 4.6. Reads project, edits files, runs tests, spawns sub-agents for parallel work.
$20-200/mo subscription or API key. ~$5-20 per complex task. This is what Bright Wrapper is built with.
Gemini CLI
Google. Open-source. Runs Gemini 3 Flash/Pro. Free tier with a Google account.
1M token context. SWE-Bench Verified 78% with Flash.
Async agents (clone repo, deliver PR)
Jules
Google. Clones repo into cloud VM, works in background, delivers a PR.
Free tier: 15 tasks/day. jules-tools CLI available.
OpenAI Codex
OpenAI. Same pattern — sandbox, delivers PR.
Powered by GPT-5.3-Codex (deep reasoning) and Codex-Spark (1000+ tok/s on Cerebras for fast edits).
ChatGPT, CLI, and VS Code extension.
IDEs
Antigravity
Google. Agent-first IDE (public preview, free). Can dispatch multiple agents in parallel across editor, terminal, browser.
Multi-model (Gemini, Claude, OpenAI). Still early — reports of IDE freezes when agents spin up.
Cursor
VS Code fork. Editor-centric — inline completions and chat. Single-agent focus. $20/mo.