GPT-5.5
OpenAI · 1M context · $5/$30 per 1M
Complex reasoning · Coding · Professional workflows
Coding and agent model guide
Use this page to decide which model should plan, edit, review, and handle bulk work in an AI coding-agent stack. It combines coding quality, tool use, context length, and API cost into practical routing guidance.
Shortlist
Use frontier models for planning and risky edits. Use cheaper or local models for repetitive, low-risk work.
OpenAI · 1M context · $5/$30 per 1M
Complex reasoning · Coding · Professional workflows
OpenAI · 1M context · $2.5/$15 per 1M
Coding · Agents · Tool integration
Anthropic · 1M context · $5/$25 per 1M
Complex reasoning · Agentic coding · Critical decisions
OpenAI · 400K context · $1.75/$14 per 1M
Coding-focused tasks · Type inference · Agentic coding
Anthropic · 1M context · $3/$15 per 1M
Balanced performance · Production workloads · Cost-efficient
Google · 1M context · $2/$12 per 1M
Multimodal tasks · Long context · Search integration
DeepSeek · 128K context · $0.27/$1.1 per 1M
Budget coding · High-volume · Cost-sensitive
Meta · 128K context · $2/$8 per 1M
Self-hosted · Open source · Customizable
Routing pattern
The best setup is usually a router, not one model doing every step. Split planning, implementation, review, and bulk work.
GPT-5.5 or Claude Opus 4.7
Use the strongest reasoning model to decompose tasks, choose files, and decide when to stop.
GPT-5.4 or GPT-5.2-Codex
Use a coding-optimized model for implementation loops, test fixes, and repository edits.
Claude Opus 4.7
Use a second frontier model for architecture review, hidden assumptions, and regression risk.
Claude Sonnet 4.6, DeepSeek V3, or local models
Route repetitive linting, extraction, and low-risk edits to lower-cost models.
Workflow map
Choose based on the job: long context, autonomy, privacy, and retry cost matter as much as raw coding score.
| Workflow | Primary model | Fallback | Why |
|---|---|---|---|
| Large refactor | Claude Opus 4.7 | GPT-5.4 | Prioritize context, careful planning, and review quality. |
| Autonomous feature build | GPT-5.5 | GPT-5.4 | Keep planning and execution separate for cleaner diffs. |
| Bug fix from failing tests | GPT-5.2-Codex | GPT-5.4 | Give the model test output, touched files, and reproduction steps. |
| Repo Q&A / codebase search | Gemini 3.1 Pro Preview | Claude Sonnet 4.6 | Long context helps, but still pair it with retrieval. |
| High-volume code review | Claude Sonnet 4.6 | DeepSeek V3 | Use cheaper models for first-pass comments, then escalate risky files. |
| Private codebase | Llama 4 (405B) | GPT-OSS-120B | Prefer open or local deployment when source cannot leave your environment. |
Use the model picker for budget, context, tool-use, and deployment constraints.