What it is
GPT-5 (released early 2026) is competitive with Claude 4.5 on many agentic tasks. The API is the most mature in the market.
Why we use it
- Most mature API in the market with broadest SDK support
- Competitive pricing vs Claude on equivalent tier
- Strong on coding-specific tasks
- Function calling that works reliably with every framework
Why we wouldn't
- Tool-use behaviour more variable than Claude on multi-step tasks
- Frequent model deprecations require ongoing config updates
- OpenAI's safety stance is less consistent than Anthropic's
Best for
- Teams already using OpenAI for other workloads
- Coding-heavy agent tasks
- Fallback when Claude is rate-limited
Not for
- Multi-step planning where reliability is the primary metric
Long review
OpenAI's GPT-5 is genuinely strong, particularly on coding tasks where it edges out Claude in our benchmarks. The API itself remains the gold standard for SDK breadth and integration coverage — every agent framework we test supports OpenAI first-class, sometimes only OpenAI for non-Anthropic providers. Pricing is competitive. The trade-offs: tool-use behaviour is meaningfully more variable than Claude's on multi-step tasks (we've watched GPT-5 lose track of which tool it called two steps ago), model deprecations come faster than we'd like, and OpenAI's safety stance has wobbled enough times in 2025 to make us cautious about depending on it as an only choice. Our standard recommendation: have OpenAI configured as a fallback for when Claude is unavailable, but lead with Claude.
Alternatives we've tested
- Anthropic (Claude API) — The LLM provider we default to for self-hosted agent workloads. Claude 4.5 Sonnet remains the best agentic model in 2026.
- OpenRouter — LLM gateway with unified API across 100+ providers. Our default for cost-optimisation and provider fallback.