Claude
Quick Facts
- Vendor
- Anthropic (San Francisco)
- Released
- Claude 1 (March 2023); Claude 4 family (2025–2026)
- Current line
- Opus 4.7 · Sonnet 4.6 · Haiku 4.5
- License
- Proprietary; hosted API only
- Hosting
- Anthropic API, Amazon Bedrock, Google Vertex AI
- Context window
- 200K tokens standard; 1M tokens on select Opus tiers
- Modalities
- Text, image (vision), PDF, code
- Alignment approach
- Constitutional AI / RLHF
Summary
Claude is Anthropic's general-purpose LLM family, launched in 2023 as a direct competitor to OpenAI's GPT series. The defining design choice is Constitutional AI: an alignment technique that trains the model against a written set of principles rather than relying solely on human feedback. In practice this produces a model that refuses fewer legitimate requests, explains its reasoning more readily, and degrades more gracefully on ambiguous edge cases.
For agent infrastructure, three features matter most: (1) structured tool use with a well-defined JSON protocol that agent frameworks can target; (2) prompt caching, which drops the cost of long system prompts and repeated context by up to 90% — turning otherwise uneconomical agent loops into viable workloads; and (3) extended thinking, a reasoning mode that lets the model think before responding, closing the gap with dedicated reasoning models on math and code benchmarks.
Model Lineup
- Opus — flagship tier. Highest capability, longest context, slowest and most expensive. Use for planning, deep code edits, and agent orchestration.
- Sonnet — balanced tier. The workhorse for most production agents. Claude Code defaults to Sonnet for most interactive tasks.
- Haiku — speed tier. Sub-second latency on short prompts. Use for classification, routing, tool-call dispatch, and high-volume summarization.
Where Claude Fits
Claude is the default choice when any of the following dominate your requirements: long-running coding agents, multi-step tool loops with strict reliability requirements, or workloads that benefit from prompt caching (long system prompts, RAG pipelines, document-grounded Q&A). The Claude Code CLI and the Claude Agent SDK are built on the same tool-use protocol, so infrastructure built around one transfers cleanly to the other.
Tradeoffs
- Closed weights. No self-hosted option. Data residency is constrained to Anthropic, Bedrock, or Vertex regions. Regulated workloads with on-premise requirements need a different family (Llama, Qwen, Mistral).
- Rate limits on Opus. Flagship tiers are capacity-constrained. Production deployments should plan for multi-provider fallback via Bedrock or Vertex.
- Pricing at the top tier. Opus token pricing is premium. Prompt caching and tiered routing (Opus → Sonnet → Haiku) are essential for cost control.
Deployment Notes
Within the Claw ecosystem, Claude powers the human-facing control plane: Claude Code Slack for operator conversations, Claude Agent SDK for bespoke agent loops, and the observability layer that turns agent traces into KPIs. Open-weights models (Qwen, Llama) handle the high-volume interior loops where self-hosting on Mac Mini edge hardware beats API economics. See The Agent Infrastructure Stack for the full architecture.