Models & Effort Levels
Choose the right Claude model and effort level for cost, speed, and quality trade-offs.
Overview
Praxiom AI runs on Claude models from Anthropic. You can choose between three model tiers and four effort levels, giving you fine-grained control over the cost, speed, and quality of every AI operation.
Models
| Model | Extended Thinking | Large Context (1M) | Typical Cost |
|---|---|---|---|
| Claude Haiku | No | No | Lowest |
| Claude Sonnet | Yes | Yes | Medium |
| Claude Opus | Yes | Yes | Highest |
When to Use Each
- Haiku — Quick lookups, simple Q&A, low-latency operations. Best for frequent, lightweight interactions where speed matters more than depth.
- Sonnet — The default for most workflows. Handles synthesis, recommendations, and document drafting with strong quality at reasonable cost.
- Opus — Complex, multi-step analysis requiring maximum depth. Use for deep research synthesis across many sources, nuanced strategy recommendations, or comprehensive document generation.
Effort Levels
Effort levels control how deeply the agent reasons about your query:
| Effort | UI Name | Extended Thinking | Description |
|---|---|---|---|
low | Fast | No | Quick responses with minimal tool use |
medium | Fast | No | Standard responses |
high | Thorough | No | Deeper analysis, multiple tool calls |
max | Deep | Yes (Sonnet/Opus) | Extended thinking enabled, maximum reasoning depth |
Effort Capping
Not all models support all effort levels:
- Haiku caps at
high— requestingmaxis automatically downgraded tohigh - Sonnet and Opus support all levels including
maxwith extended thinking
Credit Cost
Praxiom uses a cost-proportional credit system: 1 credit = $0.08 USD of Anthropic API spend. Your credit charge is determined by the actual token usage of each operation — more expensive models and higher effort levels consume more credits.
Approximate credit ranges by model and effort:
| Combination | Est. Credits | Best For |
|---|---|---|
| Haiku + low | ~0.5–1 | Quick lookups, status checks |
| Haiku + high | ~1–2 | Simple synthesis, basic queries |
| Sonnet + low | ~1–2 | Fast chat responses |
| Sonnet + high | ~2–4 | Standard synthesis and recommendations |
| Sonnet + max | ~3–7 | Deep analysis with extended thinking |
| Opus + low | ~2–5 | Complex queries, quick turnaround |
| Opus + high | ~6–15 | Thorough multi-source synthesis |
| Opus + max | ~12–32 | Maximum depth research and strategy |
Use GET /api/billing/estimate to get a precise low/high range before running an operation.
Failed runs are charged at 25% of actual cost (minimum 0.25 credits). Cancelled runs at 50%. Chat messages have a flat floor of 0.25 credits.
Extended Thinking
When effort is set to max on Sonnet or Opus, the agent uses extended thinking — a mode where the model reasons through complex problems step-by-step before responding. This produces:
- Deeper cross-referencing of research sources
- More nuanced recommendations with better rationale
- More comprehensive documents with stronger evidence chains
Extended thinking blocks are visible in the chat UI as collapsible "thinking" sections, giving you transparency into the agent's reasoning process.
Large Context Window
Sonnet and Opus support the 1M token context window for synthesis, drafting, and full pipeline workflows. This allows the agent to process significantly more research material in a single operation — useful when synthesizing 10+ lengthy interview transcripts or drafting documents from extensive source material.
Large context is enabled automatically for synthesis, drafting, and full pipeline workflows on supported models. You don't need to configure anything.
Setting Model and Effort
In the UI
Use the model selector and reasoning mode toggle in the chat panel to switch between models and effort levels.
Via API
Pass model and reasoning_mode (or effort) in the streaming request:
{
"workspace_id": "your-workspace-uuid",
"prompt": "Synthesize all interview sources",
"model": "claude-sonnet-4-20250514",
"reasoning_mode": "deep",
"effort": "max"
}
The reasoning_mode field accepts "fast", "thorough", or "deep". The effort field accepts "low", "medium", "high", or "max".
Workflow Defaults
Different workflows have default effort floors to ensure quality:
| Workflow | Default Effort | Minimum |
|---|---|---|
| Synthesis | high | high |
| Recommendations | high | high |
| Drafting | high | high |
| Chat | medium | low |
| Full Pipeline | high | high |
You can always increase effort above the default, but requesting lower than the minimum is automatically upgraded.
What's Next
Learn about verification and trust scoring in the Verification & Trust guide.
Was this helpful?