Models & Effort Levels

Choose the right Claude model and effort level for cost, speed, and quality trade-offs.

Overview

Praxiom AI runs on Claude models from Anthropic. You can choose between three model tiers and four effort levels, giving you fine-grained control over the cost, speed, and quality of every AI operation.

Models

Model	Extended Thinking	Large Context (1M)	Typical Cost
Claude Haiku	No	No	Lowest
Claude Sonnet	Yes	Yes	Medium
Claude Opus	Yes	Yes	Highest

When to Use Each

Haiku — Quick lookups, simple Q&A, low-latency operations. Best for frequent, lightweight interactions where speed matters more than depth.
Sonnet — The default for most workflows. Handles synthesis, recommendations, and document drafting with strong quality at reasonable cost.
Opus — Complex, multi-step analysis requiring maximum depth. Use for deep research synthesis across many sources, nuanced strategy recommendations, or comprehensive document generation.

Effort Levels

Effort levels control how deeply the agent reasons about your query:

Effort	UI Name	Extended Thinking	Description
`low`	Fast	No	Quick responses with minimal tool use
`medium`	Fast	No	Standard responses
`high`	Thorough	No	Deeper analysis, multiple tool calls
`max`	Deep	Yes (Sonnet/Opus)	Extended thinking enabled, maximum reasoning depth

Effort Capping

Not all models support all effort levels:

Haiku caps at high — requesting max is automatically downgraded to high
Sonnet and Opus support all levels including max with extended thinking

Credit Cost

Praxiom uses a cost-proportional credit system: 1 credit = $0.08 USD of Anthropic API spend. Your credit charge is determined by the actual token usage of each operation — more expensive models and higher effort levels consume more credits.

Approximate credit ranges by model and effort:

Combination	Est. Credits	Best For
Haiku + low	~0.5–1	Quick lookups, status checks
Haiku + high	~1–2	Simple synthesis, basic queries
Sonnet + low	~1–2	Fast chat responses
Sonnet + high	~2–4	Standard synthesis and recommendations
Sonnet + max	~3–7	Deep analysis with extended thinking
Opus + low	~2–5	Complex queries, quick turnaround
Opus + high	~6–15	Thorough multi-source synthesis
Opus + max	~12–32	Maximum depth research and strategy

Use GET /api/billing/estimate to get a precise low/high range before running an operation.

Failed runs are charged at 25% of actual cost (minimum 0.25 credits). Cancelled runs at 50%. Chat messages have a flat floor of 0.25 credits.

Extended Thinking

When effort is set to max on Sonnet or Opus, the agent uses extended thinking — a mode where the model reasons through complex problems step-by-step before responding. This produces:

Deeper cross-referencing of research sources
More nuanced recommendations with better rationale
More comprehensive documents with stronger evidence chains

Extended thinking blocks are visible in the chat UI as collapsible "thinking" sections, giving you transparency into the agent's reasoning process.

Large Context Window

Sonnet and Opus support the 1M token context window for synthesis, drafting, and full pipeline workflows. This allows the agent to process significantly more research material in a single operation — useful when synthesizing 10+ lengthy interview transcripts or drafting documents from extensive source material.

Large context is enabled automatically for synthesis, drafting, and full pipeline workflows on supported models. You don't need to configure anything.

Setting Model and Effort

In the UI

Use the model selector and reasoning mode toggle in the chat panel to switch between models and effort levels.

Via API

Pass model and reasoning_mode (or effort) in the streaming request:

{
  "workspace_id": "your-workspace-uuid",
  "prompt": "Synthesize all interview sources",
  "model": "claude-sonnet-4-20250514",
  "reasoning_mode": "deep",
  "effort": "max"
}

The reasoning_mode field accepts "fast", "thorough", or "deep". The effort field accepts "low", "medium", "high", or "max".

Workflow Defaults

Different workflows have default effort floors to ensure quality:

Workflow	Default Effort	Minimum
Synthesis	high	high
Recommendations	high	high
Drafting	high	high
Chat	medium	low
Full Pipeline	high	high

You can always increase effort above the default, but requesting lower than the minimum is automatically upgraded.

What's Next

Learn about verification and trust scoring in the Verification & Trust guide.

Was this helpful?

PreviousExecution Tickets NextMissions & Command Center