Models & Effort Levels

Choose the right Claude model and effort level for cost, speed, and quality trade-offs.

Overview

Praxiom AI runs on Claude models from Anthropic. You can choose between three model tiers and four effort levels, giving you fine-grained control over the cost, speed, and quality of every AI operation.

Models

ModelExtended ThinkingLarge Context (1M)Typical Cost
Claude HaikuNoNoLowest
Claude SonnetYesYesMedium
Claude OpusYesYesHighest

When to Use Each

  • Haiku — Quick lookups, simple Q&A, low-latency operations. Best for frequent, lightweight interactions where speed matters more than depth.
  • Sonnet — The default for most workflows. Handles synthesis, recommendations, and document drafting with strong quality at reasonable cost.
  • Opus — Complex, multi-step analysis requiring maximum depth. Use for deep research synthesis across many sources, nuanced strategy recommendations, or comprehensive document generation.

Effort Levels

Effort levels control how deeply the agent reasons about your query:

EffortUI NameExtended ThinkingDescription
lowFastNoQuick responses with minimal tool use
mediumFastNoStandard responses
highThoroughNoDeeper analysis, multiple tool calls
maxDeepYes (Sonnet/Opus)Extended thinking enabled, maximum reasoning depth

Effort Capping

Not all models support all effort levels:

  • Haiku caps at high — requesting max is automatically downgraded to high
  • Sonnet and Opus support all levels including max with extended thinking

Credit Cost

Praxiom uses a cost-proportional credit system: 1 credit = $0.08 USD of Anthropic API spend. Your credit charge is determined by the actual token usage of each operation — more expensive models and higher effort levels consume more credits.

Approximate credit ranges by model and effort:

CombinationEst. CreditsBest For
Haiku + low~0.5–1Quick lookups, status checks
Haiku + high~1–2Simple synthesis, basic queries
Sonnet + low~1–2Fast chat responses
Sonnet + high~2–4Standard synthesis and recommendations
Sonnet + max~3–7Deep analysis with extended thinking
Opus + low~2–5Complex queries, quick turnaround
Opus + high~6–15Thorough multi-source synthesis
Opus + max~12–32Maximum depth research and strategy

Use GET /api/billing/estimate to get a precise low/high range before running an operation.

Failed runs are charged at 25% of actual cost (minimum 0.25 credits). Cancelled runs at 50%. Chat messages have a flat floor of 0.25 credits.

Extended Thinking

When effort is set to max on Sonnet or Opus, the agent uses extended thinking — a mode where the model reasons through complex problems step-by-step before responding. This produces:

  • Deeper cross-referencing of research sources
  • More nuanced recommendations with better rationale
  • More comprehensive documents with stronger evidence chains

Extended thinking blocks are visible in the chat UI as collapsible "thinking" sections, giving you transparency into the agent's reasoning process.

Large Context Window

Sonnet and Opus support the 1M token context window for synthesis, drafting, and full pipeline workflows. This allows the agent to process significantly more research material in a single operation — useful when synthesizing 10+ lengthy interview transcripts or drafting documents from extensive source material.

Large context is enabled automatically for synthesis, drafting, and full pipeline workflows on supported models. You don't need to configure anything.

Setting Model and Effort

In the UI

Use the model selector and reasoning mode toggle in the chat panel to switch between models and effort levels.

Via API

Pass model and reasoning_mode (or effort) in the streaming request:

{
  "workspace_id": "your-workspace-uuid",
  "prompt": "Synthesize all interview sources",
  "model": "claude-sonnet-4-20250514",
  "reasoning_mode": "deep",
  "effort": "max"
}

The reasoning_mode field accepts "fast", "thorough", or "deep". The effort field accepts "low", "medium", "high", or "max".

Workflow Defaults

Different workflows have default effort floors to ensure quality:

WorkflowDefault EffortMinimum
Synthesishighhigh
Recommendationshighhigh
Draftinghighhigh
Chatmediumlow
Full Pipelinehighhigh

You can always increase effort above the default, but requesting lower than the minimum is automatically upgraded.

What's Next

Learn about verification and trust scoring in the Verification & Trust guide.

Was this helpful?