Streaming (SSE)

Real-time agent responses via Server-Sent Events for all workflow types.

Overview

Praxiom AI provides Server-Sent Events (SSE) streaming endpoints for real-time agent responses. Instead of waiting for a blocking response, the frontend opens an EventSource connection and receives typed events as the agent executes.

Timeout: 300 seconds per query.

Token batching: Text tokens are batched every 50ms to avoid excessive re-renders.

Event Types

All streaming endpoints emit the following event types:

Event	Description
`token`	Text chunk from the agent (for incremental display)
`thinking`	Agent reasoning / chain-of-thought block (collapsible in UI)
`tool_use`	Tool invocation started (name, status, input summary)
`tool_result`	Tool execution completed (name, output summary, duration, RQS metadata)
`turn`	Agentic turn boundary marker (turn number, total tools)
`progress`	Pipeline progress update (step, percentage, message)
`skill_activated`	Skill activated for this response (skill slug, name, instructions)
`skill_suggestion`	Uninstalled skill suggested for the query (slug, name, reason)
`briefing`	Conversation resumption context (delta, active counts, memory)
`creation`	Entity created during stream (insights, recommendations, or documents with inline result cards)
`validation_sync`	Synchronous trust verification results (citation checks, cross-source, severity)
`mission_proposed`	Complex query decomposed into a multi-agent mission proposal
`agent_thread_start`	A mission subtask (agent) has begun execution
`agent_thread_done`	A mission subtask has completed with artifacts and duration
`harness_subtask_start`	Detailed harness-level subtask lifecycle start
`harness_subtask_done`	Detailed harness-level subtask lifecycle completion
`harness_complexity`	Pre-flight query complexity classification (simple / medium / complex)
`harness_plan_selected`	Pre-flight selected execution plan for COMPLEX queries
`harness_stream_alert`	Real-time quality alert detected mid-stream (500 / 2000 / 5000 chars)
`harness_swarm_activated`	Specialist swarm activated for high-source-count synthesis
`harness_swarm_complete`	Specialist swarm + aggregator finished
`harness_contract`	Post-flight contract evaluation result (pass/fail, score, issues)
`harness_verifier`	Independent verifier quality score and feedback
`harness_workspace_profile`	Workspace maturity + historical quality profile
`harness_retry`	Retry triggered between attempts (healing prompt built from prior issues)
`harness_degraded`	All retries exhausted without reaching the quality threshold
`harness_iteration_complete`	Deep-synthesis iteration run finished
`context_metrics`	Tool-result context consumption summary (emitted just before `done`)
`milestone_unlocked`	Workspace maturity milestone reached (milestone name, credits granted)
`error`	Error during execution
`done`	Stream complete (message ID, summary, model, cost, tokens, credits, optional mission_id)

Done Event Detail

The done event includes execution metrics. When a mission completes, mission_id is included:

{
  "message_id": "uuid",
  "summary": "Identified top 3 pain points",
  "model": "claude-sonnet-4-20250514",
  "num_turns": 3,
  "tool_count": 5,
  "duration_ms": 12400,
  "credits": 2,
  "mission_id": "uuid-or-null"
}

Creation Event Detail

When the agent saves insights, recommendations, or documents, a creation event is emitted with inline result cards:

{
  "type": "insights",
  "ids": ["uuid-1", "uuid-2"],
  "count": 5,
  "preview": [
    {"id": "uuid-1", "title": "Onboarding drop-off", "severity": "high"}
  ]
}

Chat Stream

POST /api/stream/chat

Stream a chat response from the AI copilot.

Request Body

Name	Type	Required	Description
`workspace_id`	UUID	Yes	Target workspace
`prompt`	string	Yes	User message (1-10,000 characters)
`conversation_id`	UUID	No	Existing conversation to continue
`session_id`	string	No	Session identifier
`max_turns`	integer	No	Maximum agent turns (1-30)
`mode`	string	No	`"plan"` or `"research"`
`reasoning_mode`	string	No	`"fast"`, `"thorough"`, or `"deep"`
`model`	string	No	Claude model ID override
`attachment_ids`	UUID[]	No	File attachment IDs (max 10)
`language`	string	No	ISO language code for response
`effort`	string	No	Direct effort override: `"low"`, `"medium"`, `"high"`, `"max"`

Example Request

const response = await fetch("https://api.praxiomai.xyz/api/stream/chat", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${token}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    workspace_id: "550e8400-e29b-41d4-a716-446655440000",
    prompt: "What are the top 3 pain points from our user interviews?",
    conversation_id: "a7b8c9d0-e1f2-3456-abcd-789012345678",
    reasoning_mode: "thorough",
  }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const text = decoder.decode(value);
  // Parse SSE events from text
  for (const line of text.split("\n")) {
    if (line.startsWith("data: ")) {
      const data = JSON.parse(line.slice(6));
      console.log(data);
    }
  }
}

Response (SSE Stream)

event: progress
data: {"step": "context_assembly", "percentage": 10, "message": "Loading workspace context..."}

event: thinking
data: {"content": "Let me analyze the research sources..."}

event: tool_use
data: {"name": "get_workspace_context", "status": "running", "input_summary": "workspace_id=550e..."}

event: tool_result
data: {"name": "get_workspace_context", "status": "done", "output_summary": "12 sources, 45 insights", "duration_ms": 230}

event: token
data: {"content": "Based on "}

event: token
data: {"content": "your research, the top 3 pain points are:\n\n"}

event: creation
data: {"type": "insights", "ids": ["d1e2f3..."], "count": 3, "preview": [{"id": "d1e2f3...", "title": "Onboarding drop-off at step 3", "severity": "high"}]}

event: done
data: {"message_id": "c9d0e1f2-...", "summary": "Identified top 3 pain points", "model": "claude-sonnet-4-20250514", "num_turns": 2, "tool_count": 3, "duration_ms": 8500, "credits": 2}

Synthesis Stream

POST /api/stream/synthesis

Stream a synthesis operation with real-time progress events.

Request Body

Name	Type	Required	Description
`workspace_id`	UUID	Yes	Target workspace
`prompt`	string	Yes	Synthesis instructions
`source_ids`	UUID[]	Yes	Research source IDs (1-20)
`synthesis_type`	string	No	Default: `"comprehensive"`
`conversation_id`	UUID	No	Conversation to log to
`max_turns`	integer	No	Maximum agent turns (1-30)

Recommendation Stream

POST /api/stream/recommendation

Stream a recommendation generation with real-time progress.

Request Body

Name	Type	Required	Description
`workspace_id`	UUID	Yes	Target workspace
`prompt`	string	Yes	Generation instructions
`insight_ids`	UUID[]	Yes	Insight IDs (1-20)
`focus_areas`	string[]	No	Focus areas to guide generation
`conversation_id`	UUID	No	Conversation to log to
`max_turns`	integer	No	Maximum agent turns (1-30)

Draft Stream

POST /api/stream/draft

Stream a document drafting operation.

Request Body

Name	Type	Required	Description
`workspace_id`	UUID	Yes	Target workspace
`prompt`	string	Yes	Drafting instructions
`document_type`	string	No	Default: `"prd"`
`recommendation_id`	UUID	No	Source recommendation
`user_instructions`	string	No	Additional instructions
`conversation_id`	UUID	No	Conversation to log to
`max_turns`	integer	No	Maximum agent turns (1-30)

Pipeline Stream

POST /api/stream/pipeline

Stream a multi-step pipeline operation (synthesis + recommendations + drafting in sequence).

Request Body

Same as StreamRequest base schema:

Name	Type	Required	Description
`workspace_id`	UUID	Yes	Target workspace
`prompt`	string	Yes	Pipeline instructions
`conversation_id`	UUID	No	Conversation to log to
`max_turns`	integer	No	Maximum agent turns (1-30)

Mission Events

Emitted when a query is classified as COMPLEX and decomposed into a multi-agent mission. See the Missions guide for the full execution flow.

`mission_proposed`

{
  "event": "mission_proposed",
  "data": {
    "mission_id": "550e8400-e29b-41d4-a716-446655440000",
    "subtasks": [
      {
        "index": 0,
        "title": "Research market trends",
        "workflow_type": "synthesis",
        "depends_on": []
      },
      {
        "index": 1,
        "title": "Generate recommendations",
        "workflow_type": "recommendation",
        "depends_on": [0]
      }
    ],
    "agent_count": 2,
    "parallel_count": 1
  }
}

`agent_thread_start`

Emitted when a subtask begins execution.

{
  "event": "agent_thread_start",
  "data": {
    "mission_id": "uuid",
    "subtask_index": 0,
    "agent_name": "Research market trends",
    "workflow_type": "synthesis"
  }
}

`agent_thread_done`

Emitted when a subtask completes.

{
  "event": "agent_thread_done",
  "data": {
    "mission_id": "uuid",
    "subtask_index": 0,
    "artifact_summary": "Created 4 insights on market positioning",
    "artifact_counts": {
      "insights": 4,
      "documents": 1
    },
    "duration_ms": 34200
  }
}

`harness_subtask_start`

Detailed harness-level lifecycle event for subtask start.

{
  "event": "harness_subtask_start",
  "data": {
    "subtask_index": 0,
    "total_subtasks": 3,
    "title": "Research market trends",
    "workflow_type": "synthesis",
    "parallel": false
  }
}

`harness_subtask_done`

{
  "event": "harness_subtask_done",
  "data": {
    "subtask_index": 0,
    "total_subtasks": 3,
    "artifacts_created": 5,
    "is_last": false
  }
}

Milestone Events

`milestone_unlocked`

Emitted when the workspace reaches a maturity milestone during an agent run. The frontend displays a toast notification with the milestone name and credits granted.

{
  "event": "milestone_unlocked",
  "data": {
    "milestone": "engagement",
    "credits_granted": 5.0,
    "message": "Milestone unlocked: engagement — 5.0 credits added"
  }
}

Milestones are one-time events. See Trial Milestone System for the full list.

Harness Quality Events

Emitted after each agent run as part of the post-flight quality pipeline.

`harness_contract`

Result of the algorithmic contract evaluation — checks whether the agent met minimum output requirements for the workflow type.

{
  "event": "harness_contract",
  "data": {
    "passed": true,
    "score": 0.87,
    "issues": [],
    "workflow_type": "synthesis"
  }
}

When the contract fails, issues lists specific requirements not met:

{
  "event": "harness_contract",
  "data": {
    "passed": false,
    "score": 0.45,
    "issues": [
      "Minimum 3 insights required, found 1",
      "Citations required for synthesis workflow"
    ],
    "workflow_type": "synthesis"
  }
}

`harness_verifier`

Result of the independent verification agent (a separate Haiku call with fresh context).

{
  "event": "harness_verifier",
  "data": {
    "passed": true,
    "score": 0.82,
    "issues": ["Response could include more cross-source triangulation"],
    "feedback": "Good analysis with strong citations. Could benefit from comparing across more sources."
  }
}

`harness_stream_alert`

Real-time quality alert detected mid-stream (before completion). Used for early warning of potential issues.

{
  "event": "harness_stream_alert",
  "data": {
    "severity": "warning",
    "checkpoint": 500,
    "issue": "Response appears to be a refusal or very short answer",
    "suggestion": "Agent may lack necessary context — consider adding research sources"
  }
}

Field	Values
`severity`	`"warning"` or `"critical"`
`checkpoint`	Character position where alert was detected (500, 2000, 5000)

`harness_complexity`

Emitted in pre-flight after the Haiku complexity classifier returns.

{
  "event": "harness_complexity",
  "data": {
    "level": "complex",
    "reason": "Query requires sequential phases: analyze interviews, generate recs, draft PRD.",
    "subtask_count": 3
  }
}

level is "simple", "medium", or "complex". subtask_count is non-zero only when level == "complex".

`harness_plan_selected`

Emitted for COMPLEX queries after PlanGenerator produces 2-3 candidate plans and PlanSelector picks the highest-scoring one (deterministic scoring — no LLM call).

{
  "event": "harness_plan_selected",
  "data": {
    "plan": {
      "plan_id": "a1b2c3d4",
      "title": "Evidence-first synthesis",
      "description": "Analyze sources by theme, then synthesize cross-source insights",
      "suggested_workflow": "synthesis",
      "prompt_modifier": "Focus on triangulating claims across 3+ sources before...",
      "estimated_turns": 6,
      "subtasks": [],
      "rationale": "Query emphasizes evidence quality over speed"
    },
    "score": {
      "coverage": 0.82,
      "efficiency": 0.70,
      "feasibility": 0.80,
      "alignment": 1.00,
      "total": 0.818
    },
    "candidates_count": 3
  }
}

`harness_swarm_activated`

Emitted when the specialist swarm starts for high-source-count synthesis.

{
  "event": "harness_swarm_activated",
  "data": {
    "source_count": 12,
    "specialist_count": 4,
    "specialists": ["themes", "evidence", "contradictions", "opportunities"]
  }
}

`harness_swarm_complete`

Emitted when all specialists + aggregator have finished.

{
  "event": "harness_swarm_complete",
  "data": {
    "specialist_count": 4,
    "specialists_succeeded": 4,
    "aggregator_succeeded": true,
    "total_duration_ms": 48200,
    "estimated_cost_usd": 0.0412
  }
}

`harness_retry`

Emitted between attempts when RetryPolicy.should_retry returns True. The next attempt uses a healing prompt built from the prior attempt's contract + verifier issues.

{
  "event": "harness_retry",
  "data": {
    "attempt": 2,
    "reason": "Contract failed (gap=0.35) — retrying with healing prompt",
    "previous_score": 0.35
  }
}

attempt is 1-based and refers to the attempt that is about to start (so attempt: 2 means the first retry). previous_score is the combined (contract_passed + verifier_score) / 2.0 of the attempt that just finished.

`harness_degraded`

Emitted when all retry attempts are exhausted without passing the quality threshold.

{
  "event": "harness_degraded",
  "data": {
    "attempts_made": 3,
    "best_score": 0.52,
    "workflow_type": "synthesis",
    "issues": [
      "Insufficient insights: 2 created (minimum 3)",
      "Response could include more cross-source triangulation"
    ]
  }
}

`harness_iteration_complete`

Emitted after deep-synthesis iteration runs (multi-pass refinement). Summarizes the best result across all iterations.

{
  "event": "harness_iteration_complete",
  "data": {
    "iterations": 4,
    "best_score": 8.2,
    "accepted_count": 3,
    "total_duration_ms": 62400,
    "estimated_cost_usd": 0.0583
  }
}

`harness_workspace_profile`

Emitted once per post-flight (when telemetry history exists) to surface the workspace's current quality profile.

{
  "event": "harness_workspace_profile",
  "data": {
    "maturity": "developing",
    "total_runs": 23,
    "avg_quality": 0.742,
    "feedback_rate": 0.857,
    "common_issues": [
      "Insufficient insights: 1 created (minimum 3)"
    ],
    "quality_multiplier": 1.0
  }
}

maturity is "new" (0-9 runs), "developing" (10-49), or "mature" (50+). quality_multiplier scales contract thresholds — 0.85 for new, 1.00 for developing, up to 1.15 for mature workspaces with avg_quality > 0.75.

`context_metrics`

Emitted just before the done event. Summarizes how much tool-result context was consumed during the run — useful for debugging context-window pressure.

{
  "event": "context_metrics",
  "data": {
    "total_tool_calls": 8,
    "cumulative_result_chars": 52400,
    "estimated_tool_result_tokens": 13100,
    "largest_result": ["get_workspace_context", 18200],
    "calls_per_tool": {
      "get_workspace_context": 1,
      "search_sources": 4,
      "save_insights": 3
    }
  }
}

estimated_tool_result_tokens is a rough estimate at ~4 chars per token. largest_result is a [tool_name, char_count] tuple for the single largest tool result.

Was this helpful?

PreviousChat & Conversations NextSkills