Streaming (SSE)

Real-time agent responses via Server-Sent Events for all workflow types.

Overview

Praxiom AI provides Server-Sent Events (SSE) streaming endpoints for real-time agent responses. Instead of waiting for a blocking response, the frontend opens an EventSource connection and receives typed events as the agent executes.

Timeout: 300 seconds per query.

Token batching: Text tokens are batched every 50ms to avoid excessive re-renders.


Event Types

All streaming endpoints emit the following event types:

EventDescription
tokenText chunk from the agent (for incremental display)
thinkingAgent reasoning / chain-of-thought block (collapsible in UI)
tool_useTool invocation started (name, status, input summary)
tool_resultTool execution completed (name, output summary, duration, RQS metadata)
turnAgentic turn boundary marker (turn number, total tools)
progressPipeline progress update (step, percentage, message)
skill_activatedSkill activated for this response (skill slug, name, instructions)
skill_suggestionUninstalled skill suggested for the query (slug, name, reason)
briefingConversation resumption context (delta, active counts, memory)
creationEntity created during stream (insights, recommendations, or documents with inline result cards)
validation_syncSynchronous trust verification results (citation checks, cross-source, severity)
mission_proposedComplex query decomposed into a multi-agent mission proposal
agent_thread_startA mission subtask (agent) has begun execution
agent_thread_doneA mission subtask has completed with artifacts and duration
harness_subtask_startDetailed harness-level subtask lifecycle start
harness_subtask_doneDetailed harness-level subtask lifecycle completion
harness_complexityPre-flight query complexity classification (simple / medium / complex)
harness_plan_selectedPre-flight selected execution plan for COMPLEX queries
harness_stream_alertReal-time quality alert detected mid-stream (500 / 2000 / 5000 chars)
harness_swarm_activatedSpecialist swarm activated for high-source-count synthesis
harness_swarm_completeSpecialist swarm + aggregator finished
harness_contractPost-flight contract evaluation result (pass/fail, score, issues)
harness_verifierIndependent verifier quality score and feedback
harness_workspace_profileWorkspace maturity + historical quality profile
harness_retryRetry triggered between attempts (healing prompt built from prior issues)
harness_degradedAll retries exhausted without reaching the quality threshold
harness_iteration_completeDeep-synthesis iteration run finished
context_metricsTool-result context consumption summary (emitted just before done)
milestone_unlockedWorkspace maturity milestone reached (milestone name, credits granted)
errorError during execution
doneStream complete (message ID, summary, model, cost, tokens, credits, optional mission_id)

Done Event Detail

The done event includes execution metrics. When a mission completes, mission_id is included:

{
  "message_id": "uuid",
  "summary": "Identified top 3 pain points",
  "model": "claude-sonnet-4-20250514",
  "num_turns": 3,
  "tool_count": 5,
  "duration_ms": 12400,
  "credits": 2,
  "mission_id": "uuid-or-null"
}

Creation Event Detail

When the agent saves insights, recommendations, or documents, a creation event is emitted with inline result cards:

{
  "type": "insights",
  "ids": ["uuid-1", "uuid-2"],
  "count": 5,
  "preview": [
    {"id": "uuid-1", "title": "Onboarding drop-off", "severity": "high"}
  ]
}

Chat Stream

POST /api/stream/chat

Stream a chat response from the AI copilot.

Request Body

NameTypeRequiredDescription
workspace_idUUIDYesTarget workspace
promptstringYesUser message (1-10,000 characters)
conversation_idUUIDNoExisting conversation to continue
session_idstringNoSession identifier
max_turnsintegerNoMaximum agent turns (1-30)
modestringNo"plan" or "research"
reasoning_modestringNo"fast", "thorough", or "deep"
modelstringNoClaude model ID override
attachment_idsUUID[]NoFile attachment IDs (max 10)
languagestringNoISO language code for response
effortstringNoDirect effort override: "low", "medium", "high", "max"

Example Request

const response = await fetch("https://api.praxiomai.xyz/api/stream/chat", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${token}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    workspace_id: "550e8400-e29b-41d4-a716-446655440000",
    prompt: "What are the top 3 pain points from our user interviews?",
    conversation_id: "a7b8c9d0-e1f2-3456-abcd-789012345678",
    reasoning_mode: "thorough",
  }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const text = decoder.decode(value);
  // Parse SSE events from text
  for (const line of text.split("\n")) {
    if (line.startsWith("data: ")) {
      const data = JSON.parse(line.slice(6));
      console.log(data);
    }
  }
}

Response (SSE Stream)

event: progress
data: {"step": "context_assembly", "percentage": 10, "message": "Loading workspace context..."}

event: thinking
data: {"content": "Let me analyze the research sources..."}

event: tool_use
data: {"name": "get_workspace_context", "status": "running", "input_summary": "workspace_id=550e..."}

event: tool_result
data: {"name": "get_workspace_context", "status": "done", "output_summary": "12 sources, 45 insights", "duration_ms": 230}

event: token
data: {"content": "Based on "}

event: token
data: {"content": "your research, the top 3 pain points are:\n\n"}

event: creation
data: {"type": "insights", "ids": ["d1e2f3..."], "count": 3, "preview": [{"id": "d1e2f3...", "title": "Onboarding drop-off at step 3", "severity": "high"}]}

event: done
data: {"message_id": "c9d0e1f2-...", "summary": "Identified top 3 pain points", "model": "claude-sonnet-4-20250514", "num_turns": 2, "tool_count": 3, "duration_ms": 8500, "credits": 2}

Synthesis Stream

POST /api/stream/synthesis

Stream a synthesis operation with real-time progress events.

Request Body

NameTypeRequiredDescription
workspace_idUUIDYesTarget workspace
promptstringYesSynthesis instructions
source_idsUUID[]YesResearch source IDs (1-20)
synthesis_typestringNoDefault: "comprehensive"
conversation_idUUIDNoConversation to log to
max_turnsintegerNoMaximum agent turns (1-30)

Recommendation Stream

POST /api/stream/recommendation

Stream a recommendation generation with real-time progress.

Request Body

NameTypeRequiredDescription
workspace_idUUIDYesTarget workspace
promptstringYesGeneration instructions
insight_idsUUID[]YesInsight IDs (1-20)
focus_areasstring[]NoFocus areas to guide generation
conversation_idUUIDNoConversation to log to
max_turnsintegerNoMaximum agent turns (1-30)

Draft Stream

POST /api/stream/draft

Stream a document drafting operation.

Request Body

NameTypeRequiredDescription
workspace_idUUIDYesTarget workspace
promptstringYesDrafting instructions
document_typestringNoDefault: "prd"
recommendation_idUUIDNoSource recommendation
user_instructionsstringNoAdditional instructions
conversation_idUUIDNoConversation to log to
max_turnsintegerNoMaximum agent turns (1-30)

Pipeline Stream

POST /api/stream/pipeline

Stream a multi-step pipeline operation (synthesis + recommendations + drafting in sequence).

Request Body

Same as StreamRequest base schema:

NameTypeRequiredDescription
workspace_idUUIDYesTarget workspace
promptstringYesPipeline instructions
conversation_idUUIDNoConversation to log to
max_turnsintegerNoMaximum agent turns (1-30)

Mission Events

Emitted when a query is classified as COMPLEX and decomposed into a multi-agent mission. See the Missions guide for the full execution flow.

mission_proposed

{
  "event": "mission_proposed",
  "data": {
    "mission_id": "550e8400-e29b-41d4-a716-446655440000",
    "subtasks": [
      {
        "index": 0,
        "title": "Research market trends",
        "workflow_type": "synthesis",
        "depends_on": []
      },
      {
        "index": 1,
        "title": "Generate recommendations",
        "workflow_type": "recommendation",
        "depends_on": [0]
      }
    ],
    "agent_count": 2,
    "parallel_count": 1
  }
}

agent_thread_start

Emitted when a subtask begins execution.

{
  "event": "agent_thread_start",
  "data": {
    "mission_id": "uuid",
    "subtask_index": 0,
    "agent_name": "Research market trends",
    "workflow_type": "synthesis"
  }
}

agent_thread_done

Emitted when a subtask completes.

{
  "event": "agent_thread_done",
  "data": {
    "mission_id": "uuid",
    "subtask_index": 0,
    "artifact_summary": "Created 4 insights on market positioning",
    "artifact_counts": {
      "insights": 4,
      "documents": 1
    },
    "duration_ms": 34200
  }
}

harness_subtask_start

Detailed harness-level lifecycle event for subtask start.

{
  "event": "harness_subtask_start",
  "data": {
    "subtask_index": 0,
    "total_subtasks": 3,
    "title": "Research market trends",
    "workflow_type": "synthesis",
    "parallel": false
  }
}

harness_subtask_done

{
  "event": "harness_subtask_done",
  "data": {
    "subtask_index": 0,
    "total_subtasks": 3,
    "artifacts_created": 5,
    "is_last": false
  }
}

Milestone Events

milestone_unlocked

Emitted when the workspace reaches a maturity milestone during an agent run. The frontend displays a toast notification with the milestone name and credits granted.

{
  "event": "milestone_unlocked",
  "data": {
    "milestone": "engagement",
    "credits_granted": 5.0,
    "message": "Milestone unlocked: engagement — 5.0 credits added"
  }
}

Milestones are one-time events. See Trial Milestone System for the full list.


Harness Quality Events

Emitted after each agent run as part of the post-flight quality pipeline.

harness_contract

Result of the algorithmic contract evaluation — checks whether the agent met minimum output requirements for the workflow type.

{
  "event": "harness_contract",
  "data": {
    "passed": true,
    "score": 0.87,
    "issues": [],
    "workflow_type": "synthesis"
  }
}

When the contract fails, issues lists specific requirements not met:

{
  "event": "harness_contract",
  "data": {
    "passed": false,
    "score": 0.45,
    "issues": [
      "Minimum 3 insights required, found 1",
      "Citations required for synthesis workflow"
    ],
    "workflow_type": "synthesis"
  }
}

harness_verifier

Result of the independent verification agent (a separate Haiku call with fresh context).

{
  "event": "harness_verifier",
  "data": {
    "passed": true,
    "score": 0.82,
    "issues": ["Response could include more cross-source triangulation"],
    "feedback": "Good analysis with strong citations. Could benefit from comparing across more sources."
  }
}

harness_stream_alert

Real-time quality alert detected mid-stream (before completion). Used for early warning of potential issues.

{
  "event": "harness_stream_alert",
  "data": {
    "severity": "warning",
    "checkpoint": 500,
    "issue": "Response appears to be a refusal or very short answer",
    "suggestion": "Agent may lack necessary context — consider adding research sources"
  }
}
FieldValues
severity"warning" or "critical"
checkpointCharacter position where alert was detected (500, 2000, 5000)

harness_complexity

Emitted in pre-flight after the Haiku complexity classifier returns.

{
  "event": "harness_complexity",
  "data": {
    "level": "complex",
    "reason": "Query requires sequential phases: analyze interviews, generate recs, draft PRD.",
    "subtask_count": 3
  }
}

level is "simple", "medium", or "complex". subtask_count is non-zero only when level == "complex".

harness_plan_selected

Emitted for COMPLEX queries after PlanGenerator produces 2-3 candidate plans and PlanSelector picks the highest-scoring one (deterministic scoring — no LLM call).

{
  "event": "harness_plan_selected",
  "data": {
    "plan": {
      "plan_id": "a1b2c3d4",
      "title": "Evidence-first synthesis",
      "description": "Analyze sources by theme, then synthesize cross-source insights",
      "suggested_workflow": "synthesis",
      "prompt_modifier": "Focus on triangulating claims across 3+ sources before...",
      "estimated_turns": 6,
      "subtasks": [],
      "rationale": "Query emphasizes evidence quality over speed"
    },
    "score": {
      "coverage": 0.82,
      "efficiency": 0.70,
      "feasibility": 0.80,
      "alignment": 1.00,
      "total": 0.818
    },
    "candidates_count": 3
  }
}

harness_swarm_activated

Emitted when the specialist swarm starts for high-source-count synthesis.

{
  "event": "harness_swarm_activated",
  "data": {
    "source_count": 12,
    "specialist_count": 4,
    "specialists": ["themes", "evidence", "contradictions", "opportunities"]
  }
}

harness_swarm_complete

Emitted when all specialists + aggregator have finished.

{
  "event": "harness_swarm_complete",
  "data": {
    "specialist_count": 4,
    "specialists_succeeded": 4,
    "aggregator_succeeded": true,
    "total_duration_ms": 48200,
    "estimated_cost_usd": 0.0412
  }
}

harness_retry

Emitted between attempts when RetryPolicy.should_retry returns True. The next attempt uses a healing prompt built from the prior attempt's contract + verifier issues.

{
  "event": "harness_retry",
  "data": {
    "attempt": 2,
    "reason": "Contract failed (gap=0.35) — retrying with healing prompt",
    "previous_score": 0.35
  }
}

attempt is 1-based and refers to the attempt that is about to start (so attempt: 2 means the first retry). previous_score is the combined (contract_passed + verifier_score) / 2.0 of the attempt that just finished.

harness_degraded

Emitted when all retry attempts are exhausted without passing the quality threshold.

{
  "event": "harness_degraded",
  "data": {
    "attempts_made": 3,
    "best_score": 0.52,
    "workflow_type": "synthesis",
    "issues": [
      "Insufficient insights: 2 created (minimum 3)",
      "Response could include more cross-source triangulation"
    ]
  }
}

harness_iteration_complete

Emitted after deep-synthesis iteration runs (multi-pass refinement). Summarizes the best result across all iterations.

{
  "event": "harness_iteration_complete",
  "data": {
    "iterations": 4,
    "best_score": 8.2,
    "accepted_count": 3,
    "total_duration_ms": 62400,
    "estimated_cost_usd": 0.0583
  }
}

harness_workspace_profile

Emitted once per post-flight (when telemetry history exists) to surface the workspace's current quality profile.

{
  "event": "harness_workspace_profile",
  "data": {
    "maturity": "developing",
    "total_runs": 23,
    "avg_quality": 0.742,
    "feedback_rate": 0.857,
    "common_issues": [
      "Insufficient insights: 1 created (minimum 3)"
    ],
    "quality_multiplier": 1.0
  }
}

maturity is "new" (0-9 runs), "developing" (10-49), or "mature" (50+). quality_multiplier scales contract thresholds — 0.85 for new, 1.00 for developing, up to 1.15 for mature workspaces with avg_quality > 0.75.

context_metrics

Emitted just before the done event. Summarizes how much tool-result context was consumed during the run — useful for debugging context-window pressure.

{
  "event": "context_metrics",
  "data": {
    "total_tool_calls": 8,
    "cumulative_result_chars": 52400,
    "estimated_tool_result_tokens": 13100,
    "largest_result": ["get_workspace_context", 18200],
    "calls_per_tool": {
      "get_workspace_context": 1,
      "search_sources": 4,
      "save_insights": 3
    }
  }
}

estimated_tool_result_tokens is a rough estimate at ~4 chars per token. largest_result is a [tool_name, char_count] tuple for the single largest tool result.

Was this helpful?