Skip to main content
UNINITIALIZED: CLICK TO START HEARTBEAT

IDLE_STATE

70% REDUCTION // ZERO CLOUD COST

DESIGN DOCUMENT v1.1March 2026

Omni Command Center

The unified operational control plane for AUTONOMOUS.ML. The single interface through which human operators observe, configure, instruct, and govern every autonomous agent and AI subsystem deployed across the enterprise SDLC.

The OCC does not itself execute SDLC tasks. It acts as the mission control layer that sits above all functional agents, providing real-time visibility, policy enforcement, mode governance, and instruction dispatch.

ARCHITECTURE

Where the OCC Sits in the Stack

The OCC is the topmost layer of the AUTONOMOUS.ML architecture. It communicates with the Agent Host and AI Decision Module over a WebSocket control bus for real-time telemetry and instruction delivery. Configuration changes are pushed via tRPC procedures that the Agent Host hot-reloads without a service restart. All five integration layers are described below.

┌─────────────────────────────────────────────────────────────────────┐
│                    OMNI COMMAND CENTER (Web UI)                      │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌────────────┐ │
│  │  Live Monitor │ │Control Panel │ │  Instruction │ │  System    │ │
│  │  (Thoughts,  │ │ (Features,   │ │  Dispatch    │ │  Health    │ │
│  │  Pulse, KPIs)│ │  Models,     │ │  (Direct +   │ │  (Agents,  │ │
│  │              │ │  Prompts)    │ │  Teams)      │ │  Models)   │ │
│  └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └─────┬──────┘ │
└─────────┼────────────────┼────────────────┼───────────────┼────────┘
          │                │                │               │
          ▼                ▼                ▼               ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    OCC CONTROL BUS (WebSocket + tRPC)                │
└─────────────────────────────────────────────────────────────────────┘
          │                │                │               │
          ▼                ▼                ▼               ▼
┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│  Agent Host  │  │ AI Decision  │  │  Azure DevOps│  │  Distributed │
│  (.NET 8.0)  │  │  Module      │  │  Integration │  │  Exec Network│
│  Workflow    │  │  (Ollama /   │  │  (Boards,    │  │  (Hub +      │
│  Engine      │  │  vLLM)       │  │  Test Plans, │  │  Minions)    │
└──────────────┘  └──────────────┘  └──────────────┘  └──────────────┘
                                           │
                                           ▼
                            ┌──────────────────────────┐
                            │  AI Model Management &   │
                            │  Training Arena (v5.0)   │
                            └──────────────────────────┘

Agent Host

.NET 8.0 WorkerService

The autonomous polling loop and workflow engine. The OCC connects via a persistent WebSocket control bus for real-time telemetry and instruction delivery, and via tRPC for configuration hot-reload without service restart.

WebSocket + tRPC

AI Decision Module

Ollama / vLLM — Local CPU Inference

Controlled indirectly through the Agent Host. Model selection changes in the Control Panel are pushed via tRPC to the Agent Host's AIDecisionService, which routes inference to the correct Ollama endpoint. No model endpoint is ever exposed to the browser.

tRPC → Agent Host → Ollama

Azure DevOps Integration

Boards · Test Plans · Repos

The OCC issues natural-language instructions to the Agent Host, which translates them into Azure DevOps API operations. Aggregated KPIs (work items claimed, test cases executed, PRs reviewed) are read from OpenTelemetry metrics — never from the Azure DevOps API directly.

Instruction Dispatch → Agent Host → Azure DevOps REST

Distributed Execution Network

Hub + Minion Test Farm

The Hub node exposes a REST API consumed by the Agent Host's DistributedExecutionService, which acts as a proxy for the OCC. Operators can pause, resume, or reassign test jobs across the minion farm. Minion health is displayed in the System Health Monitor.

Instruction Dispatch → Agent Host → Hub REST API

AI Model Management & Training Arena

v5.0 — Model Registry · Arena · Synthetic Data

The Control Panel model selector is populated from the v5.0 model registry. Only models in a deployed state can be selected without confirmation. Training Mode labelled data is exported to the Synthetic Data Generation module via S3-compatible storage for fine-tuning.

tRPC → Model Registry API + S3 Export
CORE MODULES

Five Functional Modules

Live Monitor

  • Active Thoughts panel — streams agent inner monologue with confidence scores, reasoning latency (ms), and semantic tags (PLAN / OBSERVE / CRITIQUE / EXECUTE)
  • SDLC Pulse panel — real-time health bars for all four capability groups with network pressure throb visualization
  • Intervention Queue — mode-aware approval workflow with mandatory rejection reason and full audit trail
  • Emergency Stop / Kill Switch — persistent header button that halts all autonomous execution within one polling cycle

Control Panel

  • Master enable/disable toggle per capability group with per-group confidence floor threshold
  • AI model selector (Granite 4, Phi-3, Llama 3 8B/70B, Codestral)
  • Fully editable system prompt per group with prompt sanitizer, unsaved-changes indicator, and versioned prompt history with named restore points
  • Safety Guardrails field per group — negative constraints that the AI must never violate regardless of mode
  • Sub-feature toggles for all 18 individual capabilities

Operational Mode Selector

  • Training Mode — observe and log patterns, no autonomous execution
  • Intern Mode — propose interventions, wait for human approval with mandatory rejection reason
  • Autonomous Mode — execute independently with full audit trail
  • Shadow Mode — execute in isolated mirror environment, zero production impact, excluded from KPI aggregation

Instruction Dispatch

  • Instruction type: Command, Query, Configuration Change, Emergency Stop
  • Target: All Agents, Requirement, Test, QA, or SDLC Agent
  • Delivery: Direct (WebSocket) or Microsoft Teams (Adaptive Card webhook)
  • Instruction History log with Pending → Dispatched → Acknowledged status

System Health Monitor

  • Per-agent status: healthy / degraded / offline
  • Per-model metrics: confidence score, activity level, last inference latency
  • Polled from Agent Host OpenTelemetry metrics endpoint (default: 5s interval)
OPERATIONAL MODES

Four Levels of Autonomy

The active mode governs the entire AUTONOMOUS.ML system. It is displayed as a persistent badge in the OCC header and drives behaviour across all modules. Mode changes take effect immediately for all new interventions. Shadow Mode (added in v1.1) enables safe production-mirror testing with zero production impact.

Training Mode

The AI observes all SDLC events and logs decision patterns without executing any interventions. Operators review and label the log to build a training dataset for model fine-tuning. Recommended during initial deployment and after major system changes.

Recommended for: Initial deployment · Post-upgrade validation · Dataset collection

Intern Mode

The AI proposes interventions and waits for explicit human approval before executing each one. Rejected interventions are fed back to the AI Decision Module as negative examples. A mandatory rejection reason is required before the Reject action is accepted — this reason is logged and fed back to the AI as a structured negative example. Recommended for new capability groups or production environments with low error tolerance.

Recommended for: New capability rollout · Production environments · Compliance-sensitive workflows

Autonomous Mode

The AI executes all enabled interventions independently, subject only to the feature toggles and model configuration in the Control Panel. Every action is logged with a full audit trail. Recommended for stable, well-validated capability groups in non-production environments.

Recommended for: Stable CI/CD pipelines · Non-production environments · High-volume test farms

Shadow Mode

The AI executes all interventions in a fully isolated shadow environment that mirrors production state but has zero production impact. Shadow Mode enables safe validation of new capability groups, model upgrades, and prompt changes against real workloads before promoting to Autonomous Mode. All shadow executions are logged with a SHADOW prefix in the audit trail and are excluded from KPI aggregation.

Recommended for: Pre-production validation · Model upgrade testing · Prompt regression testing · Canary deployments
DATA FLOWS

Instruction & Intervention Lifecycles

Instruction Lifecycle

  1. 1Operator composes instruction (type, target, content)
  2. 2OCC validates and appends to history with status: Pending
  3. 3Determines delivery channel (Direct / Teams / Both)
  4. 4Direct: sends JSON payload via WebSocket control bus
  5. 5Teams: POSTs Adaptive Card to configured webhook URL
  6. 6Agent Host routes instruction to target agent handler
  7. 7Agent Host logs instruction in PostgreSQL audit trail
  8. 8WebSocket acknowledgement sent back to OCC
  9. 9OCC updates history entry status: Acknowledged

Intervention Lifecycle (Intern Mode)

  1. 1Agent Host analyses work item / code / test result
  2. 2Generates intervention proposal with confidence score
  3. 3Sends intervention event via WebSocket to OCC
  4. 4OCC renders intervention card with Approve / Reject buttons
  5. 5Operator reviews proposed action
  6. 6Approve → OCC sends approval via WebSocket
  7. 7Agent Host executes intervention, updates audit trail
  8. 8Reject → OCC sends rejection via WebSocket
  9. 9Agent Host logs rejection as negative example for AI learning
QUALITY GATES

Acceptance Criteria

Acceptance criteria are defined at four granularity levels — Function, Class, Module, and System — consistent with the AUTONOMOUS.ML self-testing framework. A representative subset is shown below; the full set is in the design document.

IDLevelCriterionPass Condition
S-OCC-001SystemEnd-to-end instruction flowOperator instruction reaches Agent Host and is logged within 5 seconds
S-OCC-002SystemMode change propagationSwitching to Intern mode causes all subsequent interventions to require approval
S-OCC-003SystemFeature disable propagationDisabling a group stops its interventions within one polling cycle
S-OCC-004SystemTraining data exportLabelled dataset exported to S3 and ingested by Training Arena without data loss
S-OCC-005SystemHealth accuracyOCC display matches Agent Host OpenTelemetry metrics within one polling interval
S-OCC-006SystemWCAG 2.2 AAA complianceAll OCC panels pass axe-core AAA audit with zero violations
S-OCC-007SystemScroll stabilityPage scroll position unchanged across 100 consecutive thought stream updates
F-OCC-004FunctionThought panel bounded scrollwindow.scrollY unchanged after thought entry appended
F-OCC-007FunctionApproval visible in Intern modeApprove/Reject buttons rendered on intervention cards when mode = Intern
F-OCC-009FunctionFeature toggle suppresses interventionsZero interventions generated for a disabled capability group
M-OCC-001ModuleWebSocket reconnects after disconnectOCC reconnects within 5 seconds of simulated disconnect
M-OCC-004ModuleConfig push hot-reloads Agent HostAgent Host applies new model/prompt without service restart
F-OCC-010FunctionReasoning latency displayed per thoughtEvery thought entry shows latency_ms field; Deep Thought state triggered when latency > 3000ms
F-OCC-011FunctionConfidence floor enforcedInterventions with confidence below the per-group floor are suppressed and logged as BELOW_FLOOR
F-OCC-012FunctionSafety guardrails persist across prompt editsGuardrail field value is stored separately from system prompt and survives prompt restore operations
F-OCC-013FunctionShadow mode produces no production writesZero Azure DevOps API write calls occur during a 60-second Shadow Mode execution session
F-OCC-014FunctionSemantic thought tags rendered correctlyPLAN entries render blue, OBSERVE green, CRITIQUE amber, EXECUTE red with correct tag label
F-OCC-015FunctionPrompt version history saves and restoresNamed version saved, restored, and applied to Agent Host within one hot-reload cycle
F-OCC-016FunctionMandatory rejection reason enforcedReject button disabled until rejection reason field contains at least 10 characters
F-OCC-017FunctionNetwork pressure visualization updatesThrob animation intensity correlates with message queue depth; idle state within 2 seconds of queue drain
S-OCC-008SystemEmergency stop halts all executionAll autonomous interventions cease within one polling cycle (default 5s) of Emergency Stop activation
F-OCC-018FunctionPrompt sanitizer blocks credential patternsPrompt containing AWS_SECRET_ACCESS_KEY pattern is rejected with specific warning; save is blocked
DESIGN IMPROVEMENTS v1.1

10-Point Enhancement Specification

The following enhancements were formally reviewed and approved for inclusion in OCC v1.1. Each entry specifies the section of the design document it updates, the rationale, and the implementation contract.

#1§4.1 SDLC Pulse

Reasoning Latency KPI

Specification

Each thought entry includes a latency_ms field measuring the time from inference request to response. When latency exceeds 3000ms, the thought is tagged with a DEEP_THOUGHT indicator and the SDLC Pulse panel displays a latency warning. Aggregated P50/P95 latency is surfaced as a KPI card.

Rationale

Latency is a leading indicator of model overload and resource contention. Operators need early warning before throughput degrades.

#2§4.3 Control Panel

Confidence Floor Thresholds

Specification

Each capability group exposes a confidence_floor slider (0.0–1.0, default 0.7). Interventions with a confidence score below the floor are suppressed before reaching the Intervention Queue and logged as BELOW_FLOOR events. The floor is stored per-group in the configuration store and hot-reloaded to the Agent Host.

Rationale

A single global confidence threshold is too coarse. High-stakes groups (QA, Security) require a higher floor than lower-risk groups (Documentation).

#3§4.2 Control Panel

Safety Guardrails / Negative Constraints

Specification

A Safety Guardrails textarea is exposed per capability group, separate from the system prompt field. Guardrail content is prepended to every inference request as a system-level constraint block that the model must not violate, regardless of operational mode. Guardrails survive prompt restore operations and are versioned independently.

Rationale

System prompts are editable and can be accidentally overwritten. Guardrails provide a tamper-resistant negative constraint layer that persists across all prompt changes.

#4§4.3 Operational Modes

Shadow Mode (4th Operational Mode)

Specification

Shadow Mode executes all interventions in an isolated environment that mirrors production state. No Azure DevOps API write calls are issued. Shadow executions are logged with a SHADOW_ prefix in the audit trail and are excluded from all KPI aggregation. The Agent Host routes Shadow Mode instructions to a dedicated ShadowExecutionService that replays against a read-only snapshot.

Rationale

Autonomous Mode cannot be safely tested against production without Shadow Mode. It enables model upgrades and prompt changes to be validated against real workloads before promotion.

#5§4.1 Active Thoughts

Semantic Thought Tagging

Specification

Every thought entry is tagged with one of four semantic types: PLAN (strategic decomposition), OBSERVE (data gathering), CRITIQUE (quality assessment), EXECUTE (action dispatch). Tags are determined by the Agent Host based on the instruction type and inference context. The OCC renders each tag with a distinct colour: PLAN=blue, OBSERVE=green, CRITIQUE=amber, EXECUTE=red.

Rationale

Untagged thought streams are difficult to audit. Semantic tags allow operators to quickly identify whether the AI is planning, observing, critiquing, or acting — and to filter the stream by cognitive phase.

#6§4.2 Control Panel

Versioned Prompt Config Store

Specification

The system prompt editor includes a version history drawer. Operators can save named snapshots (e.g. "v2.1 — post-incident hardening") and restore any previous version. Restore triggers a hot-reload to the Agent Host without service restart. The version store persists in the database with timestamps, author, and a diff preview against the current prompt.

Rationale

Prompt engineering is iterative and error-prone. Without version history, a bad prompt change cannot be rolled back without manual reconstruction.

#7§4.4 Intervention Queue

Mandatory Rejection Reason

Specification

In Intern Mode, the Reject button is disabled until the operator enters a rejection reason of at least 10 characters. The reason is logged in the audit trail and transmitted to the Agent Host as a structured negative example with the rejection_reason field. The Agent Host's AI Decision Module incorporates rejection reasons into its fine-tuning dataset.

Rationale

Unlabelled rejections provide no learning signal. Mandatory reasons ensure every rejection improves the model and creates a human-readable audit trail for compliance.

#8§5.4 SDLC Pulse

Network Pressure Visualization

Specification

The SDLC Pulse panel displays a network pressure indicator that reflects the current WebSocket message queue depth. At low pressure (queue < 5), the indicator is static. At medium pressure (5–20), a slow throb animation is applied. At high pressure (> 20), a fast throb with an amber warning badge is displayed. Queue depth is reported by the Agent Host via the OpenTelemetry metrics endpoint.

Rationale

A backed-up message queue is an early indicator of Agent Host overload. Visual pressure feedback allows operators to intervene before messages are dropped.

#9§4.4 OCC Header

Emergency Stop / Kill Switch

Specification

A persistent Emergency Stop button is displayed in the OCC header at all times. Clicking it opens a confirmation modal requiring the operator to type CONFIRM. On confirmation, an EMERGENCY_STOP instruction is dispatched to all agents via the WebSocket control bus. The Agent Host halts all autonomous execution within one polling cycle (default 5s). The OCC transitions to Training Mode and displays a persistent EMERGENCY STOP ACTIVE banner until manually cleared.

Rationale

Autonomous systems require a hardware-equivalent kill switch accessible from any page without navigation. The confirmation step prevents accidental activation.

#10§10 Security

Prompt Sanitizer / Security Judge

Specification

The system prompt editor runs a client-side sanitizer on every save attempt. The sanitizer checks for: (1) credential patterns (AWS keys, GitHub tokens, connection strings), (2) prompt injection patterns (ignore previous instructions, jailbreak templates), (3) PII patterns (SSN, credit card numbers). Any match blocks the save and displays a specific warning with the matched pattern type. Sanitizer rules are loaded from a versioned blocklist stored in the database.

Rationale

System prompts are a high-value injection surface. Operators may accidentally paste credentials or be socially engineered into inserting injection payloads. Client-side scanning provides immediate feedback before any data reaches the server.

SECURITY

Security Considerations

All communication between the OCC and the Agent Host occurs within the enterprise network. The WebSocket control bus endpoint is not exposed to the public internet.

Operator authentication is enforced via the Manus OAuth flow. Only authenticated users with the operator role can access the OCC and dispatch instructions.

The Teams webhook URL is stored in the Agent Host's secrets management system (Azure Key Vault for production, DPAPI for on-premises) and is never transmitted to the OCC frontend. The OCC stores only a masked representation in localStorage.

System prompts are validated against a blocklist of credential patterns before saving to prevent accidental storage of API keys or PII in configuration.

OAuth-gated access

Only operator-role users can dispatch instructions or modify configuration

Secrets never in browser

Teams webhook URL stored in Azure Key Vault / DPAPI, masked in localStorage

Local-first inference

All model inference via Ollama/vLLM — no model endpoints exposed to browser

Prompt credential scanning

System prompts validated against credential pattern blocklist before save

Ready to explore the live dashboard?

The Omni Command Center is available now. Start in Training Mode to observe agent behaviour before enabling autonomous execution.