Core Concepts
Tahyi is built on four ideas: specialize, coordinate, act safely, and stay inspectable. This page defines the vocabulary used across the docs and the product.
The swarm
A swarm is a set of specialist agents plus the coordination layer that routes work between them. The swarm is the product — not any single agent.
┌─────────────────────────────────────────────────┐
│ Coordination layer │
│ tasks · context · safety gates · audit log │
└────────┬──────────┬──────────┬──────────┬───────┘
│ │ │ │
┌────▼───┐ ┌────▼───┐ ┌────▼───┐ ┌────▼───┐
│ Deploy │ │ Monitor│ │ Observe│ │ DBA │
│ agent │ │ agent │ │ agent │ │ agent │
└────────┘ └────────┘ └────────┘ └────────┘
Why a swarm? Production infrastructure spans domains no single model context can hold well. Narrow experts with explicit handoffs outperform one generalist that drifts, forgets, or overreaches.
Specialist agents
Each specialist owns one operational domain:
| Specialist | Owns | Example work |
|---|---|---|
| Deployment | Release pipelines, rollouts, rollbacks | Promote a canary, roll back a bad deploy |
| Monitoring | Alerts, SLOs, on-call routing | Tune alert noise, acknowledge incidents |
| Observability | Logs, metrics, traces | Find root cause across signals |
| DBA | Databases, migrations, backups | Run a verified migration, restore a snapshot |
Specialists are replaceable — you should be able to swap or upgrade one without rewriting the swarm. If removing an agent breaks unrelated domains, coupling is too tight.
Coordination layer
The coordination layer turns isolated agents into a swarm:
- Tasks — bounded units of work with an owner, status, and thread.
- Context — environment state, recent actions, and policy rules passed to agents on wake.
- Handoffs — when one specialist finishes, the layer routes follow-up work to the right peer.
- Heartbeats — agents wake on a schedule or when assigned work; they do not run unbounded loops.
Think of the coordination layer as the operating system for infrastructure agents — specialists are the processes; the layer is the scheduler, IPC, and audit log.
Safety gates
Tahyi follows autonomy with safety:
| Door type | Definition | Agent behavior |
|---|---|---|
| Two-way | Reversible — rollback is feasible | Agent may act autonomously within policy |
| One-way | Irreversible or high blast radius | Agent plans, logs, and waits for approval |
Examples of one-way doors: destructive DDL, production deletes, irreversible infra teardown, credential rotation that invalidates live sessions.
Every autonomous action must name its guardrail — dry-run, approval gate, reversibility proof, or blast-radius cap. No guardrail → not shippable.
Inspectability
Nothing happens in the dark. Tahyi records:
- What the agent intended to do
- Why — linked task, trigger, and policy rule
- How — tool calls, API requests, and outcomes
- Who — which specialist, which environment, which approval
The audit log is append-only. You can replay an incident timeline without reconstructing it from chat history.
Founding domains (1.0 scope)
The first GA release (SCOPE) ships four specialists — deployment, monitoring, observability, and DBA — plus the coordination layer and safety model.
Domains like SecOps, FinOps, and networking are out of scope for 1.0 unless the board explicitly moves the fence.
Release trajectory
Tahyi ships usable increments (VERSIONING):
| Version | Theme |
|---|---|
| 0.1 | Single specialist runs real work end-to-end |
| 0.2 | Coordinated swarm — multiple specialists, shared layer |
| 0.3 | Production-trustworthy — safety model hardened |
| 1.0 | GA — autonomous ops with gated one-way doors |
Each 0.x must deliver value on its own — not scaffolding that only makes sense after the next release.
Related reading
- Introduction — problem, audience, mission
- Quickstart — first-run path
- How it Works — execution flow