Squad: Open-Source Multi-Agent Orchestration

I built lobs-core over eight versions. Each time, the bottleneck wasn't the agent itself — it was coordination. Running one agent is easy. Running a team of agents that need to talk to each other, share state, spawn subagents, and route requests across platforms? That's where every framework falls apart.

Squad is what I built to solve that properly.

The Problem with Script-Based Orchestration

Most agent coordination today happens one of two ways: either a single monolithic agent that tries to do everything, or a mess of scripts that trigger other scripts with bash pipes and prayer.

The first approach hits ceiling after ceiling as you add capability. The second approach is technically functional but breaks the moment you need anything more than a linear pipeline. What happens when a subagent fails? When you need parallel execution? When you need to route a question back to the user mid-workflow?

Most frameworks patch around these with markdown instructions and hope the LLM follows them. Squad treats coordination as a first-class concern with native protocol support.

The Gateway: Center of Everything

Squad's architecture has one core principle: the gateway is the center of everything.

Connector (Discord, Slack, API...) → Gateway → Agent Runtime (Ollama, OpenAI...)

All agents register with the gateway. All communication flows through the gateway. This means:

Session management is centralized — no lost messages, no orphaned sessions
Routing is smart — the gateway knows which agents are available and what they're doing
Connectors are stateless — they just normalize messages and forward them, no agent logic lives there
You can run a Discord bot and a Slack bot with the same agent — same brain, different faces

Connectors: One Agent, Many Faces

Connectors are adapters for each platform. Discord, Slack, HTTP, CLI — they all look the same from the gateway's perspective. The connector converts the platform's protocol into Squad's internal message format, and vice versa on the way back.

This means you can:

Swap Discord for Slack by swapping the connector — zero changes to agents
Deploy connectors anywhere — same machine or different servers
Build new connectors in any language — the protocol is open

Native Primitives

The three core coordination primitives in Squad are:

Tasks — Work items with state, ownership, and lifecycle. A task can fail, be retried, or spawn sub-tasks. The gateway tracks task state so nothing gets lost.

Subagents — Spawning a subagent isn't a markdown instruction, it's a protocol message. The gateway handles spawning, routing, and result collection. You get parallelism without race conditions.

Ask-user — Per-channel, blocking. The gateway pauses the workflow and asks the user a specific question before continuing. Not a tool, not a markdown instruction — a first-class protocol primitive.

Pluggable Runtimes

Squad doesn't care what model you use. The runtime interface is clean and swappable. Currently supported:

OpenAI (GPT-4o, GPT-4o-mini, o1, o3, o4-mini)
Anthropic (Claude 3.5, 3.7, 4.0)
Google (Gemini 2.0, 2.5, Flash, Pro)
xAI (Grok 3, 2, 1.5)
Mistral (Large, Medium, Small)
Groq (Llama, Mixtral)
AWS Bedrock, Azure OpenAI, OpenRouter, Cloudflare Workers AI, Cohere, Replicate, and 14 more

Swap providers with a config change. Test the same workflow against different models without rewriting anything.

Plugin API

The plugin system exposes hooks for every lifecycle event. Write once, run in any agent. Plugins handle:

Pre/post message processing
Memory layer management
Custom tools and tool routing
Metrics and observability

Docker-First

Every agent runs in its own container. Isolation, reproducibility, and one command to deploy. The Docker Compose setup gets you a full Squad environment in under a minute:

git clone https://github.com/lobs-ai/squad.git
cd squad
docker compose up

Status

Gateway core and basic connectors are working. The agent runtime, workflow engine, and full plugin system are in progress. See SPEC.md for the complete design specification.

What's Next

The workflow engine is the biggest remaining piece — DAG-based workflow definitions with branching, rollback, and cron scheduling. Then the observability layer: structured logs, traces, and a UI to see what's happening across your agents in real time.

If you want to follow along or contribute, the repo is at github.com/lobs-ai/squad.