legion

Model-agnostic orchestration engine

We are many.
We move as one.

The orchestration engine for coding agents. Give Legion a task and it plans it, runs it across Codex, Cursor, and Claude in parallel, cross-checks the work, heals what breaks, and hands you one verified pull request — reliably enough to leave running overnight.

$curl -fsSL https://github.com/Opus-Aether-AI/legion-core/releases/latest/download/install.sh | bash

or build on it — npm i @opus-aether-ai/legion-core

legion — zsh

$legion-delegate run --archetype implement-feature --task "wire OTLP export"

→ worktree .legion/worktrees/20260703-1519 (base main, sealed)

→ codex exec -m gpt-5.4 -s workspace-write (effort=high)

span gpt-5.4 implement 12,847 tok $0.13 42s ok

span gpt-5.5 verify 3,201 tok $0.06 11s ok

✓ diff verified · 4 files · +212 −38 · yours to apply

What it does

One task in. One verified PR out.

Legion is an orchestrator, not another autocomplete. It runs the whole loop and stops where you want control — the merge.

01

Plan

Grills the prompt across models until the intent is sharp, then decomposes the goal.

02

Execute

Fans each slice to the right model, in its own isolated git worktree, in parallel.

03

Review

Cross-checks across models and runs your gates. Nothing ships on one model's say-so.

04

Ship

Stitches it into a single pull request. Never auto-merged. You approve.

And it watches itself the whole way — every step metered and traced — and opens a fix PR when something breaks.

The long run

Built to run long.

Most agents drift the moment you stop watching. Legion was built for the opposite — the overnight refactor, the nightly dependency sweep, the background watch on your build. It decomposes the goal, isolates each slice, verifies across models, and stops at a reviewable PR — so a long run ends in something you can trust, not a mess you untangle in the morning.

That's the line between a coding assistant and an orchestration engine.

The machine

The engine

The core never learns your domain. It gives you the parts every agent needs — so you start at delegation, telemetry, health, and healing instead of a blank page.

01

Scoped delegation

Hand a scoped task to any model in an isolated git worktree. You get a verified, metered diff back — the model gets a task, not your repo.

02

Telemetry in dollars

Every unit of work emits a legion.span.v1 record — executor, model, tokens, cost, timing — on one dollar basis across Claude, GPT, and Cursor.

03

Self-learning

A daily loop mines spans, reviews, and failures into memory and proposals, keeping only the mutations that measure better against a baseline.

04

Auto-healing

Opt-in remediation turns a doctor finding into an isolated delegated fix, re-gates it, and opens a reviewable PR. Never auto-merged.

Routing

Work finds the cheapest capable hand.

A routing policy — yours to tune — sends each archetype to the right executor. The board is live; the spend is metered against Claude list prices so every model reads on one scale.

archetypeexecutorcost
implement-featuregpt-5.4 · codex$0.13
cross-model-verifygpt-5.5 · codex$0.06
deep-reasoningclaude · direct$0.42
bulk-mechanicalcursor · agent$0.00

Evidence

Evidence over intuition.

Be skeptical of the cheap number. The savings are model choice, not magic — GPT-5.4 via Codex passes everything Claude does on this tier at a quarter the cost, and you could get that by running Codex yourself. Legion's job is to make that call for you — per task, only when the cheap model clears the bar — and hand back a verified, metered diff.

Heldout OSS Hard · live · 2026-06-27

38 / 38
correctness · saturated
$3.95
vs $14.83 direct
cheaper model, not orchestration
direct claude$14.83
legion → codex$3.95

5 modes × 19 tasks × 2 runs; every mode passed 38/38, so the tier saturates on correctness and cost is the only signal. Honestly: this proves GPT is cheaper than Claude, not that orchestration is. The parts that would prove Legion — routing the hard tasks back to Claude, verification catching what direct Codex ships, one metered scale across models — a saturated tier can't show. Legion also adds latency (P95 ~87s vs ~41s). The real proof is cost-per-correct on a tier that isn't saturated; we'll publish it when we have it, not before.

Two ways to run it

Yours to run. Or ours to run with you.

Open source

free forever

Install it, point it at your own model accounts, and own the whole thing — it runs locally, no Legion cloud. Apache-2.0, daily releases.

Enterprise

run with us

Run Legion inside your environment, gated by your standards — your Sonar and quality gates, your guardrails — with us on the line and an SLA. Or we build your domain agent on it.

The moat

Your agent. Our spine.

Build your domain agent on legion-core. Vendor the engine into your repo, and it stays fresh, gated, and accountable — while your domain stays yours.

your-agent — marketplace.json

"plugins": [

{ "name": "legion-router", "source": "git-subdir" },

{ "name": "legion-observability", "source": "git-subdir" },

{ "name": "trading-desk", "source": "./your-domain" }

]

$ legion-doctor

✓ marketplace · ✓ skills · ✓ codex bridge · ✓ cursor bridge · 0 findings

#212 chore: weekly legion-core refresh — the spine updates itself

Vendor the engine

One marketplace entry pulls the spine in. The engine stays ours to maintain; your domain stays yours.

It stays fresh

A weekly refresh PR updates the engine like any dependency. Review, merge, move on.

Gate everything

legion-doctor runs in CI and refuses to ship a broken garrison — skills, bridges, schemas, auth.

The garrison

Install once. Command everywhere.

Skills, MCPs, and bridges follow you across harnesses. And the engine tends itself — doctor finds, heal opens the fix, self-learn keeps only what measures better.

Claude CodeCodex CLICursoropencode

the self-learn loop

observeanalyzeproposescorekeep / discard

The engine is open source.
The business is accountability.

Enterprise is for teams who want to run it with us. We onboard your harnesses, design your routing policy, and hold the SLA. Or, we build your domain agent on legion-core.

Contact ai@opusaether.com