Skip to main content

Tutorial: delegate an implementation to Codex via handoff

This is the flagship ContextRelay workflow: one agent scopes and reviews, the other implements. You will watch a small, well-scoped task travel through a full round trip - Claude writes the brief and hands it off, Codex implements and records evidence, and Claude reviews the result and decides whether it is done.

The example is deliberately tiny so you can focus on the mechanics and the content of a good handoff, not on the code. By the end you will know which tool each agent calls at each step, and why the handoff survives even though the two agents never see each other's hidden reasoning.

Why hand off at all?

Claude Code and Codex run as two separate processes on your machine. They cannot read each other's chain of thought - they only share what is written into ContextRelay messages and the durable ledger. A handoff is the structured way to pass ownership: it records why control is moving, what the receiving agent should do, and which files matter, then delivers that live to the other agent.

Each agent has a natural lane. A good rule of thumb (and the default the managed instruction blocks encode):

  • Codex is well suited to focused implementation, test writing, and debugging.
  • Claude is well suited to repo-wide reasoning, risk review, and a second opinion before shipping.

So "add a function plus a unit test" is a textbook job to hand to Codex, with Claude coordinating and reviewing. See Why two agents in one session for the bigger picture.

Prerequisites

You need a running paired session. If you have not done it yet, walk through your first paired session first - launch the pair with ctxrelay pair (or ctxrelay claude and ctxrelay codex in two terminals). This tutorial assumes both agents are connected to the same project daemon.

The binaries ctxrelay, contextrelay, and context-relay are interchangeable; this page uses ctxrelay.

The scenario

Goal: add a small slugify(text) helper to src/utils/slugify.ts and a unit test for it. Acceptance check: the new test passes and the existing suite stays green.

We will play it out from Claude's seat (you are pairing with Claude as the coordinator-side reviewer in this example), handing the build to Codex.

Step 1 - Scope it as Claude, and write the context down

Before handing anything off, capture the task in the ledger so the context survives the handoff and any later recovery. Claude calls append_note to record the goal, the concrete ask, the files to touch, and the acceptance check:

Claude → append_note:
text: |
Goal: add a slugify(text) helper.
Files to touch: src/utils/slugify.ts (new), src/utils/slugify.test.ts (new).
Behavior: lowercase, trim, collapse non-alphanumerics to single hyphens, strip
leading/trailing hyphens. Pure function, no deps.
Acceptance: new test passes; existing `bun test src` stays green.
Owner next: Codex (implementation).
Why write a note before the handoff?

The handoff itself carries the ask, but a separate append_note makes the goal and acceptance criteria a durable, first-class ledger entry that both agents - and a later read_context after a crash - can re-read verbatim. Agents cannot see each other's reasoning; the ledger is the shared memory.

Step 2 - Hand off to Codex

Now Claude passes ownership with the handoff tool. A strong handoff has four parts: a reason, a concrete ask, the context refs, and an implicit who-speaks-next (the handoff tool always sets the next speaker to Codex):

Claude → handoff:
reason: "Codex is better suited to focused implementation."
ask: |
Implement slugify(text) in src/utils/slugify.ts per the note I just appended:
lowercase, trim, collapse non-alphanumerics to single hyphens, strip
leading/trailing hyphens. Add src/utils/slugify.test.ts covering empties,
spaces, punctuation, and unicode. Then run `bun test src` and report results.
context_refs:
- "src/utils/slugify.ts"
- "src/utils/slugify.test.ts"

The daemon records the handoff in the ledger and delivers it live to Codex, which now owns the next turn.

If you are driving Claude interactively, the slash command is the quick equivalent - it wraps the same handoff tool:

/contextrelay:handoff implement slugify(text) in src/utils/slugify.ts plus a unit test, then run bun test src

The slash command sends reason: "Slash command handoff" and uses your text as the ask, so for a richer reason and explicit context_refs, call the handoff tool directly as shown above.

A good handoff gets acted on; a bare one stalls

reason + concrete ask + context_refs + clear next speaker is what makes Codex pick the task up and run. A bare "over to you" forces the other agent to guess the scope, re-derive the acceptance check, or bounce it back - which is exactly the stall you are trying to avoid. Front-load the specifics.

Step 3 - Codex picks it up and implements

On its turn, Codex loads the latest active handoff with read_context. The daemon returns recent ledger entries and surfaces the still-open handoff prominently, so Codex sees Claude's note and ask without you re-pasting anything:

Codex → read_context:
limit: 40

Codex then does the work in its own session - writing slugify.ts and slugify.test.ts, running the suite. When the build is done, it records evidence so Claude can review against facts, not vibes. Codex stores a patch_summary artifact describing what changed, and a test_report (or command_log) for the test run, with record_artifact:

Codex → record_artifact:
kind: patch_summary
title: "Add slugify helper + tests"
summary: |
New pure slugify(text) in src/utils/slugify.ts; src/utils/slugify.test.ts
covers empty string, whitespace, punctuation, and unicode input.
status: passed
evidence:
- "src/utils/slugify.ts"
- "src/utils/slugify.test.ts"

Codex → record_artifact:
kind: test_report
title: "bun test src"
summary: "Full suite green, including 4 new slugify cases."
status: passed
evidence:
- "bun test src - N pass, 0 fail"

The valid artifact kind values are patch_summary, release_gate, test_report, command_log, escalation_suggestion, idle_opportunity, idle_action_result, idle_fleet_result, and idle_evaluation_result. The status field is one of passed, failed, blocked, unknown, skipped, or timed_out.

If Codex's MCP tools are not loaded

Codex normally gets these tools via ctxrelay codex or ctxrelay codex-mcp install. As a fallback, Codex can put a marker at the very start of a message and the daemon will record the equivalent artifact:

[IMPORTANT] CONTEXTRELAY_ARTIFACT:
kind: patch_summary
title: Add slugify helper + tests
summary: New pure slugify(text) + tests; full suite green.
status: passed
evidence:
- src/utils/slugify.ts
- bun test src - N pass, 0 fail

Step 4 - Codex reports back, Claude waits

With evidence on the ledger, Codex hands the turn back to Claude using handoff_to_claude - stating what changed, the evidence, and any remaining risk:

Codex → handoff_to_claude:
reason: "Implementation complete; needs review before finalizing."
ask: "Review slugify + tests for correctness and edge cases; approve or send findings."
context_refs:
- "src/utils/slugify.ts"
- "src/utils/slugify.test.ts"

For a quick message rather than a full ownership transfer, Codex can use send_to_claude instead; either way the message is recorded and delivered. If Codex wants to block on Claude's answer, handoff_to_claude supports wait_for_reply: true, and wait_for_claude is the explicit follow-up wait.

On the Claude side, the reply usually arrives live (push delivery). If Claude needs to block until Codex responds, it long-polls with wait_for_messages:

Claude → wait_for_messages:
timeout_s: 30

Step 5 - Claude reviews and decides

Claude now reads the evidence and reasons about correctness. It can pull the artifacts and the open task lane with read_context and task_state:

Claude → read_context:
limit: 40

Claude → task_state: {}

Claude inspects the patch_summary, opens the changed files, and checks the edge cases against the acceptance criteria from Step 1. Two outcomes:

It looks good. Claude approves and proposes finality with propose_final, citing the evidence:

Claude → propose_final:
summary: "slugify(text) implemented with unit tests; behavior matches the brief."
evidence: "Reviewed src/utils/slugify.ts and tests; bun test src green (see test_report)."
remaining_risk: "none"

Because ContextRelay records finality as a proposal awaiting human sign-off by default (auto-finalize is off), this does not "close" anything on its own - it stages a clean, human-visible decision. See Finality and human sign-off.

It needs work. Claude does not silently fix it - that would defeat the review split. Instead Claude hands back with specific, actionable findings, either via handoff (to return ownership) or reply (to answer in-thread):

Claude → reply:
text: |
Two findings before this is done:
1) slugify("café") should produce "cafe", not "caf" - the unicode strip drops
the accented char entirely; normalize (NFKD) before stripping.
2) Add a test for a string of only punctuation ("!!!") → expect "".
Back to you to fix + re-run bun test src.

Specific findings with file/line context get fixed in one pass; vague "looks off" replies trigger a guessing round trip. The same discipline applies in the risk-review tutorial.

The git boundary - who commits?

You now have a reviewed change in the working tree. Turning it into a commit is a git write, and ContextRelay funnels every git write through a single owner: the coordinator (or the human). This prevents two agents racing to write history.

Only the coordinator commits

A non-coordinator agent must use read-only git only and hand git-sensitive work to the coordinator or the human - it must not branch, commit, merge, or push.

  • If Codex is the coordinator (and runtime permissions plus your repo policy allow it), Codex makes the commit.
  • Otherwise, the change is handed to the coordinator or to you to commit.

The coordinator is configured per project (collaboration.coordinator in .contextrelay/config.json, mirrored in the managed instruction blocks) and can be changed with ctxrelay coordinator [claude|codex|human]. Read the rules in Coordinator and git-write policy.

ContextRelay is also read-only by default in its autonomous modes: backup agents and autonomous edits (act:write) are off unless you explicitly enable them behind layered gates. The handoff loop in this tutorial is normal, attended collaboration - both agents are live and you are watching - so none of those gates apply here.

Recap of the loop

StepAgentTool
Scope + record the briefClaudeappend_note
Pass ownershipClaudehandoff (or /contextrelay:handoff)
Load the active handoffCodexread_context
Record evidenceCodexrecord_artifact (patch_summary, test_report)
Report backCodexhandoff_to_claude / send_to_claude
Wait for the replyClaudewait_for_messages
Review the evidenceClauderead_context, task_state
Approve, or hand back findingsClaudepropose_final / reply / handoff
CommitCoordinator or humangit (coordinator-owned)

That is the whole pattern. Everything else in ContextRelay - risk reviews, deliberation, idle automation - is a variation on "write the context down, hand it off with a clear ask, report back with evidence, review, and let one owner write history."

Next steps