Skip to content
PKM Blog
Go back

Agent Harnesses: A Standard for Structuring Agentic Systems

Edit page .md

The word “harness” is everywhere in AI agent development — but reputable sources define it differently. A new open standard proposes to fix that with a practical, opinionated definition and a concrete directory convention.

What Is a Harness? (The Definitional Problem)

Definition 1 — The agent wrapper:

agent = model + harnessLangChain

Under this view, a harness is the scaffolding that turns a raw LLM into an agent: tools, memory, a prompt loop.

Definition 2 — The task wrapper:

We developed a two-fold solution to enable the Claude Agent SDK to work effectively across many context windows: an initializer agent that sets up the environment on the first run, and a coding agent that is tasked with making incremental progress in every session.Anthropic Engineering

Under this view, a harness wraps an already-agentic system and applies it to a specific repeatable role.

The Agent Harnesses project argues for the second definition:

A harness is information and tools that allow a general purpose agentic system to do specific, complex tasks in a repeatable and maintainable manner.

The reasoning: “agent” is already a well-defined concept. Redefining it as model + harness is redundant. The real unsolved problem is how to take a general-purpose agent (like Claude Code) and reliably constrain it to a specialized role across sessions.


Why a Standard Is Needed

Ad-hoc approaches to structuring agent context — LLM wikis, loose CLAUDE.md files, scattered documentation — run into predictable problems at scale:

  1. Slow initialization — the agent spends too long understanding its environment before doing useful work
  2. Context blindness — the agent skips documentation entirely and works without grounding
  3. Wiki entropy — when an LLM maintains its own wiki, it forgets where it wrote things, producing duplicate and contradictory content
  4. Configuration drift — adjusting agent behaviour in a large unstructured wiki requires finding and updating content in many places

For Claude specifically, there’s an additional structural constraint: skills must live in a flat .claude directory and can’t be nested within a larger harness directory. This makes it hard to package a role’s prompts and skills together.


The Agent Harnesses Standard

The standard is a directory convention for structuring harnesses. The key idea is structured progressive disclosure: the agent loads a minimal entry point, reads routing files to navigate, and only loads detailed content when a task actually requires it.

Directory Structure

my-harness/
  ├── HARNESS.md          ← entry point, loaded every session
  ├── tools/
  │   ├── TOOLS.md        ← routing file for the tools/ subtree
  │   └── query-db/
  └── data/
      ├── DATA.md         ← routing file for the data/ subtree
      └── schema.md

HARNESS.md is the only required file. It uses a short YAML frontmatter block (name, description) followed by a brief markdown body: what the agent’s role is, and where to look for capabilities and context. It’s kept minimal because it’s loaded on every session start.

Routing files (TOOLS.md, DATA.md, etc.) are named after their parent directory in all-caps. They tell the agent what lives in that subtree without requiring it to scan every file. This convention propagates recursively through the directory tree.

Leaf directories prevent the agent from recursing into implementation details (skill internals, large content stores). Two mechanisms:

Information Flow

Session start

Load HARNESS.md  (always)

Task arrives

Read routing files to locate relevant context

Load specific files only when needed

This mirrors how Claude Code’s own skills standard (SKILL.md) works — the harnesses standard is designed in direct inspiration from it, extending the same progressive-disclosure principle to the full scope of a role.


Using the Standard with Claude Code

The project ships a CLI tool:

pip install agentharnesses-cli

This installs ahar (Agent Harnesses). To initialize a directory:

ahar init .

The default target is Claude Code. This:

  1. Creates the HARNESS.md entry point
  2. Installs the Agent Harnesses “meta skill” into .claude/, which teaches Claude how to navigate the harness structure

Once initialized:

claude
# then: "load the harness"

Claude reads HARNESS.md, understands its role, and navigates the directory structure using routing files when it needs deeper context.


The Broader Debate

The underlying tension is that the AI agent tooling ecosystem is still converging on vocabulary. “Harness,” “agent,” “scaffold,” and “framework” are used interchangeably across blog posts, SDKs, and papers — making it hard to reason about what a tool actually does without reading the implementation.

The Agent Harnesses project is an attempt to claim and stabilize one of those terms with a practical, opinionated definition. Whether the definition wins adoption depends on how well the standard solves the real problems it targets: initialization speed, context discipline, and role maintainability.


Resources


Edit page
Share this post:

Previous Post
Hermes Agent and the Search Provider Attack Surface
Next Post
AI Agent Security: The Lethal Trifecta and the Rule of Two