Harness Engineering

The engineering of the wrapper around the model: prompt structure, tool design, hooks, MCP, context windows, eval loops, and the feedback systems that make agents reliable.

#harness-engineering #mcp

Reading

Building Claude Code with Boris Cherny
The Pragmatic Engineer · Apr 2026 · Podcast · 1 note
Harness engineering for coding agent users
Birgitta Böckeler (martinfowler.com) · Apr 2026 · Article
Skill Issue: Harness Engineering for Coding Agents
HumanLayer · Mar 2026 · Article
Harness Engineering for AI Coding Agents: Constraints That Ship Reliable Code
Augment Code · Mar 2026 · Article
Mitchell Hashimoto’s new way of writing code
The Pragmatic Engineer · Feb 2026 · Podcast · 10 notes
Harness engineering: leveraging Codex in an agent-first world
OpenAI · Feb 2026 · Article
My AI Adoption Journey
Mitchell Hashimoto · Feb 2026 · Article
Introducing Agent Skills
Anthropic · Oct 2025 · Article
Effective context engineering for AI agents
Anthropic · Sep 2025 · Article · 1 note · derived
Software 3.0: Software in the Age of AI
Andrej Karpathy · Jun 2025 · Article
Don't Build Multi-Agents
Walden Yan / Cognition · Jun 2025 · Article · 1 note
Building effective agents
Anthropic (Erik Schluntz, Barry Zhang) · Dec 2024 · Article
Introducing the Model Context Protocol
Anthropic · Nov 2024 · Article
Your AI Product Needs Evals
Hamel Husain · Apr 2024 · Article
LLM Powered Autonomous Agents
Lilian Weng · Jun 2023 · Article
Toolformer: Language Models Can Teach Themselves to Use Tools
Timo Schick et al. · Feb 2023 · Paper
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao et al. · Oct 2022 · Paper

Output

Hardening AI Agents Against the 'Lethal Trifecta'

Mar 24, 2026

Personal AI assistants like Openclaw are fantastically powerful, and quite dangerous. Here's how to harden a personal assistant without making it useless.

#AI #security #prompt-injection #agents #mcp

Synthesis

A working notebook on the discipline of harness engineering, the wrapper around the model rather than the model itself. The argument I’m tracking: that the harness defines the productivity ceiling more than the underlying weights. Mitchell Hashimoto crystallised the term in February 2026; within a week OpenAI and Anthropic had published their own treatments, and within two months Martin Fowler’s site had a full-length canonical article on it. The pattern matters: a vocabulary moved from one practitioner’s habit to industry consensus in eight weeks.

Threads to follow:

Agent = Model + Harness. The simplest formulation, from Hashimoto. Most discussion of “AI productivity” is really discussion of harness quality.
Context as substrate. The shift from prompt engineering (one-shot wording) to context engineering (the whole information environment the agent operates inside).
MCP as the neutral protocol. Tools were the bottleneck; MCP made them composable.
Single agent vs multi-agent. Cognition’s “Don’t Build Multi-Agents” paired with Anthropic’s research-system writeup is the cleanest disagreement in the field: same data, opposite conclusions.
Evals as steering. Hamel Husain’s argument that without evals you cannot drive the system, only watch it move.

Reading

Output

Hardening AI Agents Against the 'Lethal Trifecta'

Synthesis