Before you let AI agents loose, you'd better know what they're capable of

Charles Humble's piece in The New Stack doesn't bury the lede: if you're deploying AI agents in production, you need to understand what they can do before you find out the hard way.

The timing is relevant. Companies that spent the last two years experimenting with LLMs are now running multi-step agentic systems — workflows that take real actions, make real API calls, and can cause real damage if something goes sideways. Humble's argument is that capability assessment should happen before deployment, not after the first incident.

He identifies a few failure modes worth taking seriously. Prompt injection is the most insidious: because agents are designed to consume external input and act on it, a piece of malicious content in the environment can redirect an agent's behavior entirely. Runaway tool use — where an agent triggers unintended side effects through otherwise legitimate API calls — is harder to detect because nothing is technically broken. And in multi-agent pipelines, errors don't stay contained; they propagate downstream.

These aren't theoretical concerns, and the article points to a vendor landscape that's built around them. Galileo offers observability and evaluation tooling for monitoring agent behavior in production — flagging hallucinations, unusual tool calls, and performance degradation. SurePath AI focuses on the governance side, with policy enforcement frameworks that constrain what agents can do at runtime. Humble also mentions NanoClaw, a tool for adversarially testing agent systems — red-teaming, essentially, but purpose-built for agentic AI.

The pattern across all three vendors is the same: organizations need infrastructure to watch, constrain, and stress-test their agents before trusting them with anything consequential. That's not a novel idea in software engineering. It's one the AI industry is still catching up to.