The Witness Neither Party Can Be
Agents began to transact in production this year. The first thing that broke was not the technology. It was the record.
Something quiet happened to commerce this year. Agents began to transact. Not in a demo — in production, at volume. The major payment networks now run programs built for it; the large model providers ship agents that browse, choose, and pay on a person’s behalf. A purchase that once required a human to read a page and click a button now happens between software and a website, with no one watching in the moment it occurs.
The first thing that broke was not the technology. It was the record.
Issuers have started to report that disputes on agent-initiated transactions run well above the rate for comparable purchases a human made directly — and that the mix is different. Fewer claims of fraud. More claims of I did not authorize this, and this was not what was described. The pattern is telling. The trouble is not that the agent was an impostor. The trouble is that a legitimate agent, acting inside its mandate, did something a party later wished it had not — and when the dispute arrived, there was no clean human moment to point to. Earlier this year a federal court took the next step into the unknown, finding that an AI shopping agent may have acted unlawfully on a merchant’s surface even though the user had authorized it. Permission from one side did not settle what happened at the border between them.
For now, these disputes still have a human on at least one end. Someone who can say what they intended, produce a screenshot, describe what they saw. That asymmetry is doing more work than anyone notices. It is the last thing holding the old machinery of resolution together: when one party is a person, there is at least one account anchored to a remembering, intending human being.
Now remove that anchor.
Put an agent on both sides
Picture the interaction one turn further on, which is where it is going. An agent acting for one company and an agent acting for another arrive at the same surface and settle something between them — a price, a quantity, a commitment that binds the principals behind them. It goes through in milliseconds. No human reads it. And then, weeks later, it is disputed.
Each side has a log. The first was written by the system that acted for the first party. The second was written by the system that acted for the second. Each is a faithful account of what its own author believes happened. And each belongs, wholly, to a party with an interest in how the story reads.
This is the structure that should give everyone pause. We have spent a year worrying that a system cannot be trusted to be the sole witness of its own behavior — that an account a model writes about itself attests but does not evidence. An agent-to-agent interaction takes that problem and squares it. Now both witnesses are interested parties. Both records are self-authored. When they disagree — and the disputes are already arriving — there is no neutral version. There is the word of one side’s software against the word of the other’s. Self-attestation was already thin. Self-attestation on both ends of the same crossing is not evidence at all. It is two stories.
The record has to come from somewhere with nothing to gain
The way out is not a better log on either side. A better log is still that side’s log. The way out is a record taken by someone who is not either party — taken at the place the two systems actually meet, which is the surface that receives the behavior.
That surface has a property the two agents do not. It has no stake in the outcome of the dispute. It is simply the ground on which the interaction happened, and it can hold an independent, time-stamped, signed record of the fact of what crossed it — not the intent inside either system, which it cannot see and does not claim to, but the observable event at the border. A record like that is the one thing in the whole arrangement that is not “one party’s word.” It is the witness that neither party can be, precisely because it is neither party.
This is not a new idea so much as an old one catching up to a new actor. We did not resolve disputes between drivers by asking each of them to submit their own account of the intersection. We put a camera at the intersection — owned by no one in the car. The point was never to see inside either engine. It was to have one neutral record of what happened at the place the two of them met.
A note on what this is and isn’t
The two-agent case is not yet the headline. Today’s disputes still mostly have a human on one side, and that human is doing the quiet work of being the anchor. This note is written now, while that is still true, because the anchor is being removed in plain sight and the field tends to build the record only after it needs it — and a record cannot be backfilled. The interaction that is disputed next year is happening, unrecorded, this year.
It is also worth being exact about the limits. A record at the border can establish what crossed and when; it cannot read the mind of a system on either side, and it cannot, on its own, name an actor who arrived determined not to be named. It evidences the event, not the intent, and not always the author. That is less than people will want, and it is still more than the alternative — which, for an interaction between two interested machines, is nothing that anyone neutral can stand behind.
The foundational work on multi-agent risk, written by researchers across the major labs and universities, already lists the problem of commitments between agents among the field’s open questions, and names the practical infrastructure for trust between them as unsolved. This is a note about one piece of that infrastructure: the part that has no opinion about who should win.
BotConduct — independent behavioral observatory. Evidence, not enforcement.