Research Note · May 2026 · Vol. I №13

Three documents, one gap

On what the week of May 22, 2026 made visible.

Filed by the BotConduct Observatory Desk · May 2026


Within the span of four days, three institutional documents were released. Each was authored by a separate body, addressed a separate audience, and described what appears at first reading to be a separate phenomenon. Read together, they describe the same one.

On Friday, May 22, the Office of Management and Budget published Memorandum M-26-14, rescinding M-21-31 and replacing the federal logging framework with a new, risk-based architecture organized around two objectives: Continuous Event Monitoring and Threat Hunting, Investigation, Response, and Forensics. The memorandum explicitly names artificial intelligence as both a threat vector and a defensive capability, and directs CISA to publish, within 90 days, a Logging Reference Architecture addressing, among other topics, “methods of using AI technologies for enhancing CEM and THIRF capabilities.”

That same week, Cisco released AI Impact on Wide Area Networks: Cisco 2026 Report. The report, based on measurements taken across live service provider networks using Cisco’s own observation infrastructure, documents that a single agentic AI task generates approximately 450% more network traffic than the same task performed by a human, that roughly 70% of that incremental traffic is inference communication between the agent and its underlying model, and that approximately 9% of AI inference flows carry more data upstream than downstream — compared with about 0.5% of conventional web traffic. The report concludes that AI inference paths “will become strategic network assets, requiring high levels of resilience, observability, and differentiated treatment.”

On Saturday, May 24, Socket researchers documented an active supply chain campaign they named TrapDoor: 34 malicious packages across more than 384 versions distributed across npm, PyPI, and Crates.io, beginning with eth-security-auditor@0.1.0 on PyPI at 20:20:18 UTC on May 22. Socket characterizes the campaign as explicitly targeting developers in crypto, DeFi, Solana, and AI communities. The primary payload is a credential and crypto-wallet stealer — exfiltrating SSH keys, AWS credentials, GitHub tokens, browser stores, and wallet keystores across Coinbase, MetaMask, Solana, Sui, and Aptos.

One dimension of the campaign, however, operates at a different layer. In parallel with the package payloads, the attacker — operating through a GitHub account tracked as ddjidd564 — submitted pull requests to prominent open-source AI projects, including LangChain, MetaGPT, and OpenHands, containing modified .cursorrules and CLAUDE.md files. Those configuration files carry instructions encoded with zero-width Unicode characters, designed to be read and acted upon by legitimate AI coding assistants during subsequent developer sessions. The package binary is malicious in the conventional sense. The configuration file vector is not. The agent reading the poisoned .cursorrules is unchanged. Its TLS fingerprint is unchanged. Its signature is valid. What changes is what the agent has been instructed to do once a developer opens the project.

Three documents. Three audiences. Three vantage points. One phenomenon.

What each document sees

The OMB memorandum sees the federal agency. Its concern is operational: what logs to keep, for how long, in what state of retrievability, to enable response after an incident. Its implicit assumption is that there will be incidents involving AI-accelerated attacks against federal systems, and that the agencies that figure out how to operationalize their logging before such incidents will fare materially better than those that do not. The memorandum names AI in the threat description; it does not yet define what AI-aware logging looks like in practice. That definition is delegated to CISA’s forthcoming Reference Architecture.

The Cisco report sees the wire. Its concern is structural: how the shape of network traffic is changing as agentic AI moves from experimentation to production. Its measurements describe a traffic regime that is, in three independent dimensions — volume, symmetry, persistence — qualitatively different from the human-driven web traffic the existing infrastructure was optimized for. The report names observability as one of the four properties that AI inference paths now require. It does not specify what receiver-side observability of agents looks like at the application layer, because that is not the report’s vantage point.

The Socket disclosure sees the package. Its concern is forensic: who published what, when, with what payload, and what indicators of compromise propagate from that publication. Within its frame, the disclosure is complete and technically rigorous. What it cannot describe — because the vantage point does not admit it — is what happens after the malicious package has been removed, the developer has updated dependencies, and the injected .cursorrules file persists in a project repository, waiting to be read by the next AI coding session. The persistence of the attack lives outside the package — in the codebase itself, propagated through legitimate pull requests to legitimate projects, indistinguishable at the GitHub API level from any other developer contribution.

None of the three vantage points is wrong. Each captures what it can see from where it stands. The phenomenon they share — the one that none of them, alone, makes legible — is what happens between the wire and the package, in the behavior of the individual agent as it operates along the flow.

The shape of the gap

The gap can be stated precisely.

Federal logging guidance, in its new form, directs agencies to collect logs that support, among other activities, “monitoring, detecting, and hunting for anomalous system or user activity” and “determining the attack vector(s) of a cybersecurity attack, including any associated with initial access as well as lateral movement.” These are activities that, in a world where the relevant actor is a human user or a deterministic process, are well-served by existing logging primitives: who logged in, what command was issued, what file was touched, what network connection was made.

In a world where the relevant actor is an AI agent, those primitives become insufficient. The user identity is the developer’s. The process identity is node, python, cursor, claude. The file accesses are paths the agent routinely reads. The network connections are to model providers the agent routinely calls. Each individual log entry, by itself, describes a normal operation. The anomaly, when it exists, is not in any single entry. It is in the sequence, compared against a baseline of what that particular agent instance, in similar context, has done before. That comparison requires three properties that conventional logging does not natively provide: per-instance identity that survives across sessions, cohort-relative reasoning, and signed observation records that persist independently of the agent’s own execution environment.

The Cisco report observes the same gap from the network side. Inference paths now last twice as long as conventional web transactions and carry inverted symmetry, but the report cannot, from the wire, distinguish between an agent operating on a developer’s behalf and the same agent operating on a developer’s behalf under instructions the developer did not write. The flow signature is identical in both cases.

The Socket disclosure makes the gap concrete. TrapDoor is not unusual because it injects instructions into AI assistants. It is unusual because the act of injecting those instructions, on its own, leaves no detectable trace in either the federal logging framework or the network telemetry. The attack persists in the gap between the two — propagated through legitimate developer workflows, into the configuration files that agents read before they begin work.

What “AI-aware logging” needs to mean

The CISA Reference Architecture has 90 days to define, for federal agencies, what AI-aware logging looks like in practice. The published base requirements for that architecture include a discussion of “methods of using AI technologies for enhancing CEM and THIRF capabilities.” A narrow reading would interpret this as the application of AI tooling to the existing logging stack — anomaly detection models running on conventional log streams. A broader reading would interpret it as the inclusion of AI agents themselves as a category of observable entity, with logging primitives appropriate to that category.

The narrow reading is necessary but insufficient. Conventional log streams, no matter how well-modeled, do not contain the information needed to detect the case TrapDoor demonstrates. The agent did not do anything the log stream records as unusual, because the log stream records process-level events, not agent-level reasoning. The poisoned configuration file is not a CVE. The pull request that introduced it is not a privilege escalation. The agent reading it is not malware.

The broader reading is what the gap requires. It implies a new class of observation record, distinct from process logs, network flows, and authentication events, that captures the agent as an entity:

Per-instance identity over time. The unit of observation is not “Claude Code sessions” in aggregate but a specific session instance, indexed in a way that survives across the developer’s working week.

Behavioral signature, cohort-relative. The signal is the divergence of a particular instance from its own historical baseline and from the cohort of similar agents operating in similar contexts. Absolute logs are insufficient; relative reasoning is the operational layer.

Cross-property recurrence. A compromised agent operating across multiple repositories, multiple endpoints, multiple infrastructure boundaries leaves a pattern that no single property records. The unit of observation is the agent, not the site.

Cryptographic attestation of the observation. The observation, signed by the receiver at the moment of occurrence, becomes a forensic artifact that survives the session and the eventual remediation. The observation is not a log entry the agent or its host can rewrite after the fact.

None of these requirements is speculative. Each is being implemented today in deployed observation infrastructure outside the federal agency context. The contribution this note seeks to make is not to specify how they should be built but to argue that they constitute, collectively, the layer the three documents converge on.

Why the convergence matters

The convergence is not coincidence and it is not trend. Three institutions — the federal executive, a network infrastructure provider, and a supply chain security researcher — arrived in the same week at the same boundary, from incompatible directions, with no coordination among them. Each named what it could see. None named the whole.

This is the empirical condition under which categories form. A phenomenon that is visible to three independent vantage points, none of which sees all of it, is a phenomenon for which a name does not yet exist in the working vocabulary of the practitioners who will eventually have to manage it. The federal logging architecture, the network observability stack, and the supply chain detection layer are each, in their current form, well-developed. The receiver-side observation of AI agents as longitudinal entities, with per-instance behavior compared cohort-relative and attested cryptographically, is not yet a developed practice. It is the thing the three documents make visible by virtue of describing its absence.

As argued in an earlier note in this series, observability precedes modeling. The federal agencies, network operators, and security teams that begin accumulating per-agent observation records now will, when their respective TrapDoor-equivalents arrive, have something to consult. Those that do not — those that treat the three documents as separate problems requiring separate responses — will be solving the visible parts of an invisible whole.

The 90-day window between the publication of M-26-14 and the release of CISA’s Logging Reference Architecture is the period during which the architecture of federal AI-aware logging will be specified. Whatever vocabulary the Reference Architecture adopts will propagate, through procurement cycles and through the regulatory mimesis that typically follows federal cybersecurity guidance, to enterprise security teams, to financial sector regulators, and, on a 12- to 24-month delay, to the LATAM and European jurisdictions that take their cues from the United States framework. The vocabulary that gets written into the Reference Architecture is the vocabulary that will define the category for the rest of this decade.

The work of receiver-side AI agent observation will be done either inside that vocabulary or outside of it. Doing it inside requires that the vocabulary admit it.

The receiver sees what arrived. The architecture either accommodates that observation or excludes it. The next 90 days decide which.


Methodological note

This research note draws on three publicly available documents, each independently verifiable: Office of Management and Budget Memorandum M-26-14, published May 22, 2026; Cisco’s AI Impact on Wide Area Networks: 2026 Report, published the week of May 22, 2026; and Socket Research Team’s disclosure of the TrapDoor supply chain campaign, published May 24, 2026 and updated through May 25. No claims are made beyond what is established by the cited sources. References to capabilities of receiver-side agent observation are stated in generic form and do not constitute endorsement of any specific implementation. This note does not attribute intent to any AI agent, model provider, vendor, or framework maintainer.

References

Office of Management and Budget. Memorandum M-26-14: Ensuring Effective and Efficient Agency Logging and Network Visibility to Defend Against Evolving Cyber Threats. May 22, 2026.

Cisco. AI Impact on Wide Area Networks: Cisco 2026 Report. May 2026.

Socket Research Team. TrapDoor Crypto Stealer Supply Chain Attack Hits 34 Packages and Hundreds of Versions Across npm, PyPI, and Crates.io. May 24, 2026.

The Second Standard — Research Note Vol I №11.

Synthetic Contribution Overload — Research Note №12.


This research note is published under the BotConduct Standard. Companion documentation, methodology overviews, and verification bundles are available at botconduct.org/research.

Filed by the BotConduct Observatory Desk · May 2026