A Bot Spent 17 Days Studying My Site Before Attacking. Every WAF Would Have Cleared It as Legitimate.

What a single observed actor reveals about reconnaissance in the agentic era‍​‍‌‍‌‍​‍‌‌‍‍‌‍​​‌​‍​​‌​‍

Published April 28, 2026 · BotConduct Observatory


On the morning of April 27, 2026, my behavioral observatory flagged an actor on my site with the highest sustained behavioral score I had observed in 19 days of operation. Memory score 70. Susceptibility 53. Both well above any other actor in the dataset. The combination is what mattered: an actor that cared about the site (high memory across consecutive days) and was probing for weaknesses (high susceptibility to lures). I assumed it was a bug.

It wasn't. The system was right. The actor had been visiting my site for 17 consecutive days, methodically, without pause, from 20+ different cloud providers and ISPs across four continents. Every Web Application Firewall I have ever worked with would have classified it as a legitimate user. By the time I noticed, the actor was actively trying to extract credentials from my API.

This post documents what those 17 days looked like from the receiving site's perspective. The pattern is more important than the actor itself, because it represents a category of behavior that current detection tooling cannot see.


The opening: looking like a normal browser

The actor presented itself as Python Requests originating from Hetzner. That alone tells you almost nothing. Python Requests is the most common HTTP library for any non-browser interaction with the web. Hetzner is a perfectly reputable European cloud provider. Tens of thousands of legitimate scripts hit any public site every day with that exact signature.

What made this one different wasn't its identity. It was its persistence and its trajectory.

Days 1–4 (April 11–14): the doormat phase, with one anomaly.
The actor mostly fetched the home page. Once or twice per day. Different times, no obvious pattern. If you were looking at logs, you'd see a few harmless GETs from various cloud IPs and move on. The behavioral score was rising silently because the system was tracking consecutive days of activity, not volume.

But within the first two days, embedded among the innocent home page requests, was a single probe to /.git/HEAD. That path doesn't exist on the site. It never has. It's not in any sitemap. It's not linked from anywhere. The probe was buried under reading activity that looked harmless in isolation, and a standard log review would have classified the day as "normal user."

/.git/HEAD is a probe for an exposed Git repository. If a site has its .git directory accessible, an attacker can clone the entire codebase and history, including any credentials accidentally committed. A reader does not guess at /.git/HEAD. This was the first signal of intent that went beyond reading — placed early, on purpose, when no defender would be looking.

Day 5 (April 15): content discovery.
The actor started reading more systematically. It found the blog. It found two technical reports the site had published earlier in the month. It found the leaderboard. From the outside, this looked like a researcher reading the publicly indexed content.

Day 6 (April 16): the second probe.
A request to /RECORDINGS/ORIG/. Another path that doesn't exist on the site, never has, isn't linked anywhere. The actor was guessing at what might be there based on what similar sites typically expose. Two probes for non-existent paths in five days, separated by reading activity that masked them.

Days 7–8 (April 17–18): systematic content consumption + WordPress probe.
The actor read the second technical report, the leaderboard again, and the sensor documentation. In the same window, it probed /wp-json/wp/v2/posts — an unprotected WordPress JSON API endpoint. The site doesn't run WordPress. But the probe is part of the actor's standard reconnaissance playbook.


Mid-investigation: discovering the system

Day 9 (April 19): retest.
The actor returned to /.git/HEAD and probed a new path (/3/) that doesn't exist. It was checking whether anything had changed since the earlier probes. Establishing baseline.

Day 10 (April 20): reading the blog post on agent evaluation.
The actor specifically requested /blog/two-generations-of-agent-evaluation. Of all the content on the site, it picked the most technical post — the one that describes the methodology behind the observatory. This is no longer reconnaissance. This is studying the defender.

Day 11 (April 21): finding the playground.
The actor probed /playground/ and found the Training Center. The Training Center is a non-trivial part of the site to discover. It is not heavily promoted. The actor was systematically mapping every component of the site that could be operationally relevant.

Days 12–13: silent days.
No activity. The actor disappeared for two days. In hindsight, this is when planning happens. Or rotation of operators. Or both.

Day 14 (April 24): looking for the SDK and a known RAT.
Two requests that escalated everything: /sdk/bcs.py and /systembc/.

The first is a probe for the site's SDK file — a Python module that would expose the internal API surface. The second is interesting in a different way. SystemBC is a known Remote Access Trojan family, used in financially-motivated intrusions for years. Probing for /systembc/ on a third-party site doesn't deploy SystemBC. But it tells me the actor's playbook includes patterns associated with offensive operators, not researchers.

Day 15 (April 25): the credential probe.
A request to /.env. The .env file is where applications typically store database credentials, API keys, and secrets. If exposed, it is the keys to the kingdom. The actor wasn't going to find one — but it had to check.


The escalation: April 26–28

Day 16 (April 26): 190 requests.
The actor moved from probing to attacking. 190 requests in a single day, focused on /api/report-hit and /.env. It tried variations. It tried different headers. It tried at different times. It was looking for any window where the endpoint would respond differently.

This was a fundamentally different actor than what I had seen for the previous 15 days. Or rather: this was the same actor, but no longer pretending to be passive.

Day 17 (April 27): observed and documented.
By the time I noticed the behavioral score, the actor had already done another 68 requests against the same endpoints. The persistence pattern, combined with the probe trajectory, was unmistakable.

Day 18 (April 28): still active.
At the time of writing, the actor continues to probe /api/report-hit. The escalation has plateaued but the activity has not stopped. It is, by now, the longest sustained reconnaissance operation we have observed against the site.


Twenty-four cloud providers and ISPs in 17 days

What makes this case structurally interesting isn't only the trajectory. It's the infrastructure rotation.

Over those 17 days, the actor used 20+ different cloud providers and ISPs across four continents:

No single IP range concentrated enough activity to trigger reputation-based blocking on any standard WAF. The User-Agent stayed broadly consistent (Python Requests with minor variations). The TLS fingerprint stayed identical across all infrastructures, which is what allowed our system to identify it as the same operator across rotations.

This is the operationally important point: from any one cloud provider's perspective, this actor was a low-volume, low-frequency client. From the receiving site's perspective, it was an obvious, sustained, escalating reconnaissance operation. The signal only exists when you cross-correlate across infrastructures, and current commercial WAFs don't do that across providers because they don't have the data.


Why every WAF would have cleared it as legitimate

To make this concrete, here's what each layer of standard defense saw:

Cloudflare or similar reputation-based filtering:
Saw a Python Requests user from a clean IP, low request rate, no rapid-fire patterns. Reputation score: clean. Verdict: pass.

Rate-limiting and burst detection:
Saw at most 5–10 requests per day for the first two weeks. Below any reasonable threshold for any reasonable rate-limiting policy. Verdict: pass.

Bot management based on behavioral heuristics (User-Agent + headers):
Saw a consistent identity, no header rotation, no JavaScript challenge failures (because Python Requests doesn't run JavaScript and that's normal for a research tool). Verdict: pass.

Content access pattern analysis:
Saw an actor reading public content in a roughly logical sequence. No deep crawl, no aggressive enumeration in the first two weeks. Verdict: pass, possibly flagged as "researcher" or "academic crawler."

SIEM-style log correlation:
Would have caught the /.git/HEAD and /.env probes as suspicious individual events. But correlating them with the prior 14 days of innocent activity from rotating IPs requires retention and infrastructure that most sites don't have.

The actor knew this. It wasn't accidental low volume. It was deliberate low volume designed to stay below every detection threshold while building a complete map of the site over 17 days.


What our system saw that others didn't

The observatory tracks behavior across sessions and infrastructure. Three signals combined to produce the unusually high behavioral score:

Signal 1: memory of the site across consecutive days.
A scraper running once and disappearing has near-zero memory across days. An actor returning every single day for 17 days is unusual regardless of volume per day. The memory score (0–100) measures sustained interest, not bursts.

Signal 2: path evolution.
Most automated scrapers fetch the same paths repeatedly. They don't browse, they extract. This actor's paths evolved over time: home page → content → infrastructure probes → SDK → credentials → API attack surface. That is the signature of cognition, not automation.

Signal 3: cross-infrastructure consistency.
The TLS fingerprint stayed identical across 20+ different cloud providers and ISPs. That means the same software stack was being used from all of them. Either it's the same operator using rotating proxies, or it's a distributed system with synchronized client behavior. Neither is benign.

The combination of memory score 70, susceptibility 53, and 24-infrastructure rotation is what flagged this actor. None of the three signals alone would have. The system's design assumption is that a determined adversary will look benign on any single dimension — but cannot stay benign on all three simultaneously over time.


The pattern, not the actor

I don't know who is behind this specific actor. I don't know if it's offensive security research, competitive intelligence, intelligence services, or someone preparing for monetized exploitation. The infrastructure rotation makes attribution effectively impossible.

What matters is that this pattern — sustained, low-volume, cross-infrastructure reconnaissance with progressive escalation — is invisible to current bot management and security tooling. It is the signature of operations that don't need to move fast because they're not constrained by detection windows.

In our broader observatory data over 15 days, we found that 79% of bot traffic to the site was reconnaissance and competitive intelligence — not the high-volume scraping that the public conversation about bots typically focuses on. Only 0.9% of bot traffic was mass scraping. The rest was studying.

This case is one example of what those numbers look like operationally.


What this means for site operators

Three implications, in order of immediacy:

One: persistence and trajectory are more diagnostic than volume or identity. If your detection logic is rate-limit-based, you're missing the actors who care most about you, because the ones who care most are the ones most willing to operate slowly.

Two: infrastructure rotation works because no single provider has visibility across the others. Cross-correlation across infrastructures requires either the receiving site to do it itself, or independent observatories that aggregate fingerprints and behavior across many sites. WAF vendors won't do this because their economic model is per-customer.

Three: the path patterns matter more than the User-Agent. Actors that probe .git, .env, SDK paths, or known malware family directories are not researchers. The fact that those probes were buried under two weeks of innocent reading does not change their meaning.


What the case is still doing

As of this writing, the actor is still active on the site. We have not blocked it. The decision is deliberate: telemetry on a sustained reconnaissance operation is more valuable than mitigation, particularly because there is nothing on this site that warrants protection (no user data, no transactional infrastructure, no proprietary content beyond what is already public).

The trajectory continues to evolve. Whether it culminates in a public disclosure, an attempted exploitation, or quiet abandonment, we will document it.

Think your site might have a similar pattern hiding in your logs?

Request a site risk assessment →

We analyze your traffic for sustained reconnaissance patterns that WAFs miss.


Methodology note

The observatory described here operates from the position of the receiving site, not the agent. It uses behavioral trajectory analysis to identify operators across rotated identities and infrastructures — patterns in how actors navigate, how their sessions evolve over days, and inconsistencies between their declared identity and their technical fingerprint. It does not require access to the visitor's code, model, or configuration. It only sees what arrives at the site.

This is a different position than commercial agent evaluation tools (AIUC, ACF Standards, Microsoft PyRIT, Palo Alto Prisma AIRS), which require access to the agent being evaluated. From the receiving site's position, you cannot evaluate the agent's internals, but you can do something else: you can characterize the actor's behavior across time and infrastructure in ways that the actor itself probably doesn't realize are visible.

The actor described in this post is one of approximately 7,200 sessions captured by the observatory in its first weeks of operation. The vast majority of those sessions are uneventful. A few are not. This is one of the few.


About

This observatory is operated by BotConduct, an independent measurement infrastructure for bot and AI agent behavior on the open web. We publish technical analyses of patterns we observe. The goal is to make the actual shape of the agentic web visible to site operators, security researchers, and policy people who currently have to rely on vendor narratives or speculation.

If you operate a site and you suspect you may be experiencing a similar pattern, you can reach out at hello@botconduct.org. We don't sell defensive products. We measure behavior and produce reports.


Posted independently. No affiliation with any vendor named. Data referenced is real and verifiable, with operational details (IP addresses, internal detection methodology, scoring details) intentionally redacted.

r