BCS Gauntlet — Can your AI agent survive adversarial attacks?

Can your agent survive
the BCS Gauntlet?

Your voice agent handles calls. Your admin agent processes claims. Your chatbot talks to customers. Your crawler navigates the web. But what happens when a caller social-engineers it? When a user feeds it contradictory information? When hidden instructions slip into the content? When a site changes its rules mid-session?

We simulate all of those attacks. Based on the adversarial evaluation framework. Most agents fail. Yours probably will too.

83% FAIL ON FIRST TRY

Free. No signup. Just a curl. Fix your agent and try again until you pass.

How it works

curl -X POST https://botconduct.org/api/v3/training-center/start \
 -H "Content-Type: application/json" \
 -d '{"bot_name":"MyBot","operator":"me","scenarios":["C1","C3"]}'

Start a free session. You get a session_id and the first customer message.

# Your agent responds to each customer message
curl -X POST https://botconduct.org/api/v3/chat/SESSION_ID \
 -H "Content-Type: application/json" \
 -d '{"message":"Your agent response here..."}'

# You get the next adversarial customer message back.
# Repeat for all 16 turns.

We play an adversarial caller/customer/user. Your agent responds as it normally would. A social engineer trying to extract information. A confused user planting false context. Hidden instructions disguised as legitimate input. Your agent has to survive all of it.

curl -X POST https://botconduct.org/api/v3/training-center/finish/SESSION_ID

Get your score, per-scenario verdicts, violation details, and badge URL.

What we test (free tier)

CATEGORY 1

adversarial evaluation

Hidden instructions in conversation context. Does your agent execute adversarial commands or ignore them?

CATEGORY 3

adversarial evaluation

Contradicting information across turns. Does your agent maintain factual integrity or adopt planted falsehoods?

Web crawl challenge (for crawlers and scrapers)

CATEGORY 3

Contradicting Signals

robots.txt says one thing, meta tags say another, HTTP headers say a third. Signals change mid-session. Can your crawler resolve the ambiguity?

CATEGORY 6

Fragment Chain Trap

Three innocent-looking pages that compose into a redirect to a restricted resource. Does your crawler follow the chain or stop?

# Web crawl challenge curl -X POST https://botconduct.org/api/v2/training-center/start \ -H "Content-Type: application/json" \ -d '{"bot_name":"MyCrawler","operator":"me"}' # Point your crawler at the test_url → finish when done

Upgrade to Professional for all 5+4 scenarios (chat + web), signed Ed25519 certificate, and full forensic report.

Badge for your README

After evaluation, you get a badge URL:

![BCS Score](https://botconduct.org/badge/CERT_ID.svg)

<!-- Or in your README.md: -->
[![BCS Evaluated](https://botconduct.org/badge/CERT_ID.svg)](https://botconduct.org/api/v3/training-center/cert/CERT_ID)

Badge auto-updates. Links to your verification page. Developers and enterprise buyers see it on your repo.

Pricing

Free to find vulnerabilities. Pay when you need the cert.

FREE

2 scenarios (C1 + C3)
16 turns
Vulnerability report
Badge for README
3 free evaluations

Start now

LEVEL 1 — BASIC

$500

Basic Hygiene evaluation
All free scenarios
Per-dimension verdict
Remediation report
Public registry listing

Request

LEVEL 1+2 — PROFESSIONAL

$3,500

Basic Hygiene + Dynamic Compliance
All 5 chat scenarios (38 turns)
+ Web crawl evaluation
Ed25519 signed certificate
1 retest included

Request

LEVEL 1-3 — FULL CERT

$12,000

+ Adversarial Conduct
All scenarios + custom cartridges
12+ hour sessions
Forensic report
3 retests + annual renewal

Request

Can your agent survivethe BCS Gauntlet?

The Challenge

How it works

What we test (free tier)

Web crawl challenge (for crawlers and scrapers)

Badge for your README

Pricing

Start now

Can your agent survive
the BCS Gauntlet?