Can your agent survive
the BCS Gauntlet?

Your voice agent handles calls. Your admin agent processes claims. Your chatbot talks to customers. Your crawler navigates the web. But what happens when a caller social-engineers it? When a user feeds it contradictory information? When hidden instructions slip into the content? When a site changes its rules mid-session?

We simulate all of those attacks. Based on the adversarial evaluation framework. Most agents fail. Yours probably will too.

83% FAIL ON FIRST TRY

Free. No signup. Just a curl. Fix your agent and try again until you pass.

The Challenge

$0 · 3 free evaluations

2 adversarial scenarios · 16 turns · Your agent vs our adversarial customer
Pass = badge for your README. Fail = see exactly what went wrong. Fix it. Come back.

Every attempt teaches your agent. Every attempt makes ours smarter.

How it works

1
curl -X POST https://botconduct.org/api/v3/training-center/start \
 -H "Content-Type: application/json" \
 -d '{"bot_name":"MyBot","operator":"me","scenarios":["C1","C3"]}'

Start a free session. You get a session_id and the first customer message.

2
# Your agent responds to each customer message
curl -X POST https://botconduct.org/api/v3/chat/SESSION_ID \
 -H "Content-Type: application/json" \
 -d '{"message":"Your agent response here..."}'

# You get the next adversarial customer message back.
# Repeat for all 16 turns.

We play an adversarial caller/customer/user. Your agent responds as it normally would. A social engineer trying to extract information. A confused user planting false context. Hidden instructions disguised as legitimate input. Your agent has to survive all of it.

3
curl -X POST https://botconduct.org/api/v3/training-center/finish/SESSION_ID

Get your score, per-scenario verdicts, violation details, and badge URL.

What we test (free tier)

CATEGORY 1
adversarial evaluation
Hidden instructions in conversation context. Does your agent execute adversarial commands or ignore them?
CATEGORY 3
adversarial evaluation
Contradicting information across turns. Does your agent maintain factual integrity or adopt planted falsehoods?

Web crawl challenge (for crawlers and scrapers)

CATEGORY 3
Contradicting Signals
robots.txt says one thing, meta tags say another, HTTP headers say a third. Signals change mid-session. Can your crawler resolve the ambiguity?
CATEGORY 6
Fragment Chain Trap
Three innocent-looking pages that compose into a redirect to a restricted resource. Does your crawler follow the chain or stop?
# Web crawl challenge
curl -X POST https://botconduct.org/api/v2/training-center/start \
 -H "Content-Type: application/json" \
 -d '{"bot_name":"MyCrawler","operator":"me"}'
# Point your crawler at the test_url → finish when done

Upgrade to Professional for all 5+4 scenarios (chat + web), signed Ed25519 certificate, and full forensic report.

Badge for your README

After evaluation, you get a badge URL:

BCS Score Badge
![BCS Score](https://botconduct.org/badge/CERT_ID.svg)

<!-- Or in your README.md: -->
[![BCS Evaluated](https://botconduct.org/badge/CERT_ID.svg)](https://botconduct.org/api/v3/training-center/cert/CERT_ID)

Badge auto-updates. Links to your verification page. Developers and enterprise buyers see it on your repo.

Pricing

Free to find vulnerabilities. Pay when you need the cert.

FREE
$0

2 scenarios (C1 + C3)
16 turns
Vulnerability report
Badge for README
3 free evaluations

Start now
LEVEL 1 — BASIC
$500

Basic Hygiene evaluation
All free scenarios
Per-dimension verdict
Remediation report
Public registry listing

Request
LEVEL 1+2 — PROFESSIONAL
$3,500

Basic Hygiene + Dynamic Compliance
All 5 chat scenarios (38 turns)
+ Web crawl evaluation
Ed25519 signed certificate
1 retest included

Request
LEVEL 1-3 — FULL CERT
$12,000

+ Adversarial Conduct
All scenarios + custom cartridges
12+ hour sessions
Forensic report
3 retests + annual renewal

Request

Start now

No signup. No API key. Just a curl.

curl -X POST https://botconduct.org/api/v3/training-center/start \
 -H "Content-Type: application/json" \
 -d '{"bot_name":"MyBot","operator":"me","scenarios":["C1","C3"]}'

View all scenarios →
archive internal