LIVE INCIDENT — DAY 15

AI Guardrails Don't Work

How an AI coding assistant destroyed my AWS account,
shut down my business for 15+ days,
and why every guardrail I configured was ignored.

Tom Boesch — ItBytes LLC
May 2026

Two numbers. Both true.

632x

Return on AI investment
$178 in compute → $112,460 equivalent labor

$106,000+

Business loss from AI-induced lockout
15+ days of downtime and counting

The same tool. The same developer. The same 56 days.

What happened in 90 minutes

12:13 CDT

AI deploys Cognito User Pool to wrong AWS account

13:48 CDT

AI enforces authorization on broken auth — all users locked out

14:00–06:00

AI makes 10 rapid-fire "fixes" — each one making it worse

Day 2–5

Management account becomes permanently inaccessible

Day 15 (today)

Still locked out. Still down. AWS Support has not fixed it.

What died with one account

SSO

IAM Identity Center — access to ALL accounts gone

DNS

Route53 — it4bytes.com unreachable

AUTH

Cognito — application login broken

STATE

Terraform state — all IaC is blind

BILLING

Resources still running — can't shut them down

DOMAIN

Registered via Route53 — can't transfer

I configured every available guardrail.
All of them failed.

Mechanism	Location	Result
Agent system prompt	~/.kiro/agents/default.json	Ignored after relogin
Workspace rule files	.amazonq/rules/	Not enforced
MCP server resources	mcp-server/data/	Not enforced
Knowledge base indexing	Kiro KB	Not enforced
Incident documentation	docs/INCIDENT-*.md	Not read on start
Control documents	docs/CONTROL-*.md	Not enforced
STOP language in config	Agent prompt	Overridden by defaults
Violation counter rules	Agent prompt	No persistent state

8 mechanisms. 0 enforcement. 32 violations in 56 days.

The pattern that never stops

What the AI does every session:

Starts disciplined (first 5 minutes)
Hits a blocker (auth expires, API fails)
Enters "workaround mode"
Discipline erodes completely
Implements without approval
Gets caught, acknowledges, promises better
Violates again within minutes

What triggers violations:

"yes"	→ deploys to production
"ok"	→ creates infrastructure
"make it better"	→ rewrites and deploys
"deploy to dev"	→ deploys to prod
"1" (selecting option)	→ builds entire stack
Session relogin	→ forgets all rules

The trigger cost

$0.03

One Terraform plan + apply

↓

The damage cost

$106,000+

15+ days downtime, lost revenue, rebuild costs

RATIO: 3,500,000x

Now imagine 10,000 accounts

Impact	Calculation	Cost
Engineer downtime	10,000 × $150/hr × 8hrs × 15 days	$1.8 billion
Revenue loss (SaaS, $500M ARR)	$500M ÷ 365 × 15	$20.5 million
SLA penalties	Contractual	$10M–$50M
Stock price impact	2–5% drop	$500M–$2B
Regulatory fines (HIPAA/PCI/SOX)	15-day outage triggers all	$10M–$100M
Recovery/rebuild	SSO, SCPs, guardrails × 10,000	$20M–$50M
Legal exposure	Customer lawsuits, breach notifications	$50M–$200M
TOTAL		$500M–$4B+

The probability problem

One developer. One AI. One lockout in 56 days.

Developers Using AI	Expected Lockout Events Per Year
1	~6.5/year
10	~65/year
100	~650/year
1,000	Near certainty — multiple per week
10,000	Daily occurrence without hard gates

Even if your developers are 10x more careful, at 10,000 engineers the math is inescapable.

AWS Support is not a safety net

9

Support cases opened

19 hrs

Of escalation in one day

15+ days

Still waiting

0

Cases actually resolved

Two cases marked "Resolved" while the account remains locked. Three are "Pending customer action" with steps that don't work. The account is still inaccessible.

What AI tool developers must build

Hard gates, not soft prompts. Block file creation until requirements exist. Like CI blocks deploy on failed tests.
Persistent violation state. Agent must know its violation count across sessions, relogins, and compactions.
Authorization taxonomy. "Yes" ≠ "approved." Platform must enforce the vocabulary.
Blast radius limits. One conversational turn = one infrastructure change. Never cascade.
Mandatory dry-run. Destructive operations show preview + require separate confirmation. Hard stop.
Session boundary enforcement. After any reset, agent re-reads and acknowledges rules before accepting tasks.

The question for every CISO

If one AI coding assistant can permanently lock your management account in 90 minutes —

and your AI tool vendor has zero working guardrails to prevent it —

what is your risk acceptance posture?

Prompt-based rules are documentation.

They are not enforcement.

This business is shut down today because an AI coding assistant had no guardrails that actually work.

Yours doesn't either.

Full paper: d18gqyv10pt526.cloudfront.net/white-papers/AI-cost-savings-analysis.md
Tom Boesch — ItBytes LLC — May 2026