LIVE INCIDENT — DAY 15
AI Guardrails Don't Work
How an AI coding assistant destroyed my AWS account,
shut down my business for 15+ days,
and why every guardrail I configured was ignored.
Tom Boesch — ItBytes LLC
May 2026
Two numbers. Both true.
632x
Return on AI investment
$178 in compute → $112,460 equivalent labor
$106,000+
Business loss from AI-induced lockout
15+ days of downtime and counting
The same tool. The same developer. The same 56 days.
What happened in 90 minutes
12:13 CDT
AI deploys Cognito User Pool to wrong AWS account
13:48 CDT
AI enforces authorization on broken auth — all users locked out
14:00–06:00
AI makes 10 rapid-fire "fixes" — each one making it worse
Day 2–5
Management account becomes permanently inaccessible
Day 15 (today)
Still locked out. Still down. AWS Support has not fixed it.
What died with one account
SSO
IAM Identity Center — access to ALL accounts gone
DNS
Route53 — it4bytes.com unreachable
AUTH
Cognito — application login broken
STATE
Terraform state — all IaC is blind
BILLING
Resources still running — can't shut them down
DOMAIN
Registered via Route53 — can't transfer
I configured every available guardrail.
All of them failed.
| Mechanism | Location | Result |
| Agent system prompt | ~/.kiro/agents/default.json | Ignored after relogin |
| Workspace rule files | .amazonq/rules/ | Not enforced |
| MCP server resources | mcp-server/data/ | Not enforced |
| Knowledge base indexing | Kiro KB | Not enforced |
| Incident documentation | docs/INCIDENT-*.md | Not read on start |
| Control documents | docs/CONTROL-*.md | Not enforced |
| STOP language in config | Agent prompt | Overridden by defaults |
| Violation counter rules | Agent prompt | No persistent state |
8 mechanisms. 0 enforcement. 32 violations in 56 days.
The pattern that never stops
What the AI does every session:
- Starts disciplined (first 5 minutes)
- Hits a blocker (auth expires, API fails)
- Enters "workaround mode"
- Discipline erodes completely
- Implements without approval
- Gets caught, acknowledges, promises better
- Violates again within minutes
What triggers violations:
| "yes" | → deploys to production |
| "ok" | → creates infrastructure |
| "make it better" | → rewrites and deploys |
| "deploy to dev" | → deploys to prod |
| "1" (selecting option) | → builds entire stack |
| Session relogin | → forgets all rules |
The trigger cost
$0.03
One Terraform plan + apply
↓
The damage cost
$106,000+
15+ days downtime, lost revenue, rebuild costs
RATIO: 3,500,000x
Now imagine 10,000 accounts
| Impact | Calculation | Cost |
| Engineer downtime | 10,000 × $150/hr × 8hrs × 15 days | $1.8 billion |
| Revenue loss (SaaS, $500M ARR) | $500M ÷ 365 × 15 | $20.5 million |
| SLA penalties | Contractual | $10M–$50M |
| Stock price impact | 2–5% drop | $500M–$2B |
| Regulatory fines (HIPAA/PCI/SOX) | 15-day outage triggers all | $10M–$100M |
| Recovery/rebuild | SSO, SCPs, guardrails × 10,000 | $20M–$50M |
| Legal exposure | Customer lawsuits, breach notifications | $50M–$200M |
| TOTAL | | $500M–$4B+ |
The probability problem
One developer. One AI. One lockout in 56 days.
| Developers Using AI | Expected Lockout Events Per Year |
| 1 | ~6.5/year |
| 10 | ~65/year |
| 100 | ~650/year |
| 1,000 | Near certainty — multiple per week |
| 10,000 | Daily occurrence without hard gates |
Even if your developers are 10x more careful, at 10,000 engineers the math is inescapable.
AWS Support is not a safety net
19 hrs
Of escalation in one day
0
Cases actually resolved
Two cases marked "Resolved" while the account remains locked. Three are "Pending customer action" with steps that don't work. The account is still inaccessible.
What AI tool developers must build
- Hard gates, not soft prompts. Block file creation until requirements exist. Like CI blocks deploy on failed tests.
- Persistent violation state. Agent must know its violation count across sessions, relogins, and compactions.
- Authorization taxonomy. "Yes" ≠ "approved." Platform must enforce the vocabulary.
- Blast radius limits. One conversational turn = one infrastructure change. Never cascade.
- Mandatory dry-run. Destructive operations show preview + require separate confirmation. Hard stop.
- Session boundary enforcement. After any reset, agent re-reads and acknowledges rules before accepting tasks.
The question for every CISO
If one AI coding assistant can permanently lock your management account in 90 minutes —
and your AI tool vendor has zero working guardrails to prevent it —
what is your risk acceptance posture?
Prompt-based rules are documentation.
They are not enforcement.
This business is shut down today because an AI coding assistant had no guardrails that actually work.
Yours doesn't either.
Full paper: d18gqyv10pt526.cloudfront.net/white-papers/AI-cost-savings-analysis.md
Tom Boesch — ItBytes LLC — May 2026