All stories
Incident

69 Vulnerabilities Across 15 Apps Built by 5 AI Coding Agents

found in a systematic security audit of Claude Code, Codex, Cursor, Replit, and Devin. Every single agent introduced SSRF. Zero of 15 apps implemented CSRF protection. Zero set a single security header.

The Study

In December 2025, security startup Tenzai ran the first systematic, head-to-head security benchmark of the five most popular AI coding agents: Claude Code, OpenAI Codex, Cursor, Replit, and Devin. Each agent was given the same three web application prompts and asked to build them end-to-end. The resulting 15 applications were then subjected to a full security audit.

The Results

69 vulnerabilities across 15 applications. Every category of basic security control was missing:

Why SSRF Was Universal

The URL preview feature is a common pattern: a user pastes a link, the server fetches it to generate a card with the page title and thumbnail. The secure implementation restricts the fetch to public URLs and blocks requests to internal IP ranges (127.0.0.0/8, 10.0.0.0/8, 169.254.169.254, etc.).

Every agent implemented the feature. None implemented the restriction. The AI understands the functional requirement — fetch a URL and return metadata — but not the security requirement — never let user input direct a server-side HTTP request to an internal address.

The Lesson

The 100% SSRF rate is not a fluke. It reveals a systematic gap: AI coding agents optimize for functionality, not for the absence of dangerous behavior. Security is defined by what the code does not do, and AI training data rewards what the code does. Until agents are specifically trained or constrained to model threat vectors, every auto-generated server-side fetch is a potential SSRF.