OpenAI Launches Codex Security to Detect and Patch Vulnerabilities
OpenAI has launched Codex Security, an advanced AI-driven application security agent now available in research preview. This tool promises to transform how developers identify and fix complex vulnerabilities by deeply analyzing project context, cutting through false positives that plague traditional scanners.
Originally called Aardvark, Codex Security emerged from a private beta last year, where it quickly proved its worth. Internal tests uncovered serious issues, including SSRF exploits and cross-tenant authentication flaws, which OpenAI’s team patched within hours.
External pilots refined their context-handling, boosting precision scans on repeated repositories, showing an 84% noise reduction in one case, an over 90% drop in over-reported severity, and more than 50% fewer false positives across the board.​
Now rolling out to ChatGPT Pro, Enterprise, Business, and Edu users via the Codex web interface, it offers free access for the first month. This phased rollout prioritizes high-signal alerts, letting security teams skip the triage grind.​ Codex Security stands out by grounding its analysis in your codebase’s unique reality rather than generic rules.
Context Building and Threat Modeling: It scans repositories commit-by-commit, mapping the security structure that the system trusts, exposes, and handles. This generates an editable threat model in natural language, highlighting risks like user-upload interfaces prone to injection attacks. Teams can tweak it to match their architecture, ensuring agent alignment.
Smart Prioritization and Validation: Armed with the threat model, it hunts vulnerabilities and ranks them by real-world impact. High-potential issues go to sandboxed environments for pressure-testing, simulating exploits without risking production. Tailored setups even run proofs of concept in project-specific contexts, further reducing false positives and providing logged evidence for review.
Context-Aware Patching: For validated flaws, it crafts fixes that respect surrounding code intent, minimizing regressions. Patches land as easy-to-review pull requests in GitHub, with filters to spotlight critical items.
| Phase | Key Technique | Benefit |
|---|---|---|
| Threat Modeling | Repo analysis + editable doc | Custom risk focus ​ |
| Validation | Sandboxed PoCs | 50%+ false positive cut ​ |
| Patching | System-intent patches | Safer, faster merges ​ |
Over 30 days in beta, it scanned 1.2 million commits, flagging 792 critical and 10,561 high-severity issues, with critical ones in just 0.1% of the code, said OpenAI.
| CVE ID | Project | What Went Wrong (Easy Tech Talk) | Risk (What Attackers Gain) |
|---|---|---|---|
| CVE-2025-32990 | GnuTLS | Buffer too small for cert data, overflows | Crash or run bad code |
| CVE-2025-32989 | GnuTLS | Reads extra memory in cert parsing | Leaks secret keys |
| CVE-2025-32988 | GnuTLS | Old logins work after the password change | Corrupt data, run code |
| CVE-2025-64175 | GOGS | Skips 2nd password check | Steal user accounts |
| CVE-2026-25242 | GOGS | No login needed for key spots | Full control without login |
| CVE-2025-35430 | Thorium | Tricks file paths to write anywhere | Change or delete files |
| CVE-2025-35431 | LDAP | Bad inputs mess up user lookups | Steal user info |
| CVE-2025-35432 | Unlimited spam or crash on verify emails | Block service, spam users | |
| CVE-2025-35433 | Users | Shut down the service | Keep access forever |
| CVE-2025-35434 | Elastic | Ignores bad certs in connections | Spy on traffic |
| CVE-2025-35435 | API | Math error (divide by zero) crashes it | Stack overflow in the decrypt tool |
| CVE-2026-24881 | gpg | Run code on the victim’s machine | Run code on victim’s machine |
NETGEAR’s Head of Product Security, Chandan Nandakumaraiah, praised its seamless integration: “Findings were clear and comprehensive, like having an expert researcher on board.” This echoes broader gains, faster remediation without overwhelming reviewers.​
Codex Security tackles maintainers’ pain: low-quality reports flooding inboxes. OpenAI used it to scan dependencies and responsibly disclosed to projects such as OpenSSH, GnuTLS, GOGS, Thorium, libssh, PHP, and Chromium. This yielded 14 CVEs, including heap overflows (CVE-2025-32990), 2FA bypasses (CVE-2025-64175), and path traversals (CVE-2025-35430).​
A new “Codex for OSS” program offers free ChatGPT Pro/Plus accounts, code review, and scans to select maintainers who have already patched issues via it. OpenAI invites more OSS projects to join, aiming for sustainable security uplift.
​Site: https://cybersecuritypath.com