DARPA AI Cyber Challenge · 2nd place · Open source

Buttercup

An open-source AI Cyber Reasoning System that finds and patches vulnerabilities on its own.

Buttercup is a fully automated, AI-driven system for discovering and patching vulnerabilities in open-source software. Trail of Bits built it for DARPA's AI Cyber Challenge, and now that the competition has concluded, it is open source for the whole security community to use, extend, and build on.

View on GitHub Read the AIxCC series

AIxCC results

219 total score, $181 per point, achieved with exclusively non-reasoning LLMs. Across the competition, teams patched bugs in 54 million lines of code.

Presentations

Trail of Bits won second place in DARPA's AI Cyber Challenge at DEF CON 33. These presentations cover Buttercup's journey and what it was like to compete. Pan through the slides, or open the full PDF.

View slides (PDF)

Buttercup: Building an AI Cyber Reasoning System

How Buttercup works and the approach behind it.

PDF

View slides (PDF)

DARPA's Main Stage Presentation

The competition format and the final outcome.

PDF

How Buttercup works

Buttercup discovers and patches real vulnerabilities using static analysis and AI-guided fuzzing, then proves each bug and verifies each fix before reporting it. A multi-agent architecture runs the whole pipeline without a human in the loop.

Adaptive vulnerability discovery

Pairs static analysis with AI-guided fuzzing, adapting its search to each target to surface real, reachable bugs.
Extensive validation of bugs

Every candidate is reproduced and proven before it is reported, which kept Buttercup's results at 90% accuracy.
AI-driven patching

Generates fixes with LLMs and verifies each one closes the bug without breaking the build.
Fully autonomous system

Runs end to end with no human in the loop, from discovery through a validated patch.
Scalable architecture

A multi-agent design that runs anywhere, from a single laptop to enterprise Kubernetes clusters.
Language versatility

Finds and fixes vulnerabilities across many languages and 20 CWE categories.

AIxCC timeline

The Buttercup team

The engineers and researchers behind Buttercup.

Michael Brown
Ian Smith
Evan Downing
Eric Kilmer
Riccardo Schirone
Francesco Bertolaccini
Ronald Eytchison
Henrik Brodin
Brad Swain
Boyan Milanov
Alessandro Gario

Competition resources

Talks & events

Webinar
Hardening the Code: A Q&A on AI-Powered Security

Aired Sep 11, 2025 · Recording available

Watch the recording

Edera's Dan Fernández with Trail of Bits' Michael Brown. A recorded Q&A on AI-driven vulnerability management, agent design, and deploying secure AI systems at scale.
Workshop
Frontier AI in Cybersecurity: Risks and Opportunities

Nov 6 & 12, 2025 · Online · Berkeley RDI and Schmidt Sciences

Workshop details

Dan Guido and Riccardo Schirone. Our talk, "AIxCC Floating All Boats," on making Buttercup usable for everyone, alongside other AIxCC winning teams and frontier AI labs.
Conference talk
Buttercup and DARPA's AI Cyber Challenge

Nov 7, 2025 · RingZer0 COUNTERMEASURE, Ottawa

Talk details

Henrik Brodin and Ronald Eytchison. How Buttercup discovers and patches real vulnerabilities with static analysis and AI-guided fuzzing, and what we learned about when AI helps versus hurts.

From @trailofbits

Buttercup won the $3M second prize at DARPA's AIxCC. We found 28 vulnerabilities across 20 CWEs with 90% accuracy at just $181/point, achieving this with exclusively non-reasoning LLMs.
@trailofbits · Aug 9, 2025
Buttercup is now open source! Here's our updated and refactored repo, suitable for use by individuals. The blog has key architectural background about how it works.
@dguido · Aug 8, 2025
DARPA's AIxCC finals: 7 autonomous AI systems are competing right now to find and patch vulnerabilities in critical open-source programs like the Linux kernel, SQLite, and cURL.
@trailofbits · Jul 2, 2025

News & coverage

Buttercup

Buttercup: Building an AI Cyber Reasoning System

DARPA's Main Stage Presentation

Adaptive vulnerability discovery

Extensive validation of bugs

AI-driven patching

Fully autonomous system

Scalable architecture

Language versatility

Hardening the Code: A Q&A on AI-Powered Security

Frontier AI in Cybersecurity: Risks and Opportunities

Buttercup and DARPA's AI Cyber Challenge