You might think that one special advantage of AI systems is that they should be immune to attacks. After all, their methods are so complex and/or opaque that even we can’t understand how they work, so what hope would a hacker have in challenging these systems? But you would be mistaken. There’s nothing hackers like better than a challenge. DARPA promoted a Cyber Grand Challenge at DEF CON this year where the objective was to pit AI systems against each other in attack and real-time defense.
I should mention a few important points up front. Obviously, these folks are all white hats. Their goal is to research more responsive methods to defend systems, particularly against zero-day attacks. Second, as is the way with DARPA challenges, the playing field was a simplified subset of a production environment – 32-bit Linux running on an Intel architecture but with significantly constrained access to system calls. So you probably don’t need to worry about bad actors hacking your smart car through these methods, at least in the near future. Finally, the contest looked at AIs fighting each other rather than infesting AIs with bad behavior. Compromising AIs is a fascinating but different topic, which I may cover in a separate blog.
The contest was a version of capture the flag (CTF) where each competing system must defend against attacks from the other systems, while also launching attacks to probe for proof of vulnerability (POV), an exploit that can successfully attack a vulnerability in an opponent system. To ensure a level playing field, each team started with an identical system (apart from their AI software running on that system). Teams were then given identical challenge sets containing one or more vulnerabilities.
Each system obviously probed its own copy of a challenge set to find vulnerabilities and then had two possible courses of action: it could choose to patch the vulnerability, in which case the patch was made available to all systems, or it could choose to exploit the vulnerability on competing systems. Points were awarded to successful patches, but subtracted if a patch was found to have new vulnerabilities and points were awarded for capturing the flag (demonstrating a POV). Points were also awarded for successfully deflecting attacks.
The AI in all of this is that each system had to respond and evolve strategies autonomously. These would have to balance between deflection (presumably scoring was arranged so that simply blocking all communication would automatically lose), patching (which required some downtime to install patches, reducing time to respond to newly discovered threats) and attacks. Early in the contest new challenge sets were added every 10 minutes, and in later stages every 5 minutes, so strategies could not afford to waste time.
The winning system was MAYHEM from ForAllSecure.com, who are working on productizing the defense part of their solution (detecting and patching vulnerabilities at the binary level). It looks like they won the contest by a pretty wide margin, though several other competitors turned in very respectable performance. Perhaps not surprisingly ForAllSecure.com don’t reveal much about their AI architecture(s), but I doubt that the basics can deviate too much from one (or more) of the well-known frameworks.
This should be an interesting direction to follow. Signature-based detection methods are already outdated, behavior-based methods are becoming more common, so AI controlling and evolving those methods is a natural next step. You can learn more about the DARPA challenge HERE and more about MAYHEM at ForAllSecure.com.
Share this post via:
Podcast EP267: The Broad Impact Weebit Nano’s ReRAM is having with Coby Hanoch