For decades, the field of cybersecurity has operated under a predictable equilibrium. Humans wrote code, and humans—aided by relatively simple automated tools—found the bugs. But as Nicholas Carlini, a lead researcher at Anthropic, demonstrated at the [un]prompted 2026 conference, that era is over. We have entered the age of "Black-hat LLMs," where the most significant shift in security since the invention of the internet is happening not over decades, but over months.
The Arrival of the Autonomous Attacker
Carlini’s core thesis is unsettling: current Large Language Models (LLMs) are now better at finding and exploiting software vulnerabilities than elite human professionals. To prove this, he showcased a series of exploits discovered by AI with almost no human intervention. Using a simple "scaffold"—essentially telling the model it was in a hacking competition and giving it access to a virtual machine—the AI was able to tear through hardened codebases that have been public for decades.
From 2003 to the Present: The Linux Kernel Breach
Perhaps the most shocking revelation in Carlini’s presentation was the discovery of a remote "heap buffer overflow" in the Linux kernel. The Linux kernel is the most scrutinized piece of software on the planet, yet the AI identified a vulnerability in the Network File System (NFS) that had remained hidden since 2003.
The AI didn't just find the bug; it explained the logic behind it perfectly. Carlini displayed a flow schematic created entirely by the model, detailing how two competing adversaries could trick the system into crashing. For a machine to understand the "nuance" of such a complex, deep-rooted bug—something Carlini admits he likely couldn't have found himself—marks a fundamental shift in AI capability.
The Ghost in the Machine
Beyond the kernel, the AI targeted the Ghost Content Management System. Despite having 50,000 stars on GitHub and a clean record regarding critical vulnerabilities, the model found its first-ever critical "SQL injection" (a way to trick a database into giving up secrets). The AI then autonomously wrote a script to exploit the vulnerability, extracting admin API keys and password hashes without any prior login. The lack of friction is what makes this "scary": a malicious actor no longer needs years of training; they simply need to ask the model to find the hole.
The "Transitionary Period"
Carlini warns that we are currently in a "transitionary period." In the long run, AI might make us safer by rewriting all software in memory-safe languages like Rust or formally verifying every line of code. However, right now, we are in a race. There are billions of lines of legacy code still in use, and AI can now find bugs in that code at a scale and speed humans cannot match. With AI capabilities currently doubling roughly every four months, the "average model on a laptop" will soon possess the skills of a world-class hacker.
How You Can Stay Safe (Non-Developer Guide)
For those of us who aren't software developers, this shift means that the apps and devices we use daily—from banking tools to smart home gadgets—may temporarily become more vulnerable as AI finds old bugs faster than humans can fix them. To protect yourself during this period, you must change your digital habits:
1- Embrace "Update Culture": Since AI can now discover vulnerabilities in hours, software companies are racing to release "patches" (digital band-aids). Turn on automatic updates for every device you own (phones, laptops, and even smart TVs) to ensure these fixes are applied the moment they are available.
2 - Use Multi-Factor Authentication (MFA): AI makes it easier for hackers to steal login credentials. MFA is no longer optional; using an authentication app or a physical security key ensures that even if a hacker "breaks the lock" on your password, they still cannot get into your account.
3 - Audit Your Digital Footprint: Treat old, unmaintained apps or "smart" gadgets from unknown brands as high risks. If you don't use an account or a device anymore, delete it or disconnect it. In an AI-driven world, your "attack surface"—the amount of data you have exposed—should be as small as possible.
Conclusion: A Call for Defensive AI
The message to the tech industry is clear: the time for denial is over. The "dual-use" nature of AI means that while companies like Anthropic implement safeguards, malicious actors will inevitably seek to "jailbreak" these models for harm. To counter this, the world needs an immediate, massive influx of talent focused on using these same AI tools for defense. As Carlini concluded, "Waiting a year is going to be too long." We are living through a revolution that requires us to rethink the very foundations of how we protect our digital lives.
Watch the full talk on YouTube
Related Reads:
Anthropic: Claude Mythos & The Zero-Day Frontier
InfoQ: Claude Code Finds 23-Year-Old Linux Bug
National Cybersecurity Alliance: How to Update Your Software (2026 Edition)
Cyber Press: Adobe Patches Actively Exploited Acrobat Zero-Day (April 2026)