CrowdStrike Research: Securing AI-Generated Code with Multiple Self-Learning AI Agents

  • CrowdStrike data scientists have undertaken research into developing innovative new self-learning, multi-agent AI systems that employ Red Teaming capabilities
  • This new approach, presented at the NVIDIA GTC 2025 conference, is designed to minimize vulnerabilities in the coming flood of AI agent-developed code
  • While still in the proof-of-concept stage, our research holds significant promise as a necessary step toward preventing unpatched vulnerabilities from becoming a larger cybersecurity problem

Applying robust security measures to automated software development is no longer a luxury but a necessity. CrowdStrike data scientists have developed an AI-driven, multi-agent proof of concept that leverages Red Teaming capabilities to identify vulnerabilities in code developed by AI agents. While it is still in the research stage, our work shows this advanced AI technology has the potential to revolutionize software security.

In our novel self-learning multi-agent AI systems (MAS), each agent fulfills various security roles and they work together to reinforce each other's knowledge and actions. We employ proactive vulnerability detection and automatic exploitation to protect autonomous code generation processes. Our research team determined that adopting a multi-agent approach enables the identification of potential vulnerabilities before they can be exploited by adversaries, ensuring the integrity of software systems and empowering developers to focus on what matters most: creating innovative and secure software solutions. 

With AI agent-developed code becoming increasingly common, self-learning MAS such as those presented by CrowdStrike data scientists at the NVIDIA GTC 2025 conference could be the key to preventing a flood of unpatched vulnerabilities and the cybersecurity challenges they cause.

The Future of Coding Brings New Cybersecurity Challenges

In the realm of software development, autonomous code generation agents are a game-changer. By automating complex coding tasks, these agents free up developers to focus on high-quality content. No longer are they tied to mundane, time-consuming processes that drain their creative energy.

Under the "vibe coding" approach — a concept popularized by OpenAI co-founder Andrej Karpathy — large language models (LLMs) handle much of the coding. No programming experience or technical know-how is required — users can simply type prompts about the task they are attempting to accomplish into a text box and let the AI tool output a prototype app. Vibe coding is attracting more people than ever, as it allows them to tap into their creativity and channel their passion into building innovative applications. The result? Plenty of new software systems, but also plenty of risk.

Vulnerabilities have long been a concern for the cybersecurity industry. The gap between vulnerability discovery and patching leaves organizations at considerable risk of exploitation. Given the rapid speed of autonomous code development, this gap is poised to grow wider.

With the industry shifting toward automated code generation and review, securing AI agent-developed code has become a key security challenge. While human software testers and tools can validate code security, the scale and pace of dynamic code generation poses a significant hurdle. The problem is akin to trying to keep up with a speeding bullet — it's a daunting task that requires intelligent solutions to automate and enhance security processes.

In this new landscape where autonomous code generation is the norm, the need for intelligent solutions that scale is more pressing than ever. We require systems that can both automate security processes and anticipate and adapt to emerging threats. Harnessing the power of AI at the inception of the software development lifecycle ensures security is part of pre-release stages of the process.

Evolving the Art of Code Security with AI Agents

In this automated industry of “vibe coding” and AI-augmented code generation, we have initiated a comprehensive testing regime for an advanced AI-powered system designed to enable strict adherence to secure software development practices. This autonomous agent system leverages the latest threat detection capabilities to identify vulnerabilities in the codebase, thereby providing enhanced protection against a wide range of security threats, including unauthorized access, backdoor insertion, vulnerability exploitation, and other malicious activities.

Our proof of concept consists of at least three agentic AI systems, each built on top of various security roles that work together to reinforce each other's knowledge and actions. These systems include:

  1. Vulnerability scanning agent: capable of identifying code vulnerabilities and knowing what static application security testing (SAST) best fits each application.
  2. Red Teaming agent: builds exploitation scripts using internal knowledge and information from historical exploitation databases. This agent learns from previous iterations to associate tuples of a specific vulnerability and the exploitation code with the best results.
  3. Patching agent: responsible for generating security unit tests and generating code patches based on the input from the Vulnerability AI agent, the compound feedback from unit tests, and the exploitation results driven by the Red Teaming AI agent.
Figure 1. The three AI agents working together Figure 1. The three AI agents working together

The novelty of this workflow relies on the self-learning processes that allow the system to not only identify similar situations and apply the best solution, but also to automatically adapt to new cases based on the interaction between all of the security roles that work together to reinforce each other's knowledge and actions.

The Power of Mixology: Combining LLMs and SAST

Given a code repository, the Vulnerability AI agent is responsible for finding vulnerabilities in the received source code or in the pull request updates of an application. With each new scan, the agent selects one or more SAST tools that fit the target application, based on the information it obtains from README files and from previous knowledge of similar cases.

The Vulnerability AI agent then initiates a comprehensive scan of the source code, leveraging custom rules for the SAST tool(s) to identify potential security weaknesses. This process is facilitated through collaboration with the Patching AI agent and the Red Teaming AI agent, which assist in generating security unit tests and validating vulnerabilities through exploitation.

Upon completing the vulnerability scan and validation processes, the Vulnerability AI agent refines its knowledge base by assessing the validity of identified vulnerabilities and evaluating the accuracy of SAST tools for specific application types. This refined understanding enables the AI agent to generate highly effective patches that address the disclosed vulnerabilities.

There is considerable manual effort required to discover and address a pre-release code vulnerability. When any SAST alert is reviewed, the codebase is then opened to validate the issue, a fix is made, a pull request is made for the code, and it is then merged. With our approach, the time required for this process in our testing environment is reduced by approximately 90%.

Combining LLMs and SAST tools in a loop process, with the LLM interpreting README files and SAST tool output, enables continuous improvement of the AI agent's capabilities in detecting and mitigating security threats.

AI-Powered Validation Through Exploitation

Our Red Teaming AI agent is a guardian of our security's unseen vulnerabilities. With “instincts” honed from past experiences and training on retrieval-augmented generation (RAG) systems, this technology identifies and validates vulnerabilities through exploitation, and then uses its expertise to uncover new weaknesses that might have been hiding in plain sight. The results are used to refine and optimize our multi-AI agent approach with fine-tuning processes.

This agent’s LLM has contextual data from the Vulnerability role that allows it to generate an exploitation code and start the application by reading the configuration files (dockerfiles, makefiles, etc.). It then proceeds with gathering critical reconnaissance information, including the application endpoint, exposed ports, and available web routes. These steps support the final purpose of executing the targeted exploitation on the running code, ensuring maximum impact.

Together, our trio of AI agents collaborates in a glass testing-box role to deliver cybersecurity expertise at machine speed. They identify all potential vulnerabilities through exhaustive scanning and analysis, conducting rigorous testing using best exploit simulation techniques. This approach shows promise in ensuring that no remaining vulnerabilities are left unaddressed, preventing them from being released and subsequently requiring expensive and time-consuming human remediation.

Research Is Key to Stay Ahead of Adversaries – and Technology

It is important to stay one step ahead of malicious actors by simulating real-world attacks on your systems before they can be exploited. The research undertaken by CrowdStrike data scientists is facilitating the identification of vulnerabilities and weaknesses in real time.

By harnessing the collective power of our AI agents, organizations can look forward to a future where cybersecurity visibility, agility, and effectiveness have the potential to reach new levels. This will help to protect their most valuable assets in a rapidly evolving threat landscape and reduce the time and effort required to identify vulnerabilities with automated rigorous tests.

Our presentation of novel, self-learning MAS at the NVIDIA GTC 2025 conference is a part of CrowdStrike’s commitment to research, cybersecurity industry thought leadership, and staying ahead of adversaries to stop breaches. It is this focus on innovation that ensures the AI-native CrowdStrike Falcon® platform remains at the forefront of cybersecurity protection, even as game-changing technology like AI-augmented code generation comes into play.

Additional Resources

CrowdStrike 2025 Global Threat Report

CrowdStrike 2025 Global Threat Report

Get your copy of the must-read cybersecurity report of the year.