Red team security testing with agentic AI

Table of contents

    Red team security testing with agentic AI
    TL;DR

    In the hands of an experienced pentester, agentic AI can accelerate what’s possible and spot flaws easily overlooked by the human eye.

    In the age of agentic AI, the scope of what’s possible from a red team(opens in a new tab) perspective has absolutely exploded. This post will discuss how I use an agentic workflow to increase the surface area I can cover when I perform manual internal security testing on our applications here at Nutrient. I’ll walk through my setup, share prompting strategies, and offer guidance on integrating AI tools into your own pentesting workflow.

    Basic setup

    For the purposes of assisting penetration testing, I use Claude Code(opens in a new tab) in a Kali Linux(opens in a new tab) virtual machine (VM). This is for two reasons:

    1. A common pentesting workflow involves installing various programs and accessing system tools to do inherently unsafe tasks, such as stealing browser cookies and other nefarious work.
      • Running Claude in a VM ensures that no matter what it does, Claude won’t be able to access my real user credentials from my actual browser where I’m signed into company systems. Note: Never sign into production credentials from within a pentesting VM.
    2. By using Kali Linux, I have a wide assortment of system tools at Claude’s disposal for conducting penetration testing.
      • Furthermore, since my throwaway VM hasn’t been locked down at all and doesn’t have access to sensitive production information, I can freely allow Claude to install tooling as needed.

    MCP

    MCP is your friend because it allows you to give Claude access to a browser. This is incredibly useful if you’re testing a live web application that can be accessed in a browser, since you can combine source code access with full web interactivity for your digital pentest partner.

    You can make use of the Playwright MCP(opens in a new tab) for browser functionality:

    claude mcp add --transport stdio playwright -- npx -y @playwright/mcp@latest

    Web browser access is one of the most common MCPs to use for pentesting purposes, but I encourage you to look for more ways to improve your workflow. As of the time of writing, Claude Code is less than a year old, and the integration ecosystem for it is already enormous. Your options will only continue to expand as time goes on.

    Prompting

    AI tooling will amplify anything you give it, so your output quality will be directly proportional to how good your prompt is to go off of. When writing prompts, it’s important to give your AI agent the proper information and context it needs so that your response is as accurate as possible.

    Here’s an example of what a general prompt for a first pass of testing with an AI agent looks like:

    You are a penetration tester with access to all the tools at your disposal on a Kali virtual machine right now. I would like you to help me find vulnerabilities in a web application that I made.
    My team built a web application that [EXPLAIN WHAT THE TARGET DOES]. The web application’s live staging address is [TARGET]. The corresponding source code to the application is present in this folder, so you can look through the entire codebase and use that to look for issues and inform the testing.
    [CREDENTIALS / SESSION TOKENS] for a test user who has access to [EXPLAIN THE USER LEVEL OF ACCESS] are: [PASTE CREDS IF YOU HAVE A TEST USER FOR IT]
    Please come up with a plan to check the application for vulnerabilities and then do the testing. Preferably, I want to test in the application itself to see if there’s anything wrong with [PUT ANY DETAILS IN THAT YOU WANT TO FOCUS ON], but I want to look for other issues too.

    Framing and model alignment

    Red team testing has and always will exist in a sort of gray area, because you’re performing the exact same set of actions a malicious actor would. Because of the alignment controls placed on most AI models, you cannot blatantly ask AI agents to exploit things for you in those terms. Instead, you must ask nicely.

    Make sure to always frame your prompts from the perspective of “I’m a developer and I’m trying to find security problems in code that I wrote.” If you don’t include some amount of positive framing in your request, your AI agent may start to argue with you about doing testing that you’re fully authorized to do.

    To be clear, the contents of this post are intended purely for educational use and ethical penetration testing, such as internal testing against your own products. Don’t ever attempt to exploit software that you haven’t been given explicit written permission to test, or you risk not only ruining your career, but also going to jail for a very long time.

    The role of an AI agent on the red team

    AI agents don’t currently possess the high-context general problem solving that human beings are capable of, but there are certain categories of tasks that AI tools are inherently better at than humans. The greatest advantage that an AI agent has over a human is its ability to consume, retain, and apply verbose documentation across a large content surface.

    To give a simple and concrete example, Claude can fetch online documentation and tell you the exact nuances of any given iptables flag of your choosing and cross-reference it across a large configuration file faster than you can even type man iptables. This is a superpower when it comes to picking software apart for the purpose of breaking it:

    • Claude can be used for deobfuscating code, along with other reverse engineering and decompilation work that’s incredibly tedious to do as a human.
      • It’s actually shocking how much you can do just by giving Claude any random file or folder with instructions along the lines of “reverse engineer this for me.” It will just plod off and figure out whatever tools are needed to solve your problem while you go do other work. We’re now living in a world where it takes zero human effort to reverse engineer software code.
    • Claude can quickly prototype attacks when given tools, examples, and specific actionable goals. For example, do you think there might be an exposed database call somewhere? Tell Claude to throw something together in sqlmap at your app.
      • Remember, AI agents are command-line beasts. Luckily, there are so many great command-line tools for breaking into things, so you don’t even need an MCP to access the existing ecosystem of CLI exploit tools out there. Your imagination really is the limit.
    • Product documentation is your best friend. If you can, scrape any live product documentation for your target, put it in the folder your agent has access to, and mention it in your prompt. An AI agent can keep an entire documentation repository in mind when traversing and inspecting a huge codebase.
      • A side benefit of giving Claude documentation for a better understanding of your target is that it will also often point out discrepancies and undocumented functionality. This kind of finding (while not always directly security-related) can be a sort of extra treat for improving the developer experience of your documentation.

    Human testing still matters

    As the human behind Claude, it’s your job to determine what the most vulnerable surface area in your target app is. After doing an initial pass telling Claude to do more general testing, I like to then zero in on specific attack paths and have Claude exhaust them.

    Using agentic tools is sort of like having an extra employee with a different skillset than your own, but it doesn’t actually replace the human side of manual penetration testing. It’s still important to open up the usual exploitation tools and perform the same kind of testing you’ve always done. You’ll continue to find things that AI agents won’t by virtue of the fact that you have a human perspective on the business logic of what your target actually does and why it matters to the humans who use it.

    The other incredibly important responsibility you have when making use of AI agent tools for security testing is triage. The unfortunate reality with agentic AI tools is that they can and will regularly just lie to you. Claude will happily present you lists of findings where a reasonable portion of items presented either aren’t real or aren’t impactful in the way that the tool describes. However, agentic AI tools will inevitably find problems that you may have overlooked, and in the business of red teaming, that’s gold.

    As a security engineer, there’s nothing more important than your reputation and your relationship with your corresponding development teams. A development team that doesn’t trust you or that sees you as an enemy who just wants to throw work at it won’t want to work with you. If you’re the girl who cried wolf too many times, you’ll foster a team that skims and closes issues that you raise without giving them consideration. The single most important responsibility you have when using agentic AI tooling as part of your penetration testing workflow is to independently validate any issue presented to you prior to sending it off as a ticket for someone to fix.

    Between code review, automated testing tools, manual exploitation tools, and an AI agent at your side, you can assess sizable codebases extremely effectively.

    Takeaways for blue teams

    I wanted to end this discussion with what I think are some important notes if you’re an engineer working on software development or trying to navigate application security in your own environment:

    • Security through obscurity is completely dead; there are no excuses anymore in an age where an AI agent can explain and comment any given piece of code you throw at it.
      • Minifying code still makes sense for performance reasons, and it might deter unsophisticated actors who haven’t yet adopted an agentic workflow, but I promise you that code obfuscation isn’t wasting any state-level actor man hours to decipher anymore.
    • Even if you already “know how to code” and have a robust test suite, you should still use AI tooling to doublecheck your work for logical issues and best practices. There are issues that are easier for AI to spot than for humans, and attackers will be out there on the lookout.
    • While security tools and agentic AI raise the bar of what’s possible, you can’t outsource the human element of information security.
      • At the end of the day, if you want to build secure products, you’ll have to have humans who care about what they’re doing to test, verify, and prioritize the remediation of real security issues.
    • There are more possibilities than ever for building AI testing into your CI/CD pipelines. Needless to say, while AI isn’t a replacement for human review, you can provide invaluable additional information to developers by leveraging AI automation as part of your development process.

    At Nutrient, we’re constantly striving to raise the bar and provide industry-leading security for our products. Learn more about how Nutrient is building the next generation of secure document solutions.

    Serana Warren

    Serana Warren

    Information Security Officer

    Serana is a cybersecurity specialist with a passion for open source technologies, digital rights, and online privacy. Outside of work, she enjoys writing science fiction and playing tabletop roleplaying games.

    Explore related topics

    Try for free Ready to get started?