Building a Security Scanner for LLM Apps

(promptfoo.dev)

7 points | by danenania 2 hours ago ago

3 comments

$recursive4 2 hours ago

Really cool to see this! Building a security scanner specifically for LLM apps feels like an important step given how quickly production AI workflows are proliferating.
What stood out to me in the blog is how the scanner isn’t just a general linting tool — it actually traces inputs and outputs through the code to understand how untrusted user data might flow into prompts, models, and then back into privileged operations. That focus on data flow and behavior rather than just surface diffs seems like a solid way to reduce both blind spots and noise in alerts.
I also appreciate the emphasis on concrete vulnerabilities and real CVEs (e.g., LLM code executing arbitrary commands or translating LLM output directly into database queries) — showing that these aren’t just hypothetical risk categories but things happening in the wild.
A couple of thoughts / questions from my side:
Balancing precision vs noise: The blog mentions tailoring what counts as a real finding so you don’t overwhelm engineers with false positives. It’d be interesting to hear more about how that balance was tuned in practice, especially on larger codebases.
Integration with existing pipelines: I saw the GitHub Action auto-reviews PRs, but how do teams handle this alongside other scanners (SAST, dependency scanners, etc.) without ballooning CI times?
Vulnerability taxonomy: Prompt injection, jailbreak risk, and sensitive information leaks are all big categories, but there are other vectors (RAG-specific issues, tool misuse in agents). Curious how far the scanner’s heuristics go vs where red-teaming still wins.
Overall, a much-needed tool as LLMs go from experiment to core business logic. Would love to hear from others about how they’ve integrated this kind of scanning or what other categories of LLM security risk they’re watching for.
[-]
$danenania 2 hours ago

Hey all,
I'm an engineer at Promptfoo (open source evals and red teaming for AI). We're launching a tool that scans GitHub PRs for common LLM-related vulnerabilities. This post goes into detail on how it was built and the kinds of vulnerabilities that LLM apps are most prone to.
It includes a few real CVEs in open source projects that we reproduced as PRs so we could test the scanner.
I'd love to hear your thoughts.