Introduction
Quick Summary
Vulnerabilities enable you to specify which aspect of your LLM you wish to red-team. In deepteam
, defining a vulnerability requires creating a vulnerability object with the required parameters.
from deepteam.vulnerabilities import DebugAccess, ShellInjection
debug_access = DebugAccess()
shell_injection = ShellInjection()
Each vulnerability accepts a types
parameter that accepts a list of strings specific to that vulnerability. For example, Bias
accepts "race", "gender", "political", and "religion" as types
.
To use your defined vulnerabilities, supply it to the red_team()
method:
from deepteam import red_team
...
red_team(vulnerabilities[pii_leakage, bias], model_callback=..., attacks=[...])
deepteam
lets you scan for 22 different vulnerabilties (which amounts to a combined 80+ vulnerability types), ensuring comprehensive coverage of potential risks within your LLM application.
These risks and vulnerabilities include:
- Data Privacy
- Responsible AI
- Security
- Safety
- Business
- Agentic
You can also create custom vulnerabilities for any vulnerability that is not covered by deepteam
.
Five Main LLM Risks
LLM vulnerabilities can be categorized into 6 major LLM risk categories. Think of these categories simply as collections of vulnerabilities.
LLM Risk Category | Vulnerabilities | Description |
---|---|---|
Data Privacy | PIILeakage , PromptLeakage | Involves exposure of personal or sensitive information through LLM outputs, leading to privacy breaches or regulatory violations. |
Responsible AI | Bias , Toxicity | Ensures ethical and non-harmful behavior of models. Risks include offensive, discriminatory, or unfair content. |
Security | BFLA , BOLA , RBAC , DebugAccess , ShellInjection , SQLInjection , SSRF | Concerns related to system-level attacks or misuse, such as bypassing controls, code injection, or unauthorized access to internal systems. |
Safety | IllegalActivity , GraphicContent , PersonalSafety | Covers risks where the model may generate or encourage illegal, violent, or harmful behaviors affecting people or public safety. |
Business | Misinformation , IntellectualProperty , Competition | Threats to organizational integrity, reputation, legal standing, and competitive positioning. Includes IP leakage, false information, and competitive data exposure. |
Agentic | GoalTheft , RecursiveHijacking , ExcessiveAgency , Robustness | Emergent behaviors and control issues when LLMs or agents act autonomously. Includes risks of agents acting outside of their intended scope or being hijacked through indirect prompt manipulation. |