Skip to main content

Introduction

Quick Summary

Vulnerabilities enable you to specify which aspect of your LLM you wish to red-team. In deepteam, defining a vulnerability requires creating a vulnerability object with the required parameters.

from deepteam.vulnerabilities import DebugAccess, ShellInjection

debug_access = DebugAccess()
shell_injection = ShellInjection()
info

Each vulnerability accepts a types parameter that accepts a list of strings specific to that vulnerability. For example, Bias accepts "race", "gender", "political", and "religion" as types.

To use your defined vulnerabilities, supply it to the red_team() method:

from deepteam import red_team
...

red_team(vulnerabilities[pii_leakage, bias], model_callback=..., attacks=[...])

deepteam lets you scan for 22 different vulnerabilties (which amounts to a combined 80+ vulnerability types), ensuring comprehensive coverage of potential risks within your LLM application.

These risks and vulnerabilities include:

You can also create custom vulnerabilities for any vulnerability that is not covered by deepteam.

Five Main LLM Risks

LLM vulnerabilities can be categorized into 6 major LLM risk categories. Think of these categories simply as collections of vulnerabilities.

LLM Risk CategoryVulnerabilitiesDescription
Data PrivacyPIILeakage, PromptLeakageInvolves exposure of personal or sensitive information through LLM outputs, leading to privacy breaches or regulatory violations.
Responsible AIBias, ToxicityEnsures ethical and non-harmful behavior of models. Risks include offensive, discriminatory, or unfair content.
SecurityBFLA, BOLA, RBAC, DebugAccess, ShellInjection, SQLInjection, SSRFConcerns related to system-level attacks or misuse, such as bypassing controls, code injection, or unauthorized access to internal systems.
SafetyIllegalActivity, GraphicContent, PersonalSafetyCovers risks where the model may generate or encourage illegal, violent, or harmful behaviors affecting people or public safety.
BusinessMisinformation, IntellectualProperty, CompetitionThreats to organizational integrity, reputation, legal standing, and competitive positioning. Includes IP leakage, false information, and competitive data exposure.
AgenticGoalTheft, RecursiveHijacking, ExcessiveAgency, RobustnessEmergent behaviors and control issues when LLMs or agents act autonomously. Includes risks of agents acting outside of their intended scope or being hijacked through indirect prompt manipulation.