120+ vulnerabilities
Bias, Toxicity, PII leakage, SQL Injection, BFLA, BOLA, RBAC, Hallucination, and dozens more — every major LLM risk category covered out of the box.
Framework-aligned risk assessments straight from Python — OWASP Top 10, NIST AI RMF, MITRE ATLAS, and more. Surface jailbreaks, vulnerabilities, and agentic risks before they ship.
from deepteam import red_teamfrom deepteam.test_case import RTTurnfrom deepteam.frameworks import OWASPTop10def model_callback(prompt: str, turns: List[RTTurn]= None) -> RTTurn:return your_llm_app(prompt, turns)red_team(model_callback=model_callback, framework=OWASPTop10())
DeepTeam uses LLMs to simulate adversarial attacks across every vulnerability and attack vectors so you find the failures before real attackers do.
Bias, Toxicity, PII leakage, SQL Injection, BFLA, BOLA, RBAC, Hallucination, and dozens more — every major LLM risk category covered out of the box.
Crescendo, Linear, Tree, Sequential, and Bad-Likert-Judge — research-backed adversarial chains that simulate sophisticated multi-turn attackers.
OWASP Top 10 for LLMs, MITRE ATLAS, NIST AI RMF, EU AI Act — plus safety benchmark datasets like BeaverTails and Aegis. Every major taxonomy plugged into a single framework.
OWASP Top 10 for LLMs and MITRE ATLAS — industry-standard adversarial taxonomies in one framework.
Map your model's risk posture against NIST AI RMF and EU AI Act controls — audit-ready, no separate tooling required.
Test against curated red-teaming corpora — BeaverTails and Aegis — straight from the runner, no glue code.
DeepTeam plugs into the LLM providers you already use, the safety frameworks your auditors require, and the CI/CD runners your team already trusts — continuous red teaming with zero glue code.
Nothing would be possible without our community of amazing contributors, thank you!
Get Started