Hands-on workshop

How to Break an AI

Adversarial Attacks, Jailbreaks & Defenses That Actually Work

AI systems don't break like software; they fail in silence, misclassify with confidence, and hallucinate under pressure. This 4-hour hands-on workshop exposes the core vulnerabilities of modern AI, from adversarial image attacks to LLM manipulation. You will actively craft exploits using the open-source Adversarial Lab toolkit. The focus is practical: how do these attacks work, how can you launch them, and what can actually stop them?

Workshop Materials

LLM Attacks Notebook

An interactive Colab notebook to explore and execute jailbreaks and adversarial attacks on Large Language Models.

Open Colab
Image Attacks Notebook

Dive into the world of adversarial images. This Colab notebook guides you through crafting images that fool computer vision models.

Open Colab
Workshop GitHub Repo

All the code, resources, and tools for the workshop. Clone, fork, and experiment with the Adversarial Lab toolkit.

View on GitHub

Resources for Hands-On Experiments

Assignment 1: Image Puzzle

A practical challenge to test your understanding of adversarial image manipulation.

Start Assignment 1
Assignment 2: Secure Login

Attempt to bypass a secure login system using AI-specific attack vectors.

Start Assignment 2
Reference Document

A supplementary graph for analysis during the workshop exercises.

View PDF

Feedback

Your feedback helps improve future workshops!

About the Instructor

Pavan Reddy

Pavan Reddy is an AI security researcher and builder, and the founder of QBTrain — a hands-on platform for learning AI security and AppSec. He started inside AI (adversarial ML, model internals) and now focuses on breaking and securing real LLM and agentic systems: prompt injection, data exfiltration, and the systemic weaknesses of foundation models. He has published at AAAI, ACM, NeurIPS, and FLAIRS, and teaches a small set of signature workshops across BSides, OWASP, and academic venues. As Principal Developer at Automata, he owns a security product end to end.