Hands-on workshop

Breaking AI

Prompt Injection, Data Exfiltration & Practical Defenses That Work

AI systems don't fail like traditional software — they fail silently, follow the wrong authority, and can be steered into leaking data or taking unintended actions. This 4-hour hands-on workshop teaches you how modern AI vulnerabilities actually show up in deployed LLM features by attacking and defending a sandboxed car dealership chatbot that's connected to an internal database. Then you will pivot to real-world data exfiltration patterns via direct and indirect prompt injection (including untrusted content in RAG-like workflows).

Quick Links

Open QBTrain Colab Notebook

Prompts

Workshop Materials

PDF 1View PDF

PDF 2View PDF

Assignment 1Start Assignment

Assignment 2Start Assignment

Slides & Notes

Prompt Injection FundamentalsView Slides

Information DisclosureView Slides

Model TheftView Slides

Feedback

Your feedback helps improve future workshops!

About the Instructor

Pavan Reddy

Pavan Reddy is an AI security researcher and builder, and the founder of QBTrain — a hands-on platform for learning AI security and AppSec. He started inside AI (adversarial ML, model internals) and now focuses on breaking and securing real LLM and agentic systems: prompt injection, data exfiltration, and the systemic weaknesses of foundation models. He has published at AAAI, ACM, NeurIPS, and FLAIRS, and teaches a small set of signature workshops across BSides, OWASP, and academic venues. As Principal Developer at Automata, he owns a security product end to end.