← back

Building security around ML: Dr. Andrew Davis

434 views · Feb 08, 2025 · 25:01 min · Watch on YouTube ↗
Takeaway

ML security has moved from anti-malware-style ML to defending the ML pipeline itself — poisoning, theft, and adversarial examples remain open problems.

Summary

  • Hidden Layer's CDS surveys the ML threat surface: data poisoning, model theft (with adversarial transferability), adversarial examples, supply chain, and software CVEs (e.g. Ollama).
  • Adversarial examples remain unsolved and now extend to multimodal LLMs that accept images — same attack class, new modality.
  • Model theft enables black-box adversaries to probe the stolen copy and craft transferable attacks against production.
  • Hidden Layer monitors requester-level access patterns to production models to detect adversarial probing and model-extraction attempts.
securityml-securityadversarial
Original description
The field of Adversarial ML has been active since at least 2013 and despite over a decade of attempts to make models more robust to imperceptible changes in the input, attack methods still outpace the ability to defend neural networks and other machine learning models. In this talk, we'll get into why adversarial examples are becoming increasingly relevant with the advent of agentic multimodal LLMs and what we can do to defend these models.

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Dr. Andrew
Dr. Andrew Davis is Chief Data Scientist at HiddenLayer, where he leads research defending and detecting attacks on ML systems. Coming from a cybersecurity background, Andrew has been interested in the problem of solving adversarial examples since seeing the "Intriguing Properties of Neural Networks" poster at the ICLR 2014 workshop.