Safe and Secure Generative AI

Publications

Prompt injection attacks and defenses

Detecting and attributing AI-generated content

Preventing harmful content generation and jailbreaking

Hallucinations

Robustness to common perturbations

Poisoning and backdoor attacks to embedding foundation models (e.g., CLIP) and defenses

These foundation models are foundations of many generative AI systems. For instance, CLIP text encoder is used in text-to-image models, and vision encoder is used in multi-modal LLM.

Intellectual property and data-use auditing in model training

We study intellectual property for both model providers and users. For model providers, we study stealing attacks to foundation models and their defenses. For users, we study data auditing/tracing in pre-training foundation models.

Talks

Code and Data

Slides