Data Privacy in the Age of AI and LLMs: Attacks, Defenses, and Emerging Challenges

Presenters

Ruixuan Liu
Postdoctoral Fellow, Emory University
Email: ruixuan.liu2@emory.edu
Website: https://rachelxuan11.github.io/

Li Xiong
Samuel Candler Professor in Computer Science, Emory University
Email: lxiong@emory.edu
Website: http://www.cs.emory.edu/~lxiong

Abstract

As AI and LLMs increasingly power sensitive domains such as healthcare and finance, safeguarding user data has become a central challenge. This tutorial explores the evolving landscape of privacy risks and protection strategies in the age of AI and LLMs, with particular attention to the Big Data characteristics that make these systems both powerful and vulnerable: longitudinality (data collected and linked over time), and multimodality (structured records, text, images).

The tutorial introduces key categories of privacy attacks—including membership inference, attribute inference, and data extraction—followed by case studies for healthcare data. It then discusses defenses across the AI lifecycle, from data synthesization and differentially private or privacy-enhanced training to post-training methods such as machine unlearning. Finally, we discuss open challenges arising from longitudinal and multimodal data, as well as broader privacy risks that extend beyond training data in the expanding LLM ecosystem.

Slides

Click here to view tutorial slides