Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational AgentsIvoline NgongSwanand Ravindra Kadheet al.2025ACL 2025
MAD-MAX: Modular And Diverse Malicious AttackMiXtures for Automated LLM Red TeamingStefan SchoepfMuhammad Zaid Hameedet al.2025ICML 2025
In-Context Bias Propagation in LLM-Based Tabular Data GenerationPol Garcia RecasensAlberto Gutierrez-torreet al.2025ICML 2025
VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data SynthesisChia-yi HsuJia You Chenet al.2025ICASSP 2025
Retention Score: Quantifying Jailbreak Risks for Vision Language ModelsZhaitang LiPin-Yu Chenet al.2025AAAI 2025
Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language ModelsXiaomeng XuPin-Yu Chenet al.2025AAAI 2025
Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented GenerationMaya AndersonGuy Amitet al.2025ICISSP 2025