The Inherent Adversarial Robustness of Analog In-Memory ComputingCorey Liam LammieJulian Büchelet al.2025Nature Communications
Privacy without Noisy Gradients: Slicing Mechanism for Generative Model TrainingKristjan GreenewaldYuancheng Yuet al.2024NeurIPS 2024
Unified Lookup Tables: Privacy-Preserving Foundation ModelsNikita JanakarajanIrina Espejo Moraleset al.2024NeurIPS 2024
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAIAmbrish RawatStefan Schoepfet al.2024NeurIPS 2024
Membership Inference Attacks Against Time-Series ModelsNoam KorenAbigail Goldsteenet al.2024ACML 2024
MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt AttacksGiandomenico CornacchiaKieran Fraseret al.2024AIES 2024
On Robustness-Accuracy Characterization of Language Models using Synthetic DatasetsChing-yun KoPin-Yu Chenet al.2024COLM 2024
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic PromptsZhi-yi ChinChieh-ming Jianget al.2024ICML 2024
Be Your Own Neighborhood: Detecting Adversarial Examples by the Neighborhood Relations Built on Self-Supervised LearningZhiyuan HeYijun Yanget al.2024ICML 2024
What Would Gauss Say About Representations? Probing Pretrained Image Models using Synthetic Gaussian BenchmarksIrene KoPin-Yu Chenet al.2024ICML 2024