Workshop on Data Integrity and Secure Cloud Computing (DISCC)Pradip BoseAugusto Vegaet al.2025HPCA 2025
Retention Score: Quantifying Jailbreak Risks for Vision Language ModelsZhaitang LiPin-Yu Chenet al.2025AAAI 2025
Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language ModelsXiaomeng XuPin-Yu Chenet al.2025AAAI 2025
Neural Reasoning Networks: Efficient interpretable neural networks with automatic textual explanationsSteve CarrowKyle Harper Erwinet al.2025AAAI 2025
Agent Trajectory Explorer: Visualizing and Providing Feedback on Agent TrajectoriesMichael DesmondJa Young Leeet al.2025AAAI 2025
Foundation Models at Work: Fine-Tuning for Fairness in Algorithmic HiringBuse KorkmazRahul Nairet al.2025AAAI 2025