Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language ModelsShengyun PengPin-Yu Chenet al.2024NeurIPS 2024
Multivariate Stochastic Dominance via Optimal Transport and Applications to Models BenchmarkingGabriel RiouxApoorva Nitsureet al.2024NeurIPS 2024
Distributional Preference Alignment of LLMs via Optimal TransportIgor MelnykYoussef Mrouehet al.2024NeurIPS 2024
Graph-based Uncertainty Metrics for Long-form Language Model GenerationsMingjian JiangYangjun Yangjunet al.2024NeurIPS 2024
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational AgentsIvoline NgongSwanand Ravindra Kadheet al.2024NeurIPS 2024
Better Bias Benchmarking of Language Models via Multi-factor AnalysisHannah PowersIoana Baldini Soareset al.2024NeurIPS 2024
Learning to Optimize Molecules with a Chemical Language ModelJerret RossSamuel Hoffmanet al.2024NeurIPS 2024
Final-Model-Only Data Attribution with a Unifying View of Gradient-Based MethodsDennis WeiInkit Padhiet al.2024NeurIPS 2024
Consistency-based Black-box Uncertainty Quantification for Text-to-SQLDebarun BhattacharjyaBalaji Ganesanet al.2024NeurIPS 2024
On the role of noise in factorizers for disentangling distributed representationsKumudu Geethan KarunaratneMichael Herscheet al.2024NeurIPS 2024