Publications

88 results at NeurIPS 2024

Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes
- - Xiaomeng Xu
  - Pin-Yu Chen
  - et al.
- 2024
- NeurIPS 2024
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
- - Chia-yi Hsu
  - Yu-Lin Tsai
  - et al.
- 2024
- NeurIPS 2024
Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models
- - Shengyun Peng
  - Pin-Yu Chen
  - et al.
- 2024
- NeurIPS 2024
Neural Network Reparametrization for Accelerated Optimization in Molecular Simulations
- - Nima Dehmamy
  - Csaba Both
  - et al.
- 2024
- NeurIPS 2024
Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning
- - Runqian Wang
  - Soumya Ghosh
  - et al.
- 2024
- NeurIPS 2024
Symmetry-Informed Governing Equation Discovery
- - Jianke Yang
  - Wang Rao
  - et al.
- 2024
- NeurIPS 2024
Abductive Reasoning in Logical Credal Networks
- - Radu Marinescu
  - Junkyu Lee
  - et al.
- 2024
- NeurIPS 2024
WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia
- - Yufang Hou
  - Alessandra Pascale
  - et al.
- 2024
- NeurIPS 2024
Dense Associative Memory Through the Lens of Random Features
- - Benjamin Hoover
  - Duen Horng Chau
  - et al.
- 2024
- NeurIPS 2024
Abstracted Shapes as Tokens - A Generalizable and Interpretable Model for Time-series Classification
- - Yunshi Wen
  - Tengfei Ma
  - et al.
- 2024
- NeurIPS 2024