Publications

822 results for Trustworthy AI

CharED: Character-wise Ensemble Decoding for Large Language Models
- - Kevin Gu
  - Eva Tuecke
  - et al.
- 2024
- ICML 2024
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs
- - Swanand Ravindra Kadhe
  - Farhan Ahmed
  - et al.
- 2024
- ICML 2024
Towards Assurance of LLM Adversarial Robustness using Ontology-Driven Argumentation
- - Tomas Bueno Momcilovic
  - Beat Buesser
  - et al.
- 2024
- xAI 2024
Identifying Homogeneous and Interpretable Groups for Conformal Prediction
- - Natalia Martinez Gil
  - Dhaval Patel
  - et al.
- 2024
- UAI 2024
AUTOLYCUS: Exploiting Explainable Artificial Intelligence (XAI) for Model Extraction Attacks against Interpretable Models
- - Abdullah Caglar Oksuz
  - Anisa Halimi
  - et al.
- 2024
- PETS 2024
Exploring Vulnerabilities in LLMs: A Red Teaming Approach to Evaluate Social Bias
- - Yuya Jeremy Ong
  - Jay Pankaj Gala
  - et al.
- 2024
- IEEE CISOSE 2024
Quantifying Representation Reliability in Self-Supervised Learning Models
- - Young Jin Park
  - Hao Wang
  - et al.
- 2024
- UAI 2024
Privacy-Preserving Verification of Preprocessing in Machine Learning Models
- - Wenbiao Li
  - Anisa Halimi
  - et al.
- 2024
- PETS 2024
Effective In-Silico Gene Perturbation by Machine Learning Model Interpretation for Immunotherapies
- - Tanwi Biswas
  - Akira Koseki
  - et al.
- 2024
- ISMB 2024
Effect of dataset partitioning strategies for evaluating out-of-distribution generalisation for predictive models in biochemistry
- - Raúl Fernández Díaz
  - Lam Thanh Hoang
  - et al.
- 2024
- ISMB 2024