Publications

62 results for AI Testing

Workshop on Data Integrity and Secure Cloud Computing (DISCC)
- - Pradip Bose
  - Augusto Vega
  - et al.
- 2025
- HPCA 2025
Rest API Functional Tester
- - Diptikalyan Saha
  - Devika Sondhi
  - et al.
- 2025
- ISEC 2025
Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking
- - Gabriel Rioux
  - Apoorva Nitsure
  - et al.
- 2024
- NeurIPS 2024
A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial Scenarios
- - Samuel Ackerman
  - Ella Rabinovich
  - et al.
- 2024
- EMNLP 2024
Towards a Benchmark for Causal Business Process Reasoning with LLMs
- - Fabiana Fournier
  - Lior Limonad
  - et al.
- 2024
- BPM 2024
Why Don't Prompt-Based Fairness Metrics Correlate?
- - Abdelrahman Zayed
  - Gonçalo Mordido
  - et al.
- 2024
- ACL 2024
Data Contamination Report from the 2024 CONDA Shared Task
- - Oscar Sainz
  - Iker García-ferrero
  - et al.
- 2024
- ACL 2024
Risk Aware Benchmarking of Large Language Models
- - Apoorva Nitsure
  - Youssef Mroueh
  - et al.
- 2024
- ICML 2024
Towards Assurance of LLM Adversarial Robustness using Ontology-Driven Argumentation
- - Tomas Bueno Momcilovic
  - Beat Buesser
  - et al.
- 2024
- xAI 2024
Exploring Vulnerabilities in LLMs: A Red Teaming Approach to Evaluate Social Bias
- - Yuya Jeremy Ong
  - Jay Pankaj Gala
  - et al.
- 2024
- IEEE CISOSE 2024