Naorin Hossain, William Santiago Fernandez, et al.
ICMC 2024
Confidential collaborative machine learning (ML) enables multiple mutually distrusted data holders to jointly train an ML model while preserving the confidentiality of their private datasets due to regulatory or competitive reasons. However, existing works need frequent data and model exchanges during training via slower conventional links. They face increasing challenges due to the exponentially growing sizes of models and datasets in modern training workloads like large language models (LLMs), resulting in prohibitively high communication costs. In this paper, we propose a novel mechanism called GPU Travelling that leverages recently emerged confidential GPUs. With our rigorous design, the GPU can securely travel to the specific data holder to load the dataset directly into the GPU’s protected memory and then return for training, eliminating the need for data transmission while ensuring confidentiality up to a data-centre level. We developed a prototype using Intel TDX and NVIDIA H100 and evaluated its performance on llm.c, a CUDA-based LLM training project, and demonstrated the performance and feasibility while maintaining strong security guarantees. The results showed at least 4x speed improvement when transmitting a 512 MiB dataset chunk versus conventional transmission.
Naorin Hossain, William Santiago Fernandez, et al.
ICMC 2024
Liubov Nedoshivina, Anisa Halimi, et al.
AMIA Informatics Symposium 2024
Shengwei An, Sheng-Yen Chou, et al.
AAAI 2024
Jeffrey Burdges, Luca De Feo
Eurocrypt 2021