Isha Puri, Shivchander Sudalairaj, et al.
NeurIPS 2025
Molecular property prediction has greatly benefited from learned embeddings such as SMILES-based, SELFIES-based, and graph-derived representations. However, existing approaches often rely on a single modality or naïvely concatenating multiple modalities, limiting robustness and failing under missing-modality conditions. In this work, we propose a novel self-supervised fusion framework - dynamic fusion, that dynamically integrates multiple molecular embeddings. The proposed framework employs intra-modal gating for feature selection, inter-modal attention for adaptive weighting, and cross-modal reconstruction to ensure information exchange. Through progressive modality masking during training, the dynamic fusion approach learns to generate fused embeddings resilient to missing modalities. We conducted preliminary evaluations of the proposed approach on MoleculeNet benchmarks, and demonstrate a superior performance in reconstruction, modality alignment, and downstream property prediction tasks compared to unimodal baselines. Our findings highlight the importance of feature-level gating, entropy-regularized attention, and cross-modal reconstruction in achieving robust fusion.
Isha Puri, Shivchander Sudalairaj, et al.
NeurIPS 2025
Minghao Guo, Bohan Wang, et al.
NeurIPS 2024
Djallel Bouneffouf, Matthew Riemer, et al.
NeurIPS 2025
Jannis Born, Filip Skogh, et al.
NeurIPS 2025