Publication
ICML 2024
Workshop paper

Masking in Molecular Graphs Leveraging Reaction Context

Abstract

Token masking has been proven useful for selfsupervised learning in various modalities, including the sequential SMILES representation of molecules. Yet, research for masking over molecular graph structures has not received enough attention and existing methods often focus on single molecules. We propose ReaCTMask (Reaction ConText-based Masking), a novel approach that leverages reaction knowledge to provide critical context outside of the molecular structures themselves to guide the graph masking. We show that graph transformers are able to exploit the additional knowledge by applying a unified masking scheme, within and across molecules inside a reaction. Our experiments cover probing and transfer learning, comparing to various baselines, and provide insights into the intricate nature of the task. Overall, the results demonstrate the effectiveness of our approach and, more generally, the usefulness of reaction context in graph pre-training.

Date

Publication

ICML 2024