PropertyValue
?:abstract
  • Machine learning (ML) outperforms traditional approaches in many molecular design tasks. ML models usually predict molecular properties from a 2D chemical graph or a single 3D structure, but neither of these representations accounts for the ensemble of 3D conformers accessible to a molecule. Property prediction could be improved by using conformer ensembles as input, but there is no large-scale dataset that contains graphs annotated with high-quality conformers and experimental data. Here we use first-principles simulations to generate accurate conformers for over 430,000 molecules, including 300,000 with experimental data for the inhibition of various pathogens. The Geometric Ensemble Of Molecules (GEOM) dataset contains over 33 million molecular conformers labeled with their relative energies and statistical probabilities at room temperature. GEOM will assist in the development of models that predict properties from conformer ensembles, and generative models that sample 3D conformations.
?:arxiv_id
  • 2006.05531
?:creator
?:externalLink
?:license
  • arxiv
?:publication_isRelatedTo_Disease
?:source
  • ArXiv
?:title
  • GEOM: Energy-annotated molecular conformations for property prediction and molecular generation
?:type
?:year
  • 2020-06-09

Metadata

Anon_0  
expand all