How to compute RDKit molecular features in Mathematica…

First, initialize your code by creating an external python evaluation session which sets up a temporary virtual environment for RDKit (and associated dependencies), and then begin by importing the relevant RDKit libraries:

session = ResourceFunction["StartPythonSession"][{"rdkit"}]
ExternalEvaluate[session, "from rdkit.Chem import MolFromSmiles, Descriptors"]

12ss2xaemp3q8

Second, define a function (features) which will run in that session:

features = ExternalFunction[session, 
   "def features(smiles):
      m = MolFromSmiles(smiles)
      vals = Descriptors.CalcMolDescriptors(m)
      return vals"]

0lat68wk70a34

You may now use the defined ExternalFunction like any other Mathematica function:

features["CCN(CC)C(=O)[C@H]1CN([C@@H]2CC3=CNC4=CC=CC(=C34)C2=C1)C"]

0ic5gwwh3h1x5

Be a good citizen and clean up your session when you are finished:

DeleteObject[session]

Parerga and Paralipomena

  • “Under the hood”, Mathematica actually uses RDKit to compute many of the various MoleculeValue properties; however these do not include all of the descriptors, nor do the names match.
  • If your goal is to compute binary fingerprint vectors, it is more elegant to use the MoleculeFingerprints paclet instead
ToJekyll["How to compute RDKit molecular features in Mathematica", "chemistry cheminformatics mathematica"]