How to compute RDKit molecular features in Mathematica
[chemistry
cheminformatics
mathematica
]
How to compute RDKit molecular features in Mathematica…
First, initialize your code by creating an external python evaluation session which sets up a temporary virtual environment for RDKit (and associated dependencies), and then begin by importing the relevant RDKit libraries:
session = ResourceFunction["StartPythonSession"][{"rdkit"}]
ExternalEvaluate[session, "from rdkit.Chem import MolFromSmiles, Descriptors"]

Second, define a function (features) which will run in that session:
features = ExternalFunction[session,
"def features(smiles):
m = MolFromSmiles(smiles)
vals = Descriptors.CalcMolDescriptors(m)
return vals"]

You may now use the defined ExternalFunction like any other Mathematica function:
features["CCN(CC)C(=O)[C@H]1CN([C@@H]2CC3=CNC4=CC=CC(=C34)C2=C1)C"]

Be a good citizen and clean up your session when you are finished:
DeleteObject[session]
Parerga and Paralipomena
- “Under the hood”, Mathematica actually uses RDKit to compute many of the various MoleculeValue properties; however these do not include all of the descriptors, nor do the names match.
- If your goal is to compute binary fingerprint vectors, it is more elegant to use the MoleculeFingerprints paclet instead
ToJekyll["How to compute RDKit molecular features in Mathematica", "chemistry cheminformatics mathematica"]