Simplifies molecular cheminformatics and drug discovery workflows with a Pythonic wrapper around RDKit.
Datamol is a specialized Claude Code skill designed to streamline molecular modeling and cheminformatics tasks. It provides a high-level, Pythonic interface for RDKit, enabling users to perform complex operations like SMILES parsing, structure standardization, 3D conformer generation, and scaffold analysis with sensible defaults. By abstracting the complexities of RDKit while maintaining native object compatibility, it accelerates drug discovery pipelines, enables efficient batch processing with parallelization, and supports modern data formats including cloud storage via fsspec.
Key Features
01Molecular format conversion for SMILES, SELFIES, InChI, and InChIKey
02Batch computation of molecular descriptors and fingerprints for ML
03High-performance 3D conformer generation and energy minimization
041 GitHub stars
05Automated structure standardization and sanitization for chemical datasets
06Scaffold analysis and fragmentation using BRICS and RECAP methods
Use Cases
01Preparing and cleaning molecular datasets for machine learning property prediction
02Conducting scaffold-based library analysis and diversity selection for drug discovery
03Generating 3D conformations and performing spatial analysis of chemical structures