DeepRank (protein interfaces)
Date Published

Links
Background DeepRank (protein interfaces) is a configurable deep-learning framework developed to mine and learn patterns from protein–protein interactions (PPIs). It was built around the idea of turning structural interface information into machine‑readable 3D representations that capture atomic and residue-level physico‑chemical properties. Using those representations, DeepRank trains convolutional neural networks (CNNs) to perform tasks such as docking-decoy scoring, interface classification and regression of structural quality metrics. Core capabilities DeepRank offers a full-stack pipeline from raw PDB files to trained models. Key features include predefined atom- and residue-level feature modules (for example atomic densities, van der Waals energy, Coulombic terms, residue contacts and PSSM-derived sequence features), flexible target definitions (binary labels, CAPRI categories, DockQ, iRMSD, FNAT and other metrics), and mapping of features onto 3D grids. Mapped data are stored efficiently in HDF5 files where each complex or decoy has its own group containing the original structure, computed features and targets. The mapping implementation follows established approaches (atomic densities mapped per Van der Waals radii; continuous features mapped with Gaussian spreading) so that volumetric CNNs can exploit local 3D context. Data generation and processing The DataGenerator and database API orchestrate dataset creation. Given a directory of decoy PDBs and corresponding native structures, the generator computes requested features and targets, organizes them in an HDF5 database, and maps features onto a configurable 3D grid (number of points, resolution and per-atom density parameters are user-definable). The workflow is MPI-aware to support parallel feature computation at scale (the code expects mpi4py in the environment). The produced HDF5 is suitable both for local experiments and for sharing between teams; a companion browser (DeepXplorer) can be used to inspect HDF5 contents and export viewable files for VMD or PyMOL. Training and modeling DeepRank integrates with PyTorch to provide model training utilities. HDF5 datasets are wrapped into Torch datasets, and a NeuralNet abstraction simplifies experiments: selecting features/targets, defining optimizers and running training loops. The framework supports both classification and regression tasks out of the box. Example scripts show how to create datasets, filter conformations, specify normalization and launch training (including options for batch size, number of epochs and GPU usage). Although this DeepRank repository focused on 3D CNNs, the project ecosystem has expanded to graph-based representations in subsequent releases. Example use cases - Docking-decoy scoring and ranking: evaluate sets of docking models against native structures by learning to predict DockQ, iRMSD or CAPRI classes. - Interface classification: distinguish biologically relevant interfaces from crystal contacts or false positives using atomistic and sequence-derived features. - Regression of structural quality: predict continuous quality metrics (e.g., RMSD or FNAT) for downstream filtering in docking pipelines. - Feature exploration and visualization: inspect spatial distributions of learned features using DeepXplorer and exportable VMD/PyMOL visualizations to validate model hypotheses. Integration, installation and status DeepRank requires Python 3.7 or 3.8 on Linux/macOS and mpi4py for parallel feature generation. The package was available on PyPI (pip install deeprank) and the source included examples and tests (pytest). A browser tool (DeepXplorer) supports HDF5 inspection and visualization. Note: this particular DeepRank repository has been archived and is no longer maintained; development has moved to DeepRank2, which unifies the original DeepRank functionality with graph-based tooling (DeepRank-GNN) and mutation-focused modules (DeepRank-Mut). DeepRank2 extends support to both volumetric grids and graph representations and provides pre-implemented pipelines for graph neural networks (GNNs) as well as CNNs—users starting new projects are encouraged to adopt DeepRank2 while using the archived DeepRank code as a historical reference.