Efficiency of a hierarchical protocol for highthroughput structure-based virtual screening on Grid5000 cluster grid

Leo Ghemtio 1 Emmanuel Jeannot 2, 3 Bernard Maigret 1, *
* Corresponding author
1 ORPAILLEUR - Knowledge representation, reasonning
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
2 RUNTIME - Efficient runtime systems for parallel architectures
Inria Bordeaux - Sud-Ouest, UB - Université de Bordeaux, CNRS - Centre National de la Recherche Scientifique : UMR5800
3 ALGORILLE - Algorithms for the Grid
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Most modern computational techniques in the drug discovery areas put demands on large computer resources. Grid computing offers a powerful alternative way of running computationally intensive applications. One field of the drug innovation process that can benefit greatly from the use of grid resources is the high-throughput virtual screening approach for docking huge chemical compound libraries into known protein-binding sites. The use of computational grids is the combination of computer resources from multiple administrative domains, heterogeneous, and geographically dispersed applications to a common task that requires a great number of computer-processing cycles or the need to process large amounts of data. This study detailed a screening campaign, on Grid5000 cluster grid computing infrastructure, concerning the ZINC database, from which a subset of ∼600,000 “drug-like” molecules was extracted, against three structures of the liver-X receptor β (LXR β). A funnel strategy was used for that purpose, starting from a fast but simple shape matching procedure and achieved with more complex molecular dynamics simulations. From a total of ∼91 million three-dimensional conformations which were generated at the beginning of the funnel and after intermediate filtering steps, the process ended with 45 putative hits. The GRID5000 is a highly reconfigurable, controllable, and monitorable experimental cluster grid, connecting nine sites geographically distributed in France, and featuring more than 3,200 processors and 5,700 cores. To hide the complexity of the grid system from the user, the GRID5000 has been used through the virtual screening manager for grid computing (VSM-G) platform, dedicated to in silico screening and to provide maximum computing power by using grid resources efficiently. The whole screening process required around 82 days (78 days of pre-processing and 3.6 days for the docking funnel itself) and utilized 3,144 nodes over nine sites. The use of grid infrastructures and hierarchical filtering protocol enable us to perform evaluations of the binding capabilities of millions of compounds on several conformations of a given target and propose that, with a low cost, most promising compounds for in vitro tests.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00547970
Contributor : Bernard Maigret <>
Submitted on : Saturday, December 18, 2010 - 1:41:39 AM
Last modification on : Thursday, May 16, 2019 - 6:46:06 PM

Links full text

Identifiers

Citation

Leo Ghemtio, Emmanuel Jeannot, Bernard Maigret. Efficiency of a hierarchical protocol for highthroughput structure-based virtual screening on Grid5000 cluster grid. Open Access Bioinformatics, Dove Medical Press, 2010, 2, pp.41-53. ⟨10.2147/OAB.S7272⟩. ⟨hal-00547970⟩

Share

Metrics

Record views

247