Many interesting properties of molecular motion are best characterized statistically by considering an ensemble of motion pathways rather than an individual one. Classic simulation techniques, such as the Monte Carlo method and molecular dynamics, generate individual pathways one at a time and are easily “trapped” in the local minima of the energy landscape. They are computationally inefficient if applied in a brute-force fashion to deal with many pathways. We introduce Stochastic Roadmap Simulation (SRS), a randomized technique for sampling molecular motion and exploring the kinetics of such motion by examining multiple pathways simultaneously.
Stochastic Roadmap Simulation
SRS compactly encode many motion pathways in a directed graph, called a probabilistic conformational roadmap. A node in the graph is a point sampled at random from the conformation space of a molecule. Every path in the roadmap is a potential motion pathway for the molecule. A roadmap thus contains many pathways, with associated probabilities indicating the likelihood that a molecule may follow these pathways. Using tools from the Markov chain theory, we can efficiently obtain kinetic information on the motion of molecules from the roadmap.
We tested SRS by first using it to compute the probability of folding, an important order parameter that measures the “kinetic distance” of a conformation to the native state of a protein. Our computational studies demonstrate that, compared with the Monte Carlo method, SRS obtains more accurate results and reduces the running time by several orders of magnitudes. Problems that required 100 days of computation with the Monte Carlo method were solved in an hour with SRS. Furthermore we proved that, in the limit, SRS converges to the same distribution as the Monte Carlo method.
We then used the probability of folding to estimate the transition state ensemble (TSE) and predicted the rates and the phi-values for protein folding. Of the 16 proteins we studied, comparison with experimental data shows that our method estimates the TSE much more accurately than an existing method based on dynamic programming. This improvement leads to better folding-rate predictions. We also computed the mean first passage time of the unfolded states and show that the computed values correlate with experimentally determined folding rates. The results on phi-value predictions are mixed, possibly due to the simple energy model used in the tests. The comparison with experimental data further validates the SRS method and indicates its potential as a general tool for studying protein folding kinetics.
Predicted folding rates versus the experimentally measured folding rates. The 16 proteins ranged from 56 to 128 amino acids in length.
3CHY sequence of secondary structure formation. Warmer colors indicate higher degree of native contacts attained. The colored bar on the left indicates secondary structures, red for helices and green for strands.
- T.H. Chiang, M.S. Apaydin, D.L. Brutlag, D. Hsu and J.C. Latombe. Predicting Experimental Quantities in Protein Folding Kinetics using Stochastic Roadmap Simulation. In Proc. ACM Int. Conf. on Computational Biology (RECOMB), pp. 410–424, 2006.
- M.S. Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, J.C. Latombe, and C. Varma. Stochastic roadmap simulation: an efficient representation and algorithm for analyzing molecular motion. J. Computational Biology, 10(3-4):247–281, 2003.
- M.S. Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, and J.C. Latombe. Stochastic roadmap simulation: An efficient representation and algorithm for analyzing molecular motion. In Proc. ACM Int. Conf. on Computational Biology (RECOMB), pp. 12–21, 2002.