Structure prediction tests

Loop modeling

Figure
KIC protocol
A diagram of the KIC protocol (left) where moves are accepted or rejected based on the Metropolis criterion (MC). The four additional NGK strategies apply at different stages in the protocol (right).

The Rosetta loop modeling benchmark [Mandell09, Stein13] tests the ability of a protocol to reconstruct the backbone conformation of loop segments in experimentally determined protein structures. The short loops benchmark set consists of 45 12-residue protein segments without regular secondary structure. In each case, the given segment is deleted from the protein structure and then reconstructed de novo, given a fixed backbone environment for the rest of the protein. All segment side chains and those within 10Å of the segment are modeled based on a rotamer library that does not include the native side chain conformations.

The long loops benchmark [Stein13] analogously tests whether protocols are able to reconstruct longer loop segments of 14-17 residues. This benchmark set consists of 27 non-redundant long loops. De novo loop reconstruction and side chain optimization are performed as described above for the standard loop modeling benchmark.

After superposition of each generated model onto the native structure (excluding the reconstructed loop), the loop backbone heavy-atom RMSD of the model to the native loop conformation is calculated. Overall benchmark performance of each protocol is then evaluated using two different metrics across the entire benchmark set: (i) median loop backbone RMSD of the lowest-energy model to the native structure (or median lowest loop backbone RMSD of the 5 lowest-energy models, which is less susceptible to stochastic fluctuations [Leaver-Fay12]), and (ii) the median percentage of models generated that have a sub-angstrom loop backbone RMSD.

The cyclic coordinate descent (CCD) protocol uses insertion of fragments from proteins of known structure to efficiently sample the loop backbone degrees of freedom [Wang07], followed by torsion angle adjustments via cyclic coordinate descent to close the resulting chain break [Canutescu03].

The kinematic closure (KIC) protocol is a robotics-based method [Mandell09], which samples all but six loop backbone degrees of freedom probabilistically from Ramachandran space. Those remaining three pairs of φ/ψ torsion angles are then solved analytically through kinematic closure to close the chain break [Coutsias05]. The figure below demonstrates a KIC move, which is a purely local conformational change in the chosen protein segment without affecting the rest of the structure.

Figure
Kinematic closure example
Kinematic closure for local conformational sampling: (left) 3 Cα atoms in the segment to be remodeled are designated as pivots (orange), the remaining N-3 Cα atoms are non-pivots (yellow). (middle) Torsion angles at the non-pivot atoms are sampled from residue-specific Ramachandran distributions in standard KIC, opening the segment. (right) Analytical closure calculates values for the pivot φ/ψ torsions that form a closed conformation ("KIC move").

Next-generation KIC (NGK) adds four additional sampling strategies to the standard KIC protocol: (i) the selection of pairs of φ/ψ torsions from neighbor-dependent Ramachandran distributions, (ii) sampling of ω degrees of freedom, as well as annealing methods that gradually ramp the weights of (iii) the repulsive and (iv) Ramachandran terms of the Rosetta energy function to overcome energy barriers [Stein13].

All three loop modeling protocols use Monte-Carlo simulated annealing for rotamer-based side-chain optimization ("repacking") of the loop residues and those within 10Å of the loop, followed by gradient-based minimization. Both KIC and NGK protocols have been shown to successfully sample sub-angstrom protein loop conformations in many cases [Mandell09, Stein13], with NGK generating a higher median percentage of sub-angstrom models than standard KIC [Stein13].

Note that the protocols described below do not make use native/homologous fragments as we wish to present a fair comparison between the methods. If you have homologous fragments for your protein of interest then you may wish to use one of the methods which supports the use of fragments (e.g. CCD).

Publications
Mandell, DJ, Coutsias, EA, Kortemme, T. Sub-angstrom accuracy in protein loop reconstruction by robotics-inspired conformational sampling. 2009.
Nat Methods 6(8):551-2. doi: 10.1038/nmeth0809-551.
Stein, A, Kortemme, T. Improvements to Robotics-Inspired Conformational Sampling in Rosetta. 2013. PLoS ONE 8(5):e63090. doi: 10.1371/journal.pone.0063090.
Leaver-Fay A, O'Meara MJ, Tyka M, Jacak R, Song Y, Kellogg EH, Thompson J, Davis IW, Pache RA, Lyskov S, Gray JJ, Kortemme T, Richardson JS, Havranek JJ, Snoeyink J, Baker D, Kuhlman B. Scientific benchmarks for guiding macromolecular energy function improvement. 2013. Methods Enzymol 523:109-43. doi: 10.1016/B978-0-12-394292-0.00006-0.
Wang C, Bradley P, Baker D. Protein-protein docking with backbone flexibility. 2007. J Mol Biol 373(2): 503–19. doi: 10.1016/j.jmb.2007.07.050.
Canutescu AA, Dunbrack RL. Cyclic coordinate descent: A robotics algorithm for protein loop closure. 2003. Protein Sci 12(5): 963–72. doi: 10.1110/ps.0242703.
Coutsias EA, Seok C, Jacobson MP, Dill KA. A kinematic view of loop closure. 2004. J Comput Chem 25(4):510-28. doi: 10.1002/jcc.10416.