In our experiments we have used a set of candidate protein structures (decoys)
generated by the I-TASSER
ab inito predictor. For each of the 56 non-homologues small protein chains I-TASSER have
generated from 12.5k to 20k decoys.
We used 54 chains (excluding 1ogwA and 1cy5A)
and a sample of every 10th decoy along the generation time.
We have implemented 8 chosen I-TASSER energy terms and calculate their value for each decoy:
We left out energy terms using data from the threading process (e.g. distance map or contact order) and the hydrophobic potential as they depend on external feature predictors.
Download energy terms: energy_terms.tar.gz [2.8 MB]
The archive contains 54 files, one for each protein. Each line in the file contains space separated list of energy of terms for a single decoy. The decoys (lines) in the file are sorted in increasing order of original I-TASSER energy.Line format: T1 T2 T3 T4 T5 T6 T7 T8
For each decoy we have measured its similarity to the known native structure. As a measure we used the root mean square deviation (RMSD) between 3D coordinates of Calpha atoms of two structures minimised with respect to the rotation.
To each decoy we have assigned a rank based on the increasing order of RMSD, averaging the ranks in case of ties. A tie between decoys was called when RMSD values were the same up to the first two decimal places.
Download distances/ranks: distances.tar.gz [384 kB]
The archive contains 54 files, one for each protein. Each line in the file contains space separated list of 3 values: rank, RMSD, and the original I-TASSER energy. The order of decoys (lines) in the file is the same as for energy terms.line format: rank RMSD energy
The high resolution version of plots from our paper together with several extra plots not included there are available in two image galleries listed below: