YUP.SCX: Tutorial

YUP.SCX
INTRODUCTION
TUTORIAL
YUP.SCX
YUP.VLAT
EMMENTAL


This tutorial is based on the text of a paper (see the Introduction). It is easier to follow the tutorial along with the paper. (This tutorial does not stand on its own!) The goal is to flexibly fit the structure of the closed form of a molecule into the map of the open conformer at two resolutions, and vice-versa. Thus, the exact solutions to all four problems are known.

You will need to have YUP installed before you can run these examples. Click here to obtain a copy. We also use the program corr_coef from the NMFF suite to calculate cross correlation scores; a custom script or the VMD RMSD Calculator to calculate RMSD (RMSD° before alignment and superimposition and RMSD after); and UCSF Chimera for molecular graphics. There are many alternatives to these programs.

In the following, each command is presented in a separate colored box. The command has to be typed on a single line, but your browser may display it over multiple lines. You can also copy the command from the browser, paste it into your terminal and edit the command to your taste before submitting it.

Yup.scx --help | more; Yup.which; Yup.scx --version

This tutorial requires YUP version 1.071115 or higher (type Yup.which) and Yup.scx version 1.071023 or higher (type Yup.scx --version).
Problems The object in each problem is to refine a Starting structure to fit into the Target map. Of course, the exact solution is an ideal that is unattainable except by chance.

Problem
Starting
Target
Initial Placement
Exact Solution
K145
Closed conformer
5 Å map of open conformer


K149
Closed conformer
9 Å map of open conformer
(not shown)
(not shown)
K415
Open conformer
5 Å map of closed conformer
(not shown)
(not shown)
K419
Open conformer
9 Å map of closed conformer



Download
In order to run these examples, right-click to download yupscxtut.tgz [11,641,467 bytes] and extract the files into an empty directory:

tar -xzf yupscxtut.tgz

You will get the following:

File Name
Size [bytes]
Contents
1AKE_A0R.pdb 132,285
Initial structure for the K145/9 problems and exact solution to the K415/9 problems
1AKE_A0R5.xplor 19,426,098
Target electron density map for the K415 problem
1AKE_A0R9.xplor 239,078
Target electron density map for the K419 problem
4AKE_A0R.pdb 131,808
Initial structure for the K415/9 problems and exact solution to the K145/9 problems
4AKE_A0R5.xplor 23,517,667
Target electron density map for the K145 problem
4AKE_A0R9.xplor 305,828
Target electron density map for the K149 problem

Defaults
Most Yup.scx options have default values. Since these can change from one version to another, we list the most important of the defaults that were current at the time the paper was written:

Option
Value
Description
pdbprefix SCX- The output structures are written to PDB files with names corresponding to the input files with the added prefix.
numtimes 1 Do a simulated annealing run and finish with energy minimization. The same action is taken for any value greater than 0. If 0, only minimization will be performed.
outratio 4.0 Initial value of the ratio of the outer to inner radius in the SCX function.
finalratio 0 Final value of the ratio of the outer to inner radius in the SCX function. The radius ratio is varied between the values of outratio and finalratio during the cooling phase of the simulated annealing procedure. 0 is an illegal value and therefore the value of outratio is used instead, which means that the ratio is held constant throughout the annealing process.
maxTemp 10.0 Maximum temperature [K] in the simulated annealing procedure.
stepfactor 0.5 This constant is multiplied by the number of atoms in the system rounded up to the next 1000 to yield the number of integration steps for the cooling phase of the simulated annealing procedure. An equal number of steps is used in the heating and holding stages.
hookecut 0 Atom pairs separated by this distance are recruited into the elastic network. At the default value, cutoff distances vary by atom types. Otherwise, disregard atom types and use a single cutoff criterion.
hookeconst 1 Scale all the elastic network constants by this value.
hookeoption 3 Assign force constants for the elastic network according to bond length and residue separation.

In the following, options are specified only when it is desired to override the default values. One option that is routinely specified is pdbprefix because we want to preserve all the refined structures.

Since the publication of the paper, the program has gained new options:

Option
Value
Description
chiralconst 1 Scale the chiral force constants by the specified factor.
chiralcut -1.8 Impose chiral constraints on the network consisting of bonds within the same chain that are of length not exceeding this value. This locks chiral centers whether they are correct or not. Set a negative value to turn this off.
preminimize 1000 Energy minimize the starting structure by the specified number of steps before annealing. Set to 0 to turn this off.

If you are using version 1.080601 or higher, and you want to reproduce the published results, please include the following option in each command:

Yup.scx --preminimize=0

Convert
Yup.scx needs an initial structure in the form of a partition map file and a target electron density map in the SCX format. The following commands will create these files by conversion from the original files.

Note that two parameters are always required, and they must be placed after all the options. The first is the name of the partition map file and it must be a name starting with a letter followed by letters or numbers (.py suffix is assumed) and the second is the name of the SCX file (.scx suffix assumed). The partition map file encodes the name of the original PDB files (there may be more than one) but does not contain any coordinates. Therefore, the PDB files are still required for the refinement. The SCX files are complete replacements for the XPLOR files.

In the following, numtimes is specifed as a negative number to prevent any refinement calculations and the production of PDB output. (A value of 0 will cause the simulated annealing calculations to be skipped but not the energy minimization.)

Yup.scx --pdb=1AKE_A0R.pdb --xplor=4AKE_A0R5.xplor --numtimes=-1 pdb1 map45

A partition map file (pdb1.py) is created from the PDB file (1AKE_A0R.pdb) and the XPLOR file 4AKE_A0R5.xplor is converted into the SCX file map45.scx.

Yup.scx --xplor=4AKE_A0R9.xplor --numtimes=-1 pdb1 map49

The XPLOR file 4AKE_A0R9.xplor is converted into the SCX file map49.scx. Note that the pdb1 parameter is simply a placeholder.

Yup.scx --pdb=4AKE_A0R.pdb --xplor=1AKE_A0R5.xplor --numtimes=-1 pdb4 map15

A partition map file (pdb4.py) is created from the PDB file (4AKE_A0R.pdb) and the XPLOR file 1AKE_A0R5.xplor is converted into the SCX file map15.scx.

Yup.scx --xplor=1AKE_A0R9.xplor --numtimes=-1 pdb4 map19

The XPLOR file 1AKE_A0R9.xplor is converted into the SCX file map19.scx.

Note that we could have specified the source files (*.pdb and *.xplor) as part of a refinement calculation. However, we will be using the converted files more than once in the following exercises.

At this time, there should be six new files: map15.scx, map19.scx, map45.scx, map49.scx, pdb1.py, pdb4.py.
In the following, the section numbers are from the paper. Execution times are given for a 2001 vintage 3.2 GHz Intel Xeon processor, which is roughly equivalent to a 2 GHz Intel Core Duo of 2006. Each run starts with an initial velocity distribution that is generated by a random number generator, seeded with the current time. Runs with the same settings will have different outcomes if they are started at different times. On most machines, one second between runs is enough to prevent identical outcomes.

It is a good idea to redirect the output from each refinement to a file although that will prevent you from seeing the progress of the calculation. In the following, commands are given without the file redirection.
Section 2.8
In this section of the paper, we established the optimum duration for the cooling phase of the simulated annealing procedure. The equations of motion are integrated numerically at a given step size. The duration is the number of steps multiplied by the step size. We tried step sizes of 1 femtosecond, 2.5 fs, and 5 fs.

The duration study was carried out for the K419 problem, therefore, we will be using the partition map file pdb4.py and the SCX map file map19.scx.

The step size is not designed to be changed, but a secret option is provided for testing. The duration is controlled through the stepfactor option. The value of this option is multiplied by the number of atoms in the starting structure rounded up to the next 1000 to yield the number of steps. We also want to save the structure from each run, so that we can calculate the quality of the refinement. Thus, we specify a unique value for the pdbprefix option for each run.

Yup.scx --pdbprefix=P --secret=S --stepfactor=T pdb4 map19

We carried out the following runs and compiled the results in Figure 3 (of the paper).

T

S = 1
S = 2
S = 5
P Duration [ps]
P Duration [ps]
P Duration [ps]
0.1
Za1-
0.2
Za2-
0.5
Za5-
1.0
0.25
Zb1-
0.5
Zb2-
1.25
Zb5-
2.5
0.5
Zc1-
1.0
Zc2-
2.5
Zc5-
5.0
1.0
Zd1-
2.0
Zd2-
5.0
Zd5-
10.0
2.5
Ze1-
5.0
Ze2-
12.5
Ze5-
25.0
5.0
Zf1-
10.0
Zf2-
25.0
Zf5-
50.0
10.0
Zg1-
20.0
Zg2-
50.0
Zg5-
100.0
25.0
Zh1-
50.0
Zh2-
125.0
Zh5-
250.0

A specific example:

Yup.scx --pdbprefix=Zd2- --secret=2 --stepfactor=1.0 pdb4 map19

This run takes 2 minutes and 3 seconds. The fitted structure is in the file Zd2-1AKE_A0R.pdb.

As one would expect, the execution time increases proportionally with stepfactor. For example:

Yup.scx --pdbprefix=Zh1- --secret=1 --stepfactor=25 pdb4 map19

finished in 44 minutes, and produced a result file Zh1-1AKE_A0R.pdb.

We wrote a script to carry out all 24 fittings, including the calculations of the cross-correlation coefficients and the root mean square deviation between the refined structure and the exact solution. These calculations show that it is duration that matters, not the step size or the number of steps.
Section 2.10
We use a highly modified Gaussian Network Model (GNM). In this section, we show how the conventional GNM fails. In order to use a conventional elastic network, we have to override the default values of three options:

Yup.scx --pdbprefix=fx145- --stepfactor=2.0 --hookecut=3.5 --hookeconst=0.5 --hookeoption=0 pdb1 map45

That run took three and a half minutes and the results are saved in fx145-1AKE_A0R.pdb.

For the lower resolution map, the command is:

Yup.scx --pdbprefix=fx149- --stepfactor=2.0 --hookecut=3.5 --hookeconst=0.5 --hookeoption=0 pdb1 map49

This run also took three and a half minutes. The refined structure is in fx149-1AKE_A0R.pdb.

The starting structure is a closed conformer, which we are trying to open to fit into the map of the open conformer. This is difficult because of the dense network of bonds in the conventional GNM.

Notice that we have softened the force constants and reduced the cutoff distance to obtain a sparser elastic network. The fitted structure is actually not bad. If you display it along with the starting and the exact solution structures, you will probably see that the fitted structure is closer to the target rather than the starting structures. Try a fitting with hookecut of 5 Å (which is a more conventional choice), do not forget to set a different pdbprefix value. You should get a much worse fit. If you set hookeconst to a very small value like 0.1 you may get a better fit but the stereochemistry will be poor.

In the highly modified GNM that we finally adopted as the default, the short bonds that are within a residue or between two connected residues are made very stiff while the long bonds are made very weak.
Section 2.9
Simulated annealing is a stochastic procedure, i.e., the outcome is not deterministic. In this section, we show that, at least for these four problems, this does not matter. The fitted structures cluster tightly together and therefore, there is a good chance of getting a good result from just one attempt.

These runs are carried out with default options except for stepfactor:

Yup.scx --pdbprefix=P --stepfactor=2 PDB MAP

In the following, the pdbprefix option is assigned a different value P for each run. The runs are carried out one after the other as many times as required. Running the cases serially ensures that every run starts with a different velocity distribution.

Problem
PDB
MAP
K145
pdb1
map45
K149
pdb1
map49
K415
pdb4
map15
K419
pdb4
map19

A specific example:

Yup.scx --pdbprefix=A00- --stepfactor=2 pdb1 map45

This run took a few seconds under four minutes and produced a file A00-1AKE_A0R.pdb containing the fitted structure. This command is very typical of how Yup.scx is used. The default options have been found from experience to be best and are therefore usually acceptable. In this case, the atomic model is small enough that we can afford the luxury of extending the duration to four times the default. This is so that the results are more consistent.
Section 3.1
All the runs for the K149 problem using the default settings (other than stepfactor) yield the wrong result. The reasons are discussed in the paper.

Yup.scx --pdbprefix=B00- --stepfactor=2 pdb1 map49

This run took a few seconds under four minutes and produced a file B00-1AKE_A0R.pdb containing the fitted structure. See Figure 4 (bottom left) for the erroneous result. If you compare that with your result, you can see that the default settings provide the wrong fitting, but multiple runs should produce consistently incorrect results.
Section 3.2
In this section, we collect representative solutions to each problem. A correct solution to the K149 problem can be obtained by reducing the radius ratio:

Yup.scx --pdbprefix=k149- --stepfactor=2 --outratio=3 pdb1 map49

This run took a few seconds over two minutes and produced a file k149-1AKE_A0R.pdb containing the fitted structure.

If you have not worked out a solution for the remaining problems, here are the relevant commands:

Yup.scx --pdbprefix=k145- --stepfactor=2 pdb1 map45

Yup.scx --pdbprefix=k415- --stepfactor=2 pdb4 map15

Yup.scx --pdbprefix=k419- --stepfactor=2 pdb4 map19

Each run should take about four minutes and you should have the following files: k145-1AKE_A0R.pdb, k415-4AKE_A0R.pdb, k419-4AKE_A0R.pdb.
You may find it easier to visually compare a solution structure with the exact solution to see how well the refinement works. It is hard to judge how well a structure fits into a map, particularly a low resolution map. The structural comparison is even clearer if you switch to a simplified display (e.g. ribbons).
Home
Information
News
User
Technical
Programmer
iYup
Download
Showcase
ETC