YUP.SCX: Tutorial
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This
tutorial is based on the text of a paper (see the Introduction).
It is
easier
to follow the tutorial along with the paper. (This tutorial does not
stand on its own!) The goal is to flexibly
fit the structure of the closed form of a molecule into the map of the
open conformer at two resolutions, and vice-versa. Thus, the exact
solutions
to all four problems are known. You will need to have YUP installed before you can run these examples. Click here to obtain a copy. We also use the program corr_coef
from the NMFF suite to
calculate cross
correlation scores; a custom script or the VMD RMSD Calculator to
calculate RMSD (RMSD° before alignment and superimposition and RMSD
after); and UCSF Chimera for
molecular graphics. There are many alternatives to these programs.In the following, each command is presented in a separate colored box. The command has to be typed on a single line, but your browser may display it over multiple lines. You can also copy the command from the browser, paste it into your terminal and edit the command to your taste before submitting it.
This tutorial requires YUP version 1.071115 or higher (type Yup.which) and Yup.scx version
1.071023 or higher (type Yup.scx
--version). |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Problems | The object in each problem is to
refine a Starting structure to fit into the Target
map. Of course, the exact solution is an ideal that is unattainable
except by
chance.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Download |
In order to run these examples, right-click to download yupscxtut.tgz
[11,641,467 bytes] and extract the files into an empty
directory:
You will get the following:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Defaults |
Most Yup.scx options have
default values. Since these can
change from
one version to another, we list the most important of the
defaults that were current at the time the paper was written:
In the following, options are specified only when it is desired to override the default values. One option that is routinely specified is pdbprefix because we want to
preserve all the refined structures.Since the publication of the paper, the program has gained new options:
All of the following commands include the following options if you want to reproduce the published results. There should be no harm in using the default settings.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Convert |
Yup.scx needs an
initial
structure in the form of a partition map file and a target electron
density
map in the SCX format. The following commands will create these files
by conversion
from the original files. Note that two parameters are always required, and they must be placed after all the options. The first is the name of the partition map file and it must be a name starting with a letter followed by letters or numbers ( .py suffix is
assumed) and the second is the
name of the SCX file (.scx suffix assumed). The partition
map file
encodes the name of the original PDB files (there may be more than one)
but does not contain any
coordinates. Therefore, the PDB files are still required for the
refinement. The SCX files are complete replacements for the XPLOR files.In the following, numtimes
is
specifed as a
negative number to prevent any refinement calculations and the
production of PDB output. (A value of 0
will cause
the simulated annealing calculations to be skipped but not the energy
minimization.)
A partition map file (pdb1.py) is created
from the PDB file (1AKE_A0R.pdb) and the XPLOR file 4AKE_A0R5.xplor
is converted into the SCX file map45.scx.
The XPLOR file 4AKE_A0R9.xplor is converted
into the SCX file map49.scx. Note that the pdb1
parameter is simply a placeholder.
A partition map file (pdb4.py) is created
from the PDB file (4AKE_A0R.pdb) and the XPLOR file 1AKE_A0R5.xplor
is converted into the SCX file map15.scx.
The XPLOR file 1AKE_A0R9.xplor is converted
into the SCX file map19.scx.Note that we could have specified the source files ( *.pdb
and *.xplor) as part of a refinement calculation.
However, we
will be using the converted files more than once in the following
exercises.At this time, there should be six new files: map15.scx, map19.scx,
map45.scx, map49.scx, pdb1.py,
pdb4.py. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| In the
following, the section numbers are from the paper. Execution times are
given for a 2001 vintage 3.2 GHz Intel Xeon processor, which is roughly
equivalent to a 2 GHz Intel Core Duo of 2006. Each run starts with
an initial velocity distribution that is generated by a random number
generator, seeded with the current time. Runs with the same
settings will have different outcomes if they are started at different
times. On most machines, one second between runs is enough to prevent
identical outcomes. It is a good idea to redirect the output from each refinement to a file although that will prevent you from seeing the progress of the calculation. In the following, commands are given without the file redirection. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Section 2.8 |
In this section of
the paper, we established the optimum duration for the cooling
phase of the simulated annealing procedure. The equations of motion are
integrated numerically at a given step size. The duration is the number
of steps multiplied by the step size. We tried step sizes of 1
femtosecond, 2.5 fs, and 5 fs. The duration study was carried out for the K419 problem, therefore, we will be using the partition map file pdb4.py and the SCX
map file map19.scx.The step size is not designed to be changed, but a secret option is provided for
testing. The duration is controlled through the stepfactor option. The value of this
option is multiplied by the number of atoms in the starting structure
rounded up to the next 1000 to yield the number of steps. We also want
to save the structure from each run, so that we can calculate the
quality of the refinement. Thus, we specify a unique value for the pdbprefix option for each run.
We carried out the following runs and compiled the results in Figure 3 (of the paper).
A specific example:
This run takes 2 minutes and 3 seconds. The fitted structure is in the file Zd2-1AKE_A0R.pdb.As one would expect, the execution time increases proportionally with stepfactor. For example:
finished in 44 minutes, and produced a result file Zh1-1AKE_A0R.pdb.We wrote a script to carry out all 24 fittings, including the calculations of the cross-correlation coefficients and the root mean square deviation between the refined structure and the exact solution. These calculations show that it is duration that matters, not the step size or the number of steps. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Section 2.10 |
We
use a highly modified Gaussian Network Model (GNM). In this section, we
show
how the conventional GNM fails. In order to use a conventional elastic
network, we have to override the default values of three options:
That run took three and a half minutes and the results are saved in fx145-1AKE_A0R.pdb.For the lower resolution map, the command is:
This run also took three and a half minutes. The refined structure is in fx149-1AKE_A0R.pdb.The starting structure is a closed conformer, which we are trying to open to fit into the map of the open conformer. This is difficult because of the dense network of bonds in the conventional GNM. Notice that we have softened the force constants and reduced the cutoff distance to obtain a sparser elastic network. The fitted structure is actually not bad. If you display it along with the starting and the exact solution structures, you will probably see that the fitted structure is closer to the target rather than the starting structures. Try a fitting with hookecut of 5
Å (which is a more conventional choice), do not forget to set a
different pdbprefix value. You
should get a much worse fit. If you set hookeconst to a very small value
like 0.1 you may get a better fit but the stereochemistry will be poor.In the highly modified GNM that we finally adopted as the default, the short bonds that are within a residue or between two connected residues are made very stiff while the long bonds are made very weak. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Section 2.9 |
Simulated
annealing is a stochastic procedure, i.e., the outcome is not
deterministic. In this section, we show that, at least for these four
problems, this does not matter. The fitted structures cluster tightly
together and therefore, there is a good chance of getting a good result
from just one attempt. These runs are carried out with default options except for stepfactor:
In the following, the pdbprefix
option is assigned a different value P
for each run. The runs are carried out one after the other as many
times as required. Running the cases serially ensures that every run
starts with a different velocity distribution.
A specific example:
This run took a few seconds under four minutes and produced a file A00-1AKE_A0R.pdb
containing the fitted structure. This command is very typical of how
Yup.scx is used. The default options have been found from experience to
be best and are therefore usually acceptable. In this
case, the atomic model is small enough that we can afford the luxury of
extending the duration to four times the default. This is so that the
results are more consistent. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Section
3.1 |
All
the runs for the K149
problem using the default settings (other than stepfactor) yield the
wrong result. The reasons are discussed in the paper.
This run took a few seconds under four minutes and produced a file B00-1AKE_A0R.pdb
containing the fitted structure. See Figure 4 (bottom left) for the
erroneous result. If you compare that with your result, you can see
that the default settings provide the wrong fitting, but multiple runs
should produce consistently incorrect results. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Section
3.2 |
In
this section, we collect representative solutions to each problem. A
correct solution to the K149 problem can
be obtained by reducing the radius ratio:
This run took a few seconds over two minutes and produced a file k149-1AKE_A0R.pdb
containing the fitted structure.If you have not worked out a solution for the remaining problems, here are the relevant commands:
Each run should take about four minutes and you should have the following files: k145-1AKE_A0R.pdb, k415-4AKE_A0R.pdb,
k419-4AKE_A0R.pdb. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| You may
find it easier to visually compare a solution structure with the exact
solution to see how well the refinement works. It is hard to judge how
well a structure fits into a map, particularly a low resolution map.
The structural comparison is even clearer if you switch to a simplified
display (e.g. ribbons). |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||