NAME
mkd3d - generate descriptor for closed circular 3DNA model
SYNOPSIS
mkd3d [ options ] -n Nbp -c conf -h helf -d desfmkd3d [ options ] -n Nbp -c conf -s seqf -d desf
mkd3d [ options ] -n Nbp -c conf -H vhel -d desf
DESCRIPTION
Generate a text descriptor file for the 3DNA reduced representation model of DNA named desf containing Nbp base pairs, using as input the constants file conf and the helical file helf (in the first form of the command) or the sequence file seqf (in the second form of the command) or the helical file vhel in the old AUGUR format (in the third form of the command). The files helf or vhel or seqf must contain parameters for at least Nbp base pairs. The use of the old AUGUR format helical file is discouraged and therefore the last form of the command should not be used.The generated force field takes its equilibrium dimensions from the helical files or when a sequence file is specified instead these dimensions are predicted from the base sequence using the Bolshoy rules. The force constants are taken from the constants file.
The options are:
-open: Generate a linear (open) model (the two ends of the DNA are not linked) instead of the preset behavior of closing into a circular model (the two ends of the DNA are linked to form a closed circle)
-x C: Ignore the first C parameters in the helical file. The i-th base pair of the model will receive the helical parameters of the C+i-th base pair of the helical file.
-z Z: Generate a double well potential for the twist of the bases named in Z. The double well potential has minimum energies at ±to degrees where to is the equilibrium twist from the iton record of the constants file. Z is a continuous string (no spaces) containing numbers that are interpreted as follows: "a,bb,c-dd,e(f)ggg" where a, bb, c, dd, e, f, ggg are positive whole numbers, the string expands to mean the following base pairs: a, bb, c, c+1, c+2 to dd, e, e+f, e+2f to ggg.
-aroll A: generate anisotropic roll bending. The preset action is to set A to one in which case the roll angle is represented by a simple angle function with a force constant kroll and equilibrium roll angle qo. When -aroll is specified, the bending in the roll direction is represented by a piecewise power series: for roll angle 0 < q <= qo the potential function is kroll[q - qo]2/A and for roll angle qo < q <= 180 degrees the potential function is Akroll[q - qo]2. This is an asymmetrical function with the minimum at qo.
In the 3DNA model, each base pair of the DNA is represented by three particles. The first is located in the major groove and is named "FRON", the second is located on one of the back bones and is called "LEFT" and the last particles is located in the center of the base pair and is called "CENT". Each group of three atoms may be assigned to one of four groups named "A", "C", "G" or "T" and different physical properties may be assigned according to the group. (These names appear in the name record of the generated descriptor.)
A text descriptor is generated rather than the binary version. This is to allow modification of the descriptor. One would almost certainly want to add notes to the descriptor (see the note record of the descriptor file format). Once a final descriptor is obtained, convert it to binary. See des.
EXAMPLES
mkd3d -n 360 -c 3dna.cnt -h circ360.hel -d circ360.dxCreate a text descriptor named circ360.dx for a 360 base pair 3DNA model using the force constants from 3dna.cnt and the equilibrium dimensions from the helical file circ360.hel.
mkd3d -s c1500.seq -d c1500.dx -n 1500 -open -c 3dna.cnt
Create a text descriptor named c1500.dx for a 1500 base pair 3DNA model that is open (linear), using the force constants from 3dna.cnt and the equilibrium dimensions predicted from the base sequence in c1500.seq using the Bolshoy rules.
FILES
The constants file (-c) has the following format:The file starts with the header lineThe file is identified by the keyword CON1 as a constants file and the following number is the version number which must be 1000 and the last number Nclass is the maximum number of classes which must be a number larger than the number of bonds, angles, improper torsions and so on that will follow in the file. These records are identified by keywords and can appear in any order. Comments can be placed between the character # and the end of line and may appear between records but not within the records. The records must conform to the 3DNA model so they have a rather strict layout order.
CON1 1000 Nclass The bond record is laid out as follows:
Following the keyword bond is the number of bond records that will follow Nbond. Each bond record consists of the force constant kbij [erg/Å2] and the equilibrium bond length boij[Å]. In practice, Nbond is always 12 and the first bond record must be for the CENTi-1-CENTi bond (the subscript denoting the base pair number), the second for CENTi-FRONi and the third for CENTi-LEFTi. The first three records are for base type A, the next groups of three bonds are for C, G and T in that order, each group of three are for the three bonds listed here. Typically the equilibrium bond lengths are 3.4, 2.5 and 5Å respectively.
bond Nbond kbij boij ... ... The angle record is laid out as follows:
Following the keyword angl is the number of angles that will follow Nangl. Each angle record consists of the force constant kqijk[erg/rad2] and the equilibrium angle qoijk[deg]. In practice, Nangl is always 20 and the first five angles are for base type A the next groups of five are for C, G and T in that order. In each group of five angles, the first angle is for CENTi-1-CENTi-FRONi (the "roll" angle) the second for CENTi-1-CENTi-LEFTi (the "tilt" angle) the third for FRONi-CENTi- LEFTi, the fourth for LEFTi-CENTi-CENTi+1 and the fifth for FRONi-CENTi-CENTi+1. The equilibrium angle for the first two angles are ignored and are instead taken from the helical files or predicted from the Bolshoy rules. The last three angles are usually assigned equilibrium values of 90 degrees to keep the base plane regular.
angl Nangl kqijk qoijk ... ... The improper torsion record is laid out as follows:
Following the keyword itor is the number of improper torsion records Nitor. Each improper torsion record consists of the force constant ktijkl[erg/rad2] and the equilibrium torsion angle toijkl[deg]. In practice, Nitor is always 4 and the torsion angle is defined by the atoms LEFTi-1-CENTi-1-CENTi-LEFTi (the "twist" angle) and the four torsions are defined for base types A, C, G and T in that order. The equilibrium torsion angle is ignored and is instead taken from the helical file or generated from the base sequence using the Bolshoy rules.
itor Nitor ktijkl toijkl ... ... The torsion power series record is laid out as follows:
Following the keyword iton is the number of torsion power series terms Niton. The records consist of an energy barrier Eijkl [erg] and the equilibrium angle toijkl. This record is used only when the -z option is active. In practice Niton is always 4 and the four records are for base types A, C, G and T in that order. A power series in torsion will be used to represent the following function: Eijkl[tijkl2-toijkl2]2/toijkl4. which is a symmetrical function with zeros at torsion angles ±toijkl and a barrier of Eijkl at zero torsion.
iton Niton Eijkl toijkl ... ... The non-bond record is laid out as follows:
Following the keyword nbn+ is the number of non-bond records Nnb, the power n and the exclusion parameters Di-j and Dj-j. In the 3DNA model, each base pair, represented by three particles, are connected through CENT to form a long chain. Only CENT has any bulk and advantage is taken of the fact that DNA is locally stiff. This means that the non-bond interactions need to be evaluated for only a small number of atoms. In addition to ignoring two thirds of the atoms (the FRON and LEFT atoms), for each CENT atom, non-bond interactions are considered only for other CENT atoms that are separated by at least Di-j bonds and then only every Dj-j -th CENT atoms thereafter. Each non-bond record consists of the force constant knij[erg/Å2] and the contact exclusion distance doij[Å]. In practice, knij is always 4 and the non-bond records are to be given for A, C, G and T in that order.
nbn+ Nnb n Di-j Dj-j knij doij ... ... The mass record is laid out as follows:
Following the keyword mass is the number of mass records Nmass. The masses Mi are in units of amu. In practice, Nmass is always 12 and four groups of masses are to be given the groups being for the base types A, C, G and T in that order. Each group of masses are for FRON, LEFT and CENT in that order.
mass Nmass Mi ... The helical file format (-h) has the following layout:
The first line of the file must contain the following five words, they can be separated by any number of spaces but they must be in exact order.
Base Rise[ang] Twist[deg] Roll[deg] Tilt[deg] The helical parameters are then assumed to follow and are read as follows:
where Bi is a string that starts with a single character denoting the base symbol followed without an intervening space by the base pair number, for example: G1501 for the base symbol for guanine and a base pair number 1501. Only the following base symbols are recognized: A, C, G and T. This is followed by four numbers that are the rise in Å and the twist, roll and tilt angles in degrees.
Bi Rise Twist Roll Tilt ... ... ... ... ... The sequence file format (-s) has the following layout.
The sequence file may contain any number of text lines which will be ignored until the occurrence of two contiguous periods.Once the two periods appear together, the comment is assumed to have ended and the rest of the file will contain the base sequence.
text
.. Columns 1 to 10 are ignored. (This may contain a sequence number to keep track of the lines.) The base sequence starts on column 11 and the base symbols are separated by any number of white spaces.
The only base symbols that are allowed to appear are A, C, G and T. The base sequence is then used to generate the helical parameters using the Bolshoy, McNamara and Trifonov rules. The helical parameters are then used as equilibrium dimensions in the generated model.
1234567890
A CCGT TT AAAA GCGGG...
... The old AUGUR helical file (-H) is laid out as follows:
The file starts with the header lineThe file identifier is the string HEL1 which must start on column one of line one. This is followed by the file version number which must be 1000.
HEL1 1000 This is followed by any number of lines until one that has the format:
This line must start on column two (with a space on column one) and be exactly as shown above; the only variable is R the rise in Å.
Helix Parameters (degrees), bp:bp[A]=
R The appearance of this line signals the start of the data lines which will be read as:
where Bi is a string starting with a single letter denoting the base symbol B (e.g., A,C,G,T) and followed by the number i, for the i-th base pair, e.g. A301 for adenine base pair 301. Tw, Ro and Ti are the twist, roll and tilt angles respectively all expressed in degrees. A pair of base symbols and helical parameters are expected on each line. The first Bi starts on column 5 and the second starts on column 45. The angles can be placed freely within these limits separated by spaces. The helical parameters need not appear in proper sequence. The parameters are assigned to the i-th base pair as read from this file. If the same base pair appears more than once, the parameters last read will be in effect. If parameters are not defined for a base pair, a descriptor is still produced but the file will not be of any use.
Bi Tw Ro Ti Bi Tw Ro Ti ... ... ... ... ... ... ... ...
SEE ALSO
mkd3c, xmd3, des, the descriptor file format.
DIAGNOSTICS
BUGS
NOTES
mkd3d is one of three programs specifically designed for the 3DNA model. In this model, the base pairs of a DNA double helix is represented by only three masses. This reduces the size of the problem by one to two orders of magnitude allowing large pieces of DNA to be studied.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|