Input
File Format
CS23D2.0
accepts and processes backbone and side chain 1H, 13C or 15N chemical shift data of almost any
combination (HA only, HN only, HA+HN only, HA+HN+sidechain H, CA only, CA+CB only, CA+CO only,
HA+CA+CB, HN+CA+CB, HN+15N only, HN,+15N+CA, HN+15N+CA+CB, etc.). This allows CS23D2.0 to
handle small peptides (where only H shifts are typically measured) to large proteins (where only N or C
shifts might be available).
The
input file must include sequence data and chemical shift data either in BMRB STAR 2.1 (or 2.1.1)
format or SHIFTY format. The minimum sequence length is 3 residues. The maximum is 1000 residues.
BMRB
Format
Examples
of allowable BMRB files (with and without different headers) are
shown below:
Example
#1: This is an example of a generic BMRB file extracted from the BMRB. The entire
file is ~500 lines, and only a portion is shown here. The header file is not important for CS23D2.0
data processing,only the chemical shift list (at the bottom of the file). CS23D2.0 ignores most (if
not all) of the header text.
data_548
#######################
# Entry information #
#######################
save_entry_information
_Saveframe_category entry_information
_Entry_title
;
Sequence-Specific 1H NMR Assignment and Secondary Structure of Neuropeptide Y in
Aqueous Solution
;
loop_
_Author_ordinal
_Author_family_name
_Author_given_name
_Author_middle_initials
_Author_family_title
1 Saudek Vladimir . .
2 Pelton John T. .
stop_
_BMRB_accession_number 548
_BMRB_flat_file_name bmr548.str
_Entry_type revision
_Submission_date 1995-07-31
_Accession_date 1996-04-12
_Entry_origination BMRB
_NMR_STAR_version 2.1
_Experimental_method NMR
ETC.
ETC.
loop_
_Atom_shift_assign_ID
_Residue_seq_code
_Residue_label
_Atom_name
_Atom_type
_Chem_shift_value
_Chem_shift_value_error
_Chem_shift_ambiguity_code
1 1 TYR HA H 4.53 . 1
2 1 TYR HB2 H 3.05 . 2
3 1 TYR HB3 H 3.28 . 2
4 1 TYR HD1 H 7.28 . 1
5 1 TYR HD2 H 7.28 . 1
6 1 TYR HE1 H 6.93 . 1
7 1 TYR HE2 H 6.93 . 1
8 2 PRO HA H 4.59 . 1
9 2 PRO HB2 H 2.01 . 2
10 2 PRO HB3 H 2.39 . 2
11 2 PRO HG2 H 1.48 . 1
12 2 PRO HG3 H 1.48 . 1
13 2 PRO HD2 H 3.38 . 2
14 2 PRO HD3 H 3.74 . 2
15 3 SER H H 8.42 . 1
16 3 SER HA H 4.38 . 1
17 3 SER HB2 H 3.83 . 1
18 3 SER HB3 H 3.83 . 1
Example
#2: This is an example of a slightly shortened BMRB format where only the assigned
chemical shift section of the BMRB file is provided.
##############################
# assigned chemical shifts #
##############################
save_assigned_chem_shift_list_1
_Saveframe_category assigned_chemical_shifts
loop_
_Software_label
$NMRPipe
stop_
loop_
_Sample_label
$sample_1
$sample_2
stop_
_Sample_conditions_label $sample_conditions_1
_Chem_shift_reference_set_label $chemical_shift_reference_1
_Mol_system_component_name entity_1
loop_
_Atom_shift_assign_ID
_Residue_author_seq_code
_Residue_seq_code
_Residue_label
_Atom_name
_Atom_type
_Chem_shift_value
_Chem_shift_value_error
_Chem_shift_ambiguity_code
1 1 1 GLY HA2 H 4.44 0.0300 2
2 1 1 GLY HA3 H 3.72 0.0300 2
3 1 1 GLY CA C 44.81 0.4000 1
4 2 2 SER H H 8.70 0.0300 1
5 2 2 SER N N 121.24 0.4000 1
6 4 4 MET HA H 4.30 0.0300 1
7 4 4 MET HB2 H 2.11 0.0300 2
8 4 4 MET HB3 H 1.94 0.0300 2
9 4 4 MET HG2 H 2.30 0.0300 2
10 4 4 MET HG3 H 2.30 0.0300 2
11 4 4 MET C C 172.22 0.4000 1
12 4 4 MET CA C 55.62 0.4000 1
13 4 4 MET CB C 29.60 0.4000 1
Example
#3: This is an example of the simplest BMRB format that CS23D2.0 accepts. Only the
chemical shift list is provided with no preceding data tags. The number of columns in this
example is 9.
1 1 1 GLY HA2 H 4.44 0.0300 2
2 1 1 GLY HA3 H 3.72 0.0300 2
3 1 1 GLY CA C 44.81 0.4000 1
4 2 2 SER H H 8.70 0.0300 1
5 2 2 SER N N 121.24 0.4000 1
6 4 4 MET HA H 4.30 0.0300 1
7 4 4 MET HB2 H 2.11 0.0300 2
8 4 4 MET HB3 H 1.94 0.0300 2
9 4 4 MET HG2 H 2.30 0.0300 2
10 4 4 MET HG3 H 2.30 0.0300 2
11 4 4 MET C C 172.22 0.4000 1
12 4 4 MET CA C 55.62 0.4000 1
13 4 4 MET CB C 29.60 0.4000 1
Example
#4: This is another example of a simplified BMRB format that CS23D2.0 also accepts.
The number of data columns in this example is 8. The minimum number of columns that
CS23D2.0 accepts is 8. If no data is available for the chemical shift error or ambiguity, these
values can be replaced by a period (as seen in this example).
loop_
_Atom_shift_assign_ID
_Residue_author_seq_code
_Residue_seq_code
_Residue_label
_Atom_name
_Atom_type
_Chem_shift_value
_Chem_shift_value_error
_Chem_shift_ambiguity_code
1 1 GLY HA2 H 4.44 . .
2 1 GLY HA3 H 3.72 . .
3 1 GLY CA C 44.81 . .
4 2 SER H H 8.70 . .
5 2 SER N N 121.24 . .
6 4 MET HA H 4.30 . .
7 4 MET HB2 H 2.11 . .
8 4 MET HB3 H 1.94 . .
9 4 MET HG2 H 2.30 . .
10 4 MET HG3 H 2.30 . .
11 4 MET C C 172.22 . .
12 4 MET CA C 55.62 . .
13 4 MET CB C 29.60 . .
Example
#5: Here is another example of an acceptable BMRB format. In this situation the
“case” of the assignment loop is upper case (instead of the usual lower case). The number of
data columns is 9,even though the Author_seq_code and residue_seq_code are duplicated.
loop_
_ATOM_SHIFT_ASSIGN_ID
_RESIDUE_AUTHOR_SEQ_CODE
_RESIDUE_SEQ_CODE
_RESIDUE_LABEL
_ATOM_NAME
_ATOM_TYPE
_CHEM_SHIFT_VALUE
_CHEM_SHIFT_VALUE_ERROR
_CHEM_SHIFT_AMBIGUITY_CODE
1 1 1 GLY HA2 H 4.44 0.0300 .
2 1 1 GLY HA3 H 3.72 0.0300 .
3 1 1 GLY CA C 44.81 0.4000 .
4 2 2 SER H H 8.70 0.0300 .
5 2 2 SER N N 121.24 0.4000 .
6 4 4 MET HA H 4.30 0.0300 .
7 4 4 MET HB2 H 2.11 0.0300 .
8 4 4 MET HB3 H 1.94 0.0300 .
9 4 4 MET HG2 H 2.30 0.0300 .
10 4 4 MET HG3 H 2.30 0.0300 .
11 4 4 MET C C 172.22 0.4000 .
12 4 4 MET CA C 55.62 0.4000 .
13 4 4 MET CB C 29.60 0.4000 .
Example
#6: In this example the data is presented in a tab-delimited format rather than
following the usual 3-character spacing found in most BMRB files. Comments have also been
added below the chemical shift assignment loop and above the data columns. This format
(and modest variations of it) is also accepted by CS23D2.0.
loop_
_ATOM_CHEM_SHIFT.ID
_ATOM_CHEM_SHIFT.COMP_INDEX_ID
_ATOM_CHEM_SHIFT.COMP_ID
_ATOM_CHEM_SHIFT.ATOM_ID
_ATOM_CHEM_SHIFT.ATOM_TYPE
_ATOM_CHEM_SHIFT.VAL
_ATOM_CHEM_SHIFT.VAL_ERR
_ATOM_CHEM_SHIFT.AMBIGUITY_CODE
_ATOM_CHEM_SHIFT.OCCUPANCY
#
# some comments placed here
# more comments
#
1 1 GLY HA2 H 4.44 0.0300 2
2 1 GLY HA3 H 3.72 0.0300 2
3 1 GLY CA C 44.81 0.4000 1
4 2 SER H H 8.70 0.0300 1
5 2 SER N N 121.24 0.4000 1
6 4 MET HA H 4.30 0.0300 1
7 4 MET HB2 H 2.11 0.0300 2
8 4 MET HB3 H 1.94 0.0300 2
9 4 MET HG2 H 2.30 0.0300 2
10 4 MET HG3 H 2.30 0.0300 2
11 4 MET C C 172.22 0.4000 1
12 4 MET CA C 55.62 0.4000 1
13 4 MET CB C 29.60 0.4000 1
Example
#7: In this example the data is presented in a single-space-delimited format rather
than following the usual 3-character spacing found in most BMRB files. Comments have also
been added below the chemical shift assignment loop and above the data columns. This
format (and modest variations of it) is also accepted by CS23D2.0.
loop_
_ATOM_CHEM_SHIFT.ID
_ATOM_CHEM_SHIFT.COMP_INDEX_ID
_ATOM_CHEM_SHIFT.COMP_ID
_ATOM_CHEM_SHIFT.ATOM_ID
_ATOM_CHEM_SHIFT.ATOM_TYPE
_ATOM_CHEM_SHIFT.VAL
_ATOM_CHEM_SHIFT.VAL_ERR
_ATOM_CHEM_SHIFT.VAL_ERROR
_ATOM_CHEM_SHIFT.AMBIGUITY_CODE
_ATOM_CHEM_SHIFT.OCCUPANCY
_ATOM_CHEM_SHIFT.DETAILS
#
# some comments placed here
# more comments
1 1 1 GLY HA2 H 4.44 0.03 2.
2 1 1 GLY HA3 H 3.72 0.03 2.
3 1 1 GLY CA C 44.81 0.4 1.
4 2 2 SER H H 8.70 0.03 1.
5 2 2 SER N N 121.24 0.4 1.
6 4 4 MET HA H 4.30 0.03 2.
7 4 4 MET HB2 H 2.11 0.03 2.
8 4 4 MET HB3 H 1.94 0.03 2.
9 4 4 MET HG2 H 2.30 0.03 2.
10 4 4 MET HG3 H 2.30 0.03 1.
11 4 4 MET C C 172.22 0.4 1.
12 4 4 MET CA C 55.62 0.4 1.
Example
#8: This is an example of a generic BMRB new format file extracted from the BMRB.
LOOP_
_ATOM_CHEM_SHIFT.ID
_ATOM_CHEM_SHIFT.ASSEMBLY_ATOM_ID
_ATOM_CHEM_SHIFT.ENTITY_ASSEMBLY_ID
_ATOM_CHEM_SHIFT.ENTITY_ID
_ATOM_CHEM_SHIFT.COMP_INDEX_ID
_ATOM_CHEM_SHIFT.SEQ_ID
_ATOM_CHEM_SHIFT.COMP_ID
_ATOM_CHEM_SHIFT.ATOM_ID
_ATOM_CHEM_SHIFT.ATOM_TYPE
_ATOM_CHEM_SHIFT.ATOM_ISOTOPE_NUMBER
_ATOM_CHEM_SHIFT.VAL
_ATOM_CHEM_SHIFT.VAL_ERR
_ATOM_CHEM_SHIFT.ASSIGN_FIG_OF_MERIT
_ATOM_CHEM_SHIFT.AMBIGUITY_CODE
_ATOM_CHEM_SHIFT.OCCUPANCY
_ATOM_CHEM_SHIFT.RESONANCE_ID
_ATOM_CHEM_SHIFT.AUTH_ENTITY_ASSEMBLY_ID
_ATOM_CHEM_SHIFT.AUTH_SEQ_ID
_ATOM_CHEM_SHIFT.AUTH_COMP_ID
_ATOM_CHEM_SHIFT.AUTH_ATOM_ID
_ATOM_CHEM_SHIFT.DETAILS
_ATOM_CHEM_SHIFT.ENTRY_ID
_ATOM_CHEM_SHIFT.ASSIGNED_CHEM_SHIFT_LIST_ID
1 . 1 1 1 1 LYS HA H 1 4.133 0.000 . 1 . . . 1 K HA . 16747 1
2 . 1 1 1 1 LYS HB2 H 1 1.685 0.000 . 2 . . . 1 K HB . 16747 1
3 . 1 1 1 1 LYS HB3 H 1 1.685 0.000 . 2 . . . 1 K HB . 16747 1
4 . 1 1 1 1 LYS HD2 H 1 1.435 0.000 . 2 . . . 1 K HD2 . 16747 1
5 . 1 1 1 1 LYS HD3 H 1 1.401 0.000 . 2 . . . 1 K HD3 . 16747 1
6 . 1 1 1 1 LYS HE2 H 1 2.830 0.000 . 2 . . . 1 K HE . 16747 1
7 . 1 1 1 1 LYS HE3 H 1 2.830 0.000 . 2 . . . 1 K HE . 16747 1
8 . 1 1 1 1 LYS HG2 H 1 1.334 0.000 . 2 . . . 1 K HG . 16747 1
9 . 1 1 1 1 LYS HG3 H 1 1.334 0.000 . 2 . . . 1 K HG . 16747 1
10 . 1 1 1 1 LYS CA C 13 51.650 0.000 . 1 . . . 1 K CA . 16747 1
11 . 1 1 1 1 LYS CB C 13 29.270 0.000 . 1 . . . 1 K CB . 16747 1
12 . 1 1 1 1 LYS CD C 13 26.130 0.000 . 1 . . . 1 K CD . 16747 1
13 . 1 1 1 1 LYS CE C 13 39.360 0.000 . 1 . . . 1 K CE . 16747 1
SHIFTY
The
SHIFTY is a simplified chemical shift data entry format developed in the Sykes Lab in
1991 and is one of the more common “alternate” formats for chemical shift information.
Examples of allowable SHIFTY formats are shown below (note that any combination of shifts
may
be listed in any order, just as long as the columns are labeled with a header). The first line header
is essential. The header can be matched to the column positions or it can be presented as a
single spaced row. Minimally a SHIFTY file must have 3 columns: a residue number column, the
single letter residue name column and a chemical shift column. Unmeasured or undetectable
chemical shifts can be entered as either 0.00 or – or *.
# AA HA HN N15 CA CB CO
1 M 4.6128 8.3509 128.1401 55.5746 33.1840 174.0504
2 F 5.1658 9.1754 128.0914 56.8722 43.2068 172.6446
3 Q 5.0880 7.8251 122.4598 54.4658 32.9175 174.3090
4 Q 4.6980 8.4214 119.1251 54.3607 33.5503 173.9477
5 E 5.1262 8.3247 122.6401 54.8529 31.9685 176.1557
6 V 4.5204 8.4684 123.4184 61.4330 34.6444 173.0311
7 T 4.9002 8.2696 119.8067 62.2487 70.0431 174.1138
8 I 4.1698 8.8360 129.2597 61.8793 37.2884 176.4472
9 T 4.4136 8.2868 115.9694 60.8221 70.1452 174.6432
10 A 4.2796 8.0655 127.7723 50.9885 19.0033 176.6414
11 P 4.3562 0.0000 0.0000 65.5591 31.2252 177.2392
12 N 4.8824 7.8942 112.1161 52.5902 39.2484 177.0207
13 G 3.7309 7.5941 106.4993 46.8305 0.0000 174.5358
14 L 4.6853 9.7859 121.2612 53.1092 41.6631 175.3041
15 D 4.6986 7.0435 114.6080 52.0224 40.8042 177.3864
16 T 4.0677 7.8732 114.9997 67.0623 68.7506 177.2631
17 R 3.9316 8.0671 119.4180 60.4646 30.5755 177.9282
18 P 4.2658 0.0000 0.0000 65.3875 30.9009 178.6357
19 A 4.0015 8.5778 121.5522 55.2170 18.1581 179.5463
20 A 4.0493 7.9442 119.6336 55.1010 18.1309 179.7605
21 Q 4.0158 7.9651 115.7440 58.4227 28.2881 178.1323
22 F 4.1284 8.6923 121.2872 61.8092 39.3486 177.1596
23 V 4.0272 8.4435 118.5810 65.9995 31.2267 178.5363
24 K 3.9445 7.8277 117.7576 58.7971 31.7623 178.6483
Example
2: Here is an example where only HA HN and N15 shifts are presented. The header
spacing is aligned with the columns in this case, although
the alignment is not necessary.
# AA HA HN N15
1 M 4.6128 8.3509 128.1401
2 F 5.1658 9.1754 128.0914
3 Q 5.0880 7.8251 122.4598
4 Q 4.6980 8.4214 119.1251
5 E 5.1262 8.3247 122.6401
6 V 4.5204 8.4684 123.4184
7 T 4.9002 8.2696 119.8067
Example
3: Acceptable SHIFTY Format can include any of the following column headers
where the # sign is replaced by “NUM” or “>”
or “#NUM”:
# AA HA HN N15 CA CB CO
1 M 4.6128 8.3509 128.1401 55.5746 33.1840 174.0504
2 F 5.1658 9.1754 128.0914 56.8722 43.2068 172.6446
3 Q 5.0880 7.8251 122.4598 54.4658 32.9175 174.3090
4 Q 4.6980 8.4214 119.1251 54.3607 33.5503 173.9477
5 E 5.1262 8.3247 122.6401 54.8529 31.9685 176.1557
or
NUM AA HA HN N15 CA CB CO
1 M 4.6128 8.3509 128.1401 55.5746 33.1840 174.0504
2 F 5.1658 9.1754 128.0914 56.8722 43.2068 172.6446
3 Q 5.0880 7.8251 122.4598 54.4658 32.9175 174.3090
4 Q 4.6980 8.4214 119.1251 54.3607 33.5503 173.9477
5 E 5.1262 8.3247 122.6401 54.8529 31.9685 176.1557
or
> AA HA HN N15 CA CB CO
1 M 4.6128 8.3509 128.1401 55.5746 33.1840 174.0504
2 F 5.1658 9.1754 128.0914 56.8722 43.2068 172.6446
3 Q 5.0880 7.8251 122.4598 54.4658 32.9175 174.3090
4 Q 4.6980 8.4214 119.1251 54.3607 33.5503 173.9477
5 E 5.1262 8.3247 122.6401 54.8529 31.9685 176.1557
or
#NUM AA HA HN N15 CA CB CO
1 M 4.6128 8.3509 128.1401 55.5746 33.1840 174.0504
2 F 5.1658 9.1754 128.0914 56.8722 43.2068 172.6446
3 Q 5.0880 7.8251 122.4598 54.4658 32.9175 174.3090
4 Q 4.6980 8.4214 119.1251 54.3607 33.5503 173.9477
5 E 5.1262 8.3247 122.6401 54.8529 31.9685 176.1557
Output
File Format
An email will be sent to the recipient's email address indicated on the email box.The contents of
the email should include the final PDB Structure if the program successfully managed to
generate
a PDB Structure. Also within that email, the user can view the 3D structure of the Protein.Lastly,
statistical information is made readily available to the recipient in regards to the quality of
the
structure. The Link to results page summarizes the contents of the email, which can be bookmarked
for futurerevisitations. The following is an example of an email sent after CS23D2.0 successfully
generated a structure:
Your CS23D2.0 structure prediction is complete.
Link to PDB structure: http://busby1.cs.ualberta.ca/CS23D2.0/tmp/1203980205.pdb
Link to View Structure: http://busby1.cs.ualberta.ca/cgi-bin/CS23D/show_struct.cgi?id=1203980205
Link to results page: http://busby1.cs.ualberta.ca/cgi-bin/GenMR/Results.cgi?dir=/usr/scratch/prion/GENMR/tmp&
Input=1203980205&email=peter.tang.lai@gmail.com
Before optimization After optimization Expected
CS23D2.0 energy -18.84 -33.75
Mean chemical shift correlation 0.740 0.742
Torsion angles
#res in phi/psi core 79 79 72 (90%)
#res in phi/psi allowed 1 1 6 ( 7%)
#res in phi/psi generous 0 0 1 ( 1%)
#res in phi/psi disallowed 0 0 0 ( 0%)
#res in omega allowed 80 80 80 (99%)
#res in omega disallowed 1 1 1 ( 1%)
Final structure reliability: Good
Mean chemical shift correlation
0.75 - 1.00 = High
0.65 - 0.75 = Good
0.55 - 0.65 = Moderate
0.00 - 0.55 = Poor