The Active site record is applied to residues of enzymes known or
thought to
function in the actual catalytic reaction of the enzyme. It should be
applied
to a single residue or a short list of residues; it should not be
applied to a
range (a hyphenated pair). If the active site residues are not
specifically
known but have been localized to a segment of the sequence, the
"Region" record
rather than the "Active site" record should be used. "Active
site" features in
entries without an Enzyme Commission notation in either their title or
"Contains" records are suspect and will be flagged as possible errors.
The format for the "Active site" record is
"Active site: "res ["," res...]
["("description")"] ["#link" link]"#status
" status
The status is required for this feature. All the residues participating in each active site that do not
require different modifiers, should be combined in the same feature. Do
not combine residues from different active sites or that need different
modifiers. The use of description fields, discussed below, should be avoided if
possible.
Examples:
Active site: Arg #status experimental
Active site: Asp, His, Ser #status predicted
Active site: His, His, Asp #status experimental
A residue list may be used only for those residues which participate in
the same concerted catalytic reaction. If all the residues participating in
one active site are the same type, then only one residue need be shown.
Enzymes recognized to have several distinct catalytic reactions should have an
"Active site" record for each active site. Multiple "Active site"
records for what is, in fact, a single active site should be combined into one record
using a list of residues, unless different status conditions apply.
[GRAY] Formerly, mechanisms were presented but this should no
longer be done except when the mechanism is used as a description. Generally such a
description should be applied only when multiple active sites occur in
the same
entry.
[BLACK] In particular, the description "charge relay
system" should not be used except in enzymes with multiple
activities.
Examples currently used are:
Active site: Cys (amide transfer)
Active site: Cys (of 3-oxoacyl-[acyl-carrier-protein] synthase)
Active site: Lys (of 3-oxoacyl-[acyl-carrier-protein] reductase)
Active site: Lys (of enoyl-[acyl-carrier-protein] reductase)
Active site: Ser (of enoyl-[acyl-carrier-protein] reductase)
Active site: Ser (of oleoyl-[acyl-carrier-protein] hydrolase)
Active site: Ser (of [acyl-carrier-protein] acetyl/malonyltransferase)
Active site: Glu (alpha-reaction)
Active site: His, Lys, Cys (beta-reaction)
Descriptors like these may be replaced with "#link"
modifiers which point to tags in appropriate Function records, or Domain or Product
features.
Active site: Cys #link ARD #status predicted
Here the link "ARD" points to a Function record with the tag
"<ARD>". This
mechanism will also be used to link active site records with different
status conditions but which belong to the same active site object.
When a residue has a stable, covalently-bound, catalytically-active
prosthetic group, only the "Binding site: ... (covalent)" feature should be
used. An "Active site" record should not also be used because it is the
prosthetic group which is active and not the amino acid as such. In particular, for an
active site phosphoserine only the annotation:
Binding site: phosphate (Ser) (covalent) #status
experimental
should appear. When a residue forms a transient, covalent bond in its
role as an active site then the "Active site" record should be used and
the description field may be used. The nature of the intermediate should be made as
clear as practical. Annotators should consider carefully whether a
covalently-bound group is stable or transient in determining whether an annotation should
be for a modified or an active site. The following possible features show
active sites with transient groups that could easily be confused with a binding
site.
Active site: Ser (phosphoserine intermediate)
Active site: Tyr (phosphotyrosine intermediate)
No examples yet exist of the second feature. Other current acceptable
examples are:
Active site: Asp (aspartylphosphate intermediate)
Active site: Cys (phosphocysteine intermediate)
Active site: Cys (S-acetylcysteine intermediate)
Active site: Cys (sulfocysteine intermediate)
Active site: His (phosphohistidine intermediate)
Active site: Lys (ribulose-bisphosphate-binding)
Most of these features are documented in the RESID database. Avoid
records that are unnecessarily detailed or are synonymous with existing
features, like:
Active site: His (covalent intermediate)
Active site: Asp (phosphate-binding)
Be particularly suspicious of claims that Gly, Val, Leu, Ile, Pro, Asn,
Gln, Pro, Met or Phe residues are active site residues. It is chemically
dubious that such residues function in the actual catalytic reaction of an
enzyme. Glycine and a few other residues can form free radicals that participate
in free radical reactions, but for physical reasons such reactions are
extremely rare in biochemical reactions.
Current examples are:
Active site: Cys (cysteine thiyl radical intermediate)
Active site: Gly (stable glycyl radical)
Active site: Trp (tryptophyl radical intermediate)
Active site: Tyr (stable tyrosyl radical)
These features are documented in the RESID database.
Residues that are structurally located near an active site but do not
participate directly in the catalytic reaction of that active site
should not be annotated in the PIR databases. Annotations for such residues will
only be carried from PDB entries in the NRL_3D database. Not all reactive
compounds that block an enzymatic reaction wind up reacting with an active site
residue; they may react with a residue near the active site and block the
substrate's access to the active site. Something may be more of a "reactive
site" than an "active site", so be cautious about accepting this as experimental evidence
for active site residues.
For cysteine residues that form catalytically active disulfide bonds
only the annotation
Disulfide bonds: redox-active
should appear.
Even though selenocysteine may function as an active site, only the
feature
Modified site: selenocysteine
should be used.
Residues that participate in allosteric control of enzyme activity but
are not catalytically active should not be annotated as active sites but as
binding sites or as regions. Residues that participate in different, symmetry-related active
sites of complexes should not be combined in the same feature, but an
appropriate description should be used to indicate the relationship.
Active site: Asp (shared with dimeric partner)
Active site: Cys (shared with dimeric partner)
These features imply that there are two symmetry-related active sites.
Each site consists of an aspartate and a cysteine contributed by different
chains of the homodimer.
[BLACK] The annotation
Active site: ... inhibitory ...
should not be used. Instead, use the annotation
"Inhibitory site: "
[BLACK] Do not use the term "active site" in either
"Domain" or "Region" features. Instead,
use the term "catalytic".
In binding sites and modified sites, the following definitions are very
important. Because they include historical accidents and
grammatical exigencies, these are operational definitions and do not
necessarily extend beyond the purposes of this document.
Generally, an attachment site is an amino acid residue which has its
side chain chemically changed post-translationally in such a way that it could be
restored by physiological processes of hydrolysis, ammonolysis or simple (2H)
reduction. Such chemical changes may occur transiently, or more or less
permanently, but they must be covalent. The principle is that attachment site residues
could in principle be recovered and detected by typical methods of sequence
analysis, whereas modified sites could not be.
The "Binding site" feature includes two classes, attachment sites
and binding sites. A "binding site" is an amino acid residue, or
a group of them, that forms biochemically important, non-covalent bonds
with ions or molecules (other than the protein constituting the entry).
These bonds may be ionic, ligand (dative), Van der Waals, or donative or
receptive hydrogen bonds. One borderline case is the sulfur-metal bond which will be regarded as
covalent for cysteine when a cluster of atoms is bound, and non-covalent (dative ligand).
Methionine sulfur-metal bonds will be regarded as non-covalent (dative ligand). attachment sites
will distinguished by using "(covalent)" in "Binding site"
records. All new "Binding sites" without "(covalent)" are
reviewed and subject to conversion. Consequently it is very important for annotators
to provide the "(covalent)" designation in every case when it
should be applied.
A "modified site" is an amino acid residue which is either
- chemically changed post-translationally in such a way that it could
not be restored by physiological processes of hydrolysis, ammonolysis or simple
(2H) reduction (that is, it is not a side-chain attachment site),
- chemically changed in any way involving the alpha amino group,
including N-formylmethionine (this applies to both the amino terminus and internal
residues),
- a carboxyl terminal residue with any chemical change involving the
alpha-carboxyl group,
- a selenocysteine residue (these are translationally incorporated but
for historical reasons are regarded as modified cysteine residues);
- aspartate or glutamate esters that can arise from either the acid or
the amide forms.
Using the foregoing definitions "Binding site" records are
applied in two cases:
- when an amino acid residue, or a group of them, forms biochemically
important, non-covalent bonds with ions or molecules (other than the
protein constituting the entry); or
- when an amino acid residue forms an attachment site in which its
side chain is chemically changed post-translationally in such a way that it
could in principle be restored by physiological processes. Such cases
must have a "(covalent)" bond description.
The format for the "Binding site" record is
("Binding site:" ["(or" position ")"]
bound-group name "(" res ["," res...] ")"
["(covalent)" | "(" bonding description ")"]
["(" form ")"]
["(partial)"] ["#link " link] "#status " status
The status is required for this feature.
Currently acceptable covalent examples are listed below. The status, link
and partial descriptors have been removed, and a few minor variants have
been eliminated. Most of these features are documented in the Residues
database.
Binding site: 2Fe-2S cluster (Cys) (covalent)
Binding site: 2Fe-2S cluster (Cys, His, Cys, His) (covalent)
Binding site: 3Fe-4S cluster (Cys) (covalent)
Binding site: 4-hydroxycinnamyl (Cys) (covalent)
Binding site: 4Fe-4S cluster (Cys) (covalent)
Binding site: 4Fe-4S cluster (Cys) (covalent) (shared with dimeric partner)
Binding site: 4Fe-4S cluster 1 (Cys) (covalent)
Binding site: 4Fe-4S cluster 2 (Cys) (covalent)
Binding site: AMP (Tyr) (covalent)
Binding site: UMP (Tyr) (covalent)
Binding site: acetyl (Lys) (covalent)
Binding site: biotin (Lys) (covalent)
Binding site: carbohydrate (Asn) (covalent)
Binding site: carbohydrate (Asn) (covalent) (in ...)
Binding site: carbohydrate (Cys) (covalent)
Binding site: carbohydrate (Lys) (covalent)
Binding site: carbohydrate (Ser) (covalent)
Binding site: carbohydrate (Thr) (covalent)
Binding site: carbohydrate (Trp) (covalent)
Binding site: carbohydrate (Tyr) (covalent)
Binding site: carbon dioxide (Lys) (covalent) (by ...)
Binding site: chondroitin sulfate (Ser) (covalent)
Binding site: cysteine (Cys) (covalent)
Binding site: cysteine (Cys) (covalent) (in ...)
Binding site: dermatan sulfate (Ser) (covalent)
Binding site: farnesyl (Cys) (covalent)
Binding site: fatty acid (Ser) (covalent)
Binding site: fatty acid (Thr) (covalent)
Binding site: formyl (Lys) (covalent)
Binding site: geranyl-geranyl (Cys) (covalent)
Binding site: glutathione (Cys) (covalent)
Binding site: glycerylphosphorylethanolamine (Glu) (covalent)
Binding site: heme (Cys) (covalent)
Binding site: heme (Glu) (covalent)
Binding site: heme, high potential (Cys) (covalent)
Binding site: heme, low potential (Cys) (covalent)
Binding site: heparan sulfate (Ser) (covalent)
Binding site: homocitryl Mo-7Fe-8S cluster (Cys) (covalent)
Binding site: keratan sulfate (Thr) (covalent)
Binding site: lipoamide (Lys) (covalent)
Binding site: methyl (Cys) (covalent)
Binding site: molybdopterin (Cys) (covalent)
Binding site: molybdopterin guanine dinucleotide (Cys) (covalent)
Binding site: murein (Lys) (covalent)
Binding site: myristate (Lys) (covalent)
Binding site: nitrosonium (Cys) (covalent)
Binding site: palmitate (Cys) (covalent)
Binding site: palmitate (Lys) (covalent)
Binding site: phosphate (Arg) (covalent)
Binding site: phosphate (Asp) (covalent)
Binding site: phosphate (His) (covalent)
Binding site: phosphate (His) (covalent) (by ...)
Binding site: phosphate (Ser) (covalent)
Binding site: phosphate (Ser) (covalent) (by ...)
Binding site: phosphate (Ser) (covalent) (in ...)
Binding site: phosphate (Thr) (covalent)
Binding site: phosphate (Thr) (covalent) (by ...)
Binding site: phosphate (Tyr) (covalent)
Binding site: phosphate (Tyr) (covalent) (by ...)
Binding site: phosphopantetheine (Ser) (covalent)
Binding site: phosphoribosyl dephospho-coenzyme A (Ser) (covalent)
Binding site: phosphoryl-DNA (Ser) (covalent)
Binding site: phosphoryl-DNA (Thr) (covalent)
Binding site: phosphoryl-DNA (Tyr) (covalent)
Binding site: phosphoryl-RNA (Ser) (covalent)
Binding site: phosphoryl-RNA (Tyr) (covalent)
Binding site: phycocyanobilin (Cys) (covalent)
Binding site: phycoerythrobilin (Cys) (covalent)
Binding site: phytochromobilin (Cys) (covalent)
Binding site: polyglutamate (Glu) (covalent)
Binding site: polyglycine (Glu) (covalent)
Binding site: pyridoxal phosphate (Lys) (covalent)
Binding site: retinal (Lys) (covalent)
Binding site: sn-2,3-diacylglycerol (Cys) (covalent)
Binding site: sn-2,3-diphytanylglycerol diether (Cys) (covalent)
Binding site: sulfate (Tyr) (covalent)
Binding site: vanadium cofactor (Cys) (covalent)
Binding site: iron-sulfur clusters (Cys) (covalent)
[use this only when the cluster form has not been determined
and cannot be predicted]
A large variety in the "(by ...)" descriptor exists.
Please consult the database to determine currently used forms.
Examples of currently acceptable "Binding site" features not
labeled "covalent" are listed below. The residue lists (in all but a few cases), status, link
and partial descriptors have been removed, and a few minor variants have
been eliminated.
[the following with one locant]
Binding site: heme iron (His) (axial ligand)
Binding site: heme iron (His) (axial ligand) (shared with alpha
chain)
Binding site: heme iron (His) (axial ligand) (shared with beta
chain)
[the following with two locants]
Binding site: heme iron (His) (axial ligands)
Binding site: heme iron (His) (proximal axial ligand)
Binding site: heme iron (His, Met) (axial ligands)
Binding site: heme iron (Met, His) (axial ligands)
Binding site: heme iron (Tyr) (axial ligand)
Binding site: heme iron, high potential (His) (axial ligand)
Binding site: heme iron, high potential (His) (axial ligands)
Binding site: heme iron, high potential (His, Met) (axial ligands)
Binding site: heme iron, high potential (His, Tyr) (axial ligands)
Binding site: heme iron, low potential (His) (axial ligand)
Binding site: heme iron, low potential (His) (axial ligands)
Binding site: heme iron, low potential (His, Tyr) (axial ligands)
Binding site: heparin
Binding site: histamine
Binding site: homocitryl Mo-7Fe-8S cluster molybdenum (His) (ligand)
Binding site: iron
Binding site: iron (Asp) (shared with tetrameric partners)
Binding site: iron (His) (shared with chain M)
Binding site: iron (His, Glu, His) (shared with chain L)
Binding site: iron (Lys) (shared with tetrameric partners)
Binding site: magnesium
Binding site: magnesium (Glu) (shared with chain I)
Binding site: magnesium (His) (shared with chain II)
Binding site: manganese
Binding site: mercury
Binding site: metal
Binding site: methylcobalamin cobalt
Binding site: micellar substrate
Binding site: molybdopterin (Arg)
Binding site: molybdopterin cytosine dinucleotide (Arg)
Binding site: nickel
Binding site: nickel 1
Binding site: nickel 2
Binding site: omega-aminocarboxylic acids
Binding site: oxygen (His) (distal axial ligand)
Binding site: oxygen (Tyr) (distal axial ligand)
Binding site: phospholipid
Binding site: plastoquinone
Binding site: potassium
Binding site: pyrophosphate
Binding site: retinoic acid
Binding site: siroheme iron (Cys) (axial ligand)
Binding site: substrate
Binding site: substrate phosphate
Binding site: thyroxine
Binding site: transition metal ions
Binding site: ubiquinone
Binding site: zinc
Binding site: zinc, catalytic
[see note below on the next two]
Binding site: zinc, catalytic (Cys, His, His, His) (inhibited)
Binding site: zinc, catalytic (His) (active)
Binding site: zinc, high affinity
Binding site: zinc, noncatalytic
The bound-group name must always be followed by a set of parentheses
inclosing a residue or a list of residues that matches sequence residues
corresponding to the preceding numbers. Strict parsing is enforced for
this rule. If all the residues participating in one binding site are the
same type, then only one residue need be shown, for example:
The only bonding descriptions presently used are "covalent",
"axial ligand",
"axial ligands", "proximal axial ligand" and "distal axial
ligand". For these
ligand cases, care must be taken in specifying the bound entity:
"heme iron" rather than simply "heme".
Covalent bonds to heme and similar prosthetic groups are to the group
and not to the metal.
Also, use ligand if there is only one locant in the feature, and
ligands if there are two or more locants even though they are all the same type
of residue and one residue is shown. Thus,
A single substrate may be listed simply as "substrate". For
multiple substrates, other than water, in the same entry the substrate may be
named.
When it is experimentally observed that a group is covalently bound
at less than 95 mole per cent, the "(partial)" annotation should be
used.
The "in" form should be used very sparingly when the
covalent bond is known to occur only in the mature form or in one of
several alternative polypeptide products and the entry presents an
immature sequence.
The "by" form is used to distinguish among different binding sites
of the same group, for example:
Some covalent binding sites can occur only as a consequence of a prior
modification. These are nonetheless biochemically separate and distinct
features. For such cases we use two features, one to indicate the nature
of the modification and the other to indicate the secondary change.
For example:
When there are biochemically significantly different binding sites for
the same compound in the same entry (rare), the bound-group name may include
modifiers that distinguish between the functional differences of the bound-group
or of the binding sites. These modifiers should be placed after the bound-group,
without parentheses and separated from it by a comma.
Where the sequence was determined by protein sequencing and the nature
of the covalently attached group precludes assignment of a residue as either an
acid or an amide, and unless there is unequivocal evidence to the contrary
(for example, the nucleotide sequence), there is a reasonable biochemical
presumption that the residue should be the amide. The reported sequence
should be presented with the ambiguity explicit in the "Residues"
record, the amide presented in the sequence and feature records and an appropriate note
like
The format for the Inhibitory Site record is
"Inhibitory site:" res ["," res...]
"(" activity ["," activity ...] ")"
"#status " status
An inhibitory site is to an inhibitor what an active site is to an
enzyme. It is the residue, or small set of residues, that is responsible for
blocking the activity of an enzyme or set of enzymes. It should be applied to single
residues, and to a small list of residues only sparingly. The status is
required for this feature. Without a crystallographic structure it is
very
difficult to obtain experimental evidence that a particular residue is
an inhibitory site, so most will have predicted status.
Some examples, with status omitted:
Inhibitory site: Arg (acrosin)
Inhibitory site: Arg (thrombin, coagulation factor Xa)
Inhibitory site: Arg (trypsin)
Inhibitory site: Arg (unknown proteinase)
Inhibitory site: Cys (thermolysin)
Inhibitory site: Leu (chymotrypsin)
Inhibitory site: Leu (chymotrypsin, elastase)
Inhibitory site: Lys (trypsin)
Inhibitory site: Met (chymotrypsin, subtilisin)
Inhibitory site: Tyr (chymotrypsin)
[GRAY] In the case that one of two residues is thought to be
responsible for the inhibitory action, the record may be applied to a list and this
format is used
"Inhibitory site:" res "or" res
"(" activity ["," activity ...] ")"
"#status " status
For example,
Inhibitory site: Leu or Met (elastin, chymotrypsin)
#status predicted
The "or" form should be avoided whenever possible.
[BLACK] The "Inhibitory site" record is not used for
allosteric inhibitor sites; those may be annotated as binding sites.
We have chosen to use the unambiguous IUPAC numbered position forms, in
preference to the IUB Greek letter designations, when such usage allows
us to
avoid inconsistencies between common usage
("epsilon-aminomethyl") and IUB
recommended usage ("zeta-amino-methyl").
Note that standard abbreviations for the modified residues are not used,
so
that, the correct feature is
Modified Amino Terminus
The format for this form of the "Modified site" record is
"Modified site: "name "(" res
") "["(" form ")"] ["(" extent ")"]
"#status" status
The chemical name should be as specific as possible and should usually
include the term "amino end" at the end. When an unblocked or longer
precursor form is presented in the entry and the modified site is not position 1, the
"in mature form" modifier should be used, for example.
Modified site: acetylated amino end (Ala) (in mature
form) #status experimental
[GRAY] Because not all processed forms requiring this modifier
are the final "mature" form, it may become necessary to replace
this modifier with something like "(in processed form) #link
...". Annotators are invited to comment on this proposal.
Current acceptable examples are:
Modified site: 2-oxobutanoic acid (Thr)
Modified site: L-3-phenyllactic acid (Phe)
Modified site: N-formylmethionine
Modified site: acetylated amino end (xxx)
[the following form is used only when the presented
sequence is completely ambiguous at the amino terminus]
Modified site: blocked amino end
Modified site: blocked amino end (xxx)
Modified site: dimethylated amino end (Pro)
Modified site: fatty acylated amino end (Cys)
Modified site: formylated amino end (Gly)
Modified site: glucuronylated amino end (Gly)
Modified site: methylated amino end (Ala)
Modified site: myristylated amino end (Gly)
Modified site: succinylated amino end (Trp)
Modified site: pyrrolidone carboxylic acid (Gln)
Modified site: pyruvic acid (Ser)
Modified site: trimethylated amino end (Ala)
The form descriptor "(probably ...)" should be used with
"blocked amino end" whenever an appropriate prediction can be made for an otherwise experimentally determined ambiguous feature.
Modified site: blocked amino end (Ala) (probably acetylated)
#status experimental
The "blocked amino end" is usually only appropriate with
experimental status, because otherwise the specific modification would be used with a
predicted status. With increasing degrees of certainty
Modified site: acetylated amino end (Ala) #status
predicted
says you are guessing both whether and by what,
Modified site: blocked amino end (Ala) (probably
acetylated)
#status experimental
says you know whether but are guessing by what,
Modified site: acetylated amino end (Ala) #status
experimental
says you know both whether and by what.
Formylated amino terminal methionine is coded for and like
selenocysteine is
not really a modified site. However it should be annotated as a modified
site
when it is experimentally observed in a protein. Making the residue
explicit
is not required in this case. No occurrence has yet been noted of this
modified residue in other than the first position.
For amino terminal glutamine undergoing cyclization the format is
"Modified site: pyrrolidone carboxylic acid
(Gln)" ["(in mature form)"]["#link " link]
"#status " status
When the amino terminus is known to be glutamine and blocked,
pyrrolidone
carboxylic acid can be assumed unless a reason to believe otherwise is
explicitly provided, in which case
Modified site: blocked amino end (Gln) (in mature form) #status experimental
should be used. The form Modified site: pyrrolidone carboxylic acid
(Glx)
should be avoided.
The ambiguity should be explicitly noted in the
"Residues"record, an appropriate comment made, and the sequence
and feature presented as Gln. People entering sequences should be
explicitly warned about the notation "E" appearing in some
articles; such sequences should be entered with a "Q" and an
appropriate feature prepared.
[BLACK] Combined annotated forms like
Modified site: acetylated and phosphorylated amino end
(Ser)
should not be used. These should appear in two records:
Modified site: acetylated amino end (Ser)
Binding site: phosphate (Ser) (covalent)
See also the discussion of incidental and secondary modifications
under the covalent type "Binding site" section above.
In the case where a residue is enzymatically cleaved at the bond between
the alpha carbon and the alpha amino-nitrogen to produce a new amino
terminus blocked with a 2-oxo or a 2-hydroxy acid, the residue giving rise to the
blocking group is entered in the sequence and one of these annotations
is used
Modified site: 2-oxobutanoic acid (Thr)
Modified site: L-3-phenyllactic acid (Phe)
Modified site: pyruvic acid (Ser)
These features do not have "amino end" in the chemical name.
However,if the
preceding sequence is shown, these features should have the "(in
mature form)"
modifier.
The format for this form of the "Modified site" record has the same
format as for the modified amino terminus
"Modified site:" name "(" res
")" ["(" extent ")"] ["(" form
")"] "#status " status
Current examples are:
Modified site: amidated carboxyl end (xxx)
Modified site: amidated carboxyl end (xxx) (in mature form)
Modified site: amidated carboxyl end (xxx) (amide in mature form
...
from following glycine)
Modified site: amidated carboxyl end (Ala) (amide in mature form
...
from following serine)
Modified site: amidated carboxyl end (Tyr) (amide in mature form
...
from following leucine)
Modified site: blocked carboxyl end (xxx)
Modified site: chondroitin sulfate ester carboxyl end (Asp) (in
mature form)
Modified site: GPI-anchor ethanolamine amidated carboxyl end (xxx)
(in mature form)
Modified site: GSI-anchor ethanolamine amidated carboxyl end (Ser)
(in mature form)
Modified site: methyl ester carboxyl end (Cys) (in mature form)
The chemical name should be as specific as possible and should include
the term "carboxyl end" at the end. The "in" form should be used when a
longer immature sequence is presented in the entry and the modified site is not at the
final position.
In the case where the carboxyl amide arises from enzymatic cleavage of
the bond between the alpha-carbon and amino nitrogen of the following
glycine residue, a special form of the "in mature form" annotation is
used
Modified site: amidated carboxyl end (Ile) (amide in
mature form from following glycine)
All but a very small number of amidations arise from this mechanism. The
cases where leucine and serine are used are documented but not
well-understood.
The GSI-anchor is a chemically distinct modification that must be
carefully distinguished from the more well-known GPI-anchor.
Connections through the amino- or carboxyl-ends to other encoded peptide
chains are now all treated uniformly as Cross-link features.
The format for this form of the "Modified site:"
record is
"Modified site: selenocysteine "#status "
status
It had formerly been thought that selenocysteine arose from
post-translational modification of cysteine residues and no single-letter code was
assigned. When it was discovered to be encoded, the assignment of a special
single-letter code presented an insurmountable software implementation problem. Instead
this feature record is applied to those residues, or list of residues.
Although it usually serves as an active site, a second feature for that annotation
is superfluous. However, when it also serves as a covalent binding site for
a prosthetic group, it is considered a secondary modification and two
feature records are used.
Modified site: selenocysteine
Binding site: molybdopterin guanine dinucleotide (Cys)
(covalent)
Two different things are going on here. The first feature indicates the
true coding identity of the residue. The second indicates the true prosthetic
group covalently bound to the sequence-presented residue. [This all
arise because of the terrible historical accident that no one knew
selenocysteine was encoded until it was too late. Ever computer database uses "C"
and everyone's computer program will break if a new letter is introduced for it.]
Do not use the 1-letter code "X" in the canonical sequence or the
3-letter code "Sec" in a feature for selenocysteine. "X" may, of course, be used
in "Residues" records for encoded selenocysteine.