binaries/src/iupred/README

   1                        IUPred RELEASE NOTES
   2                        ====================
   3
   4 IUPred Version 1.0
   5
   6 by Zsuzsanna Dosztanyi
   7
   8
   9 IUPred is supplied in source code form along with the required data files. The
  10 program is written in ANSI C and the code should compile on any ANSI C compiler
  11 e.g. the GNU C compiler. To be able to run outside the source directory, the
  12 IUPred_PATH  environment variable has to be set to the location of the source
  13 files.
  14
  15 TO COMPILE:
  16
  17 cc iupred.c -o iupred
  18
  19 TO RUN IUPred:
  20
  21 iupred seqfile type
  22
  23   where seqfile is the name of the sequence file
  24
  25   type is any of the option of
  26
  27         long
  28         short
  29         glob
  30
  31   for prediction of long disorder, short disorder ( e.g. missing residues in
  32   X-ray structures) or predicting globular domains.
  33
  34
  35 INPUT FILE: sequence_file in fasta format. One sequence per file.
  36
  37 EXAMPLE RUN:
  38
  39 iupred P53_HUMAN.seq long
  40
  41
  42 INTERPRETATION OF THE OUTPUT:
  43
  44 In the case of long and short types of disorder the output  gives the
  45 likelihood of disorder for each residue, i.e. it is a value between 0 and 1,
  46 and higher values indicate higher probability of disorder. Residues with values
  47 above 0.5 can be regarded as disordered, and at this cutoff 5% of globular
  48 proteins is expected to be predicted to disordered (false positives).
  49
  50 For the prediction type of globular domains it gives the number of globular
  51 domains and list their start and end position in the sequence. This is followed
  52 by the submitted sequence with residues of globular domains indicated by
  53 uppercase letters.
  54
  55
  56 Please see the LICENSE file for the license terms for the software. It is
  57 basically free for academic users, but a license fee applies to commercial
  58 users.
  59
  60 THE PUBLICATION OF RESEARCH USING IUPred MUST INCLUDE AN APPROPRIATE
  61 CITATION TO THE METHOD:
  62
  63 The Pairwise Energy Content Estimated from Amino Acid Composition Discriminates
  64 between Folded and Intrinsically Unstructured Proteins
  65 Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon
  66 J. Mol. Biol. (2005) 347, 827-839.
  67
  68
  69 SHORT SUMMARY OF THE METHOD
  70
  71 Intrinsically unstructured/disordered proteins have no single well-defined
  72 tertiary structure in their native, functional state. Our server recognizes
  73 such regions from the amino acid sequence based on the estimated pairwise
  74 energy content. The underlying assumption is that globular proteins make a
  75 large number of interresidue interactions, providing the stabilizing energy to
  76 overcome the entropy loss during folding. In contrast, IUPs have special
  77 sequences that do not have the capacity to form sufficient interresidue
  78 interactions. Taking a set of globular proteins with known structure, we have
  79 developed a simple formalism that allows the estimation of the pairwise
  80 interaction energies of these proteins. It uses a quadratic expression in the
  81 amino acid composition, which takes into account that the contribution of an
  82 amino acid to order/disorder depends not only its own chemical type, but also
  83 on its sequential environment, including its potential interaction partners.
  84 Applying this calculation for IUP sequences, their estimated energies are
  85 clearly shifted towards less favorable energies compared to globular proteins,
  86 enabling the predicion of protein disorder on this ground.