9 IUPred is supplied in source code form along with the required data files. The
10 program is written in ANSI C and the code should compile on any ANSI C compiler
11 e.g. the GNU C compiler. To be able to run outside the source directory, the
12 IUPred_PATH environment variable has to be set to the location of the source
23 where seqfile is the name of the sequence file
25 type is any of the option of
31 for prediction of long disorder, short disorder ( e.g. missing residues in
32 X-ray structures) or predicting globular domains.
35 INPUT FILE: sequence_file in fasta format. One sequence per file.
39 iupred P53_HUMAN.seq long
42 INTERPRETATION OF THE OUTPUT:
44 In the case of long and short types of disorder the output gives the
45 likelihood of disorder for each residue, i.e. it is a value between 0 and 1,
46 and higher values indicate higher probability of disorder. Residues with values
47 above 0.5 can be regarded as disordered, and at this cutoff 5% of globular
48 proteins is expected to be predicted to disordered (false positives).
50 For the prediction type of globular domains it gives the number of globular
51 domains and list their start and end position in the sequence. This is followed
52 by the submitted sequence with residues of globular domains indicated by
56 Please see the LICENSE file for the license terms for the software. It is
57 basically free for academic users, but a license fee applies to commercial
60 THE PUBLICATION OF RESEARCH USING IUPred MUST INCLUDE AN APPROPRIATE
61 CITATION TO THE METHOD:
63 The Pairwise Energy Content Estimated from Amino Acid Composition Discriminates
64 between Folded and Intrinsically Unstructured Proteins
65 Zsuzsanna Dosztányi, Veronika Csizmók, Péter Tompa and István Simon
66 J. Mol. Biol. (2005) 347, 827-839.
69 SHORT SUMMARY OF THE METHOD
71 Intrinsically unstructured/disordered proteins have no single well-defined
72 tertiary structure in their native, functional state. Our server recognizes
73 such regions from the amino acid sequence based on the estimated pairwise
74 energy content. The underlying assumption is that globular proteins make a
75 large number of interresidue interactions, providing the stabilizing energy to
76 overcome the entropy loss during folding. In contrast, IUPs have special
77 sequences that do not have the capacity to form sufficient interresidue
78 interactions. Taking a set of globular proteins with known structure, we have
79 developed a simple formalism that allows the estimation of the pairwise
80 interaction energies of these proteins. It uses a quadratic expression in the
81 amino acid composition, which takes into account that the contribution of an
82 amino acid to order/disorder depends not only its own chemical type, but also
83 on its sequential environment, including its potential interaction partners.
84 Applying this calculation for IUP sequences, their estimated energies are
85 clearly shifted towards less favorable energies compared to globular proteins,
86 enabling the predicion of protein disorder on this ground.