Uppsala Software Factory

Uppsala Software Factory - MSEQPRO Manual


1 MSEQPRO - GENERAL INFORMATION

Program : MSEQPRO
Version : 020819
Author : Gerard J. Kleywegt, Dept. of Cell and Molecular Biology, Uppsala University, Biomedical Centre, Box 596, SE-751 24 Uppsala, SWEDEN
E-mail : gerard@xray.bmc.uu.se
Purpose : generate PROSITE profiles from aligned protein sequences
Package : SBIN


2 REFERENCES

Reference(s) for this program:

* 1 * G.J. Kleywegt & T.A. Jones (1998). Databases in protein crystallography. Acta Cryst D54, 1119-1131. [http://xray.bmc.uu.se/gerard/papers/databases.html] [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&list_uids=10089488&dopt=Citation] [http://scripts.iucr.org/cgi-bin/paper?ba0001]

* 2 * Kleywegt, G.J., Zou, J.Y., Kjeldgaard, M. & Jones, T.A. (2001). Around O. In: "International Tables for Crystallography, Vol. F. Crystallography of Biological Macromolecules" (Rossmann, M.G. & Arnold, E., Editors). Chapter 17.1, pp. 353-356, 366-367. Dordrecht: Kluwer Academic Publishers, The Netherlands.


3 VERSION HISTORY

97???? - 0.1 - first version
971023 - 0.3 - improvements
971111 - 1.0 - cleaned up code and manual
980206 - 1.1 - minor changes
000508 - 1.2 - implemented Henikoff & Henikoff method to weight the sequences (JMB 243, pp. 574-578, 1994), which is now the default
001122 - 1.3 - better profile parameters; flexible indel penalty
010621 -1.3.1- minor changes
020819 - 1.4 - can now handle both real and integer substitution matrices


4 INTRODUCTION

This program generates PROSITE profiles from a set of aligned protein sequences. It is extremely similar to STRUPRO (quod vide). It only uses uninterrupted stretches of aligned residues to calculate the profile (the rationale being that stretches without insertions/deletions are more likely to correspond to structurally conserved features. It is mainly intended as an add-on to STRUPRO, the idea being that an initial profile is generated using aligned 3D structures (with STRUPRO); this is scanned against the sequence database (with the "pfsearch" program from the pftools package), and the alignment of the matching sequences is used to produce a new profile (with MSEQPRO). Program PRF2MSEQ can convert the output of a profile/database ("pfsearch -y") into a partial multiple sequence alignment file suitable as input to MSEQPRO. Alternatively, you can make a multiple sequence alignment file yourself and use that as input to MSEQPRO directly.

In order to scan sequence profiles against SWISS-PROT, you will need:

(1) the "pftools" suite of programs, written by Philipp Bucher ( mailto:pbucher@isrec-sun1.unil.ch ) and available by ftp from http://ulrec3.unil.ch:80/ftp-server/pftools/ (the suite should compile on most Unix machines).

(2) the SWISS-PROT database of protein sequences ( http://www.expasy.ch/sprot/sprot-top.html ), which can be downloaded by ftp from ftp://ftp.expasy.ch/databases/swiss-prot/ (at the time of writing, the file "compressed/sprot40.dat.gz").


5 INPUT TO THE PROGRAM

The input to this program is largely a subset of the input for STRUPRO (see the STRUPRO manual).


5.1 Start-up

When you start the program, it prints some information:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO ***

Version - 001122/1.3 (C) 1992-2000 Gerard J. Kleywegt, Dept. Cell Mol. Biol., Uppsala (S) User I/O - routines courtesy of Rolf Boelens, Univ. of Utrecht (NL) Others - T.A. Jones, G. Bricogne, Rams, W.A. Hendrickson Others - W. Kabsch, CCP4, PROTEIN, E. Dodson, etc. etc.

Started - Wed Nov 22 21:22:43 2000 User - gerard Mode - interactive Host - sarek ProcID - 11822 Tty - /dev/ttyq12

*** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO ***

Reference(s) for this program:

* 1 * G.J. Kleywegt & T.A. Jones (1998). Databases in protein crystallography. Acta Cryst D54, 1119-1131. [http://xray.bmc.uu.se/gerard/papers/databases.html] [http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=10089488&form=6&db=m&Dopt=b] [http://www.iucr.org/iucr-top/journals/acta/tocs/actad/1998/actad5406_1.html]

* 2 * G.J. Kleywegt, J.Y. Zou, M. Kjeldgaard & T.A. Jones (2000). Chapter 17.1. Around O. Int. Tables for Crystallography, Volume F. Submitted.

==> For manuals and up-to-date references, visit: ==> http://xray.bmc.uu.se/usf ==> For downloading up-to-date versions, visit: ==> ftp://xray.bmc.uu.se/pub/gerard

*** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO ***

Max nr of molecules : ( 500) Max nr of residues in sequence : ( 2000) Nr of amino-acid types : ( 20) Random sequence length : ( 2000000) One-letter codes : ( A R N D C E Q G H I L K M F P S T W Y V) Three-letter codes : ( ALA ARG ASN ASP CYS GLU GLN GLY HIS ILE LEU LYS MET PHE PRO SER THR TRP TYR VAL) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


5.2 Random-number seed

The first bit of input is an integer seed for the random-number generator. This will be used to generate a random amino-acid sequence, and to generate random sequences when calculating the weight of each structure/sequence. If you repeat this run of the program on the same machine with the same seed, you should be getting identical results.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Random-number seed ? (  123456)
 Random-number seed : (  123456)
 => Random number generator initialised with seed :     123456
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5.3 Random sequence

The program will now generate a random amino-acid sequence of (at present) 2,000,000 residues. This sequence has an amino-acid distribution similar to that found in proteins in the PDB (GJK, unpublished results). It will be used later to calculate scores for the profile parts, which gives you some idea of the "signal-to-noise".

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Generating random sequence ...
 Target composition    : (   0.081    0.044    0.046    0.058    0.019
  0.058    0.037    0.080    0.022    0.053    0.081    0.059    0.020
  0.040    0.047    0.068    0.063    0.016    0.038    0.071)
 Working ...
 Actual composition    : (   0.081    0.044    0.046    0.058    0.019
  0.057    0.037    0.080    0.022    0.053    0.081    0.060    0.020
  0.040    0.046    0.068    0.063    0.015    0.038    0.070)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5.4 Substitution matrix

Next, you need to provide the name of a file which contains the matrix to be used in the construction of the profiles. A number of matrices are available; others can be made by the user.

Note: if you have defined the environment variable GKLIB so that it points to the directory where you keep your collection of these matrix files (in Uppsala: /nfs/public/lib), the program will use this to generate the default file name.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Library file with matrix ? (/home/gerard/lib/sbin_blosum45.lib)
 Library file with matrix : (/home/gerard/lib/sbin_blosum45.lib)
 Comment : (! BLOSUM 45 matrix made from BLOCKS v. 5.0 and scaled in half-
  bits.)
 Comment : (! ARNDCQEGHILKMFPSTWYVBZX)
 Comment : (! integer matrix)
 Average matrix value : (  -0.918)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Such a matrix file may look as follows (if it would contain real, instead of integer, numbers, replace the MATI by MATR and the format by something appropriate):

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
!
! PAM 250 matrix recommended by Gonnet, Cohen & Benner
! Science June 5, 1992.
! Values rounded to nearest integer
!
TYPE 22 (30(2x,a1))
  C  S  T  P  A  G  N  D  E  Q  H  R  K  M  I  L  V  F  Y  W  X  *
!
MATI (30i3)
 12  0  0 -3  0 -2 -2 -3 -3 -2 -1 -2 -3 -1 -1 -2  0 -1  0 -1 -3 -8
  0  2  2  0  1  0  1  0  0  0  0  0  0 -1 -2 -2 -1 -3 -2 -3  0 -8
  0  2  2  0  1 -1  0  0  0  0  0  0  0 -1 -1 -1  0 -2 -2 -4  0 -8
 -3  0  0  8  0 -2 -1 -1  0  0 -1 -1 -1 -2 -3 -2 -2 -4 -3 -5 -1 -8
  0  1  1  0  2  0  0  0  0  0 -1 -1  0 -1 -1 -1  0 -2 -2 -4  0 -8
 -2  0 -1 -2  0  7  0  0 -1 -1 -1 -1 -1 -4 -4 -4 -3 -5 -4 -4 -1 -8
 -2  1  0 -1  0  0  4  2  1  1  1  0  1 -2 -3 -3 -2 -3 -1 -4  0 -8
 -3  0  0 -1  0  0  2  5  3  1  0  0  0 -3 -4 -4 -3 -4 -3 -5 -1 -8
 -3  0  0  0  0 -1  1  3  4  2  0  0  1 -2 -3 -3 -2 -4 -3 -4 -1 -8
 -2  0  0  0  0 -1  1  1  2  3  1  2  2 -1 -2 -2 -2 -3 -2 -3 -1 -8
 -1  0  0 -1 -1 -1  1  0  0  1  6  1  1 -1 -2 -2 -2  0  2 -1 -1 -8
 -2  0  0 -1 -1 -1  0  0  0  2  1  5  3 -2 -2 -2 -2 -3 -2 -2 -1 -8
 -3  0  0 -1  0 -1  1  0  1  2  1  3  3 -1 -2 -2 -2 -3 -2 -4 -1 -8
 -1 -1 -1 -2 -1 -4 -2 -3 -2 -1 -1 -2 -1  4  2  3  2  2  0 -1 -1 -8
 -1 -2 -1 -3 -1 -4 -3 -4 -3 -2 -2 -2 -2  2  4  3  3  1 -1 -2 -1 -8
 -2 -2 -1 -2 -1 -4 -3 -4 -3 -2 -2 -2 -2  3  3  4  2  2  0 -1 -1 -8
  0 -1  0 -2  0 -3 -2 -3 -2 -2 -2 -2 -2  2  3  2  3  0 -1 -3 -1 -8
 -1 -3 -2 -4 -2 -5 -3 -4 -4 -3  0 -3 -3  2  1  2  0  7  5  4 -2 -8
  0 -2 -2 -3 -2 -4 -1 -3 -3 -2  2 -2 -2  0 -1  0 -1  5  8  4 -2 -8
 -1 -3 -4 -5 -4 -4 -4 -5 -4 -3 -1 -2 -4 -1 -2 -1 -3  4  4 14 -4 -8
 -3  0  0 -1  0 -1  0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 -2 -4 -1 -8
 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8
!
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Note that the matrix may contain entries for residue types not used by MSEQPRO (e.g., "X", "B", "Z", "*"); the program will ignore these.


5.5 Minimum fragment length

Only uninterrupted stretches of a certain minimum length will be used in the profile (they must be at least 3 residues long).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Min fragment length ? (      10)
 Min fragment length : (      10)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5.6 Indel penalty

You may supply a (negative) penalty for indels (or, rather, for MI, MD, IM and ID transitions inside the structurally conserved bits of the sequences). If you don't, a reasonable penalty will be calculated as minus one times the maximum of 100 and 1/10-th of the minimum raw score.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Indel penalty (>=0 auto) ? (       0)
 Indel penalty (>=0 auto) : (       0)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5.7 Sequence weighting

Appropriate weighting of the various structures/sequences is important to minimise bias in the profile (e.g., five different structures of the same human protein and only one of an insect form of the protein will bias the profile towards human sequences). The following weights can be used:

- uniform weights, i.e. all weights equal; this is not advisable

- sequence distance weights, as defined by Sibbald and Argos; this is probably the most sensible choice (in this implementation, the number of "Monte Carlo" cycles executed lies between 100,000 and 1,000,000, or fewer if the weights converge to within 1%)

- Henikoff & Henikoff weights (strongly prefered !)

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Sequences may be weighted:
 U = uniform weights
 S = Sibbald-Argos sequence distance weights
 H = Henikoff^2 position-based sequence weights
 Weighting scheme ? (H)
 Weighting scheme : (H)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5.8 Sequence and profile files

Provide the name of the file containing the (partial) multiple sequence alignment, as well as of the (output) profile file (these customarily have an extension ".prf").

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Name of sequence file ? (aligned.seq) aligned.mseq
 Name of sequence file : (aligned.mseq)

Name of profile file ? (aligned.prf) qqq.prf Name of profile file : (qqq.prf)

Remark : (! Sequence alignment file) Remark : (! Created by STRUPRO V. 001122/1.4 at Wed Nov 22 21:01:15 2000 for gerard) Remark : (! REMARK Created by MOLEMAN2 V. 001117/2.8 at Tue Nov 21 20:09: 46 2000 for gerard) Remark : (! REMARK Created by MOLEMAN2 V. 001117/2.8 at Tue Nov 21 20:08: 21 2000 for gerard) Remark : (! NOT ALIGNED MOL 1 FROM PRO- 5 TO PRO- 5) Remark : (!> P) Remark : (! NOT ALIGNED MOL 2 FROM THR- 1 TO CYS- 8) [...] Remark : (! NOT ALIGNED MOL 4 FROM GLY- 325 TO GLY- 401) Remark : (!> GEEPGTPSYLNTCYSILAPAYGISVAAIYRPNADGSAIESVPDSGGVTPVDAPDWVLEREV QYAYSWYNNIVHDTFG) Nr of sequences : ( 4) Nr of residues : ( 97) SEQ > (RVIVVGAGMSGISAAKRLSEAG-DLLILEATDH-RLQLNKVVREIKY-NSVYSADYVMVSAS- VGRVYFTGEH-GYVHGAYLSGIDSAEILINCAQ-) SEQ > (DVLIVGAGPAGLMAARVLSEYV-KVRIIDKRST-KVERPLIPEKMEI-IETVHCKYVIGCDG- DERVFIAGDA-QGMNTSMMDTYNLGWKLGLVLT-) SEQ > (QVAIIGAGPSGLLLGQLLHKAG-DNVILERQTP-TTVYQAAEVRLHD-RLRLDCDYIAGCDG- HGRLFLAGDA-KGLNLAASDVSTLYRLLLKAYR-) SEQ > (KVVVVGGGTGGATAAKYIKLAD-EVTLIEPNTD-IQVVHDSATGIDP-GAEFGYDRCVVAPG- HKGIHVIGDA-KSGYSANSQGKVAAAAVVVLLK-)

Nr of positions without INDELs : ( 91) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


6 OUTPUT

MSEQPRO will now start looking for aligned stretches of residues which do not have any insertions/deletions in any of the sequences. If such a fragment (exceeding the minimum required length) is found, it will be used to generate a part of the new profile.

For every successful stretch of residues that the program encounters, the output includes:

- information about the length of the stretch of residues

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 New conserved stretch !
 Length : (         22)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

- weights are calculated and printed. In the case of sequence distance weights ,this may take a little while since thousands of random sequences need to be generated, and statistics accumulated.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Calculating sequence distances ...
 Weights      : (   0.216    0.239    0.269    0.277)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

- for every residue, the amino acid in every sequence, and the profile matrix entries are listed

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 AA-TYPE :  ALA ARG ASN ASP CYS GLU GLN GLY HIS ILE LEU LYS MET PHE PRO SER THR TRP TYR VAL
 |RDQK|
 PROFILE :  -15  21   5  20 -30  10  24 -18  -3 -32 -25  23 -17 -30  -9  -5 -10 -27 -15 -25
 |VVVV|
 PROFILE :    0 -20 -30 -30 -10 -30 -30 -30 -30  30  10 -20  10   0 -30 -10   0 -30 -10  50
 |ILAV|
 PROFILE :    9 -22 -22 -29 -17 -20 -22 -24 -25  21  16 -22   9  -3 -22 -12  -5 -23  -8  23
 |VIIV|
 PROFILE :   -5 -25 -25 -35 -20 -25 -30 -35 -30  40  15 -25  15   0 -25 -15  -5 -25  -5  40
 [...]
|AYAA|
 PROFILE :   33 -18 -12 -20 -15 -10 -12  -7 -10  -8  -8 -10  -8  -8 -15   3  -2  -8   4  -2
 |GVGD|
 PROFILE :   -6 -17  -2   7 -25 -17 -11  24 -17 -23 -20 -14 -16 -26 -20  -2 -12 -28 -22 -11
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

- next, the program slides the profile along the entire random amino acid sequence and calculates statistics:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Random sequence tests :      1999979
 Average, St.dev.      :       -176.1        88.1
 Minimum, Maximum      :       -513.0       333.0
 Z-min, Z-max          :        -3.82        5.78

Mol # 1 Raw score = 707 Z-score = 10.02 Mol # 2 Raw score = 695 Z-score = 9.88 Mol # 3 Raw score = 688 Z-score = 9.81 Mol # 4 Raw score = 651 Z-score = 9.39 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


7 RESULTS

When the program has finished, it will print a summary:

- the pairwise sequence identity matrix (in %), *ONLY* counting the residues that ended up being in the profile:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Nr of residues in profile : (         91)

Sequence identity for these residues only: % Seq id mol # 1 -> 100.0 19.8 22.0 19.8 % Seq id mol # 2 -> 19.8 100.0 29.7 15.4 % Seq id mol # 3 -> 22.0 29.7 100.0 18.7 % Seq id mol # 4 -> 19.8 15.4 18.7 100.0

Average sequence identity (%) : ( 20.879) St. dev. : ( 4.396) Minimum : ( 15.385) Maximum : ( 29.670) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

- some results pertaining to the random sequence

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Sum of maximum random scores : (        746)
 Sum AVE+3SIGMA random scores : (        243)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

- the accumulated raw scores of the input structures/sequences.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Score for molecule   1 =       1893
 Score for molecule   2 =       2001
 Score for molecule   3 =       2043
 Score for molecule   4 =       1829
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

- a suggestion is made for the minimum raw score to be used in searches against the (SWISS-PROT) sequence database (note that it is better to scan the whole sequence database to get realistic statistics)

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Minimum raw score : (       1500)
 Indel penalty : (    -150)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


8 PROFILE FILE

For the example above, the following profile file is generated:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
ID   MSEQPRO; MATRIX.
AC   PS99999;
DT   JAN-1900 (CREATED);
DE   Created by MSEQPRO V. 001122/1.3 at Wed Nov 22 21:24:58 2000 for gerard
CC
CC   Substitution matrix file : /home/gerard/lib/sbin_blosum45.lib
CC   Nr of sequences used : 4
CC   Min fragment length : 10
CC   Indel penalty : -150
CC   Weighting scheme : H
CC
MA   /GENERAL_SPEC: ALPHABET='ARNDCEQGHILKMFPSTWYV'; LENGTH= 91;
MA   TOPOLOGY=LINEAR;
MA   /DISJOINT: DEFINITION=PROTECT; N1=1; N2= 91;
MA   /CUT_OFF: LEVEL=0; SCORE= 1500;
MA   /DEFAULT: MI=    -150; IM=    -150; MD=    -150; DM=    -150;
MA   /M: SY='Q'; M=-15,21,5,20,-30,10,24,-18,-3,-32,-25,23,-17,-30,-9,-5,-10,-27,-15,-25;
MA   /M: SY='V'; M=0,-20,-30,-30,-10,-30,-30,-30,-30,30,10,-20,10,0,-30,-10,0,-30,-10,50;
MA   /M: SY='V'; M=9,-22,-22,-29,-17,-20,-22,-24,-25,21,16,-22,9,-3,-22,-12,-5,-23,-8,23;
MA   /M: SY='I'; M=-5,-25,-25,-35,-20,-25,-30,-35,-30,40,15,-25,15,0,-25,-15,-5,-25,-5,40;
MA   /M: SY='V'; M=-3,-23,-27,-33,-15,-27,-30,-33,-30,35,13,-23,13,0,-27,-13,-3,-27,-7,45;
MA   /M: SY='G'; M=0,-20,0,-10,-30,-20,-20,70,-20,-40,-30,-20,-20,-30,-20,0,-20,-20,-30,-30;
MA   /M: SY='A'; M=36,-20,-7,-17,-16,-13,-13,19,-20,-18,-16,-13,-13,-23,-13,7,-6,-20,-23,-8;
MA   /M: SY='G'; M=0,-20,0,-10,-30,-20,-20,70,-20,-40,-30,-20,-20,-30,-20,0,-20,-20,-30,-30;
MA   /M: SY='P'; M=-7,-15,-14,-14,-27,-8,-7,-20,-16,-9,-14,-10,0,-18,39,-4,7,-28,-18,-13;
MA   /M: SY='S'; M=17,-15,2,-8,-16,-8,-8,19,-15,-23,-25,-13,-18,-23,-13,22,4,-30,-23,-13;
MA   /M: SY='G'; M=0,-20,0,-10,-30,-20,-20,70,-20,-40,-30,-20,-20,-30,-20,0,-20,-20,-30,-30;
MA   /M: SY='L'; M=7,-22,-22,-29,-19,-17,-19,-24,-22,18,27,-24,12,0,-22,-17,-7,-20,-6,12;
MA   /M: SY='M'; M=-3,-13,-11,-18,-15,-8,-13,-18,-13,3,9,-15,13,-4,-18,1,13,-27,-7,3;
MA   /M: SY='A'; M=34,-20,-15,-23,-13,-13,-13,-8,-20,-2,6,-15,-2,-12,-15,-1,-3,-20,-15,3;
MA   /M: SY='A'; M=37,-20,-7,-17,-15,-13,-13,19,-20,-18,-15,-13,-13,-23,-13,7,-5,-20,-23,-8;
MA   /M: SY='K'; M=-12,31,0,3,-30,13,21,-20,-5,-30,-25,34,-13,-28,-10,-7,-10,-23,-13,-23;
MA   /M: SY='Y'; M=-13,2,-21,-23,-23,-13,-18,-28,-7,6,12,-9,6,7,-28,-18,-8,-9,18,8;
MA   /M: SY='L'; M=-10,-23,-27,-33,-23,-20,-23,-33,-23,28,42,-30,20,7,-27,-27,-10,-20,0,16;
MA   /M: SY='H'; M=-4,4,7,0,-21,5,3,-11,20,-25,-27,7,-12,-23,-13,13,1,-32,-6,-18;
MA   /M: SY='E'; M=-10,7,-8,-8,-27,24,6,-23,-4,-12,-3,10,3,-23,-16,-11,-10,-20,-7,-16;
MA   /M: SY='A'; M=33,-18,-12,-20,-15,-10,-12,-7,-10,-8,-8,-10,-8,-8,-15,3,-2,-8,4,-2;
MA   /M: SY='G'; M=-6,-17,-2,7,-25,-17,-11,24,-17,-23,-20,-14,-16,-26,-20,-2,-12,-28,-22,-11;
MA     /I: MI=0; MD=0; IM=0; DM=0; /M: SY='X'; M=0;
MA   /M: SY='D'; M=-15,6,9,33,-30,18,17,-15,0,-32,-27,16,-17,-37,-10,-3,-10,-29,-15,-27;
MA   /M: SY='V'; M=-5,-16,-10,-19,-15,-21,-21,-23,-19,16,11,-18,6,-2,-28,-10,-2,-30,-10,22;
MA   /M: SY='T'; M=-8,7,-14,-19,-18,-11,-14,-25,-17,1,6,-6,2,-6,-22,-7,8,-25,-8,8;
MA   /M: SY='I'; M=-10,-27,-23,-37,-27,-20,-27,-37,-27,42,28,-30,20,3,-23,-23,-10,-20,0,25;
MA   /M: SY='I'; M=-10,-25,-25,-35,-25,-20,-25,-35,-25,36,34,-30,20,5,-25,-25,-10,-20,0,21;
MA   /M: SY='E'; M=-13,5,6,19,-30,44,20,-17,7,-26,-23,7,-8,-40,-10,0,-10,-26,-13,-30;
MA   /M: SY='K'; M=2,14,-8,-10,-28,0,0,-15,-13,-23,-23,16,-13,-25,14,-5,-8,-23,-18,-18;
MA   /M: SY='R'; M=-10,17,16,5,-23,5,11,-15,-2,-23,-20,8,-15,-20,-13,5,7,-30,-15,-20;
MA   /M: SY='T'; M=-2,-10,8,12,-15,-5,0,-12,-12,-20,-20,-8,-18,-20,-10,21,27,-35,-15,-10;
MA   /M: SY='H'; M=-12,-10,3,13,-27,-3,2,-17,14,-25,-22,-7,-15,-25,10,1,4,-33,-10,-22;
MA     /I: MI=0; MD=0; IM=0; DM=0; /M: SY='X'; M=0;
MA   /M: SY='R'; M=-10,16,-5,-14,-25,-2,-7,-25,-15,-6,-11,11,-3,-15,-15,-5,5,-23,-8,-3;
MA   /M: SY='V'; M=-5,-13,-15,-13,-17,-11,-1,-25,-18,3,8,-13,0,-7,-18,-5,8,-28,-10,9;
MA   /M: SY='V'; M=-5,-7,-15,-10,-20,6,6,-25,-12,2,-5,-5,0,-18,-17,-5,-5,-27,-13,9;
MA   /M: SY='Y'; M=-13,6,-20,-22,-23,-12,-17,-27,-7,4,10,-7,5,5,-27,-18,-8,-10,15,7;
MA   /M: SY='H'; M=-12,-5,12,7,-30,5,15,-15,21,-25,-25,-3,-15,-25,14,-2,-10,-33,-13,-30;
MA   /M: SY='D'; M=3,-5,-6,4,-22,-5,-1,-15,-13,-14,-4,2,-7,-19,-15,-8,-8,-25,-12,-9;
MA   /M: SY='V'; M=12,-20,-13,-23,-15,-15,-18,-18,-23,13,-2,-18,1,-10,-18,4,2,-27,-12,18;
MA   /M: SY='P'; M=7,-13,-15,-15,-23,2,-5,-18,-15,-5,-13,-8,-5,-23,11,-3,-5,-25,-18,-3;
MA   /M: SY='R'; M=-8,13,-8,-12,-20,8,-5,-23,-10,-8,-10,3,-2,-18,-18,0,7,-25,-10,0;
MA   /M: SY='R'; M=-10,23,0,-5,-30,15,3,1,-5,-30,-25,18,-10,-30,-15,-5,-12,-20,-15,-25;
MA   /M: SY='I'; M=-10,-22,-23,-35,-25,-15,-25,-32,-20,35,28,-25,31,3,-23,-23,-10,-20,0,20;
MA   /M: SY='H'; M=-15,8,7,17,-30,21,13,-18,25,-30,-25,13,-10,-33,-13,-5,-13,-27,-5,-28;
MA   /M: SY='Y'; M=-15,-18,-10,0,-32,-10,-8,-25,-8,-2,-9,-13,-7,-10,6,-13,-10,-15,8,-9;
MA     /I: MI=0; MD=0; IM=0; DM=0; /M: SY='X'; M=0;
MA   /M: SY='N'; M=-9,3,11,-9,-27,-8,-13,6,-10,-12,-16,-6,-9,-18,-20,-4,-10,-25,-16,-14;
MA   /M: SY='A'; M=12,-10,-7,-12,-17,7,-3,-11,-10,-8,-4,-10,-4,-18,-14,7,1,-25,-13,-7;
MA   /M: SY='R'; M=-7,11,-8,-13,-20,9,-5,-23,-10,-6,-10,2,-2,-18,-18,0,7,-25,-10,1;
MA   /M: SY='F'; M=-13,-17,-25,-30,-20,-25,-25,-30,-12,11,16,-22,7,33,-30,-20,-8,-1,27,11;
MA   /M: SY='H'; M=-6,-10,9,13,-25,-3,-1,13,15,-32,-28,-11,-18,-27,-15,8,-7,-32,-14,-25;
MA   /M: SY='C'; M=3,-22,-17,-25,43,-19,-22,-22,-13,-16,-12,-19,-12,-6,-29,-7,-7,-20,3,-7;
MA   /M: SY='D'; M=-18,-1,15,54,-30,2,18,-12,-2,-38,-30,12,-25,-38,-10,-2,-10,-35,-18,-28;
MA   /M: SY='Y'; M=-20,12,-14,-17,-30,-4,-14,-27,14,-8,-6,1,-3,16,-27,-17,-10,16,55,-13;
MA   /M: SY='V'; M=-5,-25,-25,-32,22,-28,-30,-32,-30,18,4,-25,4,-6,-31,-12,-5,-33,-13,29;
MA   /M: SY='V'; M=6,-20,-21,-30,-17,-15,-23,-23,-20,23,11,-17,21,-4,-21,-11,-5,-23,-7,24;
MA   /M: SY='G'; M=0,-20,-16,-21,-19,-25,-25,15,-25,-2,-8,-20,-4,-14,-25,-5,-9,-25,-19,14;
MA   /M: SY='C'; M=12,-22,-9,-19,49,-16,-16,-14,-22,-22,-20,-19,-17,-20,-24,9,1,-39,-25,-7;
MA   /M: SY='D'; M=2,-15,1,24,-27,-5,6,-10,-11,-26,-25,-5,-22,-32,18,0,-7,-32,-23,-22;
MA   /M: SY='G'; M=3,-17,3,-7,-25,-15,-15,51,-17,-35,-30,-17,-20,-27,-17,11,-9,-25,-27,-25;
MA     /I: MI=0; MD=0; IM=0; DM=0; /M: SY='X'; M=0;
MA   /M: SY='H'; M=-14,-8,0,6,-24,-4,-5,-21,39,-14,-13,-11,-3,-18,-21,-8,-12,-32,2,-6;
MA   /M: SY='G'; M=-5,1,0,-5,-30,6,-3,25,-11,-33,-28,6,-13,-32,-15,-3,-15,-20,-20,-27;
MA   /M: SY='R'; M=-14,45,0,-10,-30,1,-6,6,-6,-33,-23,16,-13,-23,-20,-7,-13,-20,-16,-23;
MA   /M: SY='V'; M=-5,-23,-27,-33,-18,-25,-28,-33,-28,34,21,-25,15,2,-27,-17,-5,-25,-5,36;
MA   /M: SY='Y'; M=-20,-11,-11,-23,-26,-17,-19,-27,26,-9,-2,-18,0,37,-27,-17,-13,5,42,-12;
MA   /M: SY='F'; M=-10,-22,-25,-35,-19,-29,-28,-32,-25,23,20,-27,11,26,-28,-19,-7,-14,6,23;
MA   /M: SY='A'; M=18,-20,-10,-23,-16,-13,-16,-17,-23,7,-1,-16,-1,-11,-13,4,12,-23,-11,9;
MA   /M: SY='G'; M=0,-20,0,-10,-30,-20,-20,70,-20,-40,-30,-20,-20,-30,-20,0,-20,-20,-30,-30;
MA   /M: SY='D'; M=-17,-4,14,49,-30,18,20,-13,3,-34,-27,3,-21,-40,-10,0,-10,-34,-17,-30;
MA   /M: SY='A'; M=29,-14,-4,-14,-16,-4,-7,-6,16,-16,-13,-10,-7,-20,-13,4,-6,-23,-8,-9;
MA     /I: MI=0; MD=0; IM=0; DM=0; /M: SY='X'; M=0;
MA   /M: SY='K'; M=-7,9,0,2,-30,5,15,4,-10,-33,-27,22,-15,-30,-10,-5,-13,-23,-18,-25;
MA   /M: SY='G'; M=-3,-15,-3,-10,-25,-12,-15,26,-7,-24,-22,-15,-15,-12,-20,5,-7,-12,1,-20;
MA   /M: SY='M'; M=-5,-17,-20,-25,-20,-18,-23,-2,-18,7,12,-20,17,-5,-25,-15,-10,-23,-10,10;
MA   /M: SY='H'; M=-15,-3,27,5,-25,0,-5,-13,36,-18,-20,-5,-10,-7,-23,-3,-8,-20,16,-25;
MA   /M: SY='T'; M=0,-15,-4,-12,-18,-12,-12,6,-17,-13,-6,-17,-8,-13,-17,8,10,-28,-15,-8;
MA   /M: SY='A'; M=40,-17,-5,-15,-10,-7,-7,0,-17,-13,-15,-10,-13,-20,-10,18,5,-25,-20,-3;
MA   /M: SY='Y'; M=1,-10,3,-12,-20,-5,-13,-13,3,-2,-5,-7,8,-2,-20,-5,-5,-12,11,-8;
MA   /M: SY='M'; M=0,-13,-8,-15,-15,-5,-10,-13,-10,1,4,-15,11,-7,-18,6,5,-30,-10,0;
MA   /M: SY='D'; M=-10,-7,12,39,-25,5,25,-10,-3,-32,-27,0,-25,-32,-7,10,-2,-37,-20,-25;
MA   /M: SY='G'; M=0,-17,-7,-15,-20,-20,-20,24,-22,-16,-16,-17,-11,-18,-20,3,2,-25,-20,-4;
MA   /M: SY='Y'; M=-8,-5,-8,-16,-25,-5,-10,-23,-8,1,-9,0,-2,-5,-18,-4,-3,-12,13,-2;
MA   /M: SY='N'; M=-8,-10,13,13,-18,-10,-5,-15,-10,-10,-15,-7,-13,-18,-18,5,9,-35,-15,-3;
MA   /M: SY='L'; M=10,-17,-14,-20,-15,-12,-12,-15,-17,2,14,-20,2,-5,-20,-2,0,-25,-10,2;
MA   /M: SY='A'; M=21,-18,-10,-17,-20,-13,-15,11,-11,-15,-13,-13,-10,-11,-17,1,-7,-8,1,-10;
MA   /M: SY='W'; M=0,9,-13,-18,-30,10,-5,-15,-10,-20,-17,2,-10,-18,-17,-10,-13,23,-2,-20;
MA   /M: SY='I'; M=5,-10,-15,-22,-23,-10,-12,-22,-20,8,7,-5,5,-10,-17,-12,-7,-20,-8,5;
MA   /M: SY='L'; M=-7,-20,-30,-30,-17,-23,-23,-30,-23,23,40,-27,17,7,-30,-25,-7,-23,-3,20;
MA   /M: SY='I'; M=-5,-23,-20,-28,-23,-23,-25,-7,-25,15,12,-25,7,-5,-25,-15,-10,-23,-10,15;
MA   /M: SY='V'; M=-7,-3,0,-10,-20,-10,-10,-20,-12,1,0,-1,0,-10,-23,-10,-5,-28,-10,3;
MA   /M: SY='C'; M=6,-23,-23,-28,21,-23,-23,-23,-25,3,8,-23,0,-7,-28,-10,-5,-30,-15,13;
MA   /M: SY='L'; M=3,-18,-22,-25,-20,-15,-17,-22,-11,8,23,-20,8,7,-25,-17,-7,-8,13,3;
MA   /M: SY='R'; M=-10,21,0,0,-25,8,16,-20,-8,-25,-20,20,-13,-23,-10,0,5,-25,-13,-18;
//
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Note that insertions and deletions in the uninterrupted stretches are severely penalised (since there were none in the set of aligned sequences !), whereas they may occur anywhere in between such stretches (since they occur in at least one of the input sequences !).


9 KNOWN BUGS

None, at present ("peppar, peppar").


10 UNKNOWN BUGS

Does not compute.


Uppsala Software Factory Created at Fri Jan 14 20:12:43 2005 by MAN2HTML version 050114/2.0.6 . This manual describes MSEQPRO, a program of the Uppsala Software Factory (USF), written and maintained by Gerard Kleywegt. © 1992-2005.