Uppsala Software Factory

Uppsala Software Factory - PRF2MSEQ Manual


1 PRF2MSEQ - GENERAL INFORMATION

Program : PRF2MSEQ
Version : 971111
Author : Gerard J. Kleywegt, Dept. of Cell and Molecular Biology, Uppsala University, Biomedical Centre, Box 596, SE-751 24 Uppsala, SWEDEN
E-mail : gerard@xray.bmc.uu.se
Purpose : convert profile scan results into multiple sequence alignment
Package : SBIN


2 REFERENCES

Reference(s) for this program:

* 1 * G.J. Kleywegt & T.A. Jones (1998). Databases in protein crystallography. Acta Cryst D54, 1119-1131. [http://xray.bmc.uu.se/gerard/papers/databases.html] [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Retrieve&list_uids=10089488&dopt=Citation] [http://scripts.iucr.org/cgi-bin/paper?ba0001]

* 2 * Kleywegt, G.J., Zou, J.Y., Kjeldgaard, M. & Jones, T.A. (2001). Around O. In: "International Tables for Crystallography, Vol. F. Crystallography of Biological Macromolecules" (Rossmann, M.G. & Arnold, E., Editors). Chapter 17.1, pp. 353-356, 366-367. Dordrecht: Kluwer Academic Publishers, The Netherlands.


3 VERSION HISTORY

971023 - 0.1 - first version
971111 - 1.0 - cleaned up code and manual


4 DESCRIPTION

PRF2MSEQ is a simple non-interactive program which reads a list of profile/sequence matches (calculated with the pftools-program "pfsearch", with the "-y" command-line option) and converts this into a partial multiple sequence alignment of the matching sequences in a format suitable for MSEQPRO (which can be used to generate a new profile based on these aligned sequences).

Usage: PRF2MSEQ < scan_results.file > multi_sequence.file

Typical example:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
pfsearch -ry pfx.prf /home/gerard/lib/sprot.dat > pfx.hits
PRF2MSEQ < pfx.hits > pfx.mseq
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Example of a part of an input file:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
  1131 pos.     28 -   250 P35694|BRU1_SOYBN BRASSINOSTEROID-REGULATED PROTEIN BRU1.
#
# P       4 CSXDNWTVADXNGVTTS----------------XSLQLGQITQ-XNVCARYYYMNXYKMF    -164
# S      28 CA--GSFYQD-----FDltwggdrakifnggqlLSLSLDKVSGsGFKSKKEYLFG-----    -209
#
# P      47 HLWXGLYSFDVDPAEQP------XGLNGSFFMGPM-XCCDEMDIEFDNXPHIALNPHXCD    -111
# S      76 RID----------MQLKlvagnsAGTVTAYYLSSQgPTHDEIDFEFLG----NLSGD---    -166
#
# P     100 SGGCEWNPY--XTGPFS-------------XLDTSKFHTVVFQWDPSXKITRYYQ----X     -70
# S     119 -------PYilHTNIFTqgkgnreqqfylwFDPTRNFHTYSIIWKPQ-HIIFLVDntpiR    -113
#
# P     141 TFPQAXNTLTAXGLANMPKAPXSWMDIMMSLWNGTXFSNPWLD-----------------     -27
# S     171 VFKNA-EPLGV----PFPKNQ--PMRIYSSLWNAD----DWATrgglvktdwskapftay     -65
#
# P     183 -----------------XGAPNDAEXNDAPNTHVVYS      -7
# S     220 yrnfkaiefsskssisnSGAEYEAN------ELDAYS     -34
#
  1231 pos.     64 -   268 P33693|EXOK_RHIME SUCCINOGLYCAN BIOSYNTHESIS PROTEIN EXOK.
#
# P       4 CSXDNWTVADXNGVTTS-XSLQLGQITQ-----XNVCARYYYMNXYKMFHLWXGLYSFDV    -153
# S      64 CT---WSKKQ---VKTVdGILELTFEEKkvkerNFACGEIQTRK---RFGYG--TYEARI    -158
#
# P      58 DPAEQPXGLNGSFFMGPM----XCCDEMDIEFDN--XPHIALNPHXCDSGGCEWNPYXTG     -99
# S     113 KAADGS-GLNSAFFTYIGpadkKPHDEIDFEVLGknTAKVQINQYVSAKGGNEFLAD--V    -101
#
# P     112 PFSXLDTSKFHTVVFQWDPSXKITRYYQXTFPQA--XNTLTAXGLANMPKAPXSWMDIMM     -41
# S     170 PGG--ANQGFNDYAFVWEKN-RIRYYVN-----GelVHEVTD--PAKIPVNA---QKIFF     -54
#
# P     170 SLWNGTXFSNPWLD-------------------XGAPNDAE--------XNDA     -15
# S     217 SLWGTD-TLTDWMGtfsykeptklqvdrvaftaAGDECQFAesvacqleRAQS      -2
#
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Example of a part of an output file:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
!   1131 pos.     28 -   250 P35694|BRU1_SOYBN BRASSINOSTEROID-REGULATED PROTEIN BRU1.
   CA  GSFYQD     FD SLSLDKVSG FKSKKEYLFG     RID          MQLK GTVTAYYLSSQ THDEIDFEFLG    NLSGD          PY TNIFT DPTRNFHTYSIIWKPQ HIIFLVD VFKNA EPLGV    PFPKNQ  PMRIYSSLWNAD    DWAT GAEYEAN      ELDAYS
!   1231 pos.     64 -   268 P33693|EXOK_RHIME SUCCINOGLYCAN BIOSYNTHESIS PROTEIN EXOK.
   CT   WSKKQ   VKTV ILELTFEEK FACGEIQTRK   RFGYG  TYEARIKAADGS GLNSAFFTYIG PHDEIDFEVLG AKVQINQY SAKGGNEFLAD  VPGG  ANQGFNDYAFVWEKN RIRYYVN     G HEVTD  PAKIPVNA   QKIFFSLWGTD TLTDWMG GDECQFA AQS
!   2317 pos.     40 -   236 P07980|GUB_BACAM BETA-GLUCANASE PRECURSOR (EC 3.2.1.73) (ENDO-BETA-1,3-1,4 GLUCANASE) (1,3-1,4-BETA-D-GLU
    S DGYSNGD NNVSMT EMRLALTSP FDCGENRSVQ   TYGYG  LYEVRMKPAKNT GIVSSFFTYTG PWDEIDIEFLG TKVQFNYY NGAGNHEKFAD    LG DAANAYHTYAFDWQPN SIKWYVD     G HTATT    QIPAAP   GKIMMNLWNGT GVDDWLG  SYNGVN     PIYAHYDWMRY
!   1971 pos.     41 -   251 P37073|GUB_BACBR BETA-GLUCANASE PRECURSOR (EC 3.2.1.73) (ENDO-BETA-1,3-1,4 GLUCANASE) (1,3-1,4-BETA-D-GLU
 FYES      FD AGVWTN RLTIAKKTT   SARNYKAG NDFYHYG  LFEVSMKPAKVE GTVSSFFTYTG PWDEIDIEFLG TRIQFNYF NGVGGNEFYYD    LG DASESFNTYAFEWRED SITWYVN   GEA HTATE    NIPQTP   QKIMMNLWPGV GVDGWTG      VF GDNTPVYSYYDWVRYTP
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5 KNOWN BUGS

None, at present ("peppar, peppar").


6 UNKNOWN BUGS

Does not compute.


Uppsala Software Factory Created at Fri Jan 14 20:12:40 2005 by MAN2HTML version 050114/2.0.6 . This manual describes PRF2MSEQ, a program of the Uppsala Software Factory (USF), written and maintained by Gerard Kleywegt. © 1992-2005.