UDELPTR1CPK0002P19 Length = 493 Frame = +1
This would extend before residue 1, but either different gene or bad sequ. Note X. In any case no homology before 19, different or truncated. Find start. Query: 19 LDNGLRVASEQSSQPTCTVGVWIDAGSRYESEKNNGAGYFV 59 L+NGLRVAS CTVG+ I++GSR E++ +G +F+ Sbjct: 37 LENGLRVASXNKFGQFCTVGLLINSGSRXEAKYLSGIAHFL 159
FM_ROS005D10-T3-1 REFORMAT of: EST0548-T3-1.seq Length = 643 Frame = +1
Query: 19 LDNGLRVASEQSSQPTCTVGVWIDAGSRYESEKNNGAGYFVEHLAFKGTKNRPGNALEKE 78 LDNGLRVASE+SSQPTCTVGVWI AGSRYE+EKNNGAGYFVEHLAFKGTK RP A EKE Sbjct: 7 LDNGLRVASEESSQPTCTVGVWIGAGSRYENEKNNGAGYFVEHLAFKGTKKRPCAAFEKE 186
Query: 79 VESMGAHLNAYSTREHTAYYIKALSKDLPKAVELLADIVQNCSLEDSQIEKERDVILQEL 138 VESMGAH N Y++RE TA+YIKALSKD+PK VELLAD+VQNC+LE+SQIEKER VILQEL Sbjct: 187 VESMGAHFNGYTSREQTAFYIKALSKDMPKVVELLADVVQNCALEESQIEKERGVILQEL 366
Query: 139 QENDTSMRDVVFNYLHATAFQGTPLAQSVEGPSENVRKLSRADLTEYLSRHYKAPRMVLA 198 +E D M +V F+YLHATAFQGT LA++VEG +EN++ L+RADL Y+ H+KAPRMVLA Sbjct: 367 KEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTRADLASYIDTHFKAPRMVLA 546Query: 199 AAGGLEHRQLLDLAQKHFSGLSGTY 223
DKFZ426_26E8R1 REFORMAT of: dkfz426_26e8r1.dat Length = 776 Frame = +3 Query: 126 QIEKERDVILQELQENDTSMRDVVFNYLHATAFQGTPLAQSVEGPSENVRKLSRADLTEY 185 QIEKER VILQEL+E D M +V F+YLHATAFQGT LA++VEG +EN++ L+RADL Y Sbjct: 3 QIEKERGVILQELKEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTRADLASY 182
Query: 186 LSRHYKAPRMVLAAAGGLEHRQLLDLAQKHFSGLSGTYDEDAVPTLSPCRFTGSQICHRE 245 + H+KAPRMVLAAAGG+ H++L+D A++HFSG+S TY EDAVP L CRFTGS+I R+ Sbjct: 183 IDTHFKAPRMVLAAAGGISHKELVDAARQHFSGVSFTYKEDAVPILPRCRFTGSEIRARD 362
Query: 246 DGLPLAHVAIAVEGPGWAHPDNVALQVANAIIGHYDCTYGGGAHLSSPLASIAATNKLCQ 305 D LP+AHVA+AVEGPGWA PDNV L VANAIIG YD T+GGG HLSS LA++A +KLC Sbjct: 363 DALPVAHVALAVEGPGWADPDNVVLHVANAIIGRYDRTFGGGKHLSSRLAALAVEHKLCH 542
Query: 306 SFQTFNICYADTGLLGAHFVCDHMSIDDMMFVLQGQWMRLCTSATESEVLRGKNLLRNAL 365 SFQTFN Y+DTGL G HFV D +SIDDMM QG+WMRLCTS TESEV R KN LR+A+ Sbjct: 543 SFQTFNTSYSDTGLFGFHFVADPLSIDDMMXCAQGEWMRLCTSTTESEVKRAKNHLRSAM 722
Query: 366 VSHLDGTTPVCEDIGR 381 V+ LDGTTPVCE IG+ Sbjct: 723 VAQLDGTTPVCETIGK 770
FM_ROS052D01-T3-1 REFORMAT of: EST4796-T3-1.seq Length = 494 Frame = -3
Query: 342 WMRLCTSATESEVLRGKNLLRNALVSHLDGTTPVCEDIGRSLLTYGRRIPLAEWESRIAE 401 WMRLCTS TESEV R KN LR+A+V+ LDGTTPVCE IG LL YGRRI L EW+SRI+ Sbjct: 492 WMRLCTSTTESEVKRAKNHLRSAMVAQLDGTTPVCETIGSHLLNYGRRISLEEWDSRISA 313
Query: 402 VDARVVREVCSKYFYDQCPAVAGFGPIEQLPDYNRIRSGMFWLRF 446 VDAR+VR+VCSKY YD+CPA+A GPIEQL DYNRIRSGM+W+RF Sbjct: 312 VDARMVRDVCSKYIYDKCPALAAVGPIEQLLDYNRIRSGMYWIRF 178 --LDNGLRVASEESSQPTCTVGVWIGAGSRYENEKNNGAGYFVEHLAFKGTKKRPCAAFEKE
QIEKERGVILQEL
KEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTRADLASYIDTHFKAPRMVLA
KEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTRADLASYIDTHFKAPRMVLA
AAGGISHKELVDAARQHFSGVSFTY
AAGGISHKELVDAARQHFSGVSFTY KEDAVPILPRCRFTGSEIRARDDALPVAHVALAVE
GPGWADPDNVVLHVANAIIGRYDRTFGGGKHLSSRLAALAVEHKLCHSFQTFNTSYSDTG
LFGFHFVADPLSIDDMMXCAQGEWMRLCTSTTESEVKRAKNHLRSAMVAQLDGTTPVCET
WMRLCTSTTESEVKRAKNHLRSAMVAQLDGTTPVCET
IG K
IG SHLLNYGRRISLEEWDSRISAVDARMVRDVCSKYIYDKCPALAAVGPIEQLLDYNRIR
SGMYWIRF
--
LDNGLRVASEESSQPTCTVGVWIGAGSRYENEKNNGAGYFVEHLAFKGTKKRPCAAFEKE
VESMGAHFNGYTSREQTAFYIKALSKDMPKVVELLADVVQNCALEESQIEKERGVILQEL
KEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTRADLASYIDTHFKAPRMVLA
AAGGISHKELVDAARQHFSGVSFTYKEDAVPILPRCRFTGSEIRARDDALPVAHVALAVE
GPGWADPDNVVLHVANAIIGRYDRTFGGGKHLSSRLAALAVEHKLCHSFQTFNTSYSDTG
LFGFHFVADPLSIDDMMXCAQGEWMRLCTSTTESEVKRAKNHLRSAMVAQLDGTTPVCET
IGSHLLNYGRRISLEEWDSRISAVDARMVRDVCSKYIYDKCPALAAVGPIEQLLDYNRIR
SGMYWIRF
subunit 2:
Only two of the hits seem to be core II, covers 126-273, 402-439(end)
of bovine
need to check those est's, see if more exists but low homology?
No, For 19R1 we're using entire est except 1 base on either end
for M13F start at end (300) and work back to stop codon,
exactly where beef terminates
DKFZ426_1L19R1 REFORMAT of: dkfz426_1l19r1.dat Length = 446 Frame = +2 (highest homology, pretty sure this is core 2)
Query: 126 VTTAPEFRRWEVAALQPQLRIDKAVALQNPQAHVIENLHAAAYRNALANSLYCPDYRIGK 185 VTTAPEFR WEV LQPQL++DKAVA Q+PQ V+ENLHAAAY+ ALAN LYCPDYRIGK Sbjct: 2 VTTAPEFRPWEVTDLQPQLKVDKAVAFQSPQVGVLENLHAAAYKTALANPLYCPDYRIGK 181
Query: 186 VTPVELHDYVQNHFTSARMALIGLGVSHPVLKQVAEQFLNIRGGLGLSGAKAKYHGGEIR 245 +T +LH +VQN+FTSARMAL+G+GV H LKQVAEQFLNIR G G S AKA Y GGEIR Sbjct: 182 ITSEQLHHFVQNNFTSARMALVGIGVKHSDLKQVAEQFLNIRSGAGTSSAKATYWGGEIR 361
Query: 246 EQNGDSLVHAALVAESAAIGSAEANAFS 273 EQNG SLVHAA+V E AA+GSAEANAFS Sbjct: 362 EQNGHSLVHAAVVTEGAAVGSAEANAFS 445
Sbjct: VTTAPEFRPWEVTDLQPQLKVDKAVAFQSPQVGVLENLHAAAYKTALANPLYCPDYRIGK ITSEQLHHFVQNNFTSARMALVGIGVKHSDLKQVAEQFLNIRSGAGTSSAKATYWGGEIR EQNGHSLVHAAVVTEGAAVGSAEANAFS
FM_ROS057B07-M13F-1
REFORMAT of: EST5322-M13F-1.seq check: 4084 from: 1 to: 300 Length = 300 Frame = -1 Query: 402 IDAVADADVINAAKKFVSGRKSMAASGNLGHTPFIDEL 439 ID+V ADV+NAAKKFVSG+KSMAASG+ G TPF+DEL Sbjct: 300 IDSVTSADVVNAAKKFVSGKKSMAASGDXGSTPFLDEL 187
(blast -> core II) decoding entire est in same frame gives: IDSVTSADVVNAAKKFVSGKKSMAASGDxGSTPFLDEL0KQSRDADSEWD0LSVTRV0RNDKRVIYEFSYWRKLIWSN CGLALIK0SAEFIKKKKK
(the rest are all core 1 or MPP)FM_ROS005D10-T3-1 REFORMAT of: EST0548-T3-1.seq check: -1 ... 83 2e-16
Query: 27 TRLPNGLVIASLENYAPASRIGLFIKAGSRYENSNNLGTSHLLRLASSLTTKGASSFKIT
86
T
L NGL +AS E+ P +G++I AGSRYEN N G + +
+ TK
Sbjct: 1 TTLDNGLRVASEESSQPTCTVGVWIGAGSRYENEKNNGAGYFVEHLAFKGTKKRPCAAFE
180
Query: 87 RGIEAVGGKLSVTSTRENMAYTVECLRDDVDILMEFLLNVTTAPEFRRWEVAALQPQLRI
146
+
+E++G + ++RE A+ ++ L D+ ++E L +V
A + Q+
Sbjct: 181 KEVESMGAHFNGYTSREQTAFYIKALSKDMPKVVELLADVVQ-------NCALEESQIEK
339
Query: 147 DKAVALQ------NPQAHV-IENLHAAAYR-NALANSLYCPDYRIGKVTPVELHDYVQNH
198
++
V LQ N +V + LHA A++
ALA ++ I +T +L Y+
H
Sbjct: 340 ERGVILQELKEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTRADLASYIDTH
519
Query: 199 FTSARMALIGL-GVSHPVLKQVAEQ
222
F
+ RM L G+SH L A Q
Sbjct: 520 FKAPRMVLAAAGGISHKELVDAARQ 594
sbjct (This is closest to core 1 proteins in blast):
TTLDNGLRVASEESSQPTCTVGVWIGAGSRYENEKNNGAGYFVEHLAFKGTKKRPCAAFE
KEVESMGAHFNGYTSREQTAFYIKALSKDMPKVVELLADVVQNCALEESQIEK
ERGVILQELKEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTRADLASYIDTH
FKAPRMVLAAAGGISHKELVDAARQ
--
DKFZ426_26E8R1
REFORMAT of: dkfz426_26e8r1.dat check: 155 ...
72 2e-13
DKFZ426_26E8R1 REFORMAT of: dkfz426_26e8r1.dat Length = 776 Frame
= +3
Query: 143 QLRIDKAVALQ------NPQAHV-IENLHAAAYRN-ALANSLYCPDYRIGKVTPVELHDY
194
Q+
++ V LQ N +V + LHA A++
ALA ++ I +T +L Y
Sbjct: 3 QIEKERGVILQELKEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTRADLASY
182
Query: 195 VQNHFTSARMALIGLG-VSHPVLKQVAEQFLNIRGGLGLSGA-----KAKYHGGEIREQN
248
+
HF + RM L G +SH L A Q +
A + ++ G EIR ++
Sbjct: 183 IDTHFKAPRMVLAAAGGISHKELVDAARQHFSGVSFTYKEDAVPILPRCRFTGSEIRARD
362
Query: 249 GDSL--VHAALVAESAAIGSAEANAFSVLQHVLGAGPHVKRGSNATSSLYQAVAKGVHQP
306
D+L H AL E +
V ++G G
SS A+A
Sbjct: 363 -DALPVAHVALAVEGPGWADPDNVVLHVANAIIGRYDRTFGGGKHLSSRLAALAVEHKLC
539
Query: 307 FDVSAFNASYSDSGLFGFYTISQAASAGDVIKAAYNQVKTIAQGNLSNPDVQAAKNKLKA
366
FN SYSD+GLFGF+ ++ S D++ A + +
+ + +V+ AKN L++
Sbjct: 540 HSFQTFNTSYSDTGLFGFHFVADPLSIDDMMXCAQGEWMRLCT-STTESEVKRAKNHLRS
716
Query: 367 GYLMSVESSEGFLDEVG 383
+ ++ + + +G
Sbjct: 717 AMVAQLDGTTPVCETIG 767
sbjct (this brings up mammalian core 1 proteins in blast):
QIEKERGVILQELKEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTRADLASY
IDTHFKAPRMVLAAAGGISHKELVDAARQHFSGVSFTYKEDAVPILPRCRFTGSEIRARD
DALPVAHVALAVEGPGWADPDNVVLHVANAIIGRYDRTFGGGKHLSSRLAALAVEHKLC
HSFQTFNTSYSDTGLFGFHFVADPLSIDDMMXCAQGEWMRLCTSTTESEVKRAKNHLRS
AMVAQLDGTTPVCETIG
DKFZ426_30L2R1
REFORMAT of: dkfz426_30l2r1.dat check: 5750 ...
59 2e-09
DKFZ426_30L2R1 REFORMAT of: dkfz426_30l2r1.dat Length = 607 Frame
= +3
Query: 171 ALANSLYCPDYRIGKVTPVELHDYVQNHFTSARMALIGLG-VSHPVLKQVAEQFLNIRGG
229
ALA
++ I +T +L Y+ HF
+ RM L G +SH L A Q +
Sbjct: 18 ALARTVEGTTENIKHLTRADLASYIDTHFKAPRMVLAAAGGISHKELVDAARQHFSGVSF
197
Query: 230 LGLSGA-----KAKYHGGEIREQNGDSL--VHAALVAESAAIGSAEANAFSVLQHVLGAG
282
A + ++ G EIR ++ D+L H AL E
+ V ++G
Sbjct: 198 TYKEDAVPILPRCRFTGSEIRARD-DALPVAHVALAVEGPGWADPDNVVLHVANAIIGRY
374
Query: 283 PHVKRGSNATSSLYQAVAKGVHQPFDVSAFNASYSDSGLFGFYTISQAASAGDVIKAAYN
342
G SS A+A
FN SYSD+GLFGF+ ++ S D++ A
Sbjct: 375 DRTFGGGKHLSSRLAALAVEHKLCHSFQTFNTSYSDTGLFGFHFVADPLSIDDMMFCAQG
554
Query: 343 Q 343
+
Sbjct: 555 E 557
Sbjct(blast -> core 1):
ALARTVEGTTENIKHLTRADLASYIDTHFKAPRMVLAAAGGISHKELVDAARQHFSGVSF
TYKEDAVPILPRCRFTGSEIRARD-DALPVAHVALAVEGPGWADPDNVVLHVANAIIGRY
DRTFGGGKHLSSRLAALAVEHKLCHSFQTFNTSYSDTGLFGFHFVADPLSIDDMMFCAQG
E
FM_ROS009A07-T3-1
REFORMAT of: EST905-T3-1.seq check: 6419 ...
47 7e-06
FM_ROS009A07-T3-1 REFORMAT of: EST905-T3-1.seq check: 6419 from:
1 to: 605 Length = 605 Frame = +3
Query: 274 VLQHVLGAGPHVKRGSNATSSLYQAVAKGVHQPFDVSAFNASYSDSGLFGFYTISQAASA
333
V
++G G SS
A+A FN SYSD+GLFGF+
++ S
Sbjct: 126 VANAIIGRYDRTFGGGKHLSSRLAALAVEHKLCHSFQTFNTSYSDTGLFGFHFVADPLSI
305
Query: 334 GDVIKAAYNQVKTIAQGNLSNPDVQAAKNKLKAGYLMSVESSEGFLDEVGSQALAAGSYT
393
D++ A + + + + +V+ AKN L++
+ ++ + + +GS L G
Sbjct: 306 DDMMFCAQGEWMRLCT-STTESEVKRAKNHLRSAMVAQLDGTTPVCETIGSHLLNYGRRI
482
Query: 394 PPSTVLQQIDAVADADVINAAKKFVSGR-KSMAASG 428
+I AV V + K++ + ++AA G
Sbjct: 483 SLEEWDSRISAVDARMVRDVCSKYIYDKCPALAAVG 590
Sbjct(blast ->Core 1):
VANAIIGRYDRTFGGGKHLSSRLAALAVEHKLCHSFQTFNTSYSDTGLFGFHFVADPLSI
DDMMFCAQGEWMRLCTSTTESEVKRAKNHLRSAMVAQLDGTTPVCETIGSHLLNYGRRI
SLEEWDSRISAVDARMVRDVCSKYIYDKCPALAAVG
UDELPATPK0064A3F
udel.pat.pk0064.a3.f
45 5e-05
UDELPATPK0064A3F udel.pat.pk0064.a3.f Length = 642 Frame = +3
Query: 236 KAKYHGGEIREQNGDSL--VHAALVAESAAIGSAEANAFSVLQHVLGAGPHVKRGSNATS
293
+
++ G EIR ++ D+L H AL E
+ V ++G
G S
Sbjct: 39 RCRFTGSEIRARD-DALPVAHVALAVEGPGWADPDNVVLHVANAIIGRYDRTFGGGKHLS
215
Query: 294 SLYQAVAKGVHQPFDVSAFNASYSDSGLFGFYTISQAASAGDVIKAAYNQ 343
S
A+A FN SYSD+GLFGF+
++ S D++ A +
Sbjct: 216 SRLAALAVEHKLCHSFQTFNTSYSDTGLFGFHFVADPLSIDDMMFCAQGE 365
Sbjct (blast -> core 1):
RCRFTGSEIRARDDALPVAHVALAVEGPGWADPDNVVLHVANAIIGRYDRTFGGGKHLS
SRLAALAVEHKLCHSFQTFNTSYSDTGLFGFHFVADPLSIDDMMFCAQGE
UDELPFT1CPK0001G5
udelpft1cpk0001g5
33 0.12
UDELPFT1CPK0001G5 udelpft1cpk0001g5 Length = 473 Frame = +2
Query: 141 QPQLRIDKAVALQ------NPQAHV-IENLHAAAYRN-ALANSLYCPDYRIGKVTPVELH
192
+
Q+ ++ V LQ N +V +
LHA A++ ALA ++ I +T +L
Sbjct: 59 ESQIEKERGVILQELKEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTQADLA
238
Query: 193 DYVQNHFTSARMALIGL-GVSHPVL 216
Y+ HF + RM L G+SH L
Sbjct: 239 SYIDTHFKAPRMVLAAAGGISHKEL 313
sbjct(Blast finds core 1):
ESQIEKERGVILQELKEMDNDMTNVTFDYLHATAFQGTALARTVEGTTENIKHLTQADLA
SYIDTHFKAPRMVLAAAGGISHKEL
UDELPTR1CPK0002P19
udelptr1cpk0002p19
32 0.20
UDELPTR1CPK0002P19 udelptr1cpk0002p19 Length = 493 Frame = +1
Query: 27 TRLPNGLVIASLENYAPASRIGLFIKAGSRYENSNNLGTSHLL 69
T L NGL
+AS + +GL I +GSR E
G +H L
Sbjct: 31 TVLENGLRVASXNKFGQFCTVGLLINSGSRXEAKYLSGIAHFL 159
This had slightly higher score against core 1.
Blast search gives mammal alpha-MPP
xxxxxxxx
Chicken ISP: searching chicken database with rieske precursor
from berkeley data
FM_ROS021A05-T3-1 REFORMAT of: EST1849-T3-1.seq Length = 527 Frame = +1
Query: 1 MLSVAARSGPFAPYLSAAAHAVPGPLKALAPAALRAEKVVLDLKRPLLCRESMSGRSARR 60 MLSVAARSGPFAPYLSAAAHAVPGPLKALAPAALR EKVVLDLKRPLLCRESMSGRSARR Sbjct: 82 MLSVAARSGPFAPYLSAAAHAVPGPLKALAPAALRPEKVVLDLKRPLLCRESMSGRSARR 261 Query: 61 DLVAGISLNAPASVRYVHNDVTVPDFSAYRREDVMDATTSSQTSSEDRKGFSYLVTATAC 120 DLVAGISLNAPASVRYVHNDVTVPDFSAYRREDVMDATTSSQTSSEDRKGFSYLVTATAC Sbjct: 262 DLVAGISLNAPASVRYVHNDVTVPDFSAYRREDVMDATTSSQTSSEDRKGFSYLVTATAC 441
Query: 121 VATAYAAKNVVTQFISSLSASADVLALS 148 VATAYAAKNVVTQFISSLSASADVLALS Sbjct: 442 VATAYAAKNVVTQFISSLSASADVLALS 525
UDELPATPK0053G8F udel.pat.pk0053.g8.f 308 9e-85 UDELPATPK0053G8F udel.pat.pk0053.g8.f Length = 609 Frame = +2
Query: 66 ISLNAPASVRYVHNDVTVPDFSAYRREDVMDATTSSQTSSEDRKGFSYLVTATACVATAY 125 ISLNAPASVRYVHNDVTVPDFSAYRREDVMDATTSSQTSSEDRKGFSYLVTATACVATAY Sbjct: 2 ISLNAPASVRYVHNDVTVPDFSAYRREDVMDATTSSQTSSEDRKGFSYLVTATACVATAY 181 Query: 126 AAKNVVTQFISSLSASADVLALSKIEIKLSDIPEGKNVAFKWRGKPLFVRHRTQAEINQE 185 AAKNVVTQFISSLSASADVLALSKIEIKLSDIPEGKNVAFKWRGKPLFVRHRTQAEINQE Sbjct: 182 AAKNVVTQFISSLSASADVLALSKIEIKLSDIPEGKNVAFKWRGKPLFVRHRTQAEINQE 361 Query: 186 AEVDVSKLRDPQHDLDRVKKPEWVILVGVCTHLG 219 AEVDVSKLRDPQHDLDRVKKPEWVILVGVCTHLG Sbjct: 362 AEVDVSKLRDPQHDLDRVKKPEWVILVGVCTHLG 463 UDELPATPK0008E6 udel.pat.pk0008.e6 Length = 494 Frame = -3
Query: 177 RTQAEINQEAEVDVSKLRDPQHDLDRVKKPEWVILVGVCTHLGCVPIANSGDFGGYYCPC 236 +TQAE+ EVDVSKLRDPQHDLDRVKKPEWVILVGVCTHLGCVPIANSGDFGGYYCPC Sbjct: 474 QTQAEL-PGXEVDVSKLRDPQHDLDRVKKPEWVILVGVCTHLGCVPIANSGDFGGYYCPC 298 Query: 237 HGSHYDASGRIRKGPAPYNLEVPTYQFVGDDLVVVG 272 HGSHYDASGRIRKGPAPYNLEVPTYQFVGDDLVVVG Sbjct: 297 HGSHYDASGRIRKGPAPYNLEVPTYQFVGDDLVVVG 190 This confirms the entire chicken sequence we are using, with overlap in the green region. except for residue 36 in presequence A->P No need to assemble the sequence here as we already have it. still,
>precursor: MLSVAARSGPFAPYLSAAAHAVPGPLKALAPAALRPEKVVLDLKRPLLCRESMSGRSARR 60 DLVAGISLNAPASVRYVHNDVTVPDFSAYRREDVMDATTSSQTSSEDRKGFSYLVTATAC VATAYAAKNVVTQFISSLSASADVLALSKIEIKLSDIPEGKNVAFKWRGKPLFVRHRTQA EINQEAEVDVSKLRDPQHDLDRVKKPEWVILVGVCTHLGCVPIANSGDFGGYYCPCHGSH YDASGRIRKGPAPYNLEVPTYQFVGDDLVVVG >presequence(su 9): MLSVAARSGPFAPYLSAAAHAVPGPLKALAPAALRPEKVVLDLKRPLLCRESMSGRSARR 60 DLVAGISLNAPASVRY >Mature sequence: VHNDVTVPDFSAYRREDVMDATTSSQTSSEDRKGFSYLVTATACVATAYAAKNVVTQFIS SLSASADVLALSKIEIKLSDIPEGKNVAFKWRGKPLFVRHRTQAEINQEAEVDVSKLRDP QHDLDRVKKPEWVILVGVCTHLGCVPIANSGDFGGYYCPCHGSHYDASGRIRKGPAPYNL EVPTYQFVGDDLVVVG
SU 6:
DKFZ426_11A21R1 REFORMAT of: dkfz426_11a21r1.dat check: 6685 from: 1 to: 495 Length = 495 Frame = +2
Query: 1 AGRPAVSASSRWLEGIRKWYYNAAGFNKLGLMRDDTIHENDDVKEAIRRLPENLYDDRVF 60 A R V+ R ++ IRKWYYNAAGFNK GLMRDDT++E+DDVKEA++RLPE+LY++R+F Sbjct: 2 AARATVAGGGRLMDRIRKWYYNAAGFNKYGLMRDDTLYEDDDVKEALKRLPEDLYNERMF 181 Query: 61 RIKRALDLSMRQQILPKEQWTKYEEDKSYLEPYLKEVIRERKEREEWAKK 110 RIKRALDLS++ +ILPKEQW KYEEDK YLEPYLKEVIRER ERE W KK Sbjct: 182 RIKRALDLSLKHRILPKEQWVKYEEDKPYLEPYLKEVIRERLEREAWNKK 331
AARATVAGGGRLMDRIRKWYYNAAGFNKYGLMRDDTLYEDDDVKEALKRLPEDLYNERMF RIKRALDLSLKHRILPKEQWVKYEEDKPYLEPYLKEVIRERLEREAWNKK
SU 7: Two separate EST’s contain the entire sequence, apparently identical
FM_ROS015D09-T3-1 REFORMAT of: EST1500-T3-1.seq check: 3585 from: 1 to: 362 Length = 362 Frame = +2
Query: 1 GRQFGHLTRVRHVITYSLSPFEQRAFPHYFSKGIPNVLRRTRACILRVAPPFVAFYLVYT 60 G FG+L RVRH+ITYSLSPFEQRA P+ FS +PNV RR + + +VAPPF+ YL+Y+ Sbjct: 47 GIHFGNLARVRHIITYSLSPFEQRAIPNIFSDALPNVWRRFSSQVFKVAPPFLGAYLLYS 226 Query: 61 WGTQEFEKSKRKNPAAYENDR 81 WGTQEFE+ KRKNPA YEND+ Sbjct: 227 WGTQEFERLKRKNPADYENDQ 289
UDELPATPK0013B5 udel.pat.pk0013.b5 Length = 398 Frame = +1 Query: 1 GRQFGHLTRVRHVITYSLSPFEQRAFPHYFSKGIPNVLRRTRACILRVAPPFVAFYLVYT 60 G FG+L RVRH+ITYSLSPFEQRA P+ FS +PNV RR + + +VAPPF+ YL+Y+ Sbjct: 25 GIHFGNLARVRHIITYSLSPFEQRAIPNIFSDALPNVWRRFSSQVFKVAPPFLGAYLLYS 204 Query: 61 WGTQEFEKSKRKNPAAYENDR 81 WGTQEFE+ KRKNPA YEND+ Sbjct: 205 WGTQEFERLKRKNPADYENDQ 267 GIHFGNLARVRHIITYSLSPFEQRAIPNIFSDALPNVWRRFSSQVFKVAPPFLGAYLLYS WGTQEFERLKRKNPADYENDQ
SU 8:
UDELPATPK0073C10F udel.pat.pk0073.c10.f Length = 535 Frame = +1
Query: 1 GDPKEEEEEEEELVDPLTTVREQCEQLEKCVKARERLELCDERVSSRSQTEEDCTEELLD 60 G+P EEEEEEELVDPLTT+RE CEQ EKCVKARERLELCD RVSSRS TEE CTEEL D Sbjct: 76 GEP--EEEEEEELVDPLTTIREHCEQTEKCVKARERLELCDARVSSRSHTEEQCTEELFD 249 Query: 61 FLHARDHCVAHKLFNSLK 78 FLHARDHCVAHKLFN LK Sbjct: 250 FLHARDHCVAHKLFNKLK 303 GEP--EEEEEEELVDPLTTIREHCEQTEKCVKARERLELCDARVSSRSHTEEQCTEELFDFLHARDHCVAHKLFNKLK
SU 9:- see ISP
SU 10
UDELPNF-BPK00002B9 udelpnf-bpk00002b9 Length = 366 Frame = +2
Query: 5 LTARLYSLLFRRTSTFALTIVVGALFFERAFDQGADAIYEHINEGKLWKHIKHKYENKE 63 L + YS LFRRTSTFALT+V+GA+ FERAFDQGADAI+EH+NEGKLWKHIKHKYE E Sbjct: 8 LLRQAYSALFRRTSTFALTVVLGAVLFERAFDQGADAIFEHLNEGKLWKHIKHKYEASE 184
LLRQAYSALFRRTSTFALTVVLGAVLFERAFDQGADAIFEHLNEGKLWKHIKHKYEASE
SU 11
UDELPATPK0035A5F udel.pat.pk0035.a5.f Length = 330 Frame = +1
(chromatin assembly factor)
Query: 14 ARNWVPTAQLWGAVGA---VG-LVSATDSRLILDWV 45 A +W P +LW +G VG L+SA+D I W+ Sbjct: 136 ASSWTPERRLWVVMGTXT*VGHLLSASDDHTICLWI 243
DKFZ426_9L9R1 REFORMAT of: dkfz426_9l9r1.dat Length = 658 Frame = -3 Query: 6 LGPRYRQLARNWVPTAQLWGAVGAVGLVSATDSRLILDWVP 46 L P W P WG + A G+ TD++++L W P Sbjct: 146 LWPTMGSYGHQWGPRGHQWGPMAANGV--RTDTKVVL-WPP 33
WVPTASLWGAVGAVGLV Aligning the second with other su11’s gives the motif: (wiy)x(PA)xxxxwgxx(gas)xxg(lva)