If you don't receive the confirmation message, check that your email address
is correct and reachable from outside your location.
For each motif that it discovers in the training set,
MEME prints the following information:
Confirmation message
The first e-mail message you receive should be a confirmation message
to let you know that your motif discovery request has been received.
You should receive an e-mail message that looks something like this:
Subject: MEME job 7215957 confirmation: alcohol dehydrogenases (adh.s)
Your MEME request number 7215957 is being processed.
You should receive two subsequent messages containing:
1) your MEME results
2) a MAST analysis of your sequences using the MEME results
Remember to save your MEME results to a file so that you can use the motifs
to search sequence databases with MAST (http://www.sdsc.edu/MEME).
MEME Results
The second e-mail message you should receive contains the results of the MEME
analysis of your sequences for shared motifs.
Click on the hot-links below to see examples of each type of information that
MEME returns. The MEME results consist of:
Summary Line
This line gives the width (`width') and
expected number of occurrences in the training set (`sites') of the motif.
MEME numbers the motifs consecutively from one as it finds them. MEME
usually finds the most statistically significant motifs first.
Each motif describes a pattern of a fixed width--no gaps are allowed in
MEME motifs. MEME estimates the number of places the motif occurs in the
training set. This need not be an integer value.
Simplified Motif Letter-probability Matrix
MEME motifs are represented by letter-probability matrices that specify the
probability of each possible letter appearing at each
possible position in an occurrence of the motif. In order to make it easier
to see which letters are most likely in each of the columns of the
motif, the simplified motif
shows the letter probabilities multiplied by 10 rounded to the nearest integer.
Zeros are replaced by ":" (the colon) for readability.
Information Content Diagram
The information content diagram provides
an idea of which positions in the motif are most highly conserved.
Each column (position) in a motif can be characterized by the amount of
information it contains (measured in bits). Highly conserved positions
in the motif have high information; positions where all letters are equally
likely have low information.
The diagram is printed so that each column lines up with the same column in
the simplified motif letter-probability matrix above it.
Summing the information content for each position in the motif gives
the total information content of the motif (shown in parentheses to the
left of the diagram). This gives a measure of
the usefulness of the motif for database searches.
For a motif to be useful for database searches, it must as a rule contain at
least log_2(N) bits of information
where N is the number of sequences in the database being searched.
For example, to effectively search a database containing 100,000 sequences
for occurrences of a single motif, the motif should have an IC of at
least 16.6 bits. Motifs with lower information content are still useful when a
family of sequences shares more than one motif since they can be combined
in multiple motif searches (using MAST).
Multilevel Consensus Sequence
The multilevel consensus sequence corresponding to the motif is an aid in
remembering and understanding the motif. It is calculated from the motif
letter-probability matrix as follows. Separately for each column of the motif,
the letters in the alphabet are sorted in decreasing order by the probability
with which they are expected to occur in that position of motif occurrences.
The sorted letters are then printed vertically with the most probable letter
on top. Only letters with probabilities of 0.2 or higher at that position in
the motif are printed. As an example, the multilevel consensus sequence of
motif 2 in the sample output is:
Multilevel LITGAASGIG
consensus V GS
sequence G
This multilevel consensus sequence says several things about the motif.
First, the most likely form of the motif
can be read from the top line as LITGAASGIG.
Second, that only letter L has probability more than 0.2 in
position 1 of the motif, both I and V have probability
greater than 0.2 in position 2, etc.
Third, a rough approximation of the motif can be made by converting the
multilevel consensus sequence into the Prosite signature
L-[IV]-T-G-[AG]-[ASG]-S-G-I-G.
The multilevel consensus sequence is printed so that each column lines up
with the same column in the simplified motif and information
content diagrams above it.
Possible Examples of the Motif
As a further aid in understanding the motif, MEME displays a list of
possible occurrences of the motif in the training set. This list
is made by converting the motif letter-probability matrix into a
position-dependent scoring matrix (log-odds matrix) and using that to
compute a match score between each position in the training set and the
motif. All positions which score above a threshold score are listed.
(The threshold score is chosen by MEME such that the expected number of
non-motif positions listed in error
will equal the number of actual motif positions not listed.)
The format of the list is sequence name, starting position of the
(putative) occurrence, match score of the position, and
the actual sequence including the ten positions before and
after the motif occurrence (`site').
Position-dependent Scoring Matrix
The position-dependent scoring matrix corresponding to the motif is printed
for use by database search programs such as MAST. This matrix is a
log-odds matrix calculated
by taking the log (base 2) of the ratio p/f at each position in
the motif where p is the probability of a particular letter at that
position in the motif, and f is the average frequency of that
letter in the training set. The scoring matrix is printed "sideways"--columns
correspond to the letters in the alphabet (in the same order as shown in
the simplified motif) and rows corresponding to the positions of the motif,
position one first. The scoring matrix is preceded by a line starting with
"log-odds matrix:" and containing the length of the alphabet, width
of the motif, number of characters in the training set and the scoring
threshold used in the list of possible motif examples.
Motif Letter-probability Matrix
The motif itself is a position-dependent letter-probability matrix giving,
for each position in the pattern, the probabilities of each possible
letter occurring there.
The letter-probability matrix is printed "sideways"--columns
correspond to the letters in the alphabet (in the same order as shown in
the simplified motif) and rows corresponding to the positions of the motif,
position one first.
The motif is preceded by a line starting with
"letter-probability matrix:" and containing the length of the alphabet, width
of the motif and number of characters in the training set.
Mast Results
The third e-mail message you should receive contains the results of a
MAST
search of the training set using the motifs found by MEME.
This gives several additional ways of viewing how the motifs occur in
the training set. You may also want to use the motifs to search one
of the large sequence databases available through the
MAST website.
Sample MEME Results
Here is an actual MEME results file.
********************************************************************************
MEME - Motif discovery tool
********************************************************************************
MEME version 2.0 (Release date: 1996/08/29 00:23:06)
For further information on how to interpret these results or to get
a copy of the MEME software please access http://www.sdsc.edu/MEME.
This file may be used as input to the MAST algorithm for searching
sequence databases for matches to groups of motifs. MAST is available
for interactive use and downloading at http://www.sdsc.edu/MEME.
********************************************************************************
********************************************************************************
REFERENCE
********************************************************************************
If you use this program in your research, please cite:
Timothy L. Bailey and Charles Elkan,
"Fitting a mixture model by expectation maximization to discover
motifs in biopolymers", Proceedings of the Second International
Conference on Intelligent Systems for Molecular Biology, pp. 28-36,
AAAI Press, Menlo Park, California, 1994.
********************************************************************************
********************************************************************************
TRAINING SET
********************************************************************************
Sequence name Length Sequence name Length
------------- ------ ------------- ------
2BHD_STREX 255 3BHD_COMTE 253
ADH_DROME 255 AP27_MOUSE 244
BA72_EUBSP 249 BDH_HUMAN 343
BPHB_PSEPS 275 BUDC_KLETE 241
DHES_HUMAN 327 DHGB_BACME 262
DHII_HUMAN 292 DHMA_FLAS1 270
ENTA_ECOLI 248 FIXR_BRAJA 278
GUTD_ECOLI 259 HDE_CANTR 906
HDHA_ECOLI 255 LIGD_PSEPA 305
NODG_RHIME 245 RIDH_KLEAE 249
YINL_LISMO 248 YRTP_BACSU 238
CSGA_MYXXA 166 DHB2_HUMAN 387
DHB3_HUMAN 310 DHCA_HUMAN 276
FABI_ECOLI 262 FVT1_HUMAN 332
HMTR_LEIMA 287 MAS1_AGRRA 476
PCR_PEA 399 RFBB_NEIGO 346
YURA_MYXXA 258
********************************************************************************
********************************************************************************
EXPLANATION OF RESULTS
********************************************************************************
For each motif that it discovers in the training set, MEME prints the
following information:
Summary Line
This line gives the width (`width') and expected number of occurrences in
the training set (`sites') of the motif. MEME numbers the motifs
consecutively from one as it finds them. MEME usually finds the most
statistically significant motifs first. Each motif describes a pattern of a
fixed width--no gaps are allowed in MEME motifs. MEME estimates
the number of places the motif occurs in the training set. This need
not be an integer value.
Simplified Motif Letter-probability Matrix
MEME motifs are represented by letter-probability matrices that
specify the probability of each possible letter appearing at each
possible position in an occurrence of the motif. In order to make it
easier to see which letters are most likely in each of the columns of
the motif, the simplified motif shows the letter probabilities multiplied
by 10 rounded to the nearest integer. Zeros are replaced by ":" (the
colon) for readability.
Information Content Diagram
The information content diagram provides an idea of which positions
in the motif are most highly conserved. Each column (position) in a
motif can be characterized by the amount of information it contains
(measured in bits). Highly conserved positions in the motif have high
information; positions where all letters are equally likely have low
information. The diagram is printed so that each column lines up with
the same column in the simplified motif letter-probability matrix above
it. Summing the information content for each position in the motif
gives the total information content of the motif (shown in parentheses
to the left of the diagram). This gives a measure of the usefulness of
the motif for database searches. For a motif to be useful for database
searches, it must as a rule contain at least log_2(N) bits of
information where N is the number of sequences in the database
being searched. For example, to effectively search a database
containing 100,000 sequences for occurrences of a single motif, the
motif should have an IC of at least 16.6 bits. Motifs with lower
information content are still useful when a family of sequences shares
more than one motif since they can be combined in multiple motif
searches (using MAST).
Multilevel Consensus Sequence
The multilevel consensus sequence corresponding to the motif is an
aid in remembering and understanding the motif. It is calculated from
the motif letter-probability matrix as follows. Separately for each
column of the motif, the letters in the alphabet are sorted in
decreasing order by the probability with which they are expected to
occur in that position of motif occurrences. The sorted letters are
then printed vertically with the most probable letter on top. Only
letters with probabilities of 0.2 or higher at that position in the motif
are printed. As an example, the multilevel consensus sequence of
motif 2 in the sample output is:
Multilevel LITGAASGIG
consensus V GS
sequence G
This multilevel consensus sequence says several things about the
motif. First, the most likely form of the motif can be read from the top
line as LITGAASGIG. Second, that only letter L has probability
more than 0.2 in position 1 of the motif, both I and V have probability
greater than 0.2 in position 2, etc. Third, a rough approximation of the
motif can be made by converting the multilevel consensus sequence
into the Prosite signature
L-[IV]-T-G-[AG]-[ASG]-S-G-I-G. The multilevel
consensus sequence is printed so that each column lines up with
the same column in the simplified motif and information content
diagrams above it.
Possible Examples of the Motif
As a further aid in understanding the motif, MEME displays a list of
possible occurrences of the motif in the training set. This list is made
by converting the motif letter-probability matrix into a
position-dependent scoring matrix (log-odds matrix) and using that
to compute a match score between each position in the training set
and the motif. All positions which score above a threshold score are
listed. (The threshold score is chosen by MEME such that the
expected number of non-motif positions listed in error will equal the
number of actual motif positions not listed.) The format of the list is
sequence name, starting position of the (putative) occurrence, match
score of the position, and the actual sequence including the ten
positions before and after the motif occurrence (`site').
Position-dependent Scoring Matrix
The position-dependent scoring matrix corresponding to the motif is
printed for use by database search programs such as MAST. This
matrix is a log-odds matrix calculated by taking the log (base 2) of
the ratio p/f at each position in the motif where p is the probability
of a particular letter at that position in the motif, and f is the average
frequency of that letter in the training set. The scoring matrix is
printed "sideways"--columns correspond to the letters in the
alphabet (in the same order as shown in the simplified motif) and
rows corresponding to the positions of the motif, position one first.
The scoring matrix is preceded by a line starting with "log-odds
matrix:" and containing the length of the alphabet, width of the motif,
number of characters in the training set and the scoring threshold
used in the list of possible motif examples.
Motif Letter-probability Matrix
The motif itself is a position-dependent letter-probability matrix
giving, for each position in the pattern, the probabilities of each
possible letter occurring there. The letter-probability matrix is printed
"sideways"--columns correspond to the letters in the alphabet (in
the same order as shown in the simplified motif) and rows
corresponding to the positions of the motif, position one first. The
motif is preceded by a line starting with "letter-probability matrix:" and
containing the length of the alphabet, width of the motif and number of
characters in the training set.
********************************************************************************
********************************************************************************
MOTIF 1 width = 9 sites = 29.5
********************************************************************************
Simplified A ::1::::8:
motif letter- C :::::::::
probability D :8:::::::
matrix E :::::::::
F :::::::::
G ::1:::::9
H :::::::::
I 2:212::::
K :::::::::
L 3:18:::::
M :::::::::
N :::::89::
P :::::::::
Q :::::::::
R :::::::::
S :::::::::
T :::::::::
V 3:3:7::::
W :::::::::
Y :::::::::
bits 6.7
6.0
5.4
4.7
Information 4.0
content 3.4 *
(22.1 bits) 2.7 * **
2.0 * * ** *
1.3 ** ******
0.7 *********
0.0 ---------
Multilevel VDVLVNNAG
consensus L
sequence
----------------------------------------------------------------------
Possible examples of motif 1 in the training set
----------------------------------------------------------------------
Sequence name Start Score Site
------------- ----- ----- ---------
2BHD_STREX 81 27.60 VAYAREEFGS VDGLVNNAG ISTGMFLETE
3BHD_COMTE 81 25.43 MAAVQRRLGT LNVLVNNAG ILLPGDMETG
ADH_DROME 86 20.80 LKTIFAQLKT VDVLINGAG ILDDHQIERT
AP27_MOUSE 77 23.29 TEKALGGIGP VDLLVNNAA LVIMQPFLEV
BA72_EUBSP 86 25.63 VGQVAQKYGR LDVMINNAG ITSNNVFSRV
BDH_HUMAN 138 22.74 PFEPEGPEKG MWGLVNNAG ISTFGEVEFT
BPHB_PSEPS 79 18.42 ASRCVARFGK IDTLIPNAG IWDYSTALVD
BUDC_KLETE 80 20.52 VEQARKALGG FNVIVNNAG IAPSTPIESI
DHES_HUMAN 84 24.73 AARERVTEGR VDVLVCNAG LGLLGPLEAL
DHGB_BACME 87 25.63 VQSAIKEFGK LDVMINNAG MENPVSSHEM
DHMA_FLAS1 198 15.82 ILVNMIAPGP VDVTGNNTG YSEPRLAEQV
ENTA_ECOLI 73 19.88 CQRLLAETER LDALVNAAG ILRMGATDQL
FIXR_BRAJA 112 22.78 EVKKRLAGAP LHALVNNAG VSPKTPTGDR
GUTD_ECOLI 82 16.09 SRGVDEIFGR VDLLVYSAG IAKAAFISDF
HDE_CANTR 92 20.26 VETAVKNFGT VHVIINNAG ILRDASMKKM
HDE_CANTR 396 28.82 IKNVIDKYGT IDILVNNAG ILRDRSFAKM
HDHA_ECOLI 89 29.35 ADFAISKLGK VDILVNNAG GGGPKPFDMP
NODG_RHIME 81 29.35 GQRAEADLEG VDILVNNAG ITKDGLFLHM
RIDH_KLEAE 89 15.33 LQGILQLTGR LDIFHANAG AYIGGPVAEG
YINL_LISMO 83 13.35 VELAIERYGK VDAIFLNAG IMPNSPLSAL
YRTP_BACSU 84 27.25 VAQVKEQLGD IDILINNAG ISKFGGFLDL
CSGA_MYXXA 13 28.10 AFATNVCTGP VDVLINNAG VSGLWCALGD
DHB2_HUMAN 161 18.93 KVAAMLQDRG LWAVINNAG VLGFPTDGEL
DHB3_HUMAN 125 17.96 HIKEKLAGLE IGILVNNVG MLPNLLPSHF
DHCA_HUMAN 83 29.40 RDFLRKEYGG LDVLVNNAG IAFKVADPTP
FVT1_HUMAN 115 23.34 IKQAQEKLGP VDMLVNCAG MAVSGKFEDL
HMTR_LEIMA 103 24.28 VAACYTHWGR CDVLVNNAS SFYPTPLLRN
MAS1_AGRRA 320 27.07 VTAAVEKFGR IDGLVNNAG YGEPVNLDKH
PCR_PEA 165 23.25 VDNFRRSEMP LDVLINNAA VYFPTAKEPS
YURA_MYXXA 90 16.83 IRALDAEAGG LDLVVANAG VGGTTNAKRL
----------------------------------------------------------------------
log-odds matrix: alength= 20 w= 9 n= 9732 bayes= 8.36137
-3.316 1.390 -5.130 -4.404 0.058 -4.789 -3.234 1.570 -4.081 1.829 0.447 -3.786 -2.854 -3.417 -4.034 -3.394 -2.682 2.096 -2.790 -2.790
-4.033 -3.269 4.095 -1.071 -4.174 -2.754 -0.610 -4.217 -3.875 -4.555 -4.136 0.118 -4.394 -3.117 -3.904 -3.169 -3.548 -4.336 0.889 -3.624
0.176 -1.770 -4.683 -4.057 -2.179 0.017 -2.875 1.767 -3.736 0.304 0.661 -2.809 -3.940 -3.061 -3.170 -3.013 -0.542 2.092 -2.559 -2.446
-3.975 -2.026 -4.994 -4.198 -0.445 -5.448 -3.389 -0.072 -3.964 3.061 0.853 -4.055 -3.998 -2.953 -3.660 -4.054 -1.412 -0.726 -2.922 -3.186
-1.964 -1.322 -3.749 -3.440 -0.950 -2.314 -0.190 1.486 -3.429 -2.219 -2.090 -3.363 -3.322 -3.106 -3.234 -2.706 -1.858 3.054 -3.177 -3.444
-2.471 -0.408 -2.158 -4.055 -3.667 -3.923 -0.612 -3.297 -3.104 -2.740 -3.592 4.531 -1.987 -2.320 -3.493 -1.798 -2.661 -4.113 -2.858 -1.451
-3.055 -0.342 -2.178 -4.043 -3.692 -2.752 -0.688 -3.329 -3.126 -4.240 -3.617 4.555 -3.737 -2.337 -3.524 -1.256 -2.532 -4.134 -2.882 -3.337
2.883 -0.565 -3.802 -3.403 -3.164 -2.439 -1.813 -2.816 -3.410 -3.111 -2.517 -3.254 -4.311 -3.103 -3.463 -0.982 -1.132 -1.123 -3.023 -3.059
-1.286 -3.261 -3.065 -3.746 -4.823 3.304 -3.368 -3.368 -3.582 -5.271 -4.466 -2.511 -4.230 -3.725 -3.621 -1.376 -4.067 -4.803 -3.714 -4.224
letter-probability matrix: alength= 20 w= 9 n= 9732
0.011080 0.032025 0.001405 0.002686 0.038068 0.003217 0.001965 0.165994 0.003148 0.322410 0.037507 0.002644 0.005738 0.002833 0.003100 0.006207 0.008997 0.345601 0.001381 0.003994
0.006738 0.001268 0.841017 0.027062 0.002026 0.013179 0.012108 0.003008 0.003632 0.003860 0.001565 0.039559 0.001973 0.003488 0.003392 0.007253 0.004937 0.004001 0.017696 0.002240
0.124615 0.003583 0.001915 0.003418 0.008071 0.089970 0.002520 0.190285 0.004000 0.112016 0.043511 0.005203 0.002703 0.003624 0.005641 0.008084 0.039659 0.344488 0.001622 0.005070
0.007016 0.003000 0.001544 0.003098 0.026855 0.002037 0.001765 0.053208 0.003415 0.756863 0.049680 0.002194 0.002597 0.003906 0.004018 0.003928 0.021703 0.048874 0.001261 0.003037
0.028280 0.004887 0.003660 0.005242 0.018923 0.017877 0.016204 0.156612 0.004946 0.019488 0.006463 0.003543 0.004148 0.003514 0.005397 0.010000 0.015926 0.671296 0.001056 0.002539
0.019897 0.009211 0.011025 0.003422 0.002878 0.005864 0.012092 0.005691 0.006199 0.013578 0.002282 0.842847 0.010466 0.006059 0.004511 0.018761 0.009126 0.004671 0.001318 0.010104
0.013281 0.009641 0.010872 0.003450 0.002828 0.013202 0.011474 0.005567 0.006102 0.004802 0.002241 0.857100 0.003112 0.005987 0.004414 0.027305 0.009986 0.004603 0.001296 0.002735
0.813838 0.008258 0.003528 0.005378 0.004079 0.016395 0.005262 0.007941 0.005014 0.010497 0.004806 0.003820 0.002090 0.003520 0.004605 0.033031 0.026343 0.037103 0.001176 0.003315
0.045248 0.001275 0.005880 0.004238 0.001291 0.878111 0.001790 0.005418 0.004451 0.002350 0.001245 0.006394 0.002211 0.002287 0.004126 0.025136 0.003445 0.002896 0.000728 0.001478
Time 297.99 secs.
********************************************************************************
MOTIF 2 width = 10 sites = 30.6
********************************************************************************
Simplified A ::::531:::
motif letter- C ::::1:::::
probability D :::::1::::
matrix E ::::::::::
F ::::::::::
G :::93219:9
H ::::::::::
I 25::::::6:
K ::::::2:::
L 51::::::2:
M ::::::::::
N :::::1::::
P ::::::::::
Q ::::::1:::
R ::::::1:::
S ::1:133:::
T ::8:::::::
V 14::::::1:
W ::::::::::
Y ::::::::::
bits 6.7
6.0
5.4
4.7
Information 4.0
content 3.4
(23.0 bits) 2.7 ** *
2.0 *** ***
1.3 ***** ***
0.7 **********
0.0 ----------
Multilevel LITGAASGIG
consensus V GS
sequence G
-----------------------------------------------------------------------
Possible examples of motif 2 in the training set
-----------------------------------------------------------------------
Sequence name Start Score Site
------------- ----- ----- ----------
2BHD_STREX 10 25.00 MNDLSGKTV IITGGARGLG AEAARQAVAA
3BHD_COMTE 10 24.81 TNRLQGKVA LVTGGASGVG LEVVKLLLGE
AP27_MOUSE 11 27.80 MKLNFSGLRA LVTGAGKGIG RDTVKALHAS
BA72_EUBSP 10 25.51 MNLVQDKVT IITGGTRGIG FAAAKIFIDN
BDH_HUMAN 59 23.13 AAEPVGSKAV LVTGCDSGFG FSLAKHLHSK
BPHB_PSEPS 9 26.22 MKLKGEAV LITGGASGLG RALVDRFVAE
BUDC_KLETE 6 27.30 MQKVA LVTGAGQGIG KAIALRLVKD
DHES_HUMAN 6 29.63 ARTVV LITGCSSGIG LHLAVRLASD
DHGB_BACME 11 20.81 MYKDLEGKVV VITGSSTGLG KSMAIRFATE
DHII_HUMAN 38 27.61 RPEMLQGKKV IVTGASKGIG REMAYHLAKM
DHMA_FLAS1 18 25.77 RPGRLAGKAA IVTGAAGGIG RATVEAYLRE
ENTA_ECOLI 9 26.79 MDFSGKNV WVTGAGKGIG YATALAFVEA
FIXR_BRAJA 40 25.09 RVDRGEPKVM LLTGASRGIG HATAKLFSEA
GUTD_ECOLI 6 12.85 MNQVA VVIGGGQTLG AFLCHGLAAE
HDE_CANTR 12 24.25 SPVDFKDKVV IITGAGGGLG KYYSLEFAKL
HDE_CANTR 326 23.93 PTVSLKDKVV LITGAGAGLG KEYAKWFAKY
HDHA_ECOLI 15 25.75 DNLRLDGKCA IITGAGAGIG KEIAITFATA
LIGD_PSEPA 10 20.38 MKDFQDQVA FITGGASGAG FGQAKVFGQA
NODG_RHIME 10 21.69 MFELTGRKA LVTGASGAIG GAIARVLHAQ
RIDH_KLEAE 18 24.83 MNTSLSGKVA AITGAASGIG LECARTLLGA
YINL_LISMO 9 28.91 MTIKNKVI IITGASSGIG KATALLLAEK
YRTP_BACSU 10 28.30 MQSLQHKTA LITGGGRGIG RATALALAKE
DHB2_HUMAN 86 23.63 ELLPVDQKAV LVTGGDCGLG HALCKYLDEL
DHB3_HUMAN 52 24.89 SFLRSMGQWA VITGAGDGIG KAYSFELAKR
DHCA_HUMAN 8 26.83 SSGIHVA LVTGGNKGIG LAIVRDLCRL
FABI_ECOLI 10 12.21 MGFLSGKRI LVTGVASKLS IAYGIAQAMH
FVT1_HUMAN 36 26.93 KPLALPGAHV VVTGGSSGIG KCIAIECYKQ
HMTR_LEIMA 10 20.41 MTAPTVPVA LVTGAAKRLG RSIAEGLHAE
MAS1_AGRRA 249 17.80 TVEIHQSPVI LVSGSNRGVG KAIAEDLIAH
PCR_PEA 90 25.41 GKKTLRKGNV VITGASSGLG LATAKALAES
RFBB_NEIGO 10 21.71 MQTEGKKNI LVTGGAGFIG SAVVRHIIQN
YURA_MYXXA 120 8.85 WERVRGIIDT NVTGAAATLS AVLPQMVERK
-----------------------------------------------------------------------
log-odds matrix: alength= 20 w= 10 n= 9699 bayes= 8.3045
-1.611 -1.886 -4.941 -4.153 -0.143 -4.526 -2.998 1.807 -3.802 2.447 -1.369 -2.086 -3.972 -3.029 -3.680 -3.212 -2.527 0.761 1.445 -2.399
-3.505 -1.107 -4.745 -4.400 -1.129 -4.838 -3.491 3.049 -4.044 -0.826 -1.706 -3.873 -4.360 -3.611 -3.453 -3.548 -2.738 2.317 -3.208 -3.054
-3.822 -2.373 -4.182 -4.547 -4.279 -5.128 -3.594 -1.270 -3.681 -4.502 -3.152 -2.508 -4.228 -3.067 -3.800 -0.110 3.945 -2.130 -3.929 -4.486
-2.480 -3.319 -2.766 -3.763 -4.815 3.498 -3.361 -4.939 -3.574 -4.785 -4.505 -2.686 -4.258 -3.716 -3.614 -2.887 -4.016 -4.854 -3.742 -4.290
2.156 2.239 -5.339 -5.005 -4.638 1.948 -4.590 -4.417 -4.995 -3.814 -4.017 -4.564 -4.724 -3.653 -4.877 0.121 -3.114 -1.547 -4.569 -4.936
1.327 -3.608 0.426 -2.949 -5.103 1.536 -2.684 -5.215 -2.992 -2.392 -4.701 0.780 -4.205 -2.631 -3.616 1.991 -0.622 -4.806 -4.926 -4.264
-0.600 1.151 -0.420 -1.334 -3.513 0.359 -1.425 -3.224 1.535 -3.390 -2.760 -1.653 -2.913 1.038 1.521 2.105 -0.457 -3.450 -3.256 -2.751
-1.991 -3.324 -3.141 -3.762 -1.103 3.391 -3.393 -4.915 -1.671 -5.400 -4.486 -2.576 -4.256 -3.728 -1.326 -2.888 -1.487 -4.811 -3.736 -4.244
-2.220 -2.663 -4.568 -4.441 -0.717 -5.093 -3.990 3.488 -4.017 1.029 -1.412 -3.987 -4.632 -3.765 -4.347 -3.770 -2.976 0.354 -3.430 -3.191
-3.185 -3.326 -3.109 -3.768 -4.822 3.483 -3.397 -4.905 -3.244 -5.360 -4.506 -2.671 -4.256 -3.687 -3.603 -1.246 -3.807 -4.824 -3.742 -4.290
letter-probability matrix: alength= 20 w= 10 n= 9699
0.036508 0.003317 0.001677 0.003211 0.033203 0.003538 0.002352 0.190393 0.003792 0.502095 0.010864 0.009793 0.002663 0.003700 0.003929 0.006850 0.009557 0.140870 0.026408 0.005279
0.009821 0.005693 0.001921 0.002706 0.016762 0.002851 0.001672 0.450462 0.003207 0.051966 0.008600 0.002837 0.002035 0.002471 0.004599 0.005424 0.008258 0.414313 0.001049 0.003354
0.007884 0.002367 0.002837 0.002443 0.001888 0.002331 0.001556 0.022576 0.004124 0.004066 0.003158 0.007310 0.002229 0.003603 0.003615 0.058806 0.848339 0.018990 0.000637 0.001243
0.019983 0.001228 0.007567 0.004207 0.001302 0.921021 0.001829 0.001774 0.004444 0.003341 0.001236 0.006461 0.002184 0.002298 0.004114 0.008581 0.003404 0.002875 0.000725 0.001424
0.497179 0.057863 0.001272 0.001778 0.001472 0.314556 0.000780 0.002548 0.001659 0.006549 0.001733 0.001758 0.001581 0.002400 0.001714 0.069019 0.006364 0.028459 0.000408 0.000910
0.279861 0.001006 0.069195 0.007395 0.001067 0.236434 0.002925 0.001466 0.006649 0.017550 0.001079 0.071398 0.002266 0.004876 0.004109 0.252178 0.035807 0.002972 0.000319 0.001450
0.073567 0.027220 0.038496 0.022646 0.003213 0.104546 0.006996 0.005825 0.153324 0.008786 0.004142 0.013216 0.005548 0.062001 0.144517 0.273064 0.040131 0.007610 0.001015 0.004136
0.028048 0.001224 0.005838 0.004209 0.017069 0.855067 0.001789 0.001804 0.016618 0.002181 0.001252 0.006970 0.002187 0.002279 0.020085 0.008570 0.019647 0.002963 0.000728 0.001470
0.023937 0.001935 0.002171 0.002629 0.022304 0.002388 0.001182 0.610812 0.003267 0.187978 0.010548 0.002621 0.001685 0.002221 0.002475 0.004650 0.007004 0.106243 0.000899 0.003049
0.012264 0.001223 0.005967 0.004192 0.001297 0.911416 0.001784 0.001816 0.005584 0.002242 0.001235 0.006526 0.002187 0.002344 0.004144 0.026762 0.003935 0.002935 0.000725 0.001424
Time 602.81 secs.
********************************************************************************
MOTIF 3 width = 13 sites = 31.4
********************************************************************************
Simplified A :171:47111:1:
motif letter- C :::::::::::::
probability D ::::::::1::::
matrix E ::::::::1:::2
F :::::2::::2::
G :1:::12::3:::
H :::::::::1::1
I :::::::11::::
K ::::9:::1:::2
L :::::::21:7::
M :::::::1:1:1:
N :::::::::1:::
P :::::::::::::
Q ::::::::::::1
R ::::::::1:::2
S :517::::11:21
T :1:2::::1::4:
V :::::::41::1:
W :::::1:::::::
Y a::::::::::::
bits 6.7
6.0
5.4
4.7 *
Information 4.0 *
content 3.4 * *
(26.7 bits) 2.7 * *
2.0 * ** *
1.3 ***** * **
0.7 ******** ****
0.0 -------------
Multilevel YSASKAAVxGLTR
consensus F L E
sequence
--------------------------------------------------------------------------
Possible examples of motif 3 in the training set
--------------------------------------------------------------------------
Sequence name Start Score Site
------------- ----- ----- -------------
2BHD_STREX 152 30.78 GLMGLALTSS YGASKWGVRGLSK LAAVELGTDR
3BHD_COMTE 151 34.37 SWLPIEQYAG YSASKAAVSALTR AAALSCRKQG
ADH_DROME 152 25.46 GFNAIYQVPV YSGTKAAVVNFTS SLAKLAPITG
AP27_MOUSE 149 29.96 AHVTFPNLIT YSSTKGAMTMLTK AMAMELGPHK
BA72_EUBSP 157 28.01 GIFGSLSGVG YPASKASVIGLTH GLGREIIRKN
BDH_HUMAN 208 21.60 GRMANPARSP YCITKFGVEAFSD CLRYEMYPLG
BPHB_PSEPS 153 24.39 GFYPNGGGPL YTAAKQAIVGLVR ELAFELAPYV
BUDC_KLETE 152 34.42 GHVGNPELAV YSSSKFAVRGLTQ TAARDLAPLG
DHES_HUMAN 155 32.89 GLMGLPFNDV YCASKFALEGLCE SLAVLLLPFG
DHGB_BACME 160 22.13 WKIPWPLFVH YAASKGGMKLMTE TLALEYAPKG
DHII_HUMAN 183 29.96 GKVAYPMVAA YSASKFALDGFFS SIRKEYSVSR
DHMA_FLAS1 165 23.00 SFMAEPEAAA YVAAKGGVAMLTR AMAVDLARHG
ENTA_ECOLI 144 22.81 AHTPRIGMSA YGASKAALKSLAL SVGLELAGSG
FIXR_BRAJA 189 24.64 SRVHPFAGSA YATSKAALASLTR ELAHDYAPHG
GUTD_ECOLI 154 26.21 GKVGSKHNSG YSAAKFGGVGLTQ SLALDLAEYG
HDE_CANTR 163 15.37 GLYGNFGQAN YASAKSALLGFAE TLAKEGAKYN
HDE_CANTR 467 28.38 GIYGNFGQAN YSSSKAGILGLSK TMAIEGAKNN
HDHA_ECOLI 159 25.05 AENKNINMTS YASSKAAASHLVR NMAFDLGEKN
LIGD_PSEPA 157 26.29 GFMGSALAGP YSAAKAASINLME GYRQGLEKYG
NODG_RHIME 152 30.31 GAIGNPGQTN YCASKAGMIGFSK SLAQEIATRN
RIDH_KLEAE 160 29.06 GVVPVIWEPV YTASKFAVQAFVH TTRRQVAQYG
YINL_LISMO 154 27.95 GLKAYPGGAV YGATKWAVRDLME VLRMESAQEG
YRTP_BACSU 155 35.05 GQRGAAVTSA YSASKFAVLGLTE SLMQEVRKHN
CSGA_MYXXA 88 21.27 AANTDGGAYA YRMSKAALNMAVR SMSTDLRPEG
DHB2_HUMAN 232 27.37 GGAPMERLAS YGSSKAAVTMFSS VMRLELSKWG
DHB3_HUMAN 198 30.12 ALFPWPLYSM YSASKAFVCAFSK ALQEEYKAKE
DHCA_HUMAN 193 18.86 HQKEGWPSSA YGVTKIGVTVLSR IHARKLSEQR
FVT1_HUMAN 186 31.79 GQLGLFGFTA YSASKFAIRGLAE ALQMEVKPYN
HMTR_LEIMA 193 26.46 TNQPLLGYTI YTMAKGALEGLTR SAALELAPLQ
MAS1_AGRRA 392 24.07 GQRVLNPLVG YNMTKHALGGLTK TTQHVGWDRR
RFBB_NEIGO 165 30.45 ETTPYAPSSP YSASKAAADHLVR AWQRTYRLPS
YURA_MYXXA 160 28.06 AGFRGLPATR YSASKAFLSTFME SLRVDLRGTG
--------------------------------------------------------------------------
log-odds matrix: alength= 20 w= 13 n= 9600 bayes= 8.25243
-6.789 -4.769 -6.508 -6.871 -1.048 -6.141 -2.789 -5.559 -5.658 -4.975 -2.791 -4.011 -5.723 -5.118 -5.602 -5.574 -6.098 -5.094 -1.942 5.260
-0.748 2.026 -3.223 -3.163 -3.486 -0.087 -2.670 -3.739 -2.445 -3.541 -3.104 -0.597 -1.084 -2.467 -1.155 3.159 1.047 -2.002 -3.371 -3.079
2.645 -0.605 -3.718 -3.429 -3.071 -1.575 -3.240 -1.481 -3.343 -2.871 0.836 -3.491 -4.376 -3.134 -3.475 0.780 -1.130 -1.217 -3.066 -3.394
-0.082 -1.742 -3.838 -4.239 -3.822 -4.021 -3.032 -4.216 -3.327 -4.352 -3.327 -2.166 -3.530 -3.299 -3.374 3.462 1.561 -4.535 -3.683 -3.496
-4.089 -2.986 -4.803 -4.292 -5.176 -5.166 -3.412 -4.097 4.201 -4.875 -4.126 -3.484 -4.344 -3.485 -0.518 -4.217 -3.880 -5.087 -3.716 -4.383
1.914 -1.807 -4.291 -3.382 2.600 0.280 0.639 -0.404 -3.076 -1.953 -1.729 -3.400 -3.949 0.010 -3.435 -2.381 -2.372 -1.848 2.470 -2.426
2.607 -0.890 -4.642 -4.202 0.420 1.041 -3.921 -3.853 -4.066 -3.604 -3.313 -4.060 -4.580 -3.799 -3.828 -0.432 -2.808 -2.726 -3.831 -4.171
-0.691 -1.285 -4.853 -3.634 -2.198 -1.560 -2.965 0.804 -3.704 1.394 1.746 -3.327 -4.045 -3.171 -3.412 -0.806 -2.387 2.164 -2.633 -2.403
-0.647 1.131 0.279 0.836 -3.468 -1.444 -1.452 0.471 0.588 -0.639 -2.431 -0.072 -2.918 0.459 1.217 0.649 0.682 -0.016 -3.267 -2.524
0.150 -2.907 -0.431 -1.415 -3.268 1.924 1.623 -3.152 -1.238 -1.342 1.991 0.579 -2.880 -0.921 -1.913 0.175 -0.534 -1.280 -2.888 -2.522
-2.195 -2.777 -5.264 -4.382 2.102 -5.476 -3.547 -1.579 -4.054 2.961 0.295 -3.691 -4.207 -3.149 -3.841 -3.993 -3.336 -2.482 -3.064 -3.127
-0.877 1.153 -4.536 -4.150 -0.301 -4.908 -3.270 -2.261 -3.092 -2.847 1.559 -2.734 -4.212 -3.125 -3.269 1.663 2.879 0.716 -2.943 -3.168
-2.700 -3.239 -0.524 1.824 -3.629 -3.654 1.643 -3.709 1.851 -1.438 -2.841 -1.720 -2.836 1.202 2.315 0.628 -2.150 -3.612 -3.359 -2.740
letter-probability matrix: alength= 20 w= 13 n= 9600
0.000978 0.000447 0.000570 0.000484 0.017138 0.001278 0.002673 0.001233 0.000989 0.002934 0.003884 0.002593 0.000797 0.000872 0.001030 0.001274 0.000821 0.002459 0.002522 0.955023
0.064402 0.049632 0.005562 0.006325 0.003162 0.084904 0.002903 0.004357 0.009175 0.007928 0.003124 0.027640 0.019870 0.005475 0.022471 0.542024 0.116183 0.020975 0.000937 0.002950
0.676781 0.008011 0.003947 0.005260 0.004217 0.030271 0.001955 0.020837 0.004924 0.012609 0.047975 0.003718 0.002028 0.003447 0.004500 0.104184 0.025680 0.036128 0.001158 0.002370
0.102160 0.003643 0.003632 0.002999 0.002505 0.005555 0.002259 0.003131 0.004979 0.004517 0.002677 0.009312 0.003645 0.003076 0.004826 0.668679 0.165818 0.003623 0.000755 0.002209
0.006355 0.001538 0.001860 0.002893 0.000980 0.002512 0.001735 0.003400 0.919107 0.003144 0.001539 0.003735 0.002073 0.002703 0.034940 0.003263 0.003819 0.002471 0.000738 0.001195
0.407591 0.003483 0.002654 0.005432 0.214825 0.109438 0.028769 0.043959 0.005925 0.023826 0.008104 0.003960 0.002727 0.030487 0.004627 0.011645 0.010860 0.023340 0.053711 0.004638
0.659085 0.006577 0.002080 0.003078 0.047386 0.185556 0.001219 0.004026 0.002983 0.007590 0.002703 0.002505 0.001761 0.002175 0.003524 0.044960 0.008028 0.012700 0.000681 0.001383
0.067010 0.005001 0.001797 0.004563 0.007719 0.030571 0.002366 0.101552 0.003834 0.242438 0.090110 0.004165 0.002550 0.003361 0.004702 0.034702 0.010745 0.376540 0.001563 0.004713
0.069057 0.026677 0.063036 0.101116 0.003202 0.033133 0.006750 0.080639 0.075134 0.059240 0.004982 0.039782 0.005573 0.041613 0.116364 0.095109 0.090162 0.083094 0.001007 0.004334
0.119989 0.001625 0.038517 0.021240 0.003677 0.342149 0.056885 0.006543 0.021186 0.036405 0.106786 0.062460 0.005722 0.015988 0.013286 0.068492 0.038813 0.034588 0.001309 0.004339
0.023627 0.001778 0.001351 0.002717 0.152150 0.002025 0.001580 0.019466 0.003009 0.718492 0.032960 0.003237 0.002281 0.003412 0.003492 0.003810 0.005567 0.015033 0.001159 0.002853
0.058901 0.027091 0.002238 0.003192 0.028758 0.003003 0.001915 0.012137 0.005861 0.012824 0.079192 0.006284 0.002272 0.003471 0.005191 0.192171 0.413450 0.138016 0.001261 0.002773
0.016647 0.001291 0.036131 0.200603 0.002863 0.007161 0.057665 0.004447 0.180282 0.034047 0.003750 0.012691 0.005899 0.069646 0.248936 0.093732 0.012665 0.006871 0.000945 0.003730
Time 862.02 secs.
Stopped because nmotifs = 3 reached.
CPU: paragon
********************************************************************************
DEBUG INFORMATION
********************************************************************************
Some of the information in this section is used by MAST.
Please do not delete it if you wish to search sequence databases for
matches to the above motifs. This information can also be useful in the
event you wish to report a problem with the MEME software.
ALPHABET= ACDEFGHIKLMNPQRSTVWY
DATAFILE= meme.7209850.data (deleted by web version of MEME)
model: mod= zoops nmotifs= 3 chi= 1
width: minw= 12 maxw= 55 shorten= yes
lambda: minsites= 0 maxsites= 33
theta: prob= 1 spmap= pam spfuzz= 120
em: prior= megap b= 99960 maxiter= 20
distance= 0.001
data: n= 9996 N= 33
strands: w53
sample: seed= 0 seqfrac= 1
LRT: adj= root
Dirichlet mixture priors file: prior30.plib
Letter frequencies:
A 0.111 C 0.012 D 0.050 E 0.055 F 0.036 G 0.090 H 0.018 I 0.057 K 0.052
L 0.092 M 0.027 N 0.041 P 0.041 Q 0.029 R 0.049 S 0.064 T 0.057 V 0.083
W 0.010 Y 0.027
Effective length of alphabet = 20
Entropy of dataset (bits) = -4.11
/mount4a/app/tbailey/meme.1.95/bin/intelparagon_p/meme meme.7209850.data -protein -mod zoops -nmotifs 3 -maxw 55 -minw 12 -nostatus -maxiter 20 -maxsize 10000
********************************************************************************