MAST - Introduction

MAST -- Motif Alignment and Search Tool

MAST is a tool for searching biological sequence databases for sequences that contain one or more of a group of known motifs.

A motif is a sequence pattern that occurs repeatedly in a group of related protein or DNA sequences. Motifs are represented as position-dependent scoring matrices that describe the score of each possible letter at each position in the pattern. Individual motifs may not contain gaps. Patterns with variable-length gaps must be split into two or more separate motifs before being submitted as input to MAST.

MAST takes as input a file containing the descriptions of one or more motifs and searches a sequence database that you select for sequences that match the motifs. The motif file can be the output of the MEME motif discovery tool or any file in the appropriate format.

MAST outputs three things:

  1. The names of the high-scoring sequences sorted by the strength of the combined match of the sequence to all of the motifs in the group.
  2. Motif diagrams showing the order and spacing of the motifs within each matching sequence.
  3. Detailed annotation of each matching sequence showing the sequence and the locations and strengths of matches to the motifs.

MAST works by calculating match scores for each sequence in the database compared with each of the motifs in the group of motifs you provide. For each sequence, the match scores are converted into various types of p-values and these are used to determine the overall match of the sequence to the group of motifs and the probable order and spacing of occurrences of the motifs in the sequence.

