The Schistosoma Database was generated by aligning 3473 Schistosoma
sequences contained in dbEST as of February 5, 1998. These sequences were first
sorted by length, PolyA tracts trimmed and regions from both the 5' and
the 3' end containing greater than 25% N's were removed. Sequences of less than 20 bp were eliminated
resulting in 3471 sequences. These sequences were aligned using the cap2
program (Xiaoqiu Huang, 1996, Genomics 33, 21-31). This output file
was then parsed to generate the files containing a library of consensus
sequences and the clustering
The Schistosoma consensus database contains 2077 consensus sequences, 1629 of which are singletons. 41 different cDNA libraries were used as templates for sequencing.
A number of analyses were done to attempt to assign relatedness or function to the consensus sequences. These analyses are presented here in text form. The files are quite large and are better accessed by using a text search or via the individual sequences which are linked from within the database when information is available.
Questions and complaints to email@example.com
Contents by Brian Brunk.