The predicted GO Function(s), for the translated DoTS transcripts in allgenes.org or the predicted protein sequences in PlasmoDB, are displayed in the following manner:
 


 

Q. How are the GO Function Predictions generated for the protein sequences?

A.

To assign GO Function(s) to proteins computationally, we assume that a protein domain, contained within a protein, is responsible for a particular function.
For example, a DNA binding domain, within a protein, is responsible for the GO Function DNA binding.

Using this premise, we created protein domain-GO Function rules associating GO Function(s) with protein domains.
 

To create the protein domain-GO Function rules, we utilized protein domains and a set of fly, yeast and mouse proteins that had manually annotated GO Functions.
The protein domains used were defined by the ProDom domain database and the conserved domain database at NCBI: CDD-pfam, CDD-smart and CDD-LOAD.   The yeast, fly and mouse proteins, which had manually annotated GO Functions, were obtained from the Gene Ontology (GO) Consortium.
 

An example of how a protein domain was associated with particular GO Function(s), to create the protein domain - GO Function rule, is illustrated below:

The protein domain pfam00125 is a protein sequence described as Core histone H2A/H2B/H3/H4.

Through BLAST similarity searching, the protein domain is found in three yeast, fly and mouse proteins with annotated GO Function(s).
 
 
Name      Identifier             Annotated GO Function(s)              p-value of similarity to pfam domain
H2AX     P27661             nucleic acid binding:DNA binding          5 x 10-19
HTB1    YDR224C          nucleic acid binding:DNA binding         9 x 10-22
His4r     FBgn0013981     nucleic acid binding:DNA binding          1 x 10-7

The intersection of the GO Function(s) for the three proteins is in good agreement, so the the following GO Function Rule is generated for the protein domain:
 

pfam00125       nucleic acid binding: DNA binding       1 x 10-7
 

In the above example, if a protein sequence has the domain with a BLAST similarity (p-value) of 1 x 10-7 or lower, the sequence will be given the predicted GO Function(s)
nucleic acid binding:DNA binding.

Many protein domain-GO Function rules were applied to translated DoTS transcripts or the predicted protein sequences within PlasmoDB to generate the GO Function predictions.
Not all proteins will have a protein domain which will generate predicted GO Functions.