Predicting GO Functions Using Protein Domains
Introduction

A heuristic algorithm for associating Gene OntologyTM (GO) defined molecular functions to protein domains as listed in ProDom and CDD is described. The algorithm generates rules for function-domain associations based on the intersection of functions assigned to gene products by the GO consortium that contain ProDom and/or CDD domains at varying levels of sequence similarity. The hierarchical nature of GO molecular functions is incorporated into rule generation. Manual review of a subset of the rules generated indicates an accuracy rate of 87% for ProDom rules and 84% for CDD rules. The utility of these associations is that novel sequences can be assigned a putative function if sufficient similarity exists to a ProDom or CDD domain for which one or more GO functions has been associated. Although functional assignments are increasingly being made for gene products from model organisms, it is likely that the needs of investigators will continue to outpace the efforts of curators, particularly for non-model organisms. A comparison with other methods in terms of coverage and agreement was performed indicating the utility of the approach.

Access
Downloads

We have made the domain-GO function associations and protein-GO functions predictions available here.

  • Associations were made with and without IEA annotation. These are available in two separate files.
  • Predictions were made to several different organisms as well as SwissProt
Browsing

Follow this link to our rule browser. Using the browser you can look up a ProDom or CDD domain and see what rule we have for it and what evidence was considered when making a rule.

Querying

The sites listed below make use of our GO associations. They can be queried to get genes with a given GO function.

Documentation

Here are links to documentation of our algorithm.

Links

Here are additional related links.



CBIL