Annotation of Magnaporthe grisea with Gene Ontology Terms

 

 Enjoy our research just by a simple click: query preliminary functional categories of Magnaporthe grisea proteins of Version 6 genome assembly and visualize their GO annotations!
Magnaporthe grisea, the rice blast fungus, infects rice and other agriculturally important cereals, such as wheat, rye and barley. Each year the fungus is estimated to destroy enough rice to feed more than 60 million people. A comprehensive annotation of the rice blast genome is a crucial step in efforts to understand the biology and develop effective disease management strategies of this destructive pathogen.

 

The Magnaporthe grisea research community, lead by Prof. Ralph Dean, sequenced the genome (assembly Version 6) and published results of an initial biological analysis and automatic annotations in the scientific journal Nature. The next step is a curated community annotation effort.

 

The Gene ontology (GO) has evolved into a reliable and rapid means of assigning functional information. Our annotation of M. grisea with GO terms provides a foundation for further functional analysis of M. grisea, and for assigning newly created plant pathogen specific terms from the PAMGO community. We applied a two steps process in the generation of the annotations:

 

1) Orthology-based GO annotation. Reciprocal best hits was iteratively searched while identifying orthologs between predicted proteins of M. grisea and GO proteins from multiple organisms with published association to GO terms. The pairwise alignments were manually reviewed for those hits with e-value equal to 1e-20 or better, pid equal to 35 or better, and sequence coverage equal to 80% or better.

 

2) Literature-based GO annotation. More than 1,646 proteins of M. grisea were annotated based on literatures about experiments of Microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation etc.

 

Eventually, our gene association file includes 34,683 records, which represent 8,283 proteins of M. grisea. Among the 8,283 proteins, 5,804 proteins were annotated by specific GO terms, which covered more than 52.5% of the whole genome. Additionally, many new programs and protocols were developed during parsing the data.

 

 This project is a part of PAMGO which is supported by the USDA NRI-CSREES: grant number 2005-35600-16370, and the National Science Foundation: grant number EF-0523736.