
Annotation of Magnaporthe grisea with Gene Ontology Terms
Enjoy our research just by a simple click:
query functional categories of Magnaporthe grisea
proteins and visualize their GO annotations!
Magnaporthe
grisea, the rice blast fungus, infects rice and other
agriculturally important cereals, such as wheat, rye and barley.
Each year the fungus is estimated to destroy enough rice to feed
more than 60 million people. A comprehensive annotation of the rice
blast genome is a crucial step in efforts to understand the biology
and develop effective disease management strategies of this
destructive pathogen.
The Magnaporthe grisea research community, lead by Prof. Ralph Dean, sequenced the genome (assembly Version 5) and published results of an initial biological analysis and automatic annotations in the scientific journal Nature. The next step is a curated community annotation effort.
The Gene ontology (GO) has evolved into a reliable and rapid means of assigning functional information. Our annotation of M. grisea with GO terms provides a foundation for further functional analysis of M. grisea, and for assigning newly created plant pathogen specific terms from the PAMGO community. We applied a two steps process in the generation of the annotations:
1) Orthology-based GO annotation. Reciprocal best hits was iteratively searched while identifying orthologs between predicted proteins of M. grisea and GO proteins from multiple organisms with published association to GO terms. The pairwise alignments were manually reviewed for those hits with e-value equal to zero and with 80% or better coverage at both query and subject sequences, and for those hits with e-value equal to 1e-20 or better, pid equal to 35 or better, and sequence coverage equal to 80% or better. These manual reviews assigned functions to 2,870 hypothetical proteins, too. Eighty percent of functional assignments were based on reciprocal best blast hits with an e-value equal to zero.
2) Literature-based GO annotation. More than 2,680 proteins of M. grisea were annotated based on literatures about experiments of Microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation etc.
Eventually, our gene association file includes 52,132 records, which represent 12,876 proteins of M. grisea in NCBI. Among the 12,876 proteins, 7,412 proteins were annotated by specific GO terms, which covered more than 57% of the whole genome. The remaining proteins were annotated with the three root GO terms. Therefore, our GO annotation covered the whole genome of M. grisea, and each protein was annotated by GO terms from three GO categories. Additionally, many new programs and protocols were developed during parsing the data.
This project is a part of PAMGO which is supported by the USDA NRI-CSREES: grant number 2005-35600-16370, and the National Science Foundation: grant number EF-0523736.