| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Copy of What is genome annotation

Page history last edited by Yoana Daoud 13 years, 6 months ago

Back to Home Page

Back to Annotation Page

 

What is genome annotation?

 

 

It is difficult to give an exact definition but, for the purpose of this discussion, it might be useful to define genome annotation as a subfield in the general field of genome analysis, which includes more or less anything that can be done with genome sequences by computational means. Genomic annotations are the focal point of sequencing, bioinformatics analysis, and molecular biology. They are the means by which we attach what we know about a genome to its sequence.It is customarily performed before a genome sequence is deposited in GenBank and described in a published paper.

 

The “unit” of genome annotation is the description of an individual gene and its protein (or RNA) product, and the focal point of each such record is the function assigned to the gene product. The record may also include a brief account of the evidence leading to this assigned function.

 

How is annotation done?

 

A variety of methods have been devised for identifying the genes in a genome and determining the functions of these genes. Genome annotation necessarily involves some level of automation. For annotation to be practicable at all, software is necessary to run routine tasks in a batch mode and also to organize the results from different programs in a convenient form. After that point, however, genome annotation is still mostly “manual” because decisions on how to assign gene functions are made by humans. Some genomes simply come practically unannotated, in most genomes however, functional prediction has been made for the majority of the genes.

 

The process for genome annotation generally consists of the following steps:

 

1. Locate the positions of all the genes: This is the initial objective. For protein-coding genes this can be attempted by searching for open reading frames. Genes can also be located by homology analysis which uses the presence of an equivalent gene in a second genome as evidence that a putative gene in the test genome is genuine.                                         

 

2. Feedback from gene identification: needed for correction of sequencing errors, these are primarily frame shifts.

 

3. General database search: searching sequence databases (typically, NCBI NR) for sequence similarity, usually using BLAST.

 

4. Specialized database search: searching domain databases, such as Pfam, SMART, and CDD, for conserved domains, genome-oriented databases, such as COGs, for identification of orthologous relationship and refined functional prediction, metabolic databases, such as KEGG for metabolic pathway reconstruction, and possibly, other database searches. The video below takes a look at finding conserved domains.

 

 

5. Statistical gene prediction: use of methods like GeneMark or Glimmer to predict protein-coding genes.

 

6. Prediction of structural features: prediction of signal peptide, transmembrane segments, coiled domain and other features in putative protein functions.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  

                                                          Figure 1: The process of genome annotation

 

In conclusion, Genome annotation is basically the process of understanding a genome sequence. On the basis of personal experience, it is good to note that genome annotation is not a routine, mundane activity as it might seem to an outside observer. On the contrary, this is exciting research, somewhat akin to detective work, which has the potential of teasing out deep mysteries of life from genome sequences.

Back to Home Page

Back to Annotation Page

 

 

Comments (1)

NARRAV@sgu.edu said

at 1:47 pm on Nov 2, 2010

Please fix figure 1. If you intend to just show that region of the image, edit the image in an image editing software (paint), and remove the unnecessary regions using the eraser tool. Also place label close to the image (there is a big blank space below the image).

You don't have permission to comment on this page.