Introduction to the Annotate Sequence with GFF File

The Annotate Sequence with GFF File makes it very easy to annotate a sequence with annotations from a GFF (Generic Feature Format) or GTF (Gene Transfer Format) file. A GFF/GTF file does not contain any sequence information, it only contains a list of annotations. You can read more about the formats at http://www.sanger.ac.uk/resources/software/gff/spec.html and http://mblab.wustl.edu/GTF22.html.

There are many different versions of GFF and GTF. We support a big part of the GFF3 definition (see http://www.sequenceontology.org/gff3.shtml), and we also support GTF format as defined at http://mblab.wustl.edu/GTF22.html. In other words, most GFF3 files can be used to annotated sequences using this tool.

The GFF and GTF files can contain various types of annotations. In general, the Annotate Sequence with GFF File action adds the annotation in each of the lines in the file to the chosen sequence, at the position or region in which the file specifies that it should go, and with the annotation type, name, description etc. as given in the file. However, special treatment is given to annotations of the types CDS, exon, mRNA, transcript and gene. For these, the following applies:

For a comprehensive source of genomic annotation of genes and transcripts, we refer to the Ensembl web site at http://www.ensembl.org/info/data/ftp/index.html. On this page, you can download GTF files that can be used to annotate genomes for use in other analyses in the CLC Genomics Workbench, e.g. RNA-Seq Analysis (Image rnaseq).

This manual will show two examples of how to use the plugin to annotate a genome as you might wish to for the purposes of RNA-Seq analysis in the CLC Genomics Workbench version 6.5.x and earlier.

If you are using the CLC Genomics Workbench and are interested in standard reference genomic data, please also take a look at the Download Genomes tool as described in the CLC Genomics Workbench manual at:

http://www.clcsupport.com/clcgenomicsworkbench/current/index.php?manual=Download_reference_genome.html