RNA Seq Analysis with DESeq

Analysis of Gene Expression with RNA Seq is fairly straightforward using commonly available tools. Specifically, a combination of a gapped aligner (tophat), the R environment and bioconductor, and the differential expression package DESeq, can be put together to generate expression differences very similar to results one would expect from a microarray analysis. This page describes how to go from sequence reads in fastq format to ratios and p-values of differential expression.

DESeq requires a table of integer counts describing features, such as transcripts or exons, and how many sequence reads overlap each feature. To generate this table, you need a description of features, and a way to count the overlap between features and your aligned reads.

There are two easy ways to do this, (1) use an existing GFF or GTF file of feature descriptions and a standalone python tool from the Huber lab called htseq-count, or (2) grab feature descriptions from Ensembl using biomart with R. Both are described below.

RNA Seq (last edited 2011-10-10 15:20:24 by ChrisSeidel)