solexa ChIP SEQ data in R

This is an example workflow for analysis of solexa ChIP SEQ data in R.

The basic idea is to go from sequence reads in export format (i.e. from Eland files of the form: s_n_export.txt) to a track you can see in the genome browser. Currently (October 2009) I think this only works in the R-devel version.

library(ShortRead)
library(chipseq)

# create a path to the location of your export files
spath <- "/n/analysis/cws42/42TH9AAXX"

# create some filters
nfilt <- nFilter()
# you can use a regular expression to grab only reads
# which align to a set of chromosomes
cfilt <- chromosomeFilter("Drosophila_melanogaster.BDGP5.4.54.dna.chromosome.[234|L|R|X]+.fa")
# quality filter
qfilt <- alignDataFilter(expression(filtering == "Y"))
# combine filters together into one
filt <- compose(nfilt, cfilt, qfilt)

# Read in a lane of data
aln <- readAligned(spath, pattern="s_1_export.txt", type="SolexaExport", filter=filt)

# convert to GenomeData object
aln.gd <- as(aln,"GenomeData")


You can execute the code on the command line as follows:

cat code.r | R-devel --vanilla

R ChIP-SEQ (last edited 2010-03-27 05:18:49 by ChrisSeidel)