## Please edit system and help pages ONLY in the moinmaster wiki! For more ## information, please see MoinMaster:MoinPagesEditorGroup. ##acl MoinPagesEditorGroup:read,write,delete,revert All:read ##master-page:HelpTemplate ##master-date:Unknown-Date #format wiki #language en = Using R to evaluate DGE for differential expression = '''Moderated statistical tests for assessing differences in tag abundance.''' (2007). Robinson MD, Smyth GK. ''Bioinformatics''. 23(21):2881-7. [http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=retrieve&db=pubmed&uid=17881408 PMID: 17881408] The authors of this paper develop a statistical test using a negative binomial distribution to model noise in the data. They release R code (available at [http://bioinf.wehi.edu.au/resources/ bioinf.wehi.edu.au/resources/]) to support their treatment, but it is uncommented, and not functional as is. Here I apply the code and examine the results in a sample data set. At the website above you can find a tar archive called: '''msage_0.7.tar.gz'''. I can't get the package to install in R, so I simply unpack it and use the .R files directly. In addition, the code calls a code block (kepler.R) which is not included in the distribution. However, it is available here: http://dulci.biostat.duke.edu/sage/sage_analysis_code.r Basically there are two files of R code in the msage package: fitNBP.R and msage.R that can be sourced. The file msage.R has a source statement inside: source("kepler.R"), however this code block is not included. To get the code to function, download ''sage_analysis.r'', and rename it ''kepler.R'' (written by Thomas Kepler at Duke). It contains a function called: glm.poisson.disp() that is called by msage.R. === Assemble some data === There is a function in msage which can assemble data from individual tab-delimited tag files of the form: tag, count. It's called: read.sage() and takes an array of file names to read. It returns a list with two elements: a data matrix in the first position, and an array of library sizes as the second element. The input files simply need to be lists of tags (or gene names) and the associated counts, separated by a tab character. They do not have to be normalized to tpm. However I already have my data in a table. {{{ # set working directory and load library functions setwd("E:/cry1/Solexa/Smyth") source("fitNBP.R") source("msage.R") }}} CategoryR