Random sampling of fastq files

Use the ShortRead library from bioconductor to take random samples of a fastq file. Can be useful for systematically down sampling data.

The script below reads a directory of g'zipd fastq files with names based on a bar code index, takes 1 million reads a random, and write a new file to the current directory.


fqpath <- "/fastq/C0TD0ACXX/"
fqfiles <- dir(pattern="*[AGCT].fastq.gz$", path=fqpath)

for( f in fqfiles){
   s1 <- FastqSampler(paste(fqpath,f,sep=""),10e6)
   newName <- sub(".gz","",f)

R/Fastq (last edited 2013-03-11 18:00:27 by ChrisSeidel)