In this post, we will use previously calculated scale factors to create scale and strand-specific BigWig files. These files can be used to display data in the Genome Browser as a graph. We will use SAMtools split the BAM files into reads mapping to the forward and reverse strands. We will then apply bamCoverage function with -scaleFactor flag (deepTools) to convert strand-specific BAM files to a scaled BigWig files. The following script can be called using:
sbatch script bamfile samplename scalefactor
script.sh:
#!/bin/bash
#SBATCH --job-name=alignment
#SBATCH -p biology_weekend
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=28
#SBATCH --mem=30gb
#SBATCH -e error.txt
#SBATCH -o output.txt
#### Set working, temporary and results directories
export WORKDIR=/path/to/working/dir/
export HOMEDIR=/path/to/home/dir/
##modules
module load samtools/1.3.1
TMPDIR="${WORKDIR}tmp/"
BIGWIGDIR="${WORKDIR}bigwigdir/"
mkdir -p $BIGWIGDIR
mkdir -p $TMPDIR
##assign variables used when calling the script to BAM, SAMPLE and SCALEFACTOR variables
BAM=${HOMEDIR}${1}
SAMPLE=$2
SCALEFACTOR=$3
#sanity check:
echo "for file" $BAM "scalefactor is" $SCALEFACTOR "and samplename is" $SAMPLE
#### Create temporaty BAM files representing reads mapping to forward or reverse strands
BAMFOR="${TMPDIR}${SAMPLE}.fwd.bam"
BAMREV="${TMPDIR}${SAMPLE}.rev.bam"
BAMFOR1="${TMPDIR}${SAMPLE}.fwd1.bam"
BAMFOR2="${TMPDIR}${SAMPLE}.fwd2.bam"
BAMREV1="${TMPDIR}${SAMPLE}.rev1.bam"
BAMREV2="${TMPDIR}${SAMPLE}.rev2.bam"
#### Create bigwig files.
BIGWIG="${BIGWIGDIR}${SAMPLE}.bigwig" # all reads
BIGWIGFOR="${BIGWIGDIR}${SAMPLE}.for.bigwig" # reads mapping to forward strand
BIGWIGREV="${BIGWIGDIR}${SAMPLE}.rev.bigwig" # reads mapping to reverse strand
#### Create bigwig file for all reads.
bamCoverage --scaleFactor $SCALEFACTOR -b ${BAM} -o $BIGWIG
#### Create bigwig file for the forward strand.
#Include reads that are 2nd in a pair (128). Exclude reads that are mapped to the reverse strand (16)
samtools view -b -f 128 -F 16 $BAM > $BAMFOR1
#Exclude reads that are mapped to the reverse strand (16) and first in a pair (64): 64 + 16 = 80
samtools view -b -f 80 $BAM > $BAMFOR2
samtools merge -f $BAMFOR $BAMFOR1 $BAMFOR2
samtools index $BAMFOR
bamCoverage --scaleFactor $SCALEFACTOR -b $BAMFOR -o $BIGWIGFOR
#### Create bigwig file for the reverse strand.
#Include reads that map to the reverse strand (128) and are second in a pair (16): 128 + 16 = 144
samtools view -b -f 144 $BAM > $BAMREV1
#Include reads that are first in a pair (64), but exclude those ones that map to the reverse strand (16)<br>
samtools view -b -f 64 -F 16 $BAM > $BAMREV2
samtools merge -f $BAMREV $BAMREV1 $BAMREV2
samtools index $BAMREV
bamCoverage --scaleFactor $SCALEFACTOR -b $BAMREV -o $BIGWIGREV
#### Remove temporary files.
rm $BAMFOR $BAMFFOR1 $BAMFFOR2 $BAMREV $BAMREV1 $BAMREV2
#### Copy data to home directory
cp -R $BIGWIGDIR $HOMEDIR
Script has been adapted from Gregersen et al. (1).
References:
- Using TTchem-seq for profiling nascent transcription and measuring transcript elongation. Lea H. Gregersen, Richard Mitter & Jesper Q. Svejstrup