Monday, August 1, 2016

Categorizing the mapped reads



The mapped reads primarily fall in above four categories. 1) Properly paired, 2) Improperly paired, Broken reads [3) mates map on different scaffolds, 4) only one of the mate maps]

For my convenience, I have used the samtools commands in a script to categorize the reads mapped in pairs (PE reads) on the reference. Please find it here.

Running the script will list out the number of reads. But it takes more time with very large BAM files.

./SAM_stats.sh file_coordinate_sorted.bam
Properly paired with correct insert distances: 191012
Wrong insert Distance or mates inverted: 1858
Broken Reads: 6284
Out of 6284 broken reads 6100 does not have a mate mapped
Out of 6284 broken reads, 92 pairs fall on different contigs

No comments:

Post a Comment