Tuesday, June 28, 2022

Contigs with no read coverage (zero reads mapped) after mapping using Bowtie2

I recently mapped reads using Bowtie2 against viral genome segments to find the read counts for each of the segments. To my surprise, I found contigs with no reads mapped on them.

That is when I came to know that bowtie2 used END to END read alignment using default settings (see here). In reality, the contigs from many assemblers are further broken down into kmers and are stitched together. In this case, I used the Megahit metagenome assembler, which uses the default sequence of 21, 41, 61, 81 and 99 kmers (Reference). So, aligning reads using the END to END parameter settings may not work sometimes resulting in zero reads count or no read coverage for the particular contig. 

In the figure down, we see the PB2 and NP (colored read) using END to END has no reads mapped but the contig is created by the assembler. This more likely seems to be happened because the assemblers use kmer approach to create contigs. This is when, changing the setting from END to END to LOCAL makes more sense. 

In the local alignment when some of the bases at the ends of the read do not participate, they are omitted (or "soft trimmed" or "soft clipped") from the beginning or from the end. That's how we see PB2 and NP have now read counts. For other segments, the read counts seemed to increase. 

So, in this particular case of alignment, local alignment of the reads using bowtie2 makes more sense.



For more discussion, see here:

No comments:

Post a Comment