26-May-2017, 02:41 PM
Where does the big peak at maximum sequencing length (for me, 51 nt) come from? Or am I the only person who gets this?
It is in both my total RNA and RPF samples. When I plot the size distribution of my trimmed reads, they fall mostly between about 15-36 nt (centred around about 28 nt), but there is also a huge peak at 51 nt.
I could understand if I was getting a range all the way up to 51 nt, but I'm not - there's a big gap above about 36 and then the peak at 51.
It is still there after removing rRNA reads and after aligning to the genome.
When we plot where these 51 nt sequences map to on the genome, the coverage is fairly even across it, so it doesn't seem to be one repetitive region or anything like that.
We plan to discard those reads, but I am interested in why we get them in the first place... Can anyone explain, please?
Thanks!
Sorry - should have said, this is after adapter trimming
It is in both my total RNA and RPF samples. When I plot the size distribution of my trimmed reads, they fall mostly between about 15-36 nt (centred around about 28 nt), but there is also a huge peak at 51 nt.
I could understand if I was getting a range all the way up to 51 nt, but I'm not - there's a big gap above about 36 and then the peak at 51.
It is still there after removing rRNA reads and after aligning to the genome.
When we plot where these 51 nt sequences map to on the genome, the coverage is fairly even across it, so it doesn't seem to be one repetitive region or anything like that.
We plan to discard those reads, but I am interested in why we get them in the first place... Can anyone explain, please?
Thanks!
Sorry - should have said, this is after adapter trimming